Archive

Posts Tagged ‘computational legal studies’

The Data Deluge [Via The Economist]

March 1st, 2010

The cover story of this week’s Economist is entitled The Data Deluge. This is, of course, a favorite topic of the hosts of this blog. While a number of folks have already highlighted this trend, we are happy to see a mainstream outlet such the Economist reporting on the era of big data. Indeed, the convergence of rapidly increasing computing power, and decreasing data storage costs, on one side, and large scale data collection and digitization on the other … has already impacted practices in the business, government and scientific communities. There is ample reason to believe that more is on the way.

In our estimation, for the particular class of questions for which data is available, two major implications of the deluge are worth reiterating: (1) no need to make assumptions about the asymptotic performance of a particular sampling frames when population level data is readily available; and (2) what statistical sampling was to the 20th century, data filtering may very well be to the 21st ….

dmartink Uncategorized , ,

The Development of Structure in the Citation Network of the United States Supreme Court — Now in HD!

February 11th, 2010

What are some of the key takeaway points?

(1) The Supreme Court’s increasing reliance upon its own decisions over the 1800-1830 window.

(2) The important role of maritime/admiralty law in the early years of the Supreme Court’s citation network.  At least with respect to the Supreme Court’s citation network, these maritime decisions are the root of the Supreme Court’s jurisprudence.

(3) The increasing centrality of decisions such as Marbury v. MadisonMartin v. Hunter’s Lessee to the overall network.

The Development of Structure in the SCOTUS Citation Network

The visualization offered above is the largest weakly connected component of the citation network of the United States Supreme Court (1800-1829). Each time slice visualizes the aggregate network as of the year in question.

In our paper entitled Distance Measures for Dynamic Citation Networks, we offer some thoughts on the early SCOTUS citation network.  In reviewing the visual above note ….“[T]he Court’s early citation practices indicate a general absence of references to its own prior decisions. While the court did invoke well-established legal concepts, those concepts were often originally developed in alternative domains or jurisdictions. At some level, the lack of self-reference and corresponding reliance upon external sources is not terribly surprising. Namely, there often did not exist a set of established Supreme Court precedents for the class of disputes which reached the high court. Thus, it was necessary for the jurisprudence of the United States Supreme Court, seen through the prism of its case-to-case citation network, to transition through a loading phase. During this loading phase, the largest weakly connected component of the graph generally lacked any meaningful clustering. However, this sparsely connected graph would soon give way, and by the early 1820’s, the largest weakly connected component displayed detectable structure.”

What are the elements of the network?

What are the labels?

To help orient the end-user, the visualization highlights several important decisions of the United States Supreme Court offered within the relevant time period:

Marbury v. Madison, 5 U.S. 137 (1803) we labeled as ”Marbury”
Murray v. The Charming Betsey, 6 U.S. 64 (1804) we labeled as “Charming Betsey”
Martin v. Hunter’s Lessee, 14 U.S. 304 (1816) we labeled as “Martin’s Lessee”
The Anna Maria, 15 U.S. 327 (1817) we labeled as “Anna Maria”
McCulloch v. Maryland, 17 U.S. 316 (1819) we labeled as “McCulloch”

Why do cases not always enter the visualization when they are decided?

As we are interested in the core set of cases, we are only visualizing the largest weakly connected component of the United States Supreme Court citation network. Cases are not added until they are linked to the LWCC.  For example, Marbury v. Madison is not added to the visualization until a few years after it is decided.

How do I best view the visualization?

Given this is a high-definition video, it may take few seconds to load.  We believe that it is worth the wait.  In our view, the video is best consumed (1) Full Screen (2) HD On (3) Scaling Off.

Where can I find related papers?

Here is a non-exhaustive list of related scholarship:

Michael Bommarito, Daniel Katz, Jon Zelner & James Fowler, Distance Measures for Dynamic Citation Networks (Under Review).

Yonatan Lupu & James Fowler, The Strategic Content Model of Supreme Court Opinion Writing, APSA 2009 Toronto Meeting Paper.

Michael Bommarito, Daniel Katz & Jon Zelner, Law as a Seamless Web? Comparison of Various Network Representations of the United States Supreme Court Corpus (1791-2005) in Proceedings of the 12th Intl. Conference on Artificial Intelligence and Law (2009).

Frank Cross, Thomas Smith & Antonio Tomarchio, The Reagan Revolution in the Network of Law, 57 Emory L. J. 1227 (2008).

James Fowler & Sangick Jeon, The Authority of Supreme Court Precedent, 30 Soc. Networks 16 (2008).

Elizabeth Leicht, Gavin Clarkson, Kerby Shedden & Mark Newman, Large-Scale Structure of Time Evolving Citation Networks, 59 European Physics Journal B 75 (2007).

Thomas Smith, The Web of the Law, 44 San Diego L.R. 309 (2007).

James Fowler, Timothy R. Johnson, James F. Spriggs II, Sangick Jeon & Paul J. Wahlbeck, Network Analysis and the Law: Measuring the Legal Importance of Precedents at the U.S. Supreme Court, 15 Political Analysis, 324 (2007).

_

dmartink Uncategorized , , ,

The Development of Structure in the Citation Network of the United States Supreme Court

February 10th, 2010

What are some of the key takeaway points?

(1) The Supreme Court’s increasing reliance upon its own decisions over the 1800-1830 window.

(2) The important role of maritime/admiralty law in the early years of the Supreme Court’s citation network.  At least with respect to the Supreme Court’s citation network, these maritime decisions are the root of the Supreme Court’s jurisprudence.

(3) The increasing centrality of decisions such as Marbury v. MadisonMartin v. Hunter’s Lessee to the overall network.

The Development of Structure in the SCOTUS Citation Network

The visualization offered above is the largest weakly connected component of the citation network of the United States Supreme Court (1800-1829). Each time slice visualizes the aggregate network as of the year in question.

In our paper entitled Distance Measures for Dynamic Citation Networks, we offer some thoughts on the early SCOTUS citation network.  In reviewing the visual above note ….“[T]he Court’s early citation practices indicate a general absence of references to its own prior decisions. While the court did invoke well-established legal concepts, those concepts were often originally developed in alternative domains or jurisdictions. At some level, the lack of self-reference and corresponding reliance upon external sources is not terribly surprising. Namely, there often did not exist a set of established Supreme Court precedents for the class of disputes which reached the high court. Thus, it was necessary for the jurisprudence of the United States Supreme Court, seen through the prism of its case-to-case citation network, to transition through a loading phase. During this loading phase, the largest weakly connected component of the graph generally lacked any meaningful clustering. However, this sparsely connected graph would soon give way, and by the early 1820’s, the largest weakly connected component displayed detectable structure.”

What are the elements of the network?

What are the labels?

To help orient the end-user, the visualization highlights several important decisions of the United States Supreme Court offered within the relevant time period:

Marbury v. Madison, 5 U.S. 137 (1803) we labeled as ”Marbury”
Murray v. The Charming Betsey, 6 U.S. 64 (1804) we labeled as “Charming Betsey”
Martin v. Hunter’s Lessee, 14 U.S. 304 (1816) we labeled as “Martin’s Lessee”
The Anna Maria, 15 U.S. 327 (1817) we labeled as “Anna Maria”
McCulloch v. Maryland, 17 U.S. 316 (1819) we labeled as “McCulloch”

Why do cases not always enter the visualization when they are decided?

As we are interested in the core set of cases, we are only visualizing the largest weakly connected component of the United States Supreme Court citation network. Cases are not added until they are linked to the LWCC.  For example, Marbury v. Madison is not added to the visualization until a few years after it is decided.

How do I best view the visualization?

Those interested in viewing the full screen video—click on the full screen icon contained in the Vimeo bottom banner.  Check out the NEW Hi-Def (HD) version of the video!


dmartink Uncategorized , , ,

United States Court of Appeals & Parallel Tag Clouds from IBM Research [Repost from 10/23]

February 3rd, 2010

Download the paper: Collins, Christopher; Viégas, Fernanda B.; Wattenberg, Martin. Parallel Tag Clouds to Explore Faceted Text Corpora To appear in Proceedings of the IEEE Symposium on Visual Analytics Science and Technology (VAST), October, 2009. [Note: The Paper is 24.5 MB]

Here is the abstract: Do court cases differ from place to place? What kind of picture do we get by looking at a country’s collection of law cases? We introduce Parallel Tag Clouds: a new way to visualize differences amongst facets of very large metadata-rich text corpora. We have pointed Parallel Tag Clouds at a collection of over 600,000 US Circuit Court decisions spanning a period of 50 years and have discovered regional as well as linguistic differences between courts. The visualization technique combines graphical elements from parallel coordinates and traditional tag clouds to provide rich overviews of a document collection while acting as an entry point for exploration of individual texts. We augment basic parallel tag clouds with a details-in-context display and an option to visualize changes over a second facet of the data, such as time. We also address text mining challenges such as selecting the best words to visualize, and how to do so in reasonable time periods to maintain interactivity.

dmartink Uncategorized , ,

Science Magazine Policy Forum: Accessible Reproducible Research

January 23rd, 2010

From the Abstract … ”Scientific publications have at least two goals: (i) to announce a result and (ii) to convince readers that the result is correct. Mathematics papers are expected to contain a proof complete enough to allow knowledgeable readers to fill in any details. Papers in experimental science should describe the results and provide a clear enough protocol to allow successful repetition and extension.”  (Institutional or Individual Subscription Required).

dmartink Uncategorized , ,

Netflix Challenge for SCOTUS Prediction?

January 13th, 2010

During our break from blogging, Ian Ayers offered a very interesting post over a Freakonomics entitled “Prediction Markets vs. Super Crunching: Which Can Better Predict How Justice Kennedy Will Vote?” In general terms, the post compares the well known statistical model offered by Martin-Quinn to the new Supreme Court Fantasy League created by Josh Blackman. We were particularly interested in a sentence located at end of the post … “[T]he fantasy league predictions would probably be more accurate if market participants had to actually put their money behind their predictions (as with intrade.com).”  This point is well taken. Extending the idea of having some “skin in the game,” we wondered what sort of intellectual returns could be generated for the field of quantitative Supreme Court prediction by some sort of Netflix style SCOTUS challenge.

The Martin-Quinn model has significantly advanced the field of quantitative analysis of the United States Supreme Court. However, despite all of the benefits the model has offered, it is unlikely to be the last word on the question. While only time will tell, an improved prediction algorithm might very well be generated through the application of ideas in machine learning and via incorporation of additional components such as text, citations, etc.

With significant financial sum at stake … even far less than the real Netflix challenge … it is certainly possible that a non-trivial mprovement could be generated. In a discussion among a few of us here at the Michigan CSCS lab, we generated the following non-exhaustive set of possible ground rules for a Netflix Style SCOTUS challenge:

  1. To be unseated, the winning team should be required to make a non-trivial improvement upon the out-of-sample historical success of the Martin-Quinn Model.
  2. To prevent overfitting, the authors of this non-trivial improvement should be required to best the existing model for some prospective period.
  3. All of those who submit agree to publish their code in a standard programming language (C, Java, Python, etc.) with reasonable commenting / documentation.

dmartink Uncategorized , , , ,

The Age of Computational Science ?

December 15th, 2009

Visualizing the Linkage Structure of the Law Blogosphere

November 22nd, 2009

Law Blogosphere

So this is Version 1.0 of our series regarding the linkage structure of the Law Blogosphere. We are currently working on a Version 2.0 that will feature documentation and a larger set of law blogs. Check back soon for more!

dmartink Uncategorized , , ,

“Sink Method” Poster for Conference on Empirical Legal Studies (CELS 2009 @ USC)

November 20th, 2009

Sinks Poster

As we mentioned in previous posts, Seadragon is a really cool product. Please note load times may vary depending upon your specific machine configuration as well as the strength of your internet connection. For those not familiar with how to operate it please see below. In our view, the Full Screen is best the way to go ….

dmartink Uncategorized , , , , ,

Computational Legal Studies™