Predicting the Behavior of the Supreme Court of the United States: A General Approach (Katz, Bommarito & Blackman)

SCOTUS Prediction Model
Abstract
:  “Building upon developments in theoretical and applied machine learning, as well as the efforts of various scholars including Guimera and Sales-Pardo (2011), Ruger et al. (2004), and Martin et al. (2004), we construct a model designed to predict the voting behavior of the Supreme Court of the United States. Using the extremely randomized tree method first proposed in Geurts, et al. (2006), a method similar to the random forest approach developed in Breiman (2001), as well as novel feature engineering, we predict more than sixty years of decisions by the Supreme Court of the United States (1953-2013). Using only data available prior to the date of decision, our model correctly identifies 69.7% of the Court’s overall affirm and reverse decisions and correctly forecasts 70.9% of the votes of individual justices across 7,700 cases and more than 68,000 justice votes. Our performance is consistent with the general level of prediction offered by prior scholars. However, our model is distinctive as it is the first robust, generalized, and fully predictive model of Supreme Court voting behavior offered to date. Our model predicts six decades of behavior of thirty Justices appointed by thirteen Presidents. With a more sound methodological foundation, our results represent a major advance for the science of quantitative legal prediction and portend a range of other potential applications, such as those described in Katz (2013).”

You can access the current draft of the paper via SSRN or via the physics arXiv.  Full code is publicly available on Github.  See also the LexPredict site.  More on this to come soon …

Network Analysis and the Law — 3D-Hi-Def Visualization of the Time Evolving Citation Network of the United States Supreme Court

What are some of the key takeaway points?

(1) The Supreme Court’s increasing reliance upon its own decisions over the 1800-1830 window.

(2) The important role of maritime/admiralty law in the early years of the Supreme Court’s citation network. At least with respect to the Supreme Court’s citation network, these maritime decisions are the root of the Supreme Court’s jurisprudence.

(3) The increasing centrality of decisions such as Marbury v. Madison, Martin v. Hunter’s Lessee to the overall network.

The Development of Structure in the SCOTUS Citation Network

The visualization offered above is the largest weakly connected component of the citation network of the United States Supreme Court (1800-1829). Each time slice visualizes the aggregate network as of the year in question.

In our paper entitled Distance Measures for Dynamic Citation Networks, we offer some thoughts on the early SCOTUS citation network. In reviewing the visual above note ….“[T]he Court’s early citation practices indicate a general absence of references to its own prior decisions. While the court did invoke well-established legal concepts, those concepts were often originally developed in alternative domains or jurisdictions. At some level, the lack of self-reference and corresponding reliance upon external sources is not terribly surprising. Namely, there often did not exist a set of established Supreme Court precedents for the class of disputes which reached the high court. Thus, it was necessary for the jurisprudence of the United States Supreme Court, seen through the prism of its case-to-case citation network, to transition through a loading phase. During this loading phase, the largest weakly connected component of the graph generally lacked any meaningful clustering. However, this sparsely connected graph would soon give way, and by the early 1820’s, the largest weakly connected component displayed detectable structure.”

What are the elements of the network?

What are the labels?

To help orient the end-user, the visualization highlights several important decisions of the United States Supreme Court offered within the relevant time period:

Marbury v. Madison, 5 U.S. 137 (1803) we labeled as ”Marbury”
Murray v. The Charming Betsey, 6 U.S. 64 (1804) we labeled as “Charming Betsey” Martin v. Hunter’s Lessee, 14 U.S. 304 (1816) we labeled as “Martin’s Lessee”
The Anna Maria, 15 U.S. 327 (1817) we labeled as “Anna Maria”
McCulloch v. Maryland, 17 U.S. 316 (1819) we labeled as “McCulloch”

Why do cases not always enter the visualization when they are decided?

As we are interested in the core set of cases, we are only visualizing the largest weakly connected component of the United States Supreme Court citation network. Cases are not added until they are linked to the LWCC. For example, Marbury v. Madison is not added to the visualization until a few years after it is decided.

How do I best view the visualization?

Given this is a high-definition video, it may take few seconds to load. We believe that it is worth the wait. In our view, the video is best consumed (1) Full Screen (2) HD On (3) Scaling Off.

Where can I find related papers?

Here is a non-exhaustive list of related scholarship:

Daniel Martin Katz, Network Analysis Reveals the Structural Position of Foreign Law in the Early Jurisprudence of the United States Supreme Court (Working Paper – 2014)

Yonatan Lupu & James H. Fowler, Strategic Citations to Precedent on the U.S. Supreme Court, 42 Journal of Legal Studies 151 (2013)

Michael Bommarito, Daniel Martin Katz, Jon Zelner & James Fowler, Distance Measures for Dynamic Citation Networks, 389 Physica A  4201 (2010).

Michael Bommarito, Daniel Martin Katz & Jon Zelner, Law as a Seamless Web? Comparison of Various Network Representations of the United States Supreme Court Corpus (1791-2005) in Proceedings of the 12th Intl. Conference on Artificial Intelligence and Law (2009).

Frank Cross, Thomas Smith & Antonio Tomarchio, The Reagan Revolution in the Network of Law, 57 Emory L. J. 1227 (2008).

James Fowler & Sangick Jeon, The Authority of Supreme Court Precedent, 30 Soc. Networks 16 (2008).

Elizabeth Leicht, Gavin Clarkson, Kerby Shedden & Mark Newman, Large-Scale Structure of Time Evolving Citation Networks, 59 European Physics Journal B 75 (2007).

Thomas Smith, The Web of the Law, 44 San Diego L.R. 309 (2007).

James Fowler, Timothy R. Johnson, James F. Spriggs II, Sangick Jeon & Paul J. Wahlbeck, Network Analysis and the Law: Measuring the Legal Importance of Precedents at the U.S. Supreme Court, 15 Political Analysis, 324 (2007).

A Rational But Ultimately Unsuccessful Critique of Nate Silver

This article is reasonable in so much as it is a rational argument against Nate Silver’s work at 538 (rather than the ridiculous nonsense he had to endure from folks who are totally clueless – UnSkewedPolls.com, etc.).  However, it is ultimately unsuccessful.

“Nate Silver didn’t nail it; the pollsters did.”  Not true.  They both got it correct (or as accurate as can be when there is only 1 event that is being modeled).

“To be fair, the art of averaging isn’t simple.”  Well it is not just averaging.  Pure averaging is totally stupid.  This is weighting and it is non-trivial because you need to build a notion of how much signal vs. noise to assign to each {pollster, time point combo}.  Some of these polling outfits are totally disreputable and some have historic “house effects” (see e.g. Rasmussen).  With respect to time – the question is how much of the past is useful for predicting the future – so you need some sort of decay function to phase out the impact of prior data points (prior polls) on your current prediction.

It is correct to say that Nate Silver’s model cannot be validated in a traditional sense – he uses simulation – because on every day other than election day – there is no way to execute a direct test of the accuracy of the model.  Simulation is basically as good as we can do in an environment where there is only one event and it is perfectly valid as a scientific endeavor.  If folks want to complain and actually be taken seriously – they can come up with their own positive approach.  The scientific community can engage the competing claims.  For example, the Princeton Election Consortium is a good example of a challenge to the 538 methodology.

No matter what 538 is a hell of a lot better than the status quo practices previously existed to its founding in early 2008. The level of jealously directed toward Nate Silver is completely transparent.  If you want to get all Popperian – go right ahead but then you have little or nothing to say about this or most other prediction problems.  This is what happened in quantitative finance / algo trading and the arbitrage went to those who were not worried about whether what they were doing was science or just engineering [insert Sheldon Cooper quote here] .

One thing we can hope comes out of all this is that all of the data free speculation that was undertaken prior to the election can be put to bed.  I talking about you Dick Morris, Karl Rove, etc. – perhaps you guys should consider retirement and leave the arguments to the serious quants.

9 Weeks to Go — House and Senate Control as Measured by the Iowa Electronic Market

With nine weeks to go before the 2010 Midterm Elections, it is worth checking in with Iowa Electronic Markets to see where things stand. “The IEM 2010 Congressional Election Markets are real-money futures markets where contract payoffs will be determined by the votes cast in the 2010 U.S. Congressional Elections. “Congress10” (plotted above) is based on the composition of both houses of Congress.”

Take a look at the plot above. You will notice there has been significant movement in the past few weeks. Consistent with the beliefs of a number of pundits, the dominant scenario for 2010 is split control “RH_DS10” (i.e Republican House and Democrat Senate). Whether you view this outcome as good or bad, it is important to emphasize there is still time left and these trends could reverse.