Measuring the Complexity of the Law: The United States Code (By Daniel Martin Katz & Michael J. Bommarito)

From our abstract:  “Einstein’s razor, a corollary of Ockham’s razor, is often paraphrased as follows: make everything as simple as possible, but not simpler.  This rule of thumb describes the challenge that designers of a legal system face—to craft simple laws that produce desired ends, but not to pursue simplicity so far as to undermine those ends.  Complexity, simplicity’s inverse, taxes cognition and increases the likelihood of suboptimal decisions.  In addition, unnecessary legal complexity can drive a misallocation of human capital toward comprehending and complying with legal rules and away from other productive ends.

While many scholars have offered descriptive accounts or theoretical models of legal complexity, empirical research to date has been limited to simple measures of size, such as the number of pages in a bill.  No extant research rigorously applies a meaningful model to real data.  As a consequence, we have no reliable means to determine whether a new bill, regulation, order, or precedent substantially effects legal complexity.

In this paper, we address this need by developing a proposed empirical framework for measuring relative legal complexity.  This framework is based on “knowledge acquisition,” an approach at the intersection of psychology and computer science, which can take into account the structure, language, and interdependence of law. We then demonstrate the descriptive value of this framework by applying it to the U.S. Code’s Titles, scoring and ranking them by their relative complexity.  Our framework is flexible, intuitive, and transparent, and we offer this approach as a first step in developing a practical methodology for assessing legal complexity.”

This is a draft version so we invite your comments (katzd@law.msu.edu) and (michael.bommarito@gmail.com).  Also, for those who might be interested – we are building out a full replication page for the paper.  In the meantime, all of the relevant code and data can be accessed at GitHub and from the Cornell Legal Information Institute.

UPDATE: Paper was named “Download of the Week” by Legal Theory Blog.

Bloomberg Government – Another Tool for Navigating the Increasingly Complex Information Environment?

While I am hardly here to shill for Bloomberg, the introduction of Bloomberg Government into the market for government information does represent an important development worthy of highlighting.  Coverage from a few weeks back is located here and here.

While I hope to explore the actual product in the coming months, the front page highlights both its coverage and its informational interface. Whether aimed at sophisticated and non sophisticated actors, the selection of this sort of dashboard style interface is important as it is precisely the sort of HCI that has been shown to help end users navigate complex information environments.

The ever increasing access to digitized governmental information provides a real arbitrage opportunity for a specific firm to serve as the default third party provider of that information. Whether Bloomberg Government will fill this void is likely a function of (1) the novelity of its informational inputs and (2) the quality of the HCI experienced by target end users. Only time will tell…

Measuring the Complexity of the Law : The United States Code

Understanding the sources of complexity in legal systems is a matter long considered by legal commentators. In tackling the question, scholars have applied various approaches including descriptive, theoretical and, in some cases, empirical analysis. The list is long but would certainly include work such as Long & Swingen (1987), Schuck (1992), White (1992), Kaplow (1995), Epstein (1997), Kades (1997), Wright (2000) and Holz (2007). Notwithstanding the significant contributions made by these and other scholars, we argue that an extensive empirical inquiry into the complexity of the law still remains to be undertaken.

While certainly just a slice of the broader legal universe, the United States Code represents a substantively important body of law familiar to both legal scholars and laypersons. In published form, the Code spans many volumes. Those volumes feature hundreds of thousands of provisions and tens of millions of words. The United States Code is obviously complicated, however, measuring its size and complexity has proven be non-trivial.

In our paper entitled, A Mathematical Approach to the Study of the United States Code we hope to contribute to the effort by formalizing the United States Code as a mathematical object with a hierarchical structure, a citation network and an associated text function that projects language onto specific vertices.

In the visualization above, Figure (a) is the full United States Code visualized to the section level. In other words, each ring is a layer of a hierarchical tree that halts at the section level. Of course, many sections feature a variety of nested sub-sections, etc. For example, the well known 26 U.S.C. 501(c)(3) is only shown above at the depth of Section 501.  If we added all of these layers there would simply be additional rings. For those interested in the visualization of specific Titles of the United States Code … we have previously created fully zoomable visualizations of Title 17 (Copyright), Title 11 (Bankruptcy),  Title 26 (Tax) [at section depth], Title 26 (Tax) [Capital Gains & Losses] as well as specific pieces of legislation such as the original Health Care Bill — HR 3962.

In the visualization above, Figure (b) combines this hierarchical structure together with a citation network.  We have previously visualized the United States Code citation network and have a working paper entitled Properties of the United States Code Citation Network. Figure (b) is thus a realization of the full United States Code through the section level.

With this representation in place, it is possible to measure the size of the Code using its various structural features such as vertices V and its edges E.  It is possible to measure the full Code at various time snapshots and consider whether the Code is growing or shrinking. Using a limited window of data, we observe growth not only in the size of the code but also its network of dependancies (i.e. its citation network).

Of course, growth in the size United States Code alone is not necessarily analogous to an increase in complexity.  Indeed, while we believe in general the size of the code tends to contribute to “complexity,” some additional measures are needed.  Thus, our paper features structural measurements such as number of sections, section sizes, etc.

In addition, we apply the well known Shannon Entropy measure (borrowed from Information Theory) to evaluate the “complexity” of the message passing / language contained therein.  Shannon Entropy has a long intellectual history and has been used as a measure of complexity by many scholars.  Here is the formula for Shannon entropy:

For those interested in reviewing the full paper, it is forthcoming in Physica A: Statistical Mechanics and its Applications. For those not familiar, Physica A is a journal published by Elsevier and is a popular outlet for Econophysics and Quantitative Finance. A current draft of the paper is available on the SSRN and the physics arXiv

We are currently working on a follow up paper that is longer, more detailed and designed for a general audience.  Even if you have little or no interest in the analysis of the United States Code, we hope principles such as entropy, structure, etc. will prove useful in the measurement of other classes of legal documents including contracts, treaties, administrative regulations, etc.

The United States Code — The Movie — Featuring Title 16 — Conservation

Above is a movie displaying Title 16 (Conservation) a subset of the content contained within the United States Code. At more than 2,400 pages (download it here), Title 16 is one of the larger titles in the US Code.  Yet, it is not the largest.  For example, Title 26 (Internal Revenue Code) and Title 42 (Public Health and Welfare) are far larger than the object displayed above.

Now, you might be wondering why we chose to generate this movie. We envisioned at least two purposes.

(1) The title of this blog is Computational Legal Studies.  One of our major goals to either develop or apply tools that scale to life in the era of Big Data. Given the scope of an object such as the United States Code, it is is clear that a significant class of potential analysis cannot reasonably be undertaken without the use of computational tools.  Thus, with respect to developing new insights, we believe computational linguistics, information theory, applied graph theory can be of great use.  For those interested, our new paper entitled A Mathematical Approach to the Study of the United States Code offers our initial exploration of the possibilities.

(2) We believe this movie can be a meaningful pedagogical device.  Many students enter law school and are dismayed when even in  statutory based classes they are not exclusively reviewing the black letter law. Given the scope of this and other large bodies of documents, any model of legal education cannot be exclusively be dedicated to teaching black letter law. Instead, such training is appropriately devoted to a mixture of existing legal rules as well as the development of information acquisition protocols that train students to navigate the relevant landscape.

Hustle and Flow: A Social Network Analysis of the American Federal Judiciary [Repost from 3/25]

Zoom on Network

Together with Derek Stafford from the University of Michigan Department of Political Science, Hustle and Flow: A Social Network Analysis of the American Federal Judiciary represents our initial foray into Computational Legal Studies. The full paper contains a number of interesting visualizations where we draw various federal judges together on the basis of their shared law clerks (1995-2004). The screen print above is a zoom very center of the center of the network.  Yellow Nodes represent Supreme Court Justices, Green Nodes represent Circuit Court Justices, Blue Nodes represent District Court Justices.

There exist many high quality formal models of judicial decision making including those considering decisions rendered by judges in judicial hierarchy, whistle blowing, etc. One component which might meaningfully contribute to the extent literature is the rigorous consideration of the social and professional relationships between jurists and the impacts (if any) these relationships impose upon outcomes. Indeed, from a modeling standpoint, we believe the “judicial game” is a game on a graph–one where an individual strategic jurist must take stock of time evolving social topology upon which he or she is operating. Even among judges of equal institutional rank, we observe jurists with widely variant levels social authority (specifically social authority follows a power law distribution).

So what does all of this mean? Take whistle blowing — the power law distribution implies that if the average judge has a whistle, the “super-judges” we identify within the paper could be said to have an air horn. With the goal of enriching positive political theory / formal modeling of the courts, we believe the development of a positive theory of judicial social structure can enrich our understanding of the dynamics of prestige and influence. In addition, we believe, at least in part, “judicial peer effects” can help legal doctrine socially spread across the network. In that vein, here is a view of our operationalization of the social landscape … a wide shot of the broader network visualized using the Kamada-Kawai visualization algorithm:

Here is the current abstract for the paper: Scholars have long asserted that social structure is an important feature of a variety of societal institutions. As part of a larger effort to develop a fully integrated model of judicial decision making, we argue that social structure-operationalized as the professional and social connections between judicial actors-partially directs outcomes in the hierarchical federal judiciary. Since different social structures impose dissimilar consequences upon outputs, the precursor to evaluating the doctrinal consequences that a given social structure imposes is a descriptive effort to characterize its properties. Given the difficulty associated with obtaining appropriate data for federal judges, it is necessary to rely upon a proxy measure to paint a picture of the social landscape. In the aggregate, we believe the flow of law clerks reflects a reasonable proxy for social and professional linkages between jurists. Having collected available information for all federal judicial law clerks employed by an Article III judge during the “natural” Rehnquist Court (1995-2004), we use these roughly 19,000 clerk events to craft a series of network based visualizations.   Using network analysis, our visualizations and subsequent analytics provide insight into the path of peer effects in the federal judiciary. For example, we find the distribution of “degrees” is highly skewed implying the social structure is dictated by a small number of socially prominent actors. Using a variety of centrality measures, we identify these socially prominent jurists. Next, we draw from the extant complexity literature and offer a possible generative process responsible for producing such inequality in social authority. While the complete adjudication of a generative process is beyond the scope of this article, our results contribute to a growing literature documenting the highly-skewed distribution of authority across the common law and its constitutive institutions.

Visualization of the Ideological History of the Supreme Court

USSC MQ Scores

Here is a cool visual for the Martin-Quinn Scores. For those of you not familiar, the Martin-Quinn paper and “MQ Scores” represented a significant breakthrough in the field of judicial politics. On that note, Stephen Jessee & Alexander Tahk have done a nice job both bringing their data up to date and extending their work.  For those interested, click on the visual above and check out all of the relevant links contained within this post.  

An Exchange in Need of Empirics and an Analytical or Computational Model


section 5

On a recent flight, I read Jeffrey Toobin’s New Yorker Article on Chief Justice Roberts entitled “No More Mr. Nice Guy”.  The exchange quoted above is drawn from this article. While I believe it is appropriate to engage empirical data where available, the underlying discussion is not one exclusively subjectable to empirical inquiry.  Rather, it is, at least in part, a question in need of a formal theoretic model. Justice Roberts and Mr. Katyal are implicitly discussing a “state of the world” not yet realized but which would be realized if the statute were not to exist. What would benefit the discussion is principled manner to adjudicate between these two inferences drawn above.  Namely, it would be useful to fully evaluate what behavior would likely follow if the statute were not to exist.

There exist a variety of mathematical modeling techniques which could inject some much needed rigor into the above discussion. To my knowledge, such an applied model has yet to be offered.  The Supreme Court’s decision in the matter is soon forthcoming. Given the nature of the exchange above, there is reason to believe that if Chief Justice Roberts prevails ….we will get our model as the “state of the world” discussed above will no longer be hypothetical…. 

Taking Judicial Content Seriously–Lupu & Fowler's Strategic Content Model

Roe v. Wade Citation Network

In my conversations with judicial politics scholars, many lament how many of our existing approaches tend to ignore opinion content.  For those interested in embedding opinion content into existing theories of judicial decision making … consider Yonatan LupuJames Fowler’s paper recently posted to the SSRN.  

The authors present a strategic model of judicial bargaining over opinion content.  They note … “we find that the Court generates opinions that are better grounded in law when more justices write concurring opinions.”  To generate the specification for “grounding in law” the authors use Kleinberg’s Hubs and Authorities Algorithm calculated at the time the opinion was authored. The Strategic Content Paper is available here. 

The visual above is drawn from a related Fowler project located here.  Another very worthwhile paper authored by FowlerJohnsonSpriggs, Jeon & Wahlbeck is located here.  

Hustle & Flow: A Network Analysis of the American Federal Judiciary

picture-3

This paper written by CLS Blog Co-Founder Daniel Katz and Derek Stafford from the University of Michigan Department of Political Science representes an initial foray into Computational Legal Studies by the graduate students here at the University of Michigan Center for the Study of Complex Systems.  The full paper contains a number of interesting visualizations where we draw various federal judges together on the basis of their shared law clerks (1995-2004).  The screen print above is a zoom very center of the center of the network.  Yellow Nodes represent Supreme Court Justices, Green Nodes represent Circuit Court Justices, Blue Nodes represent Circuit Court Justices.  Here is a wide shot of the broader network visualized using the Kamada-Kawai visualization algorithm:    

test

Here is the abstract:      Scholars have long asserted that social structure is an important feature of a variety of societal institutions. As part of a larger effort to develop a fully integrated model of judicial decision making, we argue that social structure-operationalized as the professional and social connections between judicial actors-partially directs outcomes in the hierarchical federal judiciary. Since different social structures impose dissimilar consequences upon outputs, the precursor to evaluating the doctrinal consequences that a given social structure imposes is a descriptive effort to characterize its properties. Given the difficulty associated with obtaining appropriate data for federal judges, it is necessary to rely upon a proxy measure to paint a picture of the social landscape. In the aggregate, we believe the flow of law clerks reflects a reasonable proxy for social and professional linkages between jurists. Having collected available information for all federal judicial law clerks employed by an Article III judge during the “natural” Rehnquist Court (1995-2004), we use these roughly 19,000 clerk events to craft a series of network based visualizations.   Using network analysis, our visualizations and subsequent analytics provide insight into the path of peer effects in the federal judiciary. For example, we find the distribution of “degrees” is highly skewed implying the social structure is dictated by a small number of socially prominent actors. Using a variety of centrality measures, we identify these socially prominent jurists. Next, we draw from the extant complexity literature and offer a possible generative process responsible for producing such inequality in social authority. While the complete adjudication of a generative process is beyond the scope of this article, our results contribute to a growing literature documenting the highly-skewed distribution of authority across the common law and its constitutive institutions.