Computational Legal Studies

My Slides from New and Emerging Legal Infrastructures Conference (NELIC) [ @ Berkeley Law ]

April 15, 2011 clsadmin

UPDATED SLIDES – Midwest Law & Econ Association – Indiana Law – Sept 2011

Quantitative Legal Prediction – Midwest Law and Economics Conference 2011

View more presentations from Daniel Katz

New and Emerging Legal Infrastructures Conference (NELIC) [ @ Berkeley Law ]

April 15, 2011 clsadmin

Law as a Complex System: Updated Reading List – April 2011

April 14, 2011 clsadmin

Wordle of President Obama’s Budget Speech at George Washington University

April 13, 2011 clsadmin

I realize that Wordles are not among the most scientific things on the planet (see Drew Conway’s post on this point over at ZIA). However, I just thought I would play around with President Obama’s speech on the budget delivered today at George Washington University. See above …

Building a Better Legal Search Engine, Part 1: Searching the U.S. Code

April 11, 2011 clsadmin

Cross Post from Michael Bommarito’s Blog – “Last week, I mentioned that I am excited to give a keynote in two weeks on Law and Computation at the University of Houston Law Center alongside Stephen Wolfram, Carl Malamud, Seth Chandler, and my buddy Dan Katz from here at the CLS Blog. The first part in my blog series leading up to this talk will focus on indexing and searching the U.S. Code with structured, public domain data and open source software.

Before diving into the technical aspects, I thought it would be useful to provide some background on what the U.S. Code is and why it exists. Let’s start with an example – the Dodd-Frank Wall Street Reform and Consumer Protection Act. After the final version of HR 4173 was passed by both houses and enrolled in July of 2010, it received a new identifier, Public Law 111-230. This public law, along with private laws, resolutions, amendments, and proclamations, is published in order of enactment in the Statutes at Large. The Statutes at Large is therefore a compilation of all these sources of law dating back to the Declaration of Independence itself, and as such, is the authoritative source of statutory law.

If we think about the organization and contents of the Statutes at Large, it quickly becomes clear why the Code exists. The basic task of a legal practitioner is to determine what the state of law is with respect to a given set of facts at a certain time, typically now. Let’s return to the Dodd-Frank example above. Let’s say we’re in the compliance department at a financial institution and we’d like to know how the new proprietary trading rules affect us. To do this, we might perform the following tasks:

Search for laws by concept, e.g., depository institution or derivative.
Ensure that these laws are current and comprehensive.
Build a set of rules or guidelines from these laws.
Interpret these rules in the context of our facts.

However, the Statutes at Large is not well-suited to these tasks.

It is sorted by date of enactment, not by concept.
It contains laws that may affect multiple legal concepts.
It contains laws that reference other laws for definitions or rules.
It contains laws that amend or repeal other laws.

Based on our goal and these properties of the Statutes, we need to perform an exhaustive search every time we have a new question. This is pretty clearly bad if we want to get anything done (but hey, maybe you’re not in-house and you bill by the hour). So what might we do to re-organize the Statutes to make it easier for us to use the law?

Organize the law by concept, possibly hierarchically.
Combine laws that refer or amend one another.
Remove laws that have expired or have been repealed.
Provide convenient citations or identifiers for legal concepts.

A systematic organization of the Statutes at Large that followed these rules would make our lives significantly easier. We could search for concepts and use the hierarchical context of these results to navigate related ideas. We could rest assured that the material we read was near-comprehensive and current. Furthermore, we could communicate more succintly by referencing a small number of organized sections instead of hundreds of Public Laws.

As you might have guessed, this organizational scheme defines the United States Code as produced by the Office of the Law Revision Counsel. While the LRC traditionally distributes copies of the Code as ASCII files on CD-ROMs, they recently began distributing copies of the code in XHTML. We’ll be using these copies to build our index, so if you’d like to follow along, you should download them from here – http://uscode.house.gov/xhtml/.

If we’d like to build a legal search engine, the Code is arguably the best place to start. While there are other important statutory and judicial sources like the Code of Federal Regulations or the Federal Reporter, the Code is as close to capital-L Law as it gets.

In this part of the post series, I’m going to build an index of the text of the Code from the 2009 and 2010 LRC snapshots. To do this, we’ll use the excellent Apache Lucene library for Java. Lucene is, in their own words, a “a high-performance, full-featured text search engine library written entirely in Java.” As we’ll see in later posts, Lucene (with its sister project, Solr) is a very easy and powerful tool to develop fast, web-based search interfaces. Before we dive into the code below the break, let’s take a look at what we’re working towards. Below is a search for the term “swap” across the entire Code. We’re displaying the top five results, and these were produced in a little over a second on my laptop. “

To view the images, click over to Michael Bommarito’s Blog (click here for direct access). Additional technical specifications and code are also available.

Ignite Law 2011 & The ABA Techshow

April 10, 2011 clsadmin

On the eve the ABA Tech Show, I am looking forward to attending Ignite Law 2011 @ The Chicago Hilton. For those not familiar, Ignite offers a unique style of presentation (6 minutes total with automatically advancing slides). For a certain class of ideas, Ignite offers thefmaximal information compression approach to concept introduction. Anyway, the topics of the talks interesting.

Tomorrow, I will be attending some of the sessions at the Tech Show. If anyone is attending the conference and would like to touchbase, feel free to ping me.

The Silicon Valley Ecosystem (i.e. The Money Network) [NYTimes]

April 8, 2011 clsadmin

The Patent Conference @ KU Law

April 7, 2011 clsadmin

Today, I am traveling to Kansas for The Patent Conference (aka Pat Con). Tomorrow, I will be presenting our methods paper Distance Measure for Dynamic Citation Networks (published in the Statistical Mechanics Journal – Physica A in October 2010). While the specific applied example in paper is focused on case-to-case legal citations, the formalization and method we present therein has general form applicability to all dynamic direct acyclic graphs (including the patent citation network). Thus, we are interested in discussing how to leverage our approach to better understand the path of innovation that is revealed in datasets such as the NBER patent dataset.

Map Your Moves – A Visual Exploration of Where New Yorkers Moved [Moritz Stefaner]

April 6, 2011 clsadmin

HT to Barry Ritholtz

James Fowler – Back to the Village [TEDxSanDiego]

April 4, 2011 clsadmin

Transportation in Contemporary Society: A Complex Systems Approach [Via MIT World]

April 3, 2011 clsadmin

From the abstract: “In the nineteen fifties and sixties, students of transportation focused on building infrastructure and applied lessons from the physical sciences to designing mobility. Mobility was facilely linked to the engines of economic growth and expanding GDP. In time, that perspective was replaced by a focus on transportation systems and networks. There was a newfound emphasis on environmental impacts, land use, and intermodal freight. There was also a growing concern on unpriced externalities. Today, Joseph Sussman explains, with many of those problems still unsolved, transportation has entered a new phase– a period of immense complexity or CLIOS, which stands for complex, large scale, interconnected, open and sociotechical is an acronym that is becoming the mantra of transportation engineers. While it is not as far-reaching as “chaos” to a physicist, it is an approach with far-reaching consequences for the transportation field. To participate in “Complexity 101” engineers must take account of stochastic systems, difficulties relating cause and effect, and non-linear behaviors. They must also recognize complex feedback loops between macro and micro issues; time scale anomalies, and evaluative complexity brought by new stakeholders. Sussman observes, “Even if we could wish away behavioral complexity, it would not mean that we know what we should do.” He says that transportation engineering must now embrace management, the social sciences and planning and he warns us eschew narrow representations of complex systems because they are implicitly easier to solve.”
.

Six Degrees of Marbury v. Madison : A Sink Based Visualization [v2]

April 2, 2011 clsadmin

The visualization above is something we are calling the “six degrees” of Marbury v. Madison. It was originally produced for use in our paper Distance Measures for Dynamic Citation Networks. Due to space considerations, we ended up leaving it on the cutting room floor. However, the visual is designed to highlight the idea of a “sink.”

Sinks are one of the core concepts which we outline in Distance Measures for Dynamic Citation Networks, 389 Physica A 4201 (October 1 2010). Looking through the prism of a citation network, sinks are the root to which a given legal concept, academic idea or patent based innovation can be drawn. From each citation in a non-sink node, it is possible to trace the chains of citations back to their root (which we call a sink). In the visualization above, the root or sink node is the famed United States Supreme Court decision Marbury v. Madison. Starting from the center and working out to the edge, the first ring are cases that directly cite Marbury v. Madison. The next ring are cases which cite cases that cite Marbury v. Madison. The next ring are cases which cite cases which cases that cite Marbury v. Madison and so on…

Anyway, one of the major contributions of our Distance Measures for Dynamic Citation Networks paper is that it allows us to use these sinks to create pairwise distance/similarity measure between the ith and jth unit. In this instance, the units in this directed acyclic network are the ith and jth decisions of the United States Supreme Court.

Now, it is important to note cases contain many citations and thus can be oriented relative to many different sinks. So, even if a case can be traced to the Marbury sink – this does not preclude it from being traced to other sinks as well. Also, it is possible to construct a variety of mathematical functions to characterize the sink based distance between units. For instance, the importance of a sink might decay as its shortest path length increases. An alternative measure might weight the importance of each sinks by the number of unique ancestors shared between nodes i and j that are descended from a given sink of interest. Indeed, many ﬁne-grained choices are possible but they require justiﬁcation drawn from the given substantive problem.

As mentioned above, this method has potential applications including tracing the spread of technological innovation in patent citations or the spread of ideas in a set of academic articles. However, given our primary interest surrounds the judicial citations, we are working on the follow up to the “sinks” paper. In this follow up paper, we hope to carry these and other ideas forward into a definitive community detection method for judicial citation networks.

To preview, at least two major dynamics must be considered in any null model for community detection. First, case-to-case citations can help contribute to the fractal nature of legal systems. In other words, we are pretty far from any sort of gaussian null model. However, this is easy enough to confront with an alternative null — some highly skewed distribution (i.e. power law or power law with a cutoff, etc.)

Here is the difficult part — the cross fertilization of legal concepts. This is a time evolving network where ideas are referenced/imported from otherwise unrelated or previously unrelated domains. The examples of cross-fertilization are numerous. One of my personal favorite non-SCOTUS examples is the use of the tort doctrine of “trespass to chattels” in the context of web scraping.

Anyway, we hope to have more to come on the topic of SCOTUS community detection in the weeks and months to come. In the meantime, please check out a Dynamic 3D Hi Definition United States Supreme Court Visualization.

Oyez @ Chicago Kent Releases Free OyezToday App for IPhone

March 31, 2011 clsadmin

Kudos to Jerry Goldman, the other folks at the Oyez Project as well as the Chicago-Kent College of Law for making this free resource available to the public!

From the description: “OYEZTODAY at IIT Chicago-Kent College of Law offers you the latest information and media on the current business of the Supreme Court of the United States. OYEZTODAY provides: easy-to-grasp abstracts for every case granted review, timely and searchable audio of oral arguments + transcripts, and up-to-date summaries of the Court’s most recent decisions including the Court’s full opinions. You will have access to all this information on your iPhone with the ability to share reactions on Facebook, Twitter, or by email. (Recordings of opinion announcements from the bench will follow when the Court releases these files to the National Archives at the start of the Court’s next Term). Chicago-Kent is proud to provide this free service to enhance the public’s understanding of the Supreme Court and current legal controversies.”

Network Structure of Production [From PNAS]

March 30, 2011 clsadmin

From the abstract: “Complex social networks have received increasing attention from researchers. Recent work has focused on mechanisms that produce scale-free networks. We theoretically and empirically characterize the buyer–supplier network of the US economy and find that purely scale-free models have trouble matching key attributes of the network. We construct an alternative model that incorporates realistic features of firms’ buyer–supplier relationships and estimate the model’s parameters using microdata on firms’ self-reported customers. This alternative framework is better able to match the attributes of the actual economic network and aids in further understanding several important economic phenomena.”

Daniel Martin Katz, Ron Dolin & Michael Bommarito, Legal Informatics, Cambridge University Press (2021) (Edited Volume) < Cambridge >

Corinna Coupette, Janis Beckedorf, Dirk Hartung, Michael Bommarito, & Daniel Martin Katz, Measuring Law Over Time: A Network Analytical Framework with an Application to Statutes and Regulations in the United States and Germany, 9 Front. Phys. 658463 (2021) < Frontiers in Physics > < Supplemental Material >

Daniel Martin Katz, Legal Innovation (Book Forward) in Mapping Legal Innovation: Trends and Perspectives (Springer) (Antoine Masson & Gavin Robinson, eds.) (2021) < Springer >

Michael Bommarito, Daniel Martin Katz & Eric Detterman, LexNLP: Natural Language Processing and Information Extraction For Legal and Regulatory Texts in Research Handbook on Big Data Law (Edward Elgar Press) (Roland Vogl, ed.) (2021) < Edward Elgar > < Github > < SSRN > < arXiv >

Daniel Martin Katz, Corinna Coupette, Janis Beckedorf & Dirk Hartung, Complex Societies and the Growth of the Law, 10 Scientific Reports 18737 (2020) < Nature Research > < Supplemental Material >

Edward D. Lee, Daniel Martin Katz, Michael J. Bommarito II, Paul Ginsparg, Sensitivity of Collective Outcomes Identifies Pivotal Components, 17 Journal of the Royal Society Interface 167 (2020) < Journal of the Royal Society Interface > < Supplemental Material >

Michael Bommarito, Daniel Martin Katz & Eric Detterman, OpenEDGAR: Open Source Software for SEC EDGAR Analysis, MIT Computational Law Report (2020) < MIT Law > < Github >

J.B. Ruhl & Daniel Martin Katz, Mapping the Law with Artificial Intelligence in Law of Artificial Intelligence and Smart Machines (ABA Press) (2019) < ABA Press >

J.B. Ruhl & Daniel Martin Katz, Harnessing the Complexity of Legal Systems for Governing Global Challenges in Global Challenges, Governance, and Complexity (Edward Elgar) (2019) < Edward Elgar >

J.B. Ruhl & Daniel Martin Katz, Mapping Law’s Complexity with ‘Legal Maps’ in Complexity Theory and Law: Mapping an Emergent Jurisprudence (Taylor & Francis) (2018) < Taylor & Francis >

Michael Bommarito & Daniel Martin Katz, Measuring and Modeling the U.S. Regulatory Ecosystem, 168 Journal of Statistical Physics 1125 (2017) < J Stat Phys >

Daniel Martin Katz, Michael Bommarito & Josh Blackman, A General Approach for Predicting the Behavior of the Supreme Court of the United States, PLoS ONE 12(4): e0174698 (2017) < PLoS One >

J.B. Ruhl, Daniel Martin Katz & Michael Bommarito, Harnessing Legal Complexity, 355 Science 1377 (2017) < Science >

J.B. Ruhl & Daniel Martin Katz, Measuring, Monitoring, and Managing Legal Complexity, 101 Iowa Law Review 191 (2015) < SSRN >

Paul Lippe, Daniel Martin Katz & Dan Jackson, Legal by Design: A New Paradigm for Handling Complexity in Banking Regulation and Elsewhere in Law, 93 Oregon Law Review 831 (2015) < SSRN >

Paul Lippe, Jan Putnis, Daniel Martin Katz & Ian Hurst, How Smart Resolution Planning Can Help Banks Improve Profitability And Reduce Risk, Banking Perspective Quarterly (2015) < SSRN >

Daniel Martin Katz, The MIT School of Law? A Perspective on Legal Education in the 21st Century, Illinois Law Review 1431 (2014) < SSRN > < Slides >

Daniel Martin Katz & Michael Bommarito, Measuring the Complexity of the Law: The United States Code, 22 Journal of Artificial Intelligence & Law 1 (2014) < Springer > < SSRN >

Daniel Martin Katz, Quantitative Legal Prediction – or – How I Learned to Stop Worrying and Start Preparing for the Data Driven Future of the Legal Services Industry, 62 Emory Law Journal 909 (2013) < SSRN >

Daniel Martin Katz, Joshua Gubler, Jon Zelner, Michael Bommarito, Eric Provins & Eitan Ingall, Reproduction of Hierarchy? A Social Network Analysis of the American Law Professoriate, 61 Journal of Legal Education 76 (2011) < SSRN >

Michael Bommarito, Daniel Martin Katz & Jillian Isaacs-See, An Empirical Survey of the Written Decisions of the United States Tax Court (1990-2008), 30 Virginia Tax Review 523 (2011) < SSRN >

Daniel Martin Katz, Michael Bommarito, Juile Seaman, Adam Candeub, Eugene Agichtein, Legal N-Grams? A Simple Approach to Track the Evolution of Legal Language in Proceedings of JURIX: The 24th International Conference on Legal Knowledge and Information Systems (2011) < SSRN >

Daniel Martin Katz & Derek Stafford, Hustle and Flow: A Social Network Analysis of the American Federal Judiciary, 71 Ohio State Law Journal 457 (2010) < SSRN >

Michael Bommarito & Daniel Martin Katz, A Mathematical Approach to the Study of the United States Code, 389 Physica A 4195 (2010) < SSRN > < arXiv >

Michael Bommarito, Daniel Martin Katz & Jonathan Zelner, On the Stability of Community Detection Algorithms on Longitudinal Citation Data in Proceedings of the 6th Conference on Applications of Social Network Analysis (2010) < SSRN > < arXiv >

Michael Bommarito, Daniel Martin Katz, Jonathan Zelner & James Fowler, Distance Measures for Dynamic Citation Networks 389 Physica A 4201 (2010) < SSRN > < arXiv >

Michael Bommarito, Daniel Martin Katz & Jonathan Zelner, Law as a Seamless Web? Comparing Various Network Representations of the United States Supreme Court Corpus (1791-2005) in Proceedings of the 12th International Conference on Artificial Intelligence and Law (2009) < SSRN >

Marvin Krislov & Daniel Martin Katz, Taking State Constitutions Seriously, 17 Cornell Journal of Law & Public Policy 295 (2008) < SSRN >

Daniel Martin Katz, Derek Stafford & Eric Provins, Social Architecture, Judicial Peer Effects and the ‘Evolution’ of the Law: Toward a Positive Theory of Judicial Social Structure, 23 Georgia State Law Review 975 (2008) < SSRN >

Daniel Martin Katz, Institutional Rules, Strategic Behavior and the Legacy of Chief Justice William Rehnquist: Setting the Record Straight on Dickerson v. United States, 22 Journal of Law & Politics 303 (2006) < SSRN >

Daniel Martin Katz, Michael Bommarito, Tyler Sollinger & James Ming Chen, Law on the Market? Abnormal Stock Returns and Supreme Court Decision-Making < SSRN > < arXiv > < Slides >

Daniel Martin Katz, Michael Bommarito & Josh Blackman, Crowdsourcing Accurately and Robustly Predicts Supreme Court Decisions < SSRN > < arXiv > < Slides >

Daniel Martin Katz & Michael Bommarito, Regulatory Dynamics Revealed by the Securities Filings of Registered Companies < Slides >

Pierpaolo Vivo, Daniel Martin Katz & J.B. Ruhl (Editors), The Physics of the Law: Legal Systems Through the Prism of Complexity Science, Special Collection for Frontiers in Physics (2021 Forthcoming) < Frontiers in Physics >

Corinna Coupette, Dirk Hartung, Janis Beckedorf, Maximilian Bother & Daniel Martin Katz, Law Smells – Defining and Detecting Problematic Patterns in Legal Drafting < SSRN >

Ilias Chalkidis, Abhik Jana, Dirk Hartung, Michael Bommarito, Ion Androutsopoulos, Daniel Martin Katz & Nikolaos Aletras, LexGLUE: A Benchmark Dataset for Legal Language Understanding in English < arXiv > < SSRN >