Computational Legal Studies

Programming Dynamic Models in Python

October 11, 2009 clsadmin

In this series of tutorials, we are going to focus on the theory and implementation of transmission models in some kind of population.

In epidemiology, it is common to model the transmission of a pathogen from one person to another. In the social sciences and law, we may be interested in thinking about the way in which individuals influence each other’s opinions, ideology and actions.

These two examples are different, but in many ways analogous: it is not difficult to imagine the influence that one individual has on another as being similar to the infectivity of a virus in the sense that both have the ability to change the state of an individual. One may go from being susceptible to being infected, or from unconvinced to convinced.

Additionally, social networks have become an important area of study for epidemiological modelers. We can imagine that the nature of the network is different than the ones we think about in the social sciences: when studying outbreaks of a sexually transmitted disease, one doesn’t care all that much about the friendship networks of the people involved, while this would be very important for understanding the impact of social influence on depression and anxiety.

As someone who spends a lot of time working in the space where epidemiology and sociology overlap, I end up thinking a lot about these models – and their potential application to new and different problems and am really excited to share them with a broader audience here. In this first tutorial, I’m going to introduce a simple Susceptible-Infected-Recovered (SIR) model from infectious disease epidemiology and show a simple, Pythonic implementation of it. We’ll work through the process of writing and optimizing this kind of model in Python and, in the final tutorials, will cover how to include a social network in the simulation model.

In order to use the example below, all you need to have installed is a current version of Python (2.4+ is probably best) and the excellent Python plotting package Matplotlib in order to view output. If you don’t have Matplotlib and don’t want to go and install it (although I guarantee you won’t regret it), just comment out import for Pylab and any lines related to plotting.

Model Assumptions

1. State Space / Markov Model

Before getting into the mechanics of the model, let’s talk about the theory and assumptions behind the model as it is implemented here:

The SIR model is an example of a ‘state space‘ model, and the version we’ll be talking about here is a discrete time, stochastic implementation that has the Markov property, which is to say that its state at time t+1 is only conditional on the parameters of the model and its state at time t.

A simple state-space diagram (courtesy of wikipedia) — A simple state-space diagram (courtesy of Wikipedia)

For the uninitiated, in a state-space model, we imagine that each individual in the system can only be in one state at a time and transitions from state to state as a function of the model parameters, i.e., the infectivity of the pathogen or idea and the rate of recovery from infection…and the other states of the system. In other words, the system has endogenous dynamics. This is what makes it both interesting and in some ways difficult to work with.

In the SIR model, we assume that each infectious individual infects each susceptible individual at rate beta. So, if beta = .5, there is a 50% chance that each susceptible individual will be infected by an exposure to an infectious individual. For this reason, as the number of infected individuals in the system grows, the rate at which the remaining susceptible individuals is infected also grows until the pool of susceptible individuals is depleted and the epidemic dies out.

The other parameter we care about is gamma, or the rate of recovery. If gamma is also equal to .5, we assume that the average individual has a 50% chance of recovering on a given day, and the average duration of infectiousness will be 1/gamma, or 2 days.

We refer to the ratio beta/gamma as the basic reproductive ratio, or Ro (‘R naught’). When this number is less than one, we typically expect outbreaks to die out quickly. When this quantity is greater than one, we expect that the epidemic will grow and potentially saturate the whole population.

2. Homogeneous Mixing:

We’re assuming a world in which everyone has simultaneous contact with everyone else. In other words, we’re thinking of a totally connected social network. If you’re a regular reader of this blog, a social network enthusiast, or in some other way a thinking person, this assumption probably seems unreasonable. It turns out, however, that for many diseases, this assumption of homogeneous or ‘mass-action’ mixing, which was actually originally borrowed from chemistry, turns out to be a reasonable approximation.

For instance, if we are trying to approximate the transmission dynamics of a very infectious pathogen like measles in a city or town, we can usually overlook social network effects at this scale and obtain a very good fit to the data. This is because even very weak contacts can transmit measles, so that friendships and other types of close contacts are not good predictors of risk. Instead, we we are better off looking at a higher level of organization – the pattern of connection between towns and cities to understand outbreaks. In a social context, something like panic may be thought of as being super-infectious (for a really interesting study about the potential relationship between social panic and flu dynamics, see this paper by Josh Epstein).

This is, however, a generally problematic assumption for most problems of social influence, but an understanding of this most basic version of the model is necessary to move on to more complicated contact patterns.

3. Exponentially distributed infectious periods:

In the most basic implementation of the SIR model, we assume that each infectious individual has some probability of recovering on every step. If our model steps forwards in days and individuals have a .5 probability of recovery on each day, we should expect that the time to recovery follows an exponential distribution. This means that most people will be pretty close to the mean, but some will take a relatively long time to recover. This is accurate for a lot of cases, but definitely not for all. In some diseases, recovery times may be lognormal, power-law or bimodally disributed. For social models, the notion of an ‘infectious period’ may not make a tremendous amount of sense at all. But it allows for a very simple and transparent implementation, so we’ll use it here.

CLICK THROUGH TO SEE THE IMPLEMENTATION and RELEVANT PYTHON CODE!

Continue reading “Programming Dynamic Models in Python”

On the Road Again … Midwest Law & Economics @ Notre Dame Law

October 8, 2009 clsadmin

Visualizing the Campaign Contributions to Senators in the 110th Congress — The TARP EDITION [Repost from 3/26])

October 7, 2009 clsadmin

This is a repost our previous Senators of the 110th Congress Campaign Finance Visualization. Last week we highlighted one specific element of the graph (i.e. Senator Dodd and the TARP Banks). Now, we wanted to bring the full graph back to the front of the page for your consideration. Here is the content of the old post with a few small additions….

“As part of our commitment to provide original content, we offer a Computational Legal Studies approach to the study of the current campaign finance environment. If you click below you can zoom in and read the labels on the institutions and the senators. The visualization memorializes contributions to the members of the 110th Congress (2007 -2009). Highlighted in green are the primary recipients of the TARP.

In the post below, we offer detailed documentation of this visualization. One Important Point the Visualization Algorithm we use does force the red team, blue team separation. Rather, the behavior of firms and senators produces the separation.

Four Important Principles: (1) Squares (i.e. Institutions) introduce money into the system and Circles (i.e. Senators) receive money (2) Both Institutions and Senators are sized by dollars contributed or dollars received– Larger = More Money (3) Senators are colored by Party. (4) The TARP Banks are colored in Green. “

Personas from MIT Media Lab

October 6, 2009 clsadmin

Remix of VisualComplexity.com [From Bestiario.org]

October 5, 2009 clsadmin

Getting Ready for First Monday? Time Series of SCOTUS Word Usage

October 2, 2009 clsadmin

John Holland’s 80th Birthday @ Michigan CSCS

October 1, 2009 clsadmin

Visualizing Campaign Contributions of the 110th Congress: Senate Banking Chairman Chris Dodd and the TARP Banks

September 30, 2009 clsadmin

Algorithmic Community Detection in Networks

September 29, 2009 clsadmin

Community detection in networks is an extremely important part of the broader network science literature. For quite a while, we have meant to highlight the extremely useful review article written by Mason Porter (Oxford) Jukka-Pekka Onnela (Harvard/Oxford) and Peter J Mucha (UNC). Rather than offer our description of the article, we thought it best to highlight commentary on the subject provided by the authors.

For example, in describing the paper over at Harvard’s Complexity and Social Networks Blog Jukka-Pekka Onnela posted the following… “Uncovering the “community” structure of social networks has a long history, but communities play a pivotal role in almost all networks across disciplines. Intuitively, one can think of a network community as consisting of a group of nodes that are relatively densely connected to each other but sparsely connected to other dense groups of nodes. Communities are important because they are thought to have a strong bearing on functional units in many networks. So, for example, communities in social networks can correspond to different social groups, such as family, whereas web pages dealing with a given subject tend to form topical communities. The concept is simple enough, but it turns out that coming up with precise mathematical definitions and algorithms for community detection is one of the most challenging problems in network science. Recently, a lot of the research in this area has been done using ideas from statistical physics, which has an arsenal of tools and concepts to tackle the problem. Unfortunately (but understandably) relatively few non-physicists like to read statistical physics papers.”

These scholars quote Mark Newman noting “[T]he development of methods for ﬁnding communities within networks is a thriving sub-area of the ﬁeld, with an enormous number of diﬀerent techniques under development. Methods for understanding what the communities mean after you ﬁnd them are, by contrast, still quite primitive, and much needs to be done if we are to gain real knowledge from the output of our computer programs.” They later note “the problem of how to validate and use communities once they are identified is almost completely open.”

Anyway, if you are interested in learning more about this important piece of the network science toolkit … we suggest you read this paper!

Workshop on Information in Networks ( WIN @ NYU Stern)

September 26, 2009 clsadmin

Finished the First Day of the Two Day Workshop on Information in Networks. This has been a really great conference thus far. Both the speakers and the participants in the poster session have all offered very high quality work. We were very happy to be able to participate!

Law as a Seamless Web … Poster for WIN Conference @ NYU Stern

September 22, 2009 clsadmin

As we mentioned in previous posts, Seadragon is a really cool product. Please note load times may vary depending upon your specific machine configuration as well as the strength of your internet connection. For those not familiar with how to operate it please see below. In our view, the Full Screen is best the way to go ….

Patent Citation Networks Revisited: Signs of 21st Century Change?

September 18, 2009 clsadmin

Visualizing Voter Turnout 2008 v 2004 [From Good]

September 17, 2009 clsadmin

Visualizing Contributions to the 110th Congress—House Edition (Repost)

September 16, 2009 clsadmin

Daniel Martin Katz, Ron Dolin & Michael Bommarito, Legal Informatics, Cambridge University Press (2021) (Edited Volume) < Cambridge >

Corinna Coupette, Janis Beckedorf, Dirk Hartung, Michael Bommarito, & Daniel Martin Katz, Measuring Law Over Time: A Network Analytical Framework with an Application to Statutes and Regulations in the United States and Germany, 9 Front. Phys. 658463 (2021) < Frontiers in Physics > < Supplemental Material >

Daniel Martin Katz, Legal Innovation (Book Forward) in Mapping Legal Innovation: Trends and Perspectives (Springer) (Antoine Masson & Gavin Robinson, eds.) (2021) < Springer >

Michael Bommarito, Daniel Martin Katz & Eric Detterman, LexNLP: Natural Language Processing and Information Extraction For Legal and Regulatory Texts in Research Handbook on Big Data Law (Edward Elgar Press) (Roland Vogl, ed.) (2021) < Edward Elgar > < Github > < SSRN > < arXiv >

Daniel Martin Katz, Corinna Coupette, Janis Beckedorf & Dirk Hartung, Complex Societies and the Growth of the Law, 10 Scientific Reports 18737 (2020) < Nature Research > < Supplemental Material >

Edward D. Lee, Daniel Martin Katz, Michael J. Bommarito II, Paul Ginsparg, Sensitivity of Collective Outcomes Identifies Pivotal Components, 17 Journal of the Royal Society Interface 167 (2020) < Journal of the Royal Society Interface > < Supplemental Material >

Michael Bommarito, Daniel Martin Katz & Eric Detterman, OpenEDGAR: Open Source Software for SEC EDGAR Analysis, MIT Computational Law Report (2020) < MIT Law > < Github >

J.B. Ruhl & Daniel Martin Katz, Mapping the Law with Artificial Intelligence in Law of Artificial Intelligence and Smart Machines (ABA Press) (2019) < ABA Press >

J.B. Ruhl & Daniel Martin Katz, Harnessing the Complexity of Legal Systems for Governing Global Challenges in Global Challenges, Governance, and Complexity (Edward Elgar) (2019) < Edward Elgar >

J.B. Ruhl & Daniel Martin Katz, Mapping Law’s Complexity with ‘Legal Maps’ in Complexity Theory and Law: Mapping an Emergent Jurisprudence (Taylor & Francis) (2018) < Taylor & Francis >

Michael Bommarito & Daniel Martin Katz, Measuring and Modeling the U.S. Regulatory Ecosystem, 168 Journal of Statistical Physics 1125 (2017) < J Stat Phys >

Daniel Martin Katz, Michael Bommarito & Josh Blackman, A General Approach for Predicting the Behavior of the Supreme Court of the United States, PLoS ONE 12(4): e0174698 (2017) < PLoS One >

J.B. Ruhl, Daniel Martin Katz & Michael Bommarito, Harnessing Legal Complexity, 355 Science 1377 (2017) < Science >

J.B. Ruhl & Daniel Martin Katz, Measuring, Monitoring, and Managing Legal Complexity, 101 Iowa Law Review 191 (2015) < SSRN >

Paul Lippe, Daniel Martin Katz & Dan Jackson, Legal by Design: A New Paradigm for Handling Complexity in Banking Regulation and Elsewhere in Law, 93 Oregon Law Review 831 (2015) < SSRN >

Paul Lippe, Jan Putnis, Daniel Martin Katz & Ian Hurst, How Smart Resolution Planning Can Help Banks Improve Profitability And Reduce Risk, Banking Perspective Quarterly (2015) < SSRN >

Daniel Martin Katz, The MIT School of Law? A Perspective on Legal Education in the 21st Century, Illinois Law Review 1431 (2014) < SSRN > < Slides >

Daniel Martin Katz & Michael Bommarito, Measuring the Complexity of the Law: The United States Code, 22 Journal of Artificial Intelligence & Law 1 (2014) < Springer > < SSRN >

Daniel Martin Katz, Quantitative Legal Prediction – or – How I Learned to Stop Worrying and Start Preparing for the Data Driven Future of the Legal Services Industry, 62 Emory Law Journal 909 (2013) < SSRN >

Daniel Martin Katz, Joshua Gubler, Jon Zelner, Michael Bommarito, Eric Provins & Eitan Ingall, Reproduction of Hierarchy? A Social Network Analysis of the American Law Professoriate, 61 Journal of Legal Education 76 (2011) < SSRN >

Michael Bommarito, Daniel Martin Katz & Jillian Isaacs-See, An Empirical Survey of the Written Decisions of the United States Tax Court (1990-2008), 30 Virginia Tax Review 523 (2011) < SSRN >

Daniel Martin Katz, Michael Bommarito, Juile Seaman, Adam Candeub, Eugene Agichtein, Legal N-Grams? A Simple Approach to Track the Evolution of Legal Language in Proceedings of JURIX: The 24th International Conference on Legal Knowledge and Information Systems (2011) < SSRN >

Daniel Martin Katz & Derek Stafford, Hustle and Flow: A Social Network Analysis of the American Federal Judiciary, 71 Ohio State Law Journal 457 (2010) < SSRN >

Michael Bommarito & Daniel Martin Katz, A Mathematical Approach to the Study of the United States Code, 389 Physica A 4195 (2010) < SSRN > < arXiv >

Michael Bommarito, Daniel Martin Katz & Jonathan Zelner, On the Stability of Community Detection Algorithms on Longitudinal Citation Data in Proceedings of the 6th Conference on Applications of Social Network Analysis (2010) < SSRN > < arXiv >

Michael Bommarito, Daniel Martin Katz, Jonathan Zelner & James Fowler, Distance Measures for Dynamic Citation Networks 389 Physica A 4201 (2010) < SSRN > < arXiv >

Michael Bommarito, Daniel Martin Katz & Jonathan Zelner, Law as a Seamless Web? Comparing Various Network Representations of the United States Supreme Court Corpus (1791-2005) in Proceedings of the 12th International Conference on Artificial Intelligence and Law (2009) < SSRN >

Marvin Krislov & Daniel Martin Katz, Taking State Constitutions Seriously, 17 Cornell Journal of Law & Public Policy 295 (2008) < SSRN >

Daniel Martin Katz, Derek Stafford & Eric Provins, Social Architecture, Judicial Peer Effects and the ‘Evolution’ of the Law: Toward a Positive Theory of Judicial Social Structure, 23 Georgia State Law Review 975 (2008) < SSRN >

Daniel Martin Katz, Institutional Rules, Strategic Behavior and the Legacy of Chief Justice William Rehnquist: Setting the Record Straight on Dickerson v. United States, 22 Journal of Law & Politics 303 (2006) < SSRN >

Daniel Martin Katz, Michael Bommarito, Tyler Sollinger & James Ming Chen, Law on the Market? Abnormal Stock Returns and Supreme Court Decision-Making < SSRN > < arXiv > < Slides >

Daniel Martin Katz, Michael Bommarito & Josh Blackman, Crowdsourcing Accurately and Robustly Predicts Supreme Court Decisions < SSRN > < arXiv > < Slides >

Daniel Martin Katz & Michael Bommarito, Regulatory Dynamics Revealed by the Securities Filings of Registered Companies < Slides >

Pierpaolo Vivo, Daniel Martin Katz & J.B. Ruhl (Editors), The Physics of the Law: Legal Systems Through the Prism of Complexity Science, Special Collection for Frontiers in Physics (2021 Forthcoming) < Frontiers in Physics >

Corinna Coupette, Dirk Hartung, Janis Beckedorf, Maximilian Bother & Daniel Martin Katz, Law Smells – Defining and Detecting Problematic Patterns in Legal Drafting < SSRN >

Ilias Chalkidis, Abhik Jana, Dirk Hartung, Michael Bommarito, Ion Androutsopoulos, Daniel Martin Katz & Nikolaos Aletras, LexGLUE: A Benchmark Dataset for Legal Language Understanding in English < arXiv > < SSRN >