Programming Dynamic Models in Python-Part 3: Outbreak on a Network

In this post, we will continue building on the basic models we discussed in the first and second tutorials. If you haven’t had a chance to take a look at them yet, definitely go back and at least skim them, since the ideas and code there form the backbone of what we’ll be doing here. In this tutorial, we will build a model that can simulate outbreaks of disease on a small-world network (although the code can support arbitrary networks).  This tutorial represents a shift away from both: a) the mass-action mixing of the first two and and b) the assumption of social homogeneity across individuals that allowed us to take some shortcuts to simplify model code and speed execution. Put another way, we’re moving more in the direction of individual-based modeling. When we’re done, your model should be producing plots that look like this: Red nodes are individuals who have been infected before the end of the run, blue nodes are never-infected individuals and green ones are the index cases who are infectious at the beginning of the run. And your model will be putting out interesting and unpredictable results such as these: In order to do this one, …

Visualizing Dynamic Networks with Python, Igraph, and SONIA

igraph2sonia Example 1 from michael bommarito on Vimeo. When it comes to quickly motivating a point or engaging students in a classroom, one of the most effective tools is visualization. Not only do movies provide fun and excitement, but they also allow viewers to leverage the abilities of the visual cortex to infer dynamics and patterns in the animated system. For our recent research, dynamic graphs are the type of system of interest. As I’ve covered before, Python is my language of choice for most programming tasks. Furthermore, Python is a very accessible language, even for beginners. However, when it comes to visualizing dynamic networks, we need another tool.  Our tool of choice is SONIA, the Social Network Image Animator. I thought I’d provide a helpful little function to generate SONIA input files from igraph objects, along with a few examples. This function takes as input an igraph.Graph object and a file name to store the SONIA output in. Every vertex in the Graph object should have a time attributed specified, either simply as an integer indicating the start time, or as a tuple or list of the form (startTime,endTime). Check out the following two examples if you need more …

Computer Programming and the Law — OR — How I Learned to Learn Live with Python and Leverage Developments in Information Science

    One of our very first posts highlighted a recent article in Science Magazine describing the possibilities of and perils associated with a computational revolution in the social sciences.  A very timely article by Paul Ohm (UC-Boulder Law School) entitled Computer Programming and the Law: A New Research Agenda represents the legal studies analog the science magazine article.  From information retrieval to analysis to visualization, we believe this article outlines the Computational Legal Studies playbook in a very accessable manner. Prior to founding this blog, we had little doubt that developments in informatics and the science associated with Web 2.0 would benefit the production of a wide class of theoretical and empirical legal scholarship. In order to lower the costs to collective action and generate a forum for interested scholars, we believed it would be useful to produce the Computational Legal Studies Blog. The early results have been very satisfying. For example, it has helped us link to the work of Paul Ohm.   For those interested in learning more about not only the potential benefits of a computational revolution in legal science but also some of the relevant mechanics, we strongly suggest you consider giving his new article a read!  

OpenEDGAR: Open Source Software for SEC EDGAR Analysis is published in MIT Computational Law Report

Today our Paper – “OpenEDGAR: Open Source Software for SEC EDGAR Analysis” was published in MIT Computational Law Report. ABSTRACT:  OpenEDGAR is an open source Python framework designed to rapidly construct research databases based on the Electronic Data Gathering, Analysis, and Retrieval (EDGAR) system operated by the US Securities and Exchange Commission (SEC). OpenEDGAR is built on the Django application framework, supports distributed compute across one or more servers, and includes functionality to (i) retrieve and parse index and filing data from EDGAR, (ii) build tables for key metadata like form type and filer, (iii) retrieve, parse, and update CIK to ticker and industry mappings, (iv) extract content and metadata from filing documents, and (v) search filing document contents. OpenEDGAR is designed for use in both academic research and industrial applications, and is distributed under MIT License at https://github.com/LexPredict/openedgar

NumPy Review Paper in Nature

ABSTRACT: “Array programming provides a powerful, compact and expressive syntax for accessing, manipulating and operating on data in vectors, matrices and higher-dimensional arrays. NumPy is the primary array programming library for the Python language. It has an essential role in research analysis pipelines in fields as diverse as physics, chemistry, astronomy, geoscience, biology, psychology, materials science, engineering, finance and economics. For example, in astronomy, NumPy was an important part of the software stack used in the discovery of gravitational waves and in the first imaging of a black hole. Here we review how a few fundamental array concepts lead to a simple and powerful programming paradigm for organizing, exploring and analyzing scientific data. NumPy is the foundation upon which the scientific Python ecosystem is constructed. It is so pervasive that several projects, targeting audiences with specialized needs, have developed their own NumPy-like interfaces and array objects. Owing to its central position in the ecosystem, NumPy increasingly acts as an interoperability layer between such array computation libraries and, together with its application programming interface (API), provides a flexible framework to support the next decade of scientific and industrial analysis.” Access Paper via Nature.

Back to Future in Legal Artificial Intelligence — Expert Systems, Data Science and the Need for Peer Reviewed Technical Scholarship

In the broader field of Artificial Intelligence (A.I.) there is a major divide between Data Driven A.I. and Rules Based A.I.  Of course, it is possible to combine these approaches but let’s keep it separate and easy for now.  Rules Based AI in the form of expert systems peaked in the late 1980’s and culminated in the last AI Winter.  Absent a few commercial examples such as TurboTax, the world moved on and Data Driven A.I. took hold. But here in #LegalTech #LawTech #LegalAI #LegalAcademy – it seems more and more like we have gone ‘Back to the A.I. Future’ (and brought an IF-THEN back in the Delorean).  As even in 2020, we see individuals and companies touting themselves for taking us Back to the A.I. Future. There is nothing wrong with Expert Systems or Rules Based AI per se.  In law, the first expert system was created by Richard Susskind and Phillip Capper in the 1980’s.  Richard discussed this back at ReInventLaw NYC in 2014.    There are a some use cases where Legal Expert Systems (Rules Based AI) are appropriate.  For example, it makes the most sense in the A2J context.  Indeed, offerings such as A2J Author and …

OpenEDGAR: Open Source Software for SEC EDGAR Analysis (Michael Bommarito, Daniel Martin Katz & Eric Detterman)

Our next paper — OpenEDGAR – Open Source Software for SEC Edgar Analysis is now available.  This paper explores a range of #OpenSource tools we have developed to explore the EDGAR system operated by the US Securities and Exchange Commission (SEC).  While a range of more sophisticated extraction and clause classification protocols can be developed leveraging LexNLP and other open and closed source tools, we provide some very simple code examples as an illustrative starting point. Click here for Paper:   < SSRN > < arXiv > Access Codebase Here: < Github > Abstract:  OpenEDGAR is an open source Python framework designed to rapidly construct research databases based on the Electronic Data Gathering, Analysis, and Retrieval (EDGAR) system operated by the US Securities and Exchange Commission (SEC). OpenEDGAR is built on the Django application framework, supports distributed compute across one or more servers, and includes functionality to (i) retrieve and parse index and filing data from EDGAR, (ii) build tables for key metadata like form type and filer, (iii) retrieve, parse, and update CIK to ticker and industry mappings, (iv) extract content and metadata from filing documents, and (v) search filing document contents. OpenEDGAR is designed for use in both academic research and industrial applications, and is distributed under MIT License at …

LexNLP: Natural Language Processing and Information Extraction For Legal and Regulatory Texts (Bommarito, Katz, Detterman)

Paper Abstract – LexNLP is an open source Python package focused on natural language processing and machine learning for legal and regulatory text. The package includes functionality to (i) segment documents, (ii) identify key text such as titles and section headings, (iii) extract over eighteen types of structured information like distances and dates, (iv) extract named entities such as companies and geopolitical entities, (v) transform text into features for model training, and (vi) build unsupervised and supervised models such as word embedding or tagging models. LexNLP includes pre-trained models based on thousands of unit tests drawn from real documents available from the SEC EDGAR database as well as various judicial and regulatory proceedings. LexNLP is designed for use in both academic research and industrial applications, and is distributed at https://github.com/LexPredict/lexpredict-lexnlp

ICPSR Summer Course

2010 ICPSR Summer Program in Quantitative Methods Introduction to Computing for Complex Systems . 1. Syllabus for the Computing Module (PDF),  Syllabus for the Lecture (PDF) 2. Slides for Class 1 — 7/21/10 —  (from SlideShare Channel)   3. Slides for Class 2 — 7/22/10 —  (from SlideShare Channel) 4. Slides for Class 3 — 7/23/10 —  (from SlideShare Channel) 5. Assignment #1 6.  Slides for Class 4 — 7/26/10 —  (from SlideShare Channel) 7. The Schelling Segregation Model Mapping Exercise 8. Slides for Class 5 — 7/27/10 —   (from SlideShare Channel)   9. The Camouflage Model Implementations Example 1 (.nlogo file), Example 2 (.nlogo file) Example 3 (.nlogo file), Example 4 (PDF)   10. Slides for Class 6 — 7/28/10 —  (from SlideShare Channel) 11. Slides for Class 7 — 7/29/10 —  (from SlideShare Channel) Download Netlogo SIR Timing Tutorial Download SIR Example Netlogo Model Python Code for SIR Model from Github   12. Slides for Class 8 — 7/30/10 — (from SlideShare Channel)   Download Fire Automation Part 1 Download Fire Automation Part 2   13. Assignment #2 14.  Slides for Class 9 — 8/2/10 —  (from SlideShare Channel) 15. Slides for Class 10 — 8/3/10 — (from SlideShare Channel) …

Using R for Quantitative Methods for Lawyers and Legal Analytics Courses (Professors Katz + Bommarito)

While its performance is sometimes problematic for some extremely large data problems, R (with R studio frontend) is the data science language du jour for many small to medium data problems. Among other things, R is great because it is open source, hyper customizable with thousands of packages available to be loaded for a specific problem. While Python and SQL are also important parts of the overall data science toolkit, we use R as our preferred language in both Quantitative Methods for Lawyers (3 credits) as well as in our Legal Analytics course (2 credits).  We have found that students who are diligent can make amazing strides in a relatively short amount of time.  For example, see this final project by Pat Ellis from last year’s course. Here are some introductory resources that we have developed to get folks started: Loading R and R Studio R Boot Camp – Part 1 – Loading Datasets and Basic Data Exploration Data Cleaning and Additional Resources R Boot Camp – Part 2 – Statistical Tests Using R Basic Data Visualization in R Scatter Plots, Covariance, Correlation Using R Intro to Regression Analysis Using R Over the balance of the 2014-2015 academic year, Mike and …

Intro to Complex Systems Models & Methods

2015 ICPSR Summer Program in Quantitative Methods Professor Daniel Martin Katz (Illinois Tech – Chi Kent) Professor Michael J. Bommarito  (UMich CSCS) (Please Note Slide Links are Live Starting After Class) —————————————————————————– —————————————————————————– KATZ  LECTURE + LAB  SLIDES FOR THE 2015 COURSE (Starts 7.20.15) (0)  Syllabus for the Lecture (PDF) WEEK 1 (1) Slides from Lab Session 1 – Introduction to Complex Systems (M 7.20.15) (2)  Slides from Lecture 1 – Introduction to Complex Systems (Tu 7.21.15) (3) Slides from Lab Session 2 – Introduction to Complex Systems (Tu 7.21.15) (4)  Slides from Lecture 2 – Introduction to Complex Systems (W 7.22.15) (5) Slides from Lab Session 3 – Introduction to Complex Systems (W 7.22.15) (6)  Slides from Lecture 3 – Introduction to Complex Systems (Th 7.23.15) (7a) Slides from Lab Session 4 – Introduction to Complex Systems (Th 7.23.15) (7b) Mapping the Schelling Segregation Model (Download + Print) (8)  Slides from Lecture 4 – Introduction to Complex Systems (F 7.24.15) (9) Slides from Lab Session 5 – Introduction to Complex Systems (F 7.24.15) (10) Evolution of Cooperation on a Social Network (By Greg Todd Jones) Download Model Download Paper WEEK 2 (11) Slides from Lecture 5 – Introduction to Complex …