SIGN UP FOR FREE — May 2021 — The 2021 Edition of Bucerius Legal Tech Essentials! https://techsummer.law-school.de/ – When we introduced Bucerius Legal Tech Essentials in 2020, we were overwhelmed by how many great lecturers and students (5000+) decided to join us. Not only was it the one of largest LegalTech Programs ever created, it was also incredibly intense, fun and engaging. It should come as no surprise that we are going for a version 2.0 in 2021. The program’s core remains the same: a free online educational experience with many of the lecturers that would normally be on our campus!
So Again Sign Up For Free and More Information to Follow !
This week I hosted the Elevate Together Podcast for our first ‘Inside the Engine Room’ Series with my Guest Eric Detterman, VP Data Engineering and Solutions at Elevate. In our wide ranging conversation, we talked about the path from Ad Hoc Machine Learning Projects to Building Enterprise Grade Products, thoughts on Tech Stacks, the Decomposition of Legal A.I. Products into their component parts (UI, Database, Workflow, Engine, etc.) as well key IT questions such as how to push and pull data with best in class API Infrastructure and Deployment (Docker, Kubernetes). Check it Out here or on Apple Podcasts, SoundCloud or Spotify.
ABSTRACT: “Causality understanding between events is a critical natural language processing task that is helpful in many areas, including health care, business risk management and finance. On close examination, one can find a huge amount of textual content both in the form of formal documents or in content arising from social media like Twitter, dedicated to communicating and exploring various types of causality in the real world. Recognizing these “Cause-Effect” relationships between natural language events continues to remain a challenge simply because it is often expressed implicitly. Implicit causality is hard to detect through most of the techniques employed in literature and can also, at times be perceived as ambiguous or vague. Also, although well-known datasets do exist for this problem, the examples in them are limited in the range and complexity of the causal relationships they depict especially when related to implicit relationships. Most of the contemporary methods are either based on lexico-semantic pattern matching or are feature-driven supervised methods. Therefore, as expected these methods are more geared towards handling explicit causal relationships leading to limited coverage for implicit relationships and are hard to generalize. In this paper, we investigate the language model’s capabilities for causal association among events expressed in natural language text using sentence context combined with event information, and by leveraging masked event context with in-domain and out-of-domain data distribution. Our proposed methods achieve the state-of-art performance in three different data distributions and can be leveraged for extraction of a causal diagram and/or building a chain of events from unstructured text.”
We have just posted our NEW PAPER featuring a combined dataset of network and text data which is roughly 120 MILLION words (tokens) in Size. “Measuring Law Over Time: A Network Analytical Framework with an Application to Statutes and Regulations in the United States and Germany.” Accesspaper draft via SSRN.
ABSTRACT: How do complex social systems evolve in the modern world? This question lies at the heart of social physics, and network analysis has proven critical in providing answers to it. In recent years, network analysis has also been used to gain a quantitative understanding of law as a complex adaptive system, but most research has focused on legal documents of a single type, and there exists no unified framework for quantitative legal document analysis using network analytical tools. Against this background, we present a comprehensive framework for analyzing legal documents as multi-dimensional, dynamic document networks. We demonstrate the utility of this framework by applying it to an original dataset of statutes and regulations from two different countries, the United States and Germany, spanning more than twenty years (1998–2019). Our framework provides tools for assessing the size and connectivity of the legal system as viewed through the lens of specific document collections as well as for tracking the evolution of individual legal documents over time. Implementing the framework for our dataset, we find that at the federal level, the American legal system is increasingly dominated by regulations, whereas the German legal system remains governed by statutes. This holds regardless of whether we measure the systems at the macro, the meso, or the micro level.
ABSTRACT: “The mushroom body of the fruit fly brain is one of the best studied systems in neuroscience. At its core it consists of a population of Kenyon cells, which receive inputs from multiple sensory modalities. These cells are inhibited by the anterior paired lateral neuron, thus creating a sparse high dimensional representation of the inputs. In this work we study a mathematical formalization of this network motif and apply it to learning the correlational structure between words and their context in a corpus of unstructured text, a common natural language processing (NLP) task. We show that this network can learn semantic representations of words and can generate both static and context-dependent word embeddings. Unlike conventional methods (e.g., BERT, GloVe) that use dense representations for word embedding, our algorithm encodes semantic meaning of words and their context in the form of sparse binary hash codes. The quality of the learned representations is evaluated on word similarity analysis, word-sense disambiguation, and document classification. It is shown that not only can the fruit fly network motif achieve performance comparable to existing methods in NLP, but, additionally, it uses only a fraction of the computational resources (shorter training time and smaller memory footprint).”
We are very pleased to announce pre-orders for “Legal Informatics” (Cambridge University Press – (Coming in early 2021) are now available on Amazon / Cambridge. Our book is designed to be an introduction to the academic discipline underlying the economic and technological transformation of the legal industry. Legal Informatics features contributions from more than two dozen academic and industry experts, chapters cover the history and principles of legal informatics and background technical concepts – including natural language processing and distributed ledger technology. The volume also presents real-world case studies that offer important insights into document review, due diligence, compliance, case prediction, billing, negotiation and settlement, contracting, patent management, legal research, and online dispute resolution. It is hardbound book ~600 pages in length.