Causal BERT : Language Models for Causality Detection Between Events Expressed in Text

ABSTRACT: “Causality understanding between events is a critical natural language processing task that is helpful in many areas, including health care, business risk management and finance. On close examination, one can find a huge amount of textual content both in the form of formal documents or in content arising from social media like Twitter, dedicated to communicating and exploring various types of causality in the real world. Recognizing these “Cause-Effect” relationships between natural language events continues to remain a challenge simply because it is often expressed implicitly. Implicit causality is hard to detect through most of the techniques employed in literature and can also, at times be perceived as ambiguous or vague. Also, although well-known datasets do exist for this problem, the examples in them are limited in the range and complexity of the causal relationships they depict especially when related to implicit relationships. Most of the contemporary methods are either based on lexico-semantic pattern matching or are feature-driven supervised methods. Therefore, as expected these methods are more geared towards handling explicit causal relationships leading to limited coverage for implicit relationships and are hard to generalize. In this paper, we investigate the language model’s capabilities for causal association among events expressed in natural language text using sentence context combined with event information, and by leveraging masked event context with in-domain and out-of-domain data distribution. Our proposed methods achieve the state-of-art performance in three different data distributions and can be leveraged for extraction of a causal diagram and/or building a chain of events from unstructured text.”

An Interesting Paper — ACCESS HERE

Measuring Law Over Time: A Network Analytical Framework with an Application to Statutes and Regulations in the United States and Germany

We have just posted our NEW PAPER featuring a combined dataset of network and text data which is roughly 120 MILLION words (tokens) in Size. “Measuring Law Over Time: A Network Analytical Framework with an Application to Statutes and Regulations in the United States and Germany.” Access paper draft via SSRN.

ABSTRACT: How do complex social systems evolve in the modern world? This question lies at the heart of social physics, and network analysis has proven critical in providing answers to it. In recent years, network analysis has also been used to gain a quantitative understanding of law as a complex adaptive system, but most research has focused on legal documents of a single type, and there exists no unified framework for quantitative legal document analysis using network analytical tools. Against this background, we present a comprehensive framework for analyzing legal documents as multi-dimensional, dynamic document networks. We demonstrate the utility of this framework by applying it to an original dataset of statutes and regulations from two different countries, the United States and Germany, spanning more than twenty years (1998–2019). Our framework provides tools for assessing the size and connectivity of the legal system as viewed through the lens of specific document collections as well as for tracking the evolution of individual legal documents over time. Implementing the framework for our dataset, we find that at the federal level, the American legal system is increasingly dominated by regulations, whereas the German legal system remains governed by statutes. This holds regardless of whether we measure the systems at the macro, the meso, or the micro level.

#LegalComplexity #LegalDataScience #NetworkScience #LegalAI #SocialPhysics #LegalNLP #ComplexSystems

Can A Fruit Fly Learn Word Embeddings ?

Very interesting Conference Proceeding Paper available on arXiv.

ABSTRACT: “The mushroom body of the fruit fly brain is one of the best studied systems in neuroscience. At its core it consists of a population of Kenyon cells, which receive inputs from multiple sensory modalities. These cells are inhibited by the anterior paired lateral neuron, thus creating a sparse high dimensional representation of the inputs. In this work we study a mathematical formalization of this network motif and apply it to learning the correlational structure between words and their context in a corpus of unstructured text, a common natural language processing (NLP) task. We show that this network can learn semantic representations of words and can generate both static and context-dependent word embeddings. Unlike conventional methods (e.g., BERT, GloVe) that use dense representations for word embedding, our algorithm encodes semantic meaning of words and their context in the form of sparse binary hash codes. The quality of the learned representations is evaluated on word similarity analysis, word-sense disambiguation, and document classification. It is shown that not only can the fruit fly network motif achieve performance comparable to existing methods in NLP, but, additionally, it uses only a fraction of the computational resources (shorter training time and smaller memory footprint).”

Legal Informatics – Cambridge University Press (2021)

We are very pleased to announce pre-orders for “Legal Informatics” (Cambridge University Press – (Coming in early 2021) are now available on Amazon / Cambridge. Our book is designed to be an introduction to the academic discipline underlying the economic and technological transformation of the legal industry. Legal Informatics features contributions from more than two dozen academic and industry experts, chapters cover the history and principles of legal informatics and background technical concepts – including natural language processing and distributed ledger technology. The volume also presents real-world case studies that offer important insights into document review, due diligence, compliance, case prediction, billing, negotiation and settlement, contracting, patent management, legal research, and online dispute resolution. It is hardbound book ~600 pages in length.

#LegalInformatics #LegalTech #LegalInnovation #MachineLearning #NetworkScience #NLP #LegalScience

A Third Vaccine Success – Oxford University breakthrough on global COVID-19 vaccine

Very promising news — and now some of the key questions for 2021 …

What is the Venn between these candidate vaccines?
Hopefully it is not perfectly overlapping so a patient can take one vaccine if another vaccine proves to be ineffective.

Where is the testing regime to allow folks to explore the efficacy at the personal level?
While it is helpful to offer a characterization of the mean-field performance of a vaccine, we cannot expect folks to ‘get back to normal’ unless they have some personal assurance that the vaccine has actually worked for them.

How long does immunity last ?
This is still unknown. 6 months, 1 year, etc. Also, even if the vaccine ‘fails’ or wanes how much does it reduce the severity of COVID-19?

What about Children?
Trials for Children have yet to begin (or have only recently started). While Children appear to have had less issues with COVID-19 (perhaps because of exposure to other coronaviruses, etc.), there is still the question of how well the vaccine will perform on Children.