Big data has been a buzz word for a few years and the trend of its expansion into other fields is by no means slowing down. As always, where there is a “gold rush” of a new concept lawyers do not fall very far behind, even if it is about numbers—something lawyers do not particularly claim a forte in. 

Advocates and legislators have been actively working on substantive legal issues involving big data such as data security and privacy protection. At the same time, lawyers with the most revolutionary minds or the least appetite in the mundane and repetitive aspects of lawyering have started to venture into legal tech analytics. Legal tech is a broad term that involves data analytics, machine learning and deep learning techniques, as well as other artificial intelligence technologies. Some of those endeavors are nascent in data science labs but a lot have put themselves out there selling services or products. In 2017 LexisNexis, one of the two largest legal research databases, acquired Ravel Law, a startup that created the online court case analytics application “Context” which uses artificial intelligence technology. Now in LexisNexis it is a breeze to search by the name of a judge or an expert and receive summary statistics of their actions in court, such as the granting rate of some motion by a judge, or the success rate of an expert’s testimony. By the beginning of this year, there are more than 700 legal tech companies/startups around [1]. These include Atrium LLP, a San Francisco based online law firm that provides transactional legal service and Blue J Legal, a Canadian online tax law software, to name a few.

This blog focuses on a special application of big data in the field of law - how big data methods and tools can help with patent analytics. Patent data is one of the sections in law that naturally fits with the strength and character of big data analytics. The United State Patent and Trademark Office (“USPTO”) receives more than 500 thousand patent applications every year [2].  The sheer volume of patents itself requires the application of big data technologies for computational analytics. 

According to Dr. Tim Pohlmann of IPlytics [3], big data may affect patent analytics in the following ways: (1) valuating patents; (2) identifying patented technologies – to understand technology clusters without reading patents; (3) extrapolating information from patented technologies – to identify patents relevant to a given portfolio or licensing program; (4) monitoring activities that transpired in the many patent worlds such as patent litigation, trades, licensing deals, etc.; (5) tracking the trends of each type of activity and making predictions on the outcome of future activities of similar types, such as predicting the possibility of success for a patent litigation case based on prior activities in the field. 

I would classify the applications of big data in patent law practice by the specializations of legal services. Roughly, three types of legal work involve patents: patent prosecution, patent litigation and tech transactions. Patent prosecution refers to the patent application process. Patent litigation involves litigation on patent validity or infringement. Tech transaction work handles patent evaluations in licensing or mergers and acquisitions. Big data approaches impact each type of legal work differently.

First, big data analytics provide the big picture for understanding patented technologies in a company or industry and help executives of a company make business decisions on how to build their patent portfolio and licensing programs. The patent prosecution process is the negotiation process between inventors and the patent office. Both the USPTO and the European Patent Office (“EPO”) store a gigantic amount of information, including patent applications, decisions, and all other supplemental materials. These data are available to the public for free. Through analyzing patent office data, we may get a sense of changing trends in the number of patent applications or in the granting and abandoning rate of patent applications. These findings probably do not affect the actual mechanics of the patent application but are very useful for business owners to understand how easy or hard it is for a certain type of patent to be granted, or where a company’s technology is in the clusters of patents in the industry. The analytics help business owners make informed decisions on their patent strategy, i.e. in which field they should focus on developing their patented technology or how to outperform their competitors in certain field of patented technology.

Second, big data analytics help patent litigators understand the track record of court decisions on certain issues in patent law. This information helps litigants make a prediction about winning rates and therefore allows litigants to adjust their strategies accordingly. Take Context in LexisNexis as an example. After a case is assigned to a judge, litigants may search the judge’s name on Context and find the record of the judge’s decisions on motions to dismiss, for example. A motion to dismiss is one of the first motions the defendant of a litigation would consider to file before moving forward with discovery or other merit- based arguments. Judges have different views on motions to dismiss. Some love them, but some hate them. Usually, experienced lawyers would resort to their “experience” on deciding how wise it is to bring a motion to dismiss in front of a certain judge. The “experience” is a mini database with anecdotal stories of direct or indirect interactions with the judge in court. The mini database on a judge varies significantly from lawyer to lawyer, which to a large extent distinguishes good lawyers from bad ones. Context provides more comprehensive (if not complete) data of what a judge has done so far when handling such motions. Context obtains the information through automatically processing hundreds and thousands of court opinions and litigation histories. Using Context is just like talking to the world’s most experienced litigant who knows every detail of every judges’ activities in court. To a certain degree, it also bridges the gap between those who can and cannot afford “experienced” lawyers. 

Third, tech transactional work benefits from big data analytics because the analytics allow a better valuation of patents which is essential in most of the business transactions involving patents. In addition, a big picture understanding of a company’s patent portfolio is important for estimating the potential value of the whole company either in the context of all other companies that are currently in the same industry or in the context of historical development of the industry.

As a last note, it is easy to confuse big data analytics with legal tech technologies that automate processing textual documents or producing legal documents. Both Atrium and Blue J legal, mentioned above, fall into the latter type of the legal tech technology. However big data analytics is a slightly different creature. Big data analytics allow lawyers to look at a large amount of similarly structured information and draw some conclusions that may affect litigation strategy or the legal opinions to be provided.




Su Li

Su Li worked as a statistician/research methodologist at UC Berkeley law school and later as a research director at UC Hastings, college of the law. She is experienced in providing consultation services on quantitative and qualitative research design, survey methods, data management, and statistical modeling.  She received her Ph.D in Sociology and Master’s degree in Mathematical Methods For Social Sciences from Northwestern University. She is currently a J.D.