Main Menu

Data science and big data analytics

Photo of Data science and big data analytics

Related areas:


Data science and big data analytics are increasingly critical tools in litigation. This is particularly true in the healthcare and life sciences sectors, where private entities and government payors routinely use electronic medical record (EMR), claim adjudication, enterprise resource planning (ERP), and other systems to capture and store large volumes of detailed data. Companies such as IQVIA and Truven Health Analytics and government agencies such as the Food and Drug Administration and Centers for Medicare and Medicaid Services publish various types of large, structured data. Together, these private and public data sources contain billions of records that are commonly used in litigation.

As the data expand exponentially in both scale and complexity, Bates White keeps pace by continuously investing in our people, tools, and infrastructure. We maintain industry-leading data storage, processing, analytics, and statistical modeling capabilities. Our experts are skilled at identifying the appropriate data and approaches for a given context, implementing sophisticated analyses of large and complex data sets, and interpreting and communicating results in a way that has a strong impact and is easy to understand. 

We have broad and deep experience applying our expertise in nearly every corner of the healthcare and life sciences sectors. This includes deriving key insights from structured and unstructured “big data” from pharmaceutical and biotechnology companies, hospitals, pharmacy benefits managers (PBMs), medical device manufacturers, and medical providers among others. 

Our insights help clients make informed decisions and assist them in securing favorable outcomes in litigation.


  • Big data assessment of price-fixing allegations—Leveraged Hadoop big data technology on behalf of a large generic pharmaceutical manufacturer to process, store, and analyze billions of public and private sales and prescription records to assess pricing and market share patterns over time. Data sources included transactional sales, Symphony Health, IQVIA National Sales Perspective, and IQVIA National Prescription Audit. Developed a customized tool using Solr to search and synthesize millions of unstructured documents and data files, enabling us to quickly identify key documents and information. 
  • Statistical modeling of FCA allegations—Supported pharmaceutical company with statistical simulation modeling the impact of various factors on prescribing behavior to assess causation in a matter involving alleged kickbacks. Leveraged parallel computing to estimate computationally intensive models involving thousands of simulations on hundreds of millions of prescriptions. 
  • Data conversion process and analysis help quantify effects of conduct—Automated a process to create a database from unstructured and non-digitized information on behalf of a hospital in a contractual dispute with a vendor. Developed algorithms to convert thousands of pages of qualitative information and data contained in PDF documents into a machine-readable database that allowed the team to reliably and accurately quantify the effects of alleged conduct. 
  • Creation of insights from unstructured information—In a False Claims Act matter, reviewed detailed medical records for hundreds of patients. For each patient, took hundreds of pages of non-digitized medical records and developed an automated process to graph provider care over time. 
  • Big data analytics of healthcare claims—Synthesized billions of medical and pharmacy claims data records from more than ten insurers to identify and analyze patient-level trends. Leveraged big data technologies (Spark and Hadoop) to produce key insights for the client within a couple weeks of receiving over a terabyte of claim files data. 
  • Big data analytics of IQVIA data—Analyzed 20 years of IQVIA Xponent data for thousands of products and millions of prescribers, totaling nearly 5 billion prescriptions. 


An alternative to big data analytics, statistical sampling may be preferred when key information is only available in hard copy records that are expensive and take time to obtain. We are experts in developing appropriate sampling and extrapolation methodologies to draw reliable conclusions. For example, on behalf of a leading healthcare provider in an arbitration focused on a chronic healthcare condition, our experts submitted an expert affidavit on a statistical sampling protocol related to patient records. The arbitration panel selected the sampling methodology proposed by our experts.

Jump to Page

We use cookies to optimize the performance of this site and give you the best user experience. By clicking "Accept," you agree to our use of cookies.

Necessary Cookies

Necessary cookies enable core functionality such as security, network management, and accessibility. You may disable these by changing your browser settings, but this may affect how the website functions.

Analytical Cookies

Analytical cookies help us improve our website by collecting and reporting information on its usage. We access and process information from these cookies at an aggregate level.