Case Study

Data Analysis, Searching, and Predictive Coding on Foreign Language Documents

View All

A bank engaged Buckley to conduct an internal investigation of employees with large volumes of email.


  • Industry:

    Financial Services

  • Client Size:

    1,000 to 5,000 employees

  • Duration:

    6 weeks

  1. .

    The Challenge

    A bank anticipating a potential regulatory action launched an internal investigation to determine activities of some employee. The bank collected two terabytes of email from six employees — more than 330GB of data per employee. Even after deduplication and keyword searching — with the vast majority of the documents in Spanish — the bank still faced over 360,000 potentially relevant documents, which would require more than 9,000 attorney hours to review.

  2. .

    The Forté Way

    Due to the unusually high volume of data per custodian, the FORTÉ team formulated a tiered and multilayered approach that combined keyword searches, email threading, and predictive coding.  As a result, the case team reviewed fewer than 18,000 documents — 5 percent of the 360,000 documents that hit on the search terms — ultimately identifying 1,500 relevant documents and 350 key documents. At a high level, the process entailed:

    • Prioritizing the processing of a single key custodian in order for the investigative team to determine appropriate data and keyword searches
    • Prioritizing the processing and review of additional custodians’ emails based on the information learned from the initial custodian’s emails
    • Using email threading to suppress repetitive chain content
    • Using predictive coding analytics to identify the most relevant content, and training the system using native Spanish-speaking attorneys

    This process saved the client thousands of hours of attorney review time.

  3. .


    Internal investigations allow for maximum flexibility and creativity to identify key content quickly and efficiently.  Even though the investigative team is not beholden to parameters that ought to be negotiated with an opposing party, the key to success still lies in the balance between cost, efficiency, and defensibility.

By the use of predictive coding the case team reviewed only 18,000 documents out of 360,000 documents. The use of predictive coding saved the client thousands of hours of linear review time.


Are you ready to love e-discovery?

Introducing FORTÉ, the better way to do e‑discovery.

Let's Talk