Legal Tech Memo 2014: Driving Efficiencies and Mitigating Risk in the Era of Big Data

Dilbert on Big DataNow that the Information Governance message has finally gotten through to most every CIO and General Council, not to mention the vendor community, what’s next?

Over successive Legal Tech’s (this year’s LTNY 2014  was my sixth in a row), information management related technology and services have gotten progressively more sophisticated.  Some might argue also easier to use and integrate into the enterprise.

Meanwhile Technology Assisted Review (TAR), Predictive Coding and Early Case Assessment (ECA) have all become standard tools in the ediscovery arsenal – and accepted by most courts today as viable options to a heretofore much more labor intensive and costly manual review process.

In the right hands, this new generation of ediscovery/information governance tools can be leveraged throughout an enterprise corporate legal department and beyond to other departments including compliance, records management, risk management, marketing, finance and lines of business.

However, one of the great challenges for any large corporation or organization today remains how best to address the deluge of Big Data. With a variety of technology and service provider offerings and methodologies to choose from, what is the best approach?

Big Data Comes of Age

In 2009, Analytics, Big Data and the Cloud were emerging trends. Five years later, Big Data is an integral part of the business fabric of every information-centric enterprise. Analytic tools and their variants are essential to managing Big Data.

The Cloud is fast becoming the primary Big Data transportation vehicle connecting individuals and organizations to vast amounts of compute power and data storage – all optimized to support a myriad of popular applications from Amazon, Facebook, LinkedIn and Twitter to so-called Software as a Service (SaaS) applications such as and Google Analytics.

At the same time, eDiscovery requests have increased dramatically. Better tools along with more data are a recipe for a spike in data access requests whether triggered by outside regulatory agencies, an increase number of lawsuits or internal requests from across the enterprise to leverage electronic data originally captured to meet regulatory compliance requirements.

Big Data Spike in Financial Services: Example

A Fortune 500 financial services firm I am familiar with has seen the number of internal ediscovery requests jump from 400 a year, in 2004, to 400 a month in 2013 – and counting. While changes to FRCP  helped to accelerate the pace of ediscovery activities, the firm offers variable annuities and mutual funds; therefore, the SEC also regulates them.

Thus, the company is required by Rule 17a-4 to retain every electronic interaction between their 12,000 agents and their client/prospects. Every month or so, they add 100 million more “objects” to their content archive which now exceeds 3 billion emails, documents, instant messages, tweets and other forms of social media.

From this collection of Big Data, larger highly regulated firms, in particular, are compelled to carve out portions of data for a variety of purposes. Most of these larger organizations have learned to rely on a mix of in-house technologies and service providers to analyze, categorize, collect, cull, move, secure and store mostly unstructured or semi-structured data, such as documents and emails, that do not fit neatly into the structured world of traditional relational database management systems.

People and Process First, Then Technology

A recent ZDNet article on Analytics quoting Gartner analysts suggests, “Easy-to-use tools does not mean it leads to better decisions; if you don’t know what you’re doing, it can lead to spectacular failure. Supporting this point, Gartner predicts that by 2016, 25 percent of organizations using consumer data will face reputation damage due to inadequate understanding of information trust issues.”

In other words, great technology in inexperienced hands can lead to big trouble including privacy breaches, brand damage, litigation exposure and higher costs. When meeting ediscovery demands is the primary goal, most large organizations have concluded that acquiring the right combination of in-house technologies and outside services that offer specific subject matter expertise and proven process skills is the best strategy for reducing costs and data volumes.

Responsive Data vs. Data Volume: Paying by the Gig

The time-honored practice of paying to store ESI (electronically stored information) by volume in the age of Big Data is a budget buster and non-starter for many organizations. Most on-premise ediscovery tools and appliances as well as Cloud-based solutions have a data-by-volume pricing component. Therefore, it makes perfect sense to lower ESI or data volumes to lower costs.

However, organizations run the risk of deleting potentially valuable information or spoliation of potentially responsive data relevant to a litigation or regulatory inquiry. This retain or delete dilemma favors service and solution providers whose primary focus is offering tactical, put-out-the-fire approaches to data management that worked well five or more years ago but not in today’s Big Data environment.

Big Data Volumes Driving Management Innovation

The pulse benchmarks launched last November by Kroll Ontrack indicate, “Since 2008, the average number of source gigabytes per project has declined by nearly 50 gigabytes due to more robust tools to select what is collected for ediscovery processing and review.”  In Kroll’s words, the decline is the result of “Smarter technology, savvier litigants and smaller collections.”

But, technology innovation and smarter litigants only tells a portion of the story. The larger picture was revealed during LTNY’s Managing Big Data Track panel sessions moderated by UnitedLex President Dave Deppe, and Jason Straight, UnitedLex’s Chief Privacy Officer.

Deppe’s panel included senior litigation support staff from GE and Google. Several key takeaways from the session include:

  • A long-term strategy for managing the litigation lifecycle is critical and goes well beyond deploying ediscovery tools and services. Acquiring outside subject matter expertise to mitigate internal litigation support bandwidth issues and provide defensibility through proven processes is key.
  • Price is not the mitigating factor in selecting service providers. Efficiency, quality and a comprehensive, end-to-end approach to litigation lifecycle management are more important.
  • The relationship between inside and outside council and providers needs to be managed effectively. Good people (SMEs) and process can move the dial farther than technology, which often does not make a difference – especially in the wrong hands.
  • Building a litigation support team that includes a panel of key partners, directed by a senior member of the internal legal staff can dramatically influence ediscovery outcomes and help lower downstream costs.  Aggressively negotiating search terms and, as a result, influencing outside counsel is just one example.
  • When not enough attention is paid to search terms, 60 to 95 percent of documents are responsive. Litigation support teams must stop reacting and arm outside counsel with the right search terms to obtain the right results.
  • Over collecting creates problems downstream. Conduct interviews with data owners and custodians. Ask them what data should be on hold. They will likely know.
  • Define, refine and repeat the process, train others and keep it in place.
  • Develop litigation readiness protocols, come up with a plan and play by the rules.

Merging of eDiscovery and Data Security

The focal point of Straight’s panel was on why and how organizations should develop a “Risk-based approach to cyber security threats.” With the advent of Cloud computing, sensitive data is at risk from both internal and external threats.

Panelist Ted Kobus, National Co-leader Privacy and Data Protection, at law firm Baker Hostetler shared his concerns about the possibility of cyber-terrorists shutting down critical infrastructure such as power plants. In regards to ediscovery, law firms often have digital links to client file systems which, if compromised could leak sensitive data, intellectual property or give hackers access to customer records.

In light of recent very high profile cyber breaches suffered by Target Stores and others, Kobus and other panelists emphasized the need to develop an “Incident Response Plan” that includes stakeholders from across the enterprise beyond just IT including; legal, compliance, HR, operations, brand management, marketing and sales.

Kobus emphasized that management needs to “embrace a culture of securing data” as a key component of an enterprise’s Big Data strategy. As the slide below indicates, a risk-based approach to managing corporate data addresses simple but critical questions.

UnitedLex LTNY 2014 Cyber Risk Panel Slide

Many organizations have created a Chief Data or Information Governance Officer position responsible for determining how various stakeholders throughout the enterprise are using data, and cyber insurance is becoming much more popular. Big Data management, compliance and data security are intrinsically connected. Moreover, the development of a data plan is critical to the survival of many corporations and its importance must not be overlooked or diminished.

Big Data Innovators to Watch in 2014 – and Beyond 

UnitedLex:  It is relatively easy to make the argument that it takes more than technology to innovate. UnitedLex has grown rapidly on the merits of its “Complete Litigation Lifecycle Management” solution, which provides a “unified litigation solution that is consultative in nature, provides a legally defensible, high quality process and can drive low cost through technology and global delivery centers.”

UnitedLex Domain Experts leverage best of breed ediscovery tools such as kCura’s Relativity for document review as well as their own Questio consultant led, technology-enabled service that “combines targeted automation and data analysis expertise to intelligently reduce data and significantly reduce cost and risk while dramatically improving the timeliness and efficiency of data analysis and document review.”

UnitedLex has reason to believe its Questio service is “materially changing the way eDiscovery is practiced” because UnitedLex SMEs help to significantly reduce risk and, on average, reduce customers’ total project cost (TPC) by 40 to 50% or more. This “change” is primarily achieved by reducing data volumes, avoiding legal sanctions, securely hosting data, strict adherence to jurisdictional and ethics requirements and, most or all, through the development of a true litigation and risk partnership with its customers.

How Questio Reduces Risk

Questio chart_collection

61% of eDiscovery-related sanctions caused by a failure to identify and preserve potentially responsive data. (UnitedLex)

Vound:  In the words of CTO Peter Mercer, Vound provides a “Forensic Search tool that allows corporations to focus on the 99 percent of cases not going to court.” Targeting corporate council and internal audit, Vound’s Intella is “a powerful process, search and analysis tool that enables customers to easily find critical data. All products feature our unique ‘cluster map’ to visualize relevant relationships and drill down to the most pertinent evidence.” Intella can be installed on a laptop or in the cloud.

Intella works almost exclusively with unstructured and semi-structured data such as documents, emails and metadata dividing data into “facets” using a predefined, in-line multi-classification scoring scheme. The solution is used by forensic and crime analysts as well as ediscovery experts to “put the big picture together” and bridge the gap that exists between Big Data (too much data) and ediscovery (reduce size of relevant data sets) to manage risk by identifying fraud patterns, improve efficiencies by enabling early assessments and lowering cost.

Intella slide


Catalyst:  Insight is a “revolutionary new ediscovery platform from Catalyst, a pioneer in secure, cloud-based discovery. Engineered from the ground up for the demanding requirements of even the biggest legal matters, Insight is the first to harness the power of an XML engine—combining metadata, tags and text in a unified, searchable data store.”

According to Founder and CEO John Tredennick, Catalyst has deployed at least three NoSQL databases on the backend to offer Insight users “unprecedented speed, visual analytics and ‘no limits’ scalability, all delivered securely from the cloud in an elegant and simple-to-use interface—without the costs, complications or compromises of an e-discovery appliance.”

In addition, Catalyst has engineered ACID transaction, dynamic faceted search, specialized caching techniques and relational-like join capabilities into Insight in order to deliver a reliable, fast and easy to use solution that enables results, from raw data to review, “in minutes not days” – even with large document sets exceeding 10 million.




The above-mentioned trio of services/solution providers represent a sea change in the way corporations big and small will strategically approach ediscovery in the era of Big Data and Cloud computing.  Deploying tactical solutions that meet short-term goals leads to higher costs and increase risks.

Services and technology solutions that can be leveraged across the enterprise by a variety of data stakeholders is a much more logical and cost effective approach for today’s business climate than deploying point solutions that meet only departmental needs.

A risk-based approach to managing Big Data assets throughout the enterprise – including customer and partner facing data – is not only a good idea, an organization’s survival may depend on it.

About Gary MacFadden

Gary's career in the IT industry spans more than 25 years starting in software sales and marketing for IBM partners DAPREX and Ross Data Systems, then moving to the IT Advisory and Consulting industry with META Group, Giga and IDC. He is a co-founder of The Robert Frances Group where his responsibilities have included business development, sales, marketing, senior management and research roles. For the past several years, Gary has been a passionate advocate for the use of analytics and information governance solutions to transform business and customer outcomes, and regularly consults with both end-user and vendor organizations who serve the banking, healthcare, insurance, high tech and utilities industries. Gary is also a frequent contributor to the research portal, a sought after speaker for industry events and blogs frequently on Healthcare IT (HIT) topics.
This entry was posted in Big Data, Cloud Computing, Information Governance, Information Management Thought Leadership, Information Management Trends, Strategic Information Management and tagged , , , , , , , , , , , , , , , . Bookmark the permalink.

Leave a Reply