
(Ilya Lukitchev/Shutterstock)
The growth of unstructured data poses real challenges. Many organizations struggle to manage unstructured data, including text, images, videos, and PDFs, due to the size and rate of growth of the data. For the staff at the law firm Katten Muchin Rosenman LLP (known as Katten Law), regulation and security have become new concerns.
The sheer volume of unstructured data is difficult to comprehend. As part of its Global Datasphere study a few years ago, analyst firm IDC predicted that by 2025, the planet will generate more than 175 zettabytes of data per 12 months (an estimate they have since lowered to 163 ZB).
Just storing 163 ZB of raw data would require over 700 billion 1TB drives, which is clearly not feasible, since the total storage capacity of all media in the world (HDD, flash, tape, and even phone) combined is only about 13 ZB, according to IDC. Incidentally, IDC also notes that only about 7.5 ZB of data is actually written to storage media, meaning most data is never written and storage is effectively over-provisioned.
Katten Law is used to big growth rates. The firm, which employs 700 lawyers worldwide, must store hundreds of millions of documents from thousands of client cases dating back decades. In total, the firm stores about 240 TB of data, a figure that is growing 20% to 25% each year, says Alexander Diaz, the firm’s director of infrastructure and data center operations.

Source: IDC
Until recently, the law firm ran its own unstructured data archiving system, taking data from its primary Windows file system and moving it to an archive storage server colocated in the firm’s data centre.
But the Cutten law firm ran into some operational issues with the archives, which led it to explore alternatives, Diaz said. Data Nami In a recent interview, the company invited Komprise, a manager of unstructured data management solutions, to conduct a proof of concept.
“During the proof of concept, we found that about 70% of the files stored on the file servers were old and hadn’t been accessed for more than three years or the case had been closed,” says Diaz. “Another reason we proposed a large-scale archiving project was to limit the risk if we were to encounter a ransomware event, because these files would not be affected.”
As Katten Law researched the software, they discovered other benefits. For example, many archiving solutions implement stubs in the production file system that represent the archived data. When data needs to be retrieved, users present that stub to the archiving solution, which retrieves the data. But if something happens to the stub, it can be very difficult to access the archived data again, Diaz says.
“Komprise takes a different approach,” he says. “They use symbolic links… basically like shortcuts. So you have a shortcut on your Windows desktop that references a path to the actual file or program on your operating system. And if that shortcut or symbolic link breaks or disappears, you can still find the original file or program.”
Time-based archiving of unstructured data is another benefit of using Komprise software, according to Diaz. Many traditional archiving packages archive files based on a set period of time, so if a document related to a case hasn’t been accessed for three years, it will be automatically archived.
That doesn’t work so well in the legal industry, Diaz said.
“Legal cases, especially litigation cases, often sit around for a while and then get picked up later,” he said. “Let’s say we’re representing someone. After the verdict is issued, there’s a period of time between the initial case and the appeal. So using time alone as a criterion doesn’t necessarily work.”
Komprise gave Katten Law the ability to archive files related to a case based on the date the case actually closed, rather than based on an arbitrary number of years the case remains untouched. After documents are archived, if a user needs to retrieve a read-only copy of the data, they can do so by simply clicking a shortcut on their desktop, which pulls the data from the Komprise archive to a local storage appliance, from which the user can retrieve it, Diaz says.
The company is currently in the process of migrating its primary storage platform from traditional spinning disks to flash storage, and by moving more data to a Komprise-based archive running on Microsoft Azure Blob Store, Diaz says the company can give users the benefits of faster primary storage while keeping costs down.

(Tatyana Shepeleva/Shutterstock)
“Komprise has been very consistent for us,” he says. “We started off with cases closed or data not being accessed for over three years. About six months ago, we lowered the threshold to no access for two years or cases closed, and ultimately migrated another 40TB to Azure.”
Reducing file storage on Windows file shares will also save the law firm money, especially as it transitions to a new platform later this year. “We won’t have to buy as much storage, which will save us money on future purchases,” says Diaz.
The benefits of increased security for Katten Law’s data are harder to measure, but with ransomware on the rise again this year, it’s clear that it brings real value to law firms.
“I can’t emphasize enough that the risk is also mitigated because archived files are not impacted by any hacker or ransomware event,” Dias said. “Hackers and ransomware can’t access those files. They can’t be impacted by any security event.”
Related Products:
Unstructured data management still in its infancy, according to Komprise
Conquering the unstructured data problem
The growth of unstructured data is blowing holes in IT budgets