The Evolution of Storage Systems for Media Asset Management #IBC2015

[caption id="attachment_69471" align="alignleft" width="205"] Dave Fellinger[/caption]

I had the opportunity to tour the Library of Congress film archive in Culpepper, Virginia. There, in an underground bunker, the library has stored 140,000,000 feet of nitrocellulose film and 180,000,000 feet of safety film. As I walked deep into the bunker bored into the side of a hill, I could not help but think that the entire archive could be stored on just two racks of our very dense DDN storage. We are proud to be a part of the Library’s endeavor to digitize these precious creative assets for future generations.

In the days of nitrocellulose, the film editing process required the skillful use of a glue brush. Media asset management was accomplished with a card catalog pointing to a rack in the studio vault where one could find a can of film with a label matching a hand written leader. Today, we have seen several complete evolutions of the process from editing done on custom work stations through the use of commodity hardware and non-linear editing software. Storage has evolved through multiple generations as well, and we have supported this entire process.

DDN entered the fibre channel storage market specifically for media with an arbitrated loop product over 20 years ago. Those were the first days of the Storage Area Network (SAN), and we functioned behind managed hubs and switches to enable the first collaborative environments. In 2000, we introduced a storage product with a custom ASIC that presented a virtual fibre channel environment. This was the first product to introduce a guaranteed Quality of Service (QoS) and was, in effect, a “perfect” disk drive that never varied in latency or availability. We even offered a management API that allowed post production facilities to guarantee privacy for users through scripted, managed, permissions to common archives that could scale to unprecedented sizes.

In 2006, we were the first to offer an InfiniBand (IB) host interface, which allowed SANs to be deployed with a lower latency interconnect. By 2008, we had developed an architecture that enabled complete virtual file systems to reside in a common memory space with the storage system completely eliminating the transmission latencies normally associated with a serial Small Computer System Interface (SCSI) transfer.

Initially, we focused on dedicated hardware with field-programmable gate array (FPGA)-state machines to guarantee consistency and high performance. The problem with that approach was that we were relying on file systems that were really not designed to have the same attributes. A typical file system handles the writing of data in a largely serial process. There is a V-node entry, which is created as the file name and is associated with an I-node entry in a file allocation table containing extent lists indicating the placement of data on devices such as disk drives. The deletion of data releases the blocks that were utilized in the I-node so that they can be gathered for another write operation. As data is written, locks are placed on either block-based segments or entire files to ensure file integrity in the case of two or more users trying to write to the same file at the same time. Since blocks are gathered in a new I-node for each file creation, and since these blocks have been released from a delete operation, data placement efficiency on disk drives is not a priority. Files can be modified by the inclusion of additional blocks of data, but this operation, again, does not guarantee efficient data placement. The end result is called file system fragmentation, resulting in decreased efficiency as a file system ages. Typical disk drives are very efficient when used in contiguous operations, but file system fragmentation causes the need for increased disk seek operations; an average seek operation is about 15 milliseconds (ms), which is quite long considering that, at 30 frames per second, this represents the duration of about half the length of an entire frame of media.

We started to study file system operations and even counted instructions and machine cycles required to create files of specific lengths. We then started to think about the efficiencies that could be realized if we could consider the data immutable rather than amendable. Considering that a great deal of released and distributed media data is, in fact, immutable, it seemed reasonable to create a file system that was specifically built to efficiently store this type of data.

After several years of development and testing, we released our first object-based file system for immutable data called the Web Object Scaler (WOS). This data storage software was designed specifically for data placement and retrieval efficiency for immutable media data. The concept of V-nodes in directory structures and I-nodes in file allocation tables was eliminated completely and replaced with a very efficient, single layer, data placement process that does not have to rely on locking for data consistency. Data is purposefully grouped together in “buckets” of like size. For example, all 1 MB objects are kept together such that if one is deleted, another can take that same location without creating fragmentation or any degradation of performance. Each object consists of an assemblage of data, metadata, a checksum for guaranteeing data consistency, and a distribution policy that can be modified over time based upon distribution requirements. Very large objects are stored specifically on multiple disk drives, which can be accessed simultaneously to guarantee consistent delivery. Finally, we developed several methods of failure resilience that go well beyond simple replication and allow orderly data migration, which is critical in the maintenance of large scale storage systems. The end result is a data storage technology that is so efficient that it can be filled to 99 percent of its capacity and still function perfectly. The addressing scheme allows the storage of 1 trillion objects in effectively one namespace.

Of course, media data is created utilizing file systems that can allow manipulations and modifications of the data in real time. We believe that there is need for both types of data storage technologies in a media production facility and have built “bridges” that allow published media segments or entire works to be moved to far more suitable object-based storage. As data sets grow in size, the selection of a specific storage system based upon data distribution and usage will no longer become optional but will be mandatory to enable manageability and overall site simplification.

The evolution of storage systems to support the creative process is far from over. For example, DDN will be introducing a product that can store data to a massive non-volatile random access memory (NVRAM) structure hosted on a switched system bus, not a storage bus. This will yield even better efficiencies for data-intensive workloads like rendering and real-time editing with preview. We intend to be the first to offer non-volatile media (NVMe) types that are faster, more efficient, and far more resilient than NAND flash. Finally, we will be the first to offer interconnect busses that allow even lower latencies than are possible today.

Stay tuned. The next few years will be some of the most evolutionary and exciting in storage technology to support the media industry.

The Evolution of Storage Systems for Media Asset Management #IBC2015

Discussion

Dalet Flex LTS Puts Smarter Media Workflows in Reach

Mediagenix Launches AI Model for Safe, Transparent Media Operations

BCNEXXT Reveals Vipe 2.0 at IBC2026, Cutting Costs and Boosting Efficiency

TNDV's Aspiration 35 Hits Milestone with Cinematic Success

Zixi and Comcast Offer IP-Based Solution for C-Band Transition

ACE Returns with EditFest LA 2026: Exclusive Panels & Global On-Demand