Is There an LOSF Monster Lurking in Your Data Center?
Have you heard of the dreaded LOSF beast? While recognized in academia and the industry, like Bloody Mary of folklore, many are afraid to say its name out loud out of fear that it will appear and cripple their digital transformation engines.
While at first glance your data center may look like a safe place, it is rife with danger that can make the hairs on the back of your neck stand up — despite the climate control. Are the blinking lights a subtle M. Night Shalyman-style warning of costs that may soon spiral out of control? Is the fan noise secretly hiding an impending attack of a performance or reliability issue?
One of the other potential horrors is the rise and growing importance of applications that use lots of small files (LOSF). While these apps may not sound scary, they can be a terror for your legacy data architecture and the demise of many an HPC cluster.
What does this boogeyman actually look like? The ghastly LOSF beast can often produce not just millions, but billions of small <1MB size files, and sneakily mix them among large files. And to make matters even worse, this size may be on the large end. Many of the files in these workloads are only a few kilobytes in size or smaller. LOSF applications are common in social networking sites, electronic design automation, high-performance computing, natural language processing, and life science research.
The LOSF monster feeds on challenges in metadata management, access performance, and storage efficiency to maim and cripple these applications on-premises. Many an LOSF workload has completely paralyzed customers when they have tried to move these “impossible” applications to the cloud.
How the Beast Was Born
The gruesome LOSF monster wasn’t conjured with blood magic, maiden sacrifice, or even a creepy board game found in the attic – it spontaneously emerged to take advantage of the curse of legacy data architectures. File system “houses” that originally focused on the HPC space, such as Lustre, GlusterFS, GPFS, and HDFS were originally built on ground cursed by slow, cobweb ridden hard drives. As these large mansions went up on this land they were limited by an architecture designed for high aggregate bandwidth for large files, including how they dealt with metadata management, data layout, striping design, and cache management.
As new, high performance, small file workloads moved into these creepy old houses at the end of the block, there began to be eerie warning signs – random creaking in the data center, sudden hot spots, workloads that suddenly stopped and started, and the blood curdling wailing of data scientists echoing through the halls.
At the source of the hard drive curse that enabled the emergence of the LOSF beast is metadata management. When a file needs to be read or written, the client asking for it first makes a request to the storage to lookup or find the appropriate location, then it opens the location, does its read/write, and then closes the location. This means that for each data movement, there will be multiple operations swapping metadata from disk to memory at a tremendous rate, slowly bleeding out the performance of the architecture. This is what attracted the LOSF monster and what sustains it going forward.
The Monkey’s Paw
As more LOSF workloads move to the cloud, surely existing cloud storage offerings can tame this beast, right? As with the Monkey’s Paw, there are consequences to every wish… While most object store implementations in the cloud can handle the scale of billions of files, the performance is far too slow for any of the next-generation workloads. The enterprise storage vendors that have moved to the cloud have just moved their same cursed architectures along with them and in some cases have made it worse. Like Freddy without his claws, the ability to tune in the cloud to meet the LOSF beast head-on is limited or lost.
Don’t Abandon All Hope
One of the key tenets of any good horror movie is that you will regret going back into the spooky old house. And no matter how many flash-y exorcists you bring in to stop the beast, it still wont change the decades old curse. And Jason will be back for yet another sequel.
The WEKA Data Platform isn’t burdened by the sins of the past and has the silver bullets to smite the LOSF monster at first sight. WEKA is armed with the fresh approach of creating virtual metadata servers that scale on the fly with every server that is added to the cluster. Along with WEKA’s super secret data layout algorithms which spread all metadata and data across the cluster in small 4k chunks, this creates incredibly low latency and high performance whether the IO size is small, large, or a mixture of both and keep the LOSF monster at bay,
And because the WEKA Data Platform is software-defined, the exact same technology is used whether on-prem or in the cloud, finally stopping the curse of the Monkey’s Paw: no tradeoffs for performance or capacities in the cloud.
For a more detailed – and less creepy – look at the challenge of LOSF, check out our Lots of Small Files solution brief.