WEKA File System Tiering on Microsoft Azure

WEKA technical demo showing performance in Azure cloud with tiering, including cold reads from Azure hot blob.

View Transcript

Welcome to the WEKA demo series. Today we’re going to be demonstrating WEKA data platform performance in Microsoft Azure.

WEKA makes transparent tiering a reality, by using our high performance parallelization and advanced feature set to leverage Object storage as a tier, allowing you to take advantage of the economics that object stores can bring to the table.

For this demonstration, The environment being used is entirely inside Azure.
We’re going to show you how WEKA is configured for tiering, and then we’ll measure performance in a number of ways, including how WEKA can even handle cold reads of tiered data from Azure Blob at very high performance.

First let’s log in. The WEKA deployment consists of just under 30TB of SSD capacity. We can see the default file system is provisioned with 25TB of SSD capacity. And that tearing is enabled as represented by the blue icon displayed in the tiering column.

Using the CLI on a client, we see that our default file system is mounted to /mnt/weka.. The total size of the file system is 250TB due to tiering, with only 25TB of SSD capacity being provisioned. Lets quickly resize the file system. Let’s reduce the total size from 250TB to 200TB. We’ll run the DF command again to see that changes are immediately reflected on our client.

Let’s edit the file system one last time and increase the total capacity just a little bit to 1Exabyte, while keeping the SSD at 25TB. Once again, The client sees this change immediately. Total file system capacity is thin provisioned, So showing an exabyte of capacity won’t bill you for an exabyte until you fill the space.

Single client perf section
Now let’s demonstrate single client performance. We’re going to monitor the performance from the WEKA web interface. We will use FIO simulate load on the Weka file system in the form of writing a set of 200 500 megabyte files using a one megabyte block size.

We can see that our single client is writing to WEKA at approximately 3.5 to 4GB per second, and we can also see from our core usage and utilization that WEKA is far from being saturated from a performance perspective.

If we switch to a more detailed view, you can see that WEKA is using it’s parallelization to distribute data across all hosts in the system, preventing any hotspots.

By the end of the command, we have written just over 100GB of data to our WEKA file system. Now let’s adjust the FIO command. Before, we issued one megabyte writes. This time we’ll do one megabyte reads so we can measure our read bandwidth to WEKA from a single client. The web interface now reflects the new read IO performance.

As you can see, A single client is able to achieve over 10GB per second while reading against the data set, while still using minimal resources.

Multi-client performance section
While single-client performance is nice, we’ll now want to use multiple clients to see what performance the cluster can deliver.

We’ll be able to issue commands on multiple clients from this one CLI window. We can see that we have six clients in total, all identical to the client we just tested. Let’s fire off our FIO command so that all six clients begin writing files to our WEKA cluster at the same time. Almost immediately, we can see WEKA is delivering up to 24GB per second of write throughput.

Remember, this performance is from a relatively small amount of SSD, roughly 29TB.

If we switch windows, you can see the performance per client, and that the file system is growing.

Switching back, we can observe core utilization and see that we aren’t fully saturating our WEKA system. This deployment is capable of more performance, assuming we had more clients to put load on it.

Now lets adjust FIO and see how much read throughput can be achieved with the six clients. We can see the breakout of performance from the top consumers tab in the web interface. It looks like our six clients are pushing approximately 30GB per second of aggregate bandwidth.

Single client 4k section
Now that we’ve measured the bandwidth on both single and multiple clients, let’s do the same for 4k IOPs

We’ll issue another FIO command, and this time we’ll be using a single client to write in 4K block sizes instead of the one megabyte block size we used for throughput. And we see the single client is able to push nearly 375,000 IOPs at under 400 microseconds of latency.

If we adjust FIO to do 4K reads to the client, we see WEKA is delivering over 420,000 IOPs in under 300 microsecond latency to our client. Impressive.

Multi client 4k section
As we did before when measuring bandwidth, we’ll now use six clients to push an aggregate 4k IOP load. We can see from the GUI that WEKA is pushing over 2 million IOPs.

From the CLI, you can see that the real number is actually closer to 2.4 million IOPs..
Our core utilization is in the high 90’s, and we’re approaching the performance limit from this small cluster..

Now we could easily add more back end storage servers to this deployment and see linear performance gains. But if this is more than enough performance, you could just as easily scale the system exclusively with Azure hot blob storage.

FS create and Object direct
Now, lets show you a feature of WEKA tiering that allows IO to directly use the Object store. we’ll create a new WEKA file system within the cluster and use object direct mounting to access the hot blob capacity.

We’ll navigate to Manage File systems and create a new file system. Since this file system will simply serve as a multi-protocol fast cache in front of Azure Blob, we don’t need to provision very much SSD capacity. SSD will only be used to store new writes along with any file system metadata.

We will enable tiering and provision it with our Azure Blob bucket.

The file system is immediately created and is available to mount.

To mount the file system to our clients, we’ll need to perform a couple steps. First, let’s create the local mount path on all our clients, where we’ll be mounting this file system.

Next, we’ll be issuing the mount command with an option that will direct the filesystem to ignore any timed tiering policies, and immediately move the data to Azure Hot Blob as soon as it lands on SSD. The metadata will always remain on SSD, however, any future reads of the data from an object direct mount will never be cached on SSD. Instead, WEKA will transparently pass IO between client and Azure Blob.

Lets run the mount command across our clients so you can see the difference between the original and the new object direct mounts. Again, object direct mounts provide a very fast write cache while immediately evacuating any written data to the attached Azure hot blob bucket linked to the file system.

Write performance to object direct mounts
Now let’s measure the write performance to the object direct mount. Same as before, we’re going to create IO by writing to the object direct mount in one megabyte blocks. We’ll use all six of our clients to show total throughput of this file system.

We can see that the performance is no different than a non object direct mount. We’re able achieve over 21GB per second of write bandwidth.

As stated before, WEKA is evacuating the data to Azure hot blob as represented by the OBS upload column.
WEKA will continue to do this until only the metadata remains on SSD. With object direct, you’ll need to measure your write performance and upload rate to Blob, to size your SSD write buffer in order to accommodate enough space for incoming writes.

Automatic evacuation of data
Now, let’s monitor automatic data tiering from SSD to Azure Hot Blob.

You’ll see from the web interface that our file system contains just over 630 gigabytes of data. We’ll speed up the video while it automatically tiers all written data to Azure hot blob. All that should remain in SSD when the file system is fully tiered to Azure Hot Blob is the metadata..

While results may vary. This deployment is observing roughly 700 to 800MB per second upload speeds to Azure hot blob.

As it begins to finish, You’ll see that the only remaining consumption on SSD is the 2.46GB of metadata.

Cold reads from object
Now that our data resides on Blob, let’s measure how quickly it can be accessed. This is a key measurement for the use case of large amounts of data.. Customers want to have confidence that if they have 100s of petabytes of WEKA backed by an object store they can bring huge amounts of cold data back very quickly. We’ll benchmark our performance the same way we did in prior tests with load across all six of our clients.

As before, we’ll read one megabyte block sizes. We’ll configure the command to run for 90 seconds, so that we capture the sustained bandwidth from the Azure hot blob to WEKA. Typically, reads would trigger the data to be cached in SSD, but since this is an object direct mount, reads will always be pulled from blob.

We can see almost immediately that we’re reading data from Azure Blob at speeds up to 20GB per second. And, while it does vary a little bit, the performance is rather impressive, especially considering the economic advantage of storing data on Azure Hot Blob.

Looking at the details, we see the used SSD capacity remains at 2.46GB, which is the size of our metadata. Also note, the distribution of servicing the tier data is fully balanced across every WEKA back end storage node. This balance helps WEKA achieve optimal performance from Azure Blob and other object stores. Each WEKA backend is responsible for an equal portion of the namespace, and each will perform the appropriate API operations when interacting with tiered data.

We can also see that each of the six clients is achieving approximately 2 to 3.5 gigabytes per second reads from Azure Blob. This means that with WEKA, you can achieve NFS- like performance from cold reads of your data when stored in Azure Blob.

Closing
WEKA Object direct for tiering works with POSIX, NFS mounts and SMB shares, giving you great economics and performance no matter what applications need data. Object direct mounts are a great option when looking to preserve SSD capacity for the most demanding workloads.

In addition to object direct, WEKA provides enterprise storage features like, quotas, snapshots, multi protocol access, backup and disaster recovery, and more. For more information, check out www.weka.io, and have a great day!

Additional Videos

Video

WEKA Augmented Memory

Video

WEKA Data Platform vs. Amazon FSX for Lustre

Video

AI Explained: Checkpointing in LLMs and the Trade-Offs between Reliability and Performance

WEKA DATA PLATFORM

DEPLOYMENT OPTIONS

USE CASES

INDUSTRIES

ARCHITECTURES

Learn AI

RESOURCES

TECHNICAL RESOURCES

ABOUT US

JOIN US

WEKA File System Tiering on Microsoft Azure

Additional Videos

WEKA File System Tiering on Microsoft Azure

Share On Social:

Additional Videos