Doubling Down on AI: The Amazon Way

AWS re:invent 2024 is in the rearview mirror. But the future of AI in the cloud has never been brighter. In part I, we unpacked how AWS is in the midst of their own data center transformation to support AI training and inference workloads, and how WEKA is helping enable customer build their AI workloads on top of AWS. In part 2, we unpack observations on AWS offerings for building and training AI models, and deploying inferencing at scale even further.

Customer Choice Wins the Day

Customer choice was a huge theme that came roaring back at reInvent this year. AWS has been saying this for years – first about databases (AWS offers 6 different database flavors and now offers serverless and non-serverless version of many), about analytics tools, and AI frameworks among many other things.  The same is true about models – and it’s why Amazon Bedrock offers over 100 different models for builds to choose from for their specific use case – best performance, optimize for costs, build images and video, fine-tune the models for in-house applications, optimize for agentic AI or build a RAG pipeline to improve, and many more scenarios.

We couldn’t agree more! At WEKA, we’ve been hard at work thinking about how to provide a high-performance data environment that enables this kind of customer choice. WEKA is already working with many of the leading model providers – like Stability AI, Contextual AI, Syntheisa, and many more. They’ve told us that a highly flexible, adjustable data environment is an absolute must-have to enable rapid development of different types of AI models and various approaches to deploying enterprise AI at scale, which is what we’re doing. Because WEKA is a software-first approach to delivering high-performance data that runs the same in any type of infrastructure – SageMaker HyperPods in Amazon, or CycleClouds in Azure, or Google Cloud or Oracle Cloud, on-premises, and now in most of the Specialty GPU Cloud Service Providers, customers have the ultimate flexibility in choice on where to build and training their models. It’s not just where to do the model training; it’s also the choice of compute type where WEKA customers have choice across NVIDIA GPUs, ARM-based accelerators like Tranium and Inferentia from Amazon, or Grace Chips from NVIDIA, and much more. It’s also the flexibility in the zero-tuning architecture of the WEKA software that automatically tiers training data between object and flash storage based on intelligent tiering algorithms that detect changes in the data environment and bring the right data into performance tiers at the right time. This ensures every AI workload (and every other workload for that matter) gets the right level of performance every time…AND it’s optimizing for cost at the same time by tiering back down to object storage in S3.

Better Performing AI Everywhere

The next exciting set of announcements came around the rest of Amazon Bedrock, where Amazon made impressive strides toward solving some of the most pressing problems with AI models today – hallucination (Automated Reasoning Checks), building experts and agents (Model Distillation), agentive AI (multi-agent collaboration), and speed (Latency Optimized Inferencing). Again, the exciting thing is that WEKA is already iterating in the inference space, most recently with the introduction of WARRP. All these innovations are designed to ensure that the data “firehose” that is WEKA is architected to feed the GPUs in these inference farms, drive utilization to the maximum, and ensure customers get the most from their AI infrastructure and the best performance out of their AI infrastructure investment.

If you found this insightful, helpful, entertaining or anything else, let me know!