“At Together AI, we are obsessed with speed and efficiency. That’s why we built the Together Inference Engine that provides the fastest inference speeds in the industry. We are excited to leverage WEKA’s Augmented Memory Grid capability to reduce the time involved in prompt caching and improve the flexibility of leveraging this cache across multiple nodes— reducing latency and benefitting the more than 500,000 AI developers building on Together AI.”
Ce Zhang, Chief Technology Officer at Together AI.