Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
When DeepSeek-R1 first emerged, the prevailing fear that shook the industry was that advanced reasoning could be achieved with less infrastructure.
As it turns out, that’s not necessarily the case. At least, according to Together AI, the rise of DeepSeek and open-source reasoning has had the exact opposite effect: Instead of reducing the need for infrastructure, it is increasing it.
That increased demand has helped fuel the growth of Together AI’s platform and business. Today the company announced a $305 million series B round of funding, led by General Catalyst and co-led by Prosperity7. Together AI first emerged in 2023 with an aim to simplify enterprise use of open-source large language models (LLMs). The company expanded in 2024 with the Together enterprise platform, which enables AI deployment in virtual private cloud (VPC) and on-premises environments. In 2025, Together AI is growing its platform once again with reasoning clusters and agentic AI capabilities.
The company claims that its AI deployment platform has more than 450,000 registered developers and that the business has grown 6X overall year-over-year. The company’s customers include enterprises as well as AI startups such as Krea AI, Captions and Pika Labs.
“We are now serving models across all modalities: language and reasoning and images and audio and video,” Vipul Prakash, CEO of Together AI, told VentureBeat.
The huge impact DeepSeek-R1 is having on AI infrastructure demand
DeepSeek-R1 was hugely disruptive when it first debuted, for a number of reasons — one of which was the implication that a leading edge open-source reasoning model could be built and deployed with less infrastructure than a proprietary model.
However, Prakash explained, Together AI has grown its infrastructure in part to help support increased demand of DeepSeek-R1 related workloads.
“It’s a fairly expensive model to run inference on,” he said. “It has 671 billion parameters and you need to distribute it over multiple servers. And because the quality is higher, there’s generally more demand on the top end, which means you need more capacity.”
Additionally, he noted that DeepSeek-R1 generally has longer-lived requests that can last two to three minutes. Tremendous user demand for DeepSeek-R1 is further driving the need for more infrastructure.
To meet that demand, Together AI has rolled out a service it calls “reasoning clusters” that provision dedicated capacity, ranging from 128 to 2,000 chips, to run models at the best possible performance.
How Together AI is helping organizations use reasoning AI
There are a number of specific areas where Together AI is seeing usage of reasoning models. These include:
- Coding agents: Reasoning models help break down larger problems into steps.
- Reducing hallucinations: The reasoning process helps to verify the outputs of models, thus reducing hallucinations, which is important for applications where accuracy is crucial.
- Improving non-reasoning models: Customers are distilling and improving the quality of non-reasoning models.
- Enabling self-improvement: The use of reinforcement learning with reasoning models allows models to recursively self-improve without relying on large amounts of human-labeled data.
Agentic AI is also driving increased demand for AI infrastructure
Together AI is also seeing increased infrastructure demand as its users embrace agentic AI.
Prakash explained that agentic workflows, where a single user request results in thousands of API calls to complete a task, are putting more compute demand on Together AI’s infrastructure.
To help support agentic AI workloads, Together AI recently has acquired CodeSandbox, whose technology provides lightweight, fast-booting virtual machines (VMs) to execute arbitrary, secure code within the Together AI cloud, where the language models also reside. This allows Together AI to reduce the latency between the agentic code and the models that need to be called, improving the performance of agentic workflows.
Nvidia Blackwell is already having an impact
All AI platforms are facing increased demands.
That’s one of the reasons why Nvidia keeps rolling out new silicon that provides more performance. Nvidia’s latest product chip is the Blackwell GPU, which is now being deployed at Together AI.
Prakash said Nvidia Blackwell chips cost around 25% more than the previous generation, but provide 2X the performance. The GB 200 platform with Blackwell chips is particularly well-suited for training and inference of mixture of expert (MoE) models, which are trained across multiple InfiniBand-connected servers. He noted that Blackwell chips are also expected to provide a bigger performance boost for inference of larger models, compared to smaller models.
The competitive landscape of agentic AI
The market of AI infrastructure platforms is fiercely competitive.
Together AI faces competition from both established cloud providers and AI infrastructure startups. All the hyperscalers, including Microsoft, AWS and Google, have AI platforms. There is also an emerging category of AI-focussed players such as Groq and Samba Nova that are all aiming for a slice of the lucrative market.
Together AI has a full-stack offering, including GPU infrastructure with software platform layers on top. This allows customers to easily build with open-source models or develop their own models on the Together AI platform. The company also has a focus on research developing optimizations and accelerated runtimes for both inference and training.
“For instance, we serve the DeepSeek-R1 model at 85 tokens per second and Azure serves it at 7 tokens per second,” said Prakash. “There is a fairly widening gap in the performance and cost that we can provide to our customers.”
Source link


 
				
Leave a Reply