DeepInfra secured an $8 Million Seed Round to Transform Entry into the AI Inference Arena

TL;DR:

  • DeepInfra offers an innovative solution for deploying large language model (LLM) chatbots at just $1 per 1 million tokens, undercutting competitors.
  • The company raised an impressive $8 million seed round led by A.Capital and Felicis are gaining recognition for their efficient infrastructure.
  • DeepInfra primarily targets small-to-medium-sized businesses (SMBs), providing access to open-source language models and machine learning models.
  • The founders’ expertise in managing servers globally positions DeepInfra as a market disruptor with a potential 10x cost advantage.
  • Open-source LLMs are gaining traction due to lower costs and customization options, and DeepInfra is well-positioned to drive this trend forward.
  • Longer context models are expected to dominate future AI applications, offering versatility and sophistication.
  • DeepInfra’s commitment to data privacy and security enhances its appeal to enterprises.

Main AI News:

In the fast-evolving landscape of AI and machine learning, business leaders and IT decision-makers are constantly seeking cost-effective ways to deploy large language model (LLM) chatbots for their employees and customers. The question that often arises is, “How do you launch it, and what’s the price tag?

DeepInfra, a pioneering company founded by former engineers from IMO Messenger, has emerged from stealth mode to provide a clear answer to these pressing questions. Their proposition is simple yet revolutionary: they take care of setting up the models on private servers for their clients, all at an incredibly low rate of $1 per 1 million tokens in or out. This stands in stark contrast to the $10 per 1 million tokens charged by OpenAI’s GPT-4 Turbo and the $11.02 per 1 million tokens by Anthropic’s Claude 2.

VentureBeat exclusively broke the news about DeepInfra, highlighting their successful $8 million seed round led by A.Capital and Felicis. The company’s mission is to offer a range of open-source model inferences, including Meta’s Llama 2 and CodeLlama, along with customized versions of these and other open-source models.

We wanted to provide CPUs and a low-cost way of deploying trained machine learning models,” explained Nikola Borisov, DeepInfra’s Founder and CEO, in an interview with VentureBeat. “We already saw a lot of people working on the training side of things, and we wanted to provide value on the inference side.”

The Infrastructural Challenge

While there’s been a considerable focus on the GPU resources required for training LLMs, the challenges related to running these models efficiently in real-world scenarios, known as inferencing, have received less attention. According to Borisov, the main challenge in serving these models is how to accommodate a significant number of concurrent users on the same hardware and model simultaneously. Large language models generate tokens one at a time, each demanding substantial computation and memory bandwidth. Thus, the key challenge is optimizing usage to prevent redundant computing operations and efficiently utilize server space.

DeepInfra’s Unique Expertise

DeepInfra’s competitive edge stems from its founders’ extensive experience in managing large fleets of servers in data centers worldwide. Aydin Senkut, founder and managing partner of Felicis, attested to their exceptional capabilities, comparing them to “international programming Olympic gold medal winners.” Senkut recognized the potential of DeepInfra’s efficiency in building server infrastructure and compute resources, which enables the company to offer its services at such cost-effective rates.

Affordability as a Market Disruptor

Affordability is a pivotal factor in the adoption of AI and LLMs. DeepInfra’s ability to provide up to a 10x cost advantage positions it as a significant disruptor in the market. As Senkut pointed out, while the potential of AI is widely acknowledged, cost remains a major concern. DeepInfra not only benefits itself but also empowers its customers to harness LLM technology affordably in their applications.

Targeting SMBs with Open-Source AI

Initially, DeepInfra intends to cater to small-to-medium-sized businesses (SMBs) with its inference hosting services, as these entities are often more budget-conscious. Their focus is on providing access to cutting-edge open-source language models and machine learning models.

DeepInfra anticipates staying closely connected to the open-source AI community, tracking new model releases and tuning efforts. Borisov emphasized the potential for diverse variants of large language models to emerge with minimal computational requirements, thanks to the open-source ecosystem’s collaborative nature.

A Bright Future for Open-Source LLMs

The rise of open-source large language models and generative AI has been remarkable, largely due to their cost-effectiveness and flexibility. DeepInfra is poised to play a pivotal role in this trend by continuously onboarding new models and meeting evolving demands.

Borisov believes that longer context models will shape the future, catering to increasingly sophisticated use cases. Additionally, DeepInfra’s commitment to data privacy and security is expected to appeal to enterprises concerned about safeguarding their sensitive information. Borisov assured that the company neither stores nor uses any of the input prompts, further enhancing its appeal in a data-conscious world.

Conclusion:

DeepInfra’s emergence as an affordable AI inference provider is poised to disrupt the market significantly. With its cost-effective solutions, efficient infrastructure, and commitment to open-source AI, it caters to the growing demand for accessible and customizable AI applications. The rise of open-source LLMs, coupled with DeepInfra’s expertise, is reshaping the AI landscape, making advanced AI technology more practical and affordable for businesses across the board.

Source