Reimagining the Data Center for Generative AI

TL;DR:

  • ChatGPT, an impressive chatbot built on OpenAI’s GPT series of large language models, is transforming AI interactions.
  • Incorporating internal corporate data in generative AI technology presents challenges due to expensive model building.
  • Fine-tuning existing open-source LLMs on internal data proves to be a more practical and cost-effective solution.
  • Cloud solutions offer diverse training options, but on-site deployment with accelerated hardware remains a viable choice.
  • Strategic planning and collaboration are crucial for making informed decisions in AI investments.

Main AI News:

In the fast-evolving landscape of artificial intelligence, the rise of ChatGPT has been nothing short of remarkable. As the ubiquitous chatbot built on OpenAI’s GPT series of large language models (LLMs), ChatGPT has become a game-changer, impressing users with its content-generation capabilities. From answering complex questions to automating tasks like software code writing and marketing copy production, this generative AI technology is revolutionizing the way we interact with machines.

However, embracing the potential of generative AI within data centers presents unique challenges. Most existing models are trained on publicly available data, making them less suitable for enterprise applications that require querying sensitive internal documents. For enterprises to harness the full potential of these models, incorporating internal corporate data becomes essential. But does this mean starting from scratch?

Building large language models like GPT-3 or GPT-4 is an expensive undertaking within data centers. The initial training alone demands extensive computing power, often requiring thousands of expensive GPUs clustered together for weeks or even months. For instance, the BLOOM model, a 176-billion parameter alternative to GPT-3, required a staggering 117 days of training on a 384-GPU cluster. As model size increases, so does the demand for more GPUs and specialized training techniques, making the whole process cost-prohibitive for many organizations.

Moreover, running inference on these models continuously adds to the cost, considering the need for high-performance chips. For example, using just 500 of Nvidia’s DGX A100 multi-GPU servers, at a considerable cost per server, would entail a massive investment for any company, especially those not solely focused on AI.

So, what’s the ideal approach toward a data center fit for the age of AI? For most organizations, fine-tuning existing open-source LLMs to their specific use cases on internal data, such as corporate documents and customer emails, proves to be a more practical and cost-effective solution. This approach is considerably lighter in terms of time, budget, and effort, making it an attractive alternative.

Hugging Face’s model hub alone hosts over 250,000 open-source models designed for various natural language processing, computer vision, and audio tasks. Leveraging these existing models can be an excellent starting point for companies seeking AI applications. And for those still interested in building their LLM from scratch, starting small and utilizing managed cloud infrastructure and machine learning services can be a more reasonable option.

Cloud solutions offer a wide array of training options beyond just Nvidia GPUs, including AMD and Intel chips and specialized accelerators like Google TPU and AWS Trainium. However, in situations where local regulations restrict cloud usage, on-site deployment with accelerated hardware, such as GPUs, remains a viable choice.

Planning is a crucial step before diving into AI investments. Collaborating with stakeholders and subject matter experts, technical decision-makers must define a clear strategy, considering the business case for the investment and the future demands of the workloads. This thoughtful approach enables enterprises to make informed decisions on hardware selection, the use of pre-existing models, and identifying the right AI partners for their unique journeys.

Conclusion:

The rise of ChatGPT and other generative AI technologies is revolutionizing the AI landscape. While building large language models can be costly, organizations can optimize their data centers by fine-tuning existing LLMs on internal data. Cloud solutions provide flexibility, but for certain cases, on-site deployment may be preferred due to regulatory constraints. Strategic planning and collaboration are essential for businesses to harness the benefits of AI and stay adaptable in this rapidly evolving market.

Source