Nvidia announces wide accessibility of DGX Cloud, a cloud-based AI supercomputing service

TL;DR:

  • Nvidia announces wide accessibility of DGX Cloud, a cloud-based AI supercomputing service.
  • DGX Cloud offers thousands of virtual Nvidia GPUs on Oracle Cloud Infrastructure (OCI).
  • The purpose-built infrastructure supports training large, complex generative AI models.
  • DGX Cloud simplifies complex infrastructure management, providing a user-friendly “serverless AI” experience.
  • Businesses can remotely access their own AI supercomputer for training, eliminating the need for a supercomputing data center.
  • DGX Cloud enables parallel processing, leading to faster training compared to traditional cloud computing.
  • Companies can establish their “AI center of excellence” with a pool of supercomputing capacity for multiple AI projects.
  • Generative AI’s growth is driving increased demand for accelerated computing infrastructure.
  • DGX Cloud optimizes AI development with Nvidia Base Command Platform and Nvidia AI Enterprise.
  • Amgen leverages DGX Cloud for drug discovery, achieving significant speedups in training and analysis.
  • Rental-based DGX Cloud instances feature powerful Nvidia GPUs with high-performance storage.
  • Nvidia AI Enterprise software facilitates accelerated data science pipelines and production AI development.

Main AI News:

In an exciting development for the world of artificial intelligence, Nvidia has unveiled the widespread availability of DGX Cloud, a cutting-edge cloud-based AI supercomputing service. This game-changing platform grants users seamless access to thousands of virtual Nvidia GPUs on the robust Oracle Cloud Infrastructure (OCI), boasting infrastructure locations in both the United States and the United Kingdom.

The DGX Cloud service was first introduced at Nvidia’s GTC conference in March, and its primary objective is to equip enterprises with the essential infrastructure and software required for training advanced models in generative AI and other AI-centric fields.

The purpose-built infrastructure behind DGX Cloud is specifically designed to cater to the demands of generative AI, ensuring efficient AI supercomputing for training large, complex models such as language models. Tony Paikeday, Senior Director of DGX Platforms at Nvidia, explained that DGX Cloud adopts a best-of-breed computing architecture, employing large clusters of dedicated DGX Cloud instances interconnected through an ultra-high bandwidth, low latency Nvidia network fabric. This setup mirrors how many businesses have successfully deployed DGX SuperPODs on-premises.

Simplifying the management of complex infrastructure, DGX Cloud offers a user-friendly “serverless AI” experience, allowing developers to focus on running experiments, building prototypes, and achieving viable models without the burden of infrastructure concerns.

Paikeday emphasized the transformative impact of DGX Cloud, stating that previously, organizations seeking to develop generative AI models had limited options, mainly relying on on-premises data center infrastructure. With DGX Cloud, any organization can now remotely access their own AI supercomputer for training large, complex models from the convenience of a web browser, without the need to maintain a supercomputing data center.

One of the most significant advantages of DGX Cloud lies in its ability to enable generative AI developers to distribute hefty workloads across multiple compute nodes in parallel, leading to impressive training speedups of two to three times compared to traditional cloud computing.

Furthermore, DGX Cloud empowers businesses to establish their own “AI center of excellence,” supporting large developer teams working on numerous AI projects concurrently. This setup allows projects to benefit from a pool of supercomputing capacity that automatically caters to AI workloads as needed.

The significance of generative AI’s growth cannot be overstated, as leading companies across various industries have embraced AI as a business imperative. As a result, the demand for accelerated computing infrastructure has surged, and Nvidia has meticulously optimized the architecture of DGX Cloud to meet these escalating computational demands.

Developers often encounter challenges in data preparation, building initial prototypes, and efficiently utilizing GPU infrastructure. DGX Cloud addresses these concerns through the powerful combination of Nvidia Base Command Platform and Nvidia AI Enterprise, which aim to expedite the journey to production-ready models with accelerated data science libraries, optimized AI frameworks, pre-training AI models, and workflow management software for faster model creation.

A shining example of DGX Cloud’s impact can be seen in biotechnology firm Amgen, which has leveraged the platform to expedite drug discovery. By combining DGX Cloud with Nvidia BioNeMo large language model (LLM) software and Nvidia AI Enterprise software, including Nvidia RAPIDS data science acceleration libraries, Amgen has been able to focus on deeper biology, leaving AI infrastructure and ML engineering concerns behind.

Notably, Amgen reports remarkable achievements, with DGX Cloud enabling the rapid analysis of trillions of antibody sequences, leading to swift developments in synthetic proteins. The company has observed three times faster training of protein LLMs with BioNeMo and up to 100 times faster post-training analysis with Nvidia RAPIDS compared to alternative platforms.

The rental-based DGX Cloud instances boast eight powerful Nvidia 80GB Tensor Core GPUs, delivering an impressive 640GB of GPU memory per node. A high-performance, low-latency fabric enables workload scaling across interconnected clusters, effectively transforming multiple instances into a unified, massive GPU.

Additionally, DGX Cloud comes equipped with high-performance storage, ensuring a comprehensive solution for generative AI training. The package also includes Nvidia AI Enterprise, a software layer featuring over 100 end-to-end AI frameworks and pretrained models. This software facilitates accelerated data science pipelines and expedites the development and deployment of production AI.

Paikeday emphasized that DGX Cloud does not only provide large computational resources but also enhances data scientists’ productivity and resource utilization. With immediate access to launch several jobs concurrently and run multiple generative AI programs in parallel, supported by Nvidia’s expert team, developers can optimize their code and workloads efficiently.

Conclusion:

Nvidia’s launch of DGX Cloud on Oracle Cloud Infrastructure marks a major advancement in the realm of generative AI training. With its user-friendly approach and parallel processing capabilities, DGX Cloud empowers businesses to harness the full potential of generative AI. This offering is expected to revolutionize the market by enabling enterprises to accelerate AI development and leverage AI-driven insights for significant competitive advantage. Industries across the board are likely to witness remarkable innovations and economic growth as they tap into the transformative power of DGX Cloud.

Source