Anshul Jindal & Martin Piercy

Aug 20, 2025 • World Congress 2025

Your Next AI Needs 10,000 GPUs. Now What?

Training large language models is a networking problem, not a compute problem. Learn how to keep thousands of GPUs from sitting idle.

#1about 2 minutes

Introduction to large-scale AI infrastructure challenges

An overview of the topics to be covered, from the progress of generative AI to the compute requirements for training and inference.

#2about 4 minutes

Understanding the fundamental shift to generative AI

Generative AI creates novel content, moving beyond prediction to unlock new use cases in coding, content creation, and customer experience.

#3about 6 minutes

Using NVIDIA NIMs and blueprints to deploy models

NVIDIA Inference Microservices (NIMs) and blueprints provide pre-packaged, optimized containers to quickly deploy models for tasks like retrieval-augmented generation (RAG).

#4about 4 minutes

An overview of the AI model development lifecycle

Building a production-ready model involves a multi-stage process including data curation, distributed training, alignment, optimized inference, and implementing guardrails.

#5about 6 minutes

Understanding parallelism techniques for distributed AI training

Training massive models requires splitting them across thousands of GPUs using tensor, pipeline, and data parallelism to manage compute and communication.

#6about 2 minutes

The scale of GPU compute for training and inference

Training large models like Llama requires millions of GPU hours, while inference for a single large model can demand a full multi-GPU server.

#7about 3 minutes

Key hardware and network design for AI infrastructure

Effective multi-node training depends on high-speed interconnects like NVLink and network architectures designed to minimize communication latency between GPUs.

#8about 3 minutes

Accessing global GPU capacity with DGX Cloud Lepton

NVIDIA's DGX Cloud Lepton is a marketplace connecting developers to a global network of cloud partners for scalable, on-demand GPU compute.

The rise of general-purpose GPU computing

01:40 MIN

The rise of general-purpose GPU computing

Accelerating Python on GPUs

Understanding the NVIDIA GB200 supercomputer architecture

02:27 MIN

Understanding the NVIDIA GB200 supercomputer architecture

A Deep Dive on How To Leverage the NVIDIA GB200 for Ultra-Fast Training and Inference on Kubernetes

Accessing software, models, and training resources

01:46 MIN

Accessing software, models, and training resources

Accelerating Python on GPUs

Building an AI factory with all the essential components

03:07 MIN

Building an AI factory with all the essential components

AI Factories at Scale

Overview of the NVIDIA AI Enterprise software platform

01:51 MIN

Overview of the NVIDIA AI Enterprise software platform

Efficient deployment and inference of GPU-accelerated LLMs

The evolution of GPUs from graphics to AI computing

01:24 MIN

The evolution of GPUs from graphics to AI computing

Accelerating Python on GPUs

NVIDIA's platform for the end-to-end AI workflow

01:04 MIN

NVIDIA's platform for the end-to-end AI workflow

Trends, Challenges and Best Practices for AI at the Edge

A look inside the NIM container architecture

02:27 MIN

A look inside the NIM container architecture

Efficient deployment and inference of GPU-accelerated LLMs

Featured Partners

WWC24 - Ankit Patel - Unlocking the Future Breakthrough Application Performance and Capabilities with NVIDIA

WWC24 - Ankit Patel - Unlocking the Future Breakthrough Application Performance and Capabilities with NVIDIA

Ankit Patel

about 2 years ago • World Congress 2024

A Deep Dive on How To Leverage the NVIDIA GB200 for Ultra-Fast Training and Inference on Kubernetes

A Deep Dive on How To Leverage the NVIDIA GB200 for Ultra-Fast Training and Inference on Kubernetes

Kevin Klues

about 7 months ago • World Congress 2025

Efficient deployment and inference of GPU-accelerated LLMs

Efficient deployment and inference of GPU-accelerated LLMs

Adolf Hohl

about 2 years ago • World Congress 2024

Unveiling the Magic: Scaling Large Language Models to Serve Millions

Unveiling the Magic: Scaling Large Language Models to Serve Millions

Patrick Koss

about 7 months ago • World Congress 2025

How AI Models Get Smarter

How AI Models Get Smarter

Ankit Patel

about 8 months ago • World Congress 2025

AI Factories at Scale

AI Factories at Scale

Thomas Schmidt

about 2 years ago • World Congress 2024

Exploring LLMs across clouds

Exploring LLMs across clouds

Tomislav Tipurić

about 7 months ago • World Congress 2025

Generative AI power on the web: making web apps smarter with WebGPU and WebNN

Generative AI power on the web: making web apps smarter with WebGPU and WebNN

Christian Liebel

about 2 years ago • World Congress 2024

Related Articles

View all articles

Daniel Cranney

Stephan Gillich - Bringing AI Everywhere

In the ever-evolving world of technology, AI continues to be the frontier for innovation and transformation. Stephan Gillich, from the AI Center of Excellence at Intel, dove into the subject in a recent session titled "Bringing AI Everywhere," sheddi...

Stephan Gillich - Bringing AI Everywhere

Krissy Davis

The Best Large Language Models on The Market

Large language models are sophisticated programs that enable machines to comprehend and generate human-like text. They have been the foundation of natural language processing for almost a decade. Although generative AI has only recently gained popula...

The Best Large Language Models on The Market

Daniel Cranney

Panel Discussion: Responsible AI in Practice - Real-World Examples and Challenges

IntroductionIn the ever-evolving landscape of artificial intelligence, the concept of "responsible AI" has emerged as a cornerstone for ethical and practical AI implementation. During the WWC24 Panel discussion, three eminent experts—Mina, Bjorn Brin...

Panel Discussion: Responsible AI in Practice - Real-World Examples and Challenges

Adrien Book

How AI Will Eat The World 🤖

Of generative-AI-for-everything and synthetic pleasuresRemember the web3 hype? Tech bros with easy access to cheap liquidity wanted to create a decentralised, peer-to-peer internet powered by blockchain technology. Spoiler alert, it did not work. And...

How AI Will Eat The World 🤖

From learning to earning

Jobs that call for the skills explored in this talk.

Product Owner/Projektleiter (m/w/d)

relyon AG
Tübingen, Germany

Junior

Intermediate

Senior

Scrum

Data Engineer (f/m/d) - AI

smartclip Europe GmbH
Hamburg, Germany

Intermediate

Senior

ETL

Java

Scala

Machine Learning & Data Engineer

vengine GmbH
Hamburg, Germany

Junior

Intermediate

Python

AI Test Architect

Nvidia

Remote

Senior

PyTorch

Tensorflow

Machine Learning

Senior AI Platform Expert Kubernetes GPU/HPC Workloads

BWI GmbH

Senior

Linux

DevOps

Ansible

Terraform

Kubernetes

AI & Embedded ML Engineer (Real-Time Edge Optimization)

autonomous-teaming

Remote

GIT

Linux

PyTorch

Senior HPC Performance Engineer

Nvidia

Remote

Senior

Docker

Ansible

PyTorch

Tensorflow

+1

AI Engineer - Generative AI /pixelhead)

Conrad Electronic SE

Forward Deployed Engineering Manager, GenAI Applications

Scale AI

Remote

Intermediate

DevOps