Loading…
18-19 June
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon India 2026 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in India Standard Time (UTC+5:30)To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis. 
Thursday June 18, 2026 12:40pm - 1:10pm IST
As (LLMs) continue to grow in size and demand, single-node inferencing quickly becomes a bottleneck for performance, scalability, and cost. While vLLM has become popular for efficient LLM serving on a single node, it does not fully address the challenges of distributed inferencing across multiple GPUs and nodes in Kubernetes environments.

This talk introduces llm-d, a emerging cloud-native project designed to enable distributed LLM inferencing on Kubernetes. We will cover why vLLM gained popularity and the limitations when scaling beyond a single node. We will explore how llm-d goes a step further by enabling multi-node, multi-GPU inferencing with cloud-native primitives.

Attendees will learn how llm-d fits into modern Kubernetes platforms, how it improves scalability and resource utilization. The session focuses on practical architecture, design trade-offs, and real-world use cases rather than theory with a demo on how llm-d distributes load.
Speakers
avatar for Ravindra Patil

Ravindra Patil

Principal Technical Support Engineer, Red Hat
I am AI evanlegist and working at Red Hat in AI team. I really like to learn and explore how the world can benefit from this AI revolution. I am also very keen in evaluation aspect of AI models to make sure that LLM models are Bias free and are responsible AI.
Thursday June 18, 2026 12:40pm - 1:10pm IST
Jasmine 2 (Level 3)
  AI + ML

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link