<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>2025-08 on Sebastian Scheinkman - Red Hat Openshift, Networking, Kubernetes and Cloud Native</title><link>https://sebasblog.com/archives/2025-08/</link><description>Recent content in 2025-08 on Sebastian Scheinkman - Red Hat Openshift, Networking, Kubernetes and Cloud Native</description><generator>Hugo</generator><language>en-us</language><copyright>sebasblog.com</copyright><lastBuildDate>Sun, 24 Aug 2025 00:00:00 +0000</lastBuildDate><atom:link href="https://sebasblog.com/archives/2025-08/index.xml" rel="self" type="application/rss+xml"/><item><title>Unlocking AI at Scale</title><link>https://sebasblog.com/p/unlocking-ai-at-scale/</link><pubDate>Sun, 24 Aug 2025 00:00:00 +0000</pubDate><guid>https://sebasblog.com/p/unlocking-ai-at-scale/</guid><description>&lt;h1 id="unlocking-ai-at-scale-a-deep-dive-into-rdma-infiniband-and-roce-with-nvidia-mellanox"&gt;&lt;strong&gt;Unlocking AI at Scale: A Deep Dive into RDMA, InfiniBand, and RoCE with NVIDIA Mellanox&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;The exponential growth in the scale and complexity of artificial intelligence models, particularly Large Language Models (LLMs), has created an unprecedented communication bottleneck in distributed computing systems. As these models expand beyond the memory capacity of a single GPU or even a single server, they necessitate multi-node clusters where efficient inter-node communication is paramount. In this high-stakes environment, traditional networking stacks like TCP/IP, which have served as the backbone of the internet for decades, are no longer sufficient for the demands of modern AI workloadshe overhead associated with CPU-managed data transfers and protocol processing introduces latency that can cripple the performance of tightly coupled GPU clusters.&lt;/p&gt;</description></item></channel></rss>