React Server Components & Streaming UI

Jan 2, 2026 • 20 min read

To build Generative UI, we need a rendering engine that is fast, incremental, and server-driven. Enter React Server Components (RSC).

1. The Psychology of Waiting

Latency kills retention. Standard LLMs take 2-10 seconds to generate a full answer.

If your interface freezes for 5 seconds, users assume it's broken.Streaming is the art of showing progress to keep the user engaged.

2. How HTTP Streaming Works

Most web requests are atomic: Request → Wait → Response. Streaming uses Chunked Transfer Encoding. The server keeps the connection open and pushes data as it's ready.

HTTP/1.1 200 OK

Transfer-Encoding: chunked

Chunk 1: <div>Thinking...</div>

... (100ms delay) ...

Chunk 2: <script>replace("Thinking", "Analyzing Data")</script>

... (500ms delay) ...

Chunk 3: <Chart data={...} />

3. React Suspense: The Coordinator

In React, <Suspense> is the boundary that catches a "Not Ready Yet" promise. It allows us to unblock the rest of the page while specific parts load.

// Page.jsx (Server Component)
import { Suspense } from 'react';

export default function ChatPage() {
  return (
    <div className="chat-layout">
      {/* 1. Shell loads instantly */}
      <Header />
      
      {/* 2. Messages stream in independent of the input area */}
      <Suspense fallback={<MessageSkeleton count={3} />}>
         <ChatMessages />
      </Suspense>
        
      {/* 3. Input area is interactive immediately */}
      <InputArea />
    </div>
  );
}

Pro Tip: Wrap your AI responses in granular Suspense boundaries so the user can type the next question while the previous one is still rendering.

4. The Vercel AI SDK Pattern: `createStreamableUI`

The Vercel AI SDK provides a primitive called createStreamableUI. This creates a mutable UI node that you can update from the server, even after the HTTP return.

Why is this revolutionary?
It means a Server Action can "return" immediately with a placeholder, and then keep pushing updates to that specific placeholder in the background.

'use server';
import { createStreamableUI } from 'ai/rsc';

export async function submit(question) {
  // 1. Create the stream
  const ui = createStreamableUI(<Spinner />);

  // 2. Run async in background (using IIFE)
  (async () => {
    // Step A: Update status
    ui.update(<div className="text-gray-500">Searching database...</div>);
    await new Promise(r => setTimeout(r, 1000));
    
    // Step B: Update status again
    ui.update(<div className="text-gray-500">Analyzing results...</div>);
    
    // Step C: Call LLM
    const answer = await askLLM(question);
    
    // Step D: Finalize with a rich component
    ui.done(
        <div className="result-card">
            <Markdown>{answer.text}</Markdown>
            {answer.hasChart && <StockChart data={answer.data} />}
        </div>
    );
  })();

  // 3. Return the streamable UI immediately to the client
  return { 
    id: Date.now(), 
    display: ui.value  // <--- This is the magic node
  }; 
}

5. Best Practices for Skeletons

Don't just use a spinning circle. Use a structure that mimics the final content.

Text Skeleton: Use 3-4 distinct gray lines of varying width.
Chart Skeleton: A gray rectangle with a faint grid background.
Card Skeleton: A box with a circle (avatar) and 2 lines (header).

Prevent "Layout Shift" (CLS) by hardcoding the height of your skeletons.

Conclusion

Streaming is not just a performance optimization; it is a Psychological Optimization. It makes the AI feel "alive" and responsive, building trust with the user. In the next chapter, we will connect this to the OpenAI API.