opncrafter
Module 5 of 10: Generative UI

Managing Latency with Suspense

Managing Latency with Suspense

Jan 2, 2026 • 18 min read

AI is slow. A complex chain — retrieve chunks, rerank, synthesize, stream — can take 8-12 seconds. In traditional web development, 3 seconds latency is where users start abandoning. In AI applications, users accept longer waits — but only if they understand what's happening. The engineering discipline of loading state design is where many AI products succeed or fail at the UX layer.

1. The Psychology: Why Perceived Wait Time Is All That Matters

🚫

Blank Screen

Panic mode. Users assume it's broken. Abandonment after 2-3s.

🔄

Generic Spinner

Better. Users know something is happening. Tolerate 5-8s.

💀

Skeleton Screen

Best. Users see the SHAPE of the incoming content. Tolerate 10-15s.

Research consistently shows that skeleton screens feel faster than equivalent spinner-based interfaces — not because they are faster (they aren't), but because users have a cognitive model of what they're waiting for. The shape of the incoming content creates positive anticipation rather than anxious uncertainty.

2. The Generator Pattern: Free Skeletons

// The RSC Generator Pattern: yield for immediate skeleton, return for final UI
// This is the most important pattern in Generative UI development

// ✅ CORRECT: Yield skeleton immediately, do async work, return final UI
show_stock_chart: {
    parameters: z.object({ symbol: z.string(), period: z.string() }),
    generate: async function* ({ symbol, period }) {
        
        // YIELD: This flushes to the browser in <50ms (no async work yet)
        // The user IMMEDIATELY sees the skeleton — zero perceived latency
        yield (
            <BotCard>
                <StockChartSkeleton
                    symbol={symbol}  // Show the ticker so user knows WHAT is loading
                    period={period}
                />
            </BotCard>
        );

        // ASYNC WORK: This takes 1-3 seconds — user stares at skeleton
        const stockData = await fetchStockHistory(symbol, period);
        const newsData = await fetchRelevantNews(symbol);  // Can add more fetches

        // Can yield INTERMEDIATE states too (e.g., "Loaded data, analyzing...")
        // yield <BotCard><AnalyzingIndicator symbol={symbol} /></BotCard>;
        // const analysis = await analyzeWithLLM(stockData, newsData);

        // RETURN: React reconciles DOM — skeleton smoothly replaced by real content
        return (
            <BotCard>
                <StockChart
                    data={stockData}
                    symbol={symbol}
                    period={period}
                    news={newsData}
                />
            </BotCard>
        );
    }
}

// ❌ WRONG: Doing async work before yielding — user waits 3 seconds blank
generate: async function* ({ symbol }) {
    const data = await fetchStockHistory(symbol);  // Block for 3 seconds first
    yield <BotCard><StockChartSkeleton /></BotCard>;  // Skeleton barely shows
    return <BotCard><StockChart data={data} /></BotCard>;
}

3. Building a Shimmer Skeleton Component

// File: components/skeletons/StockChartSkeleton.tsx
// CRITICAL RULE: Skeleton MUST be the exact same height as the final component
// Different heights cause Cumulative Layout Shift (CLS) — jarring jumps

interface StockChartSkeletonProps {
    symbol: string;   // Show symbol so user knows what's loading
    period: string;
}

export function StockChartSkeleton({ symbol, period }: StockChartSkeletonProps) {
    return (
        // Same dimensions as the real StockChart component: 380px total height
        <div style={{ padding: '1rem', borderRadius: '12px', border: '1px solid rgba(255,255,255,0.1)', height: '380px' }}>
            
            {/* Header area — matches real component layout */}
            <div style={{ display: 'flex', justifyContent: 'space-between', alignItems: 'center', marginBottom: '1rem' }}>
                
                {/* Symbol display — show real data, not skeleton */}
                <div>
                    <div style={{ fontWeight: 'bold', fontSize: '1.1rem' }}>{symbol}</div>
                    <div style={{ fontSize: '0.75rem', color: 'var(--text-secondary)' }}>{period} chart</div>
                </div>

                {/* Price area — skeleton placeholder */}
                <div style={{ textAlign: 'right' }}>
                    <div style={{
                        height: '1.5rem', width: '80px', borderRadius: '4px',
                        background: 'rgba(255,255,255,0.08)',
                        animation: 'pulse 2s cubic-bezier(0.4, 0, 0.6, 1) infinite',
                        marginBottom: '0.25rem',
                    }} />
                    <div style={{
                        height: '1rem', width: '50px', borderRadius: '4px',
                        background: 'rgba(255,255,255,0.06)',
                        animation: 'pulse 2s cubic-bezier(0.4, 0, 0.6, 1) infinite',
                    }} />
                </div>
            </div>

            {/* Chart area placeholder - 280px to match ResponsiveContainer height */}
            <div style={{
                height: '280px',
                background: 'rgba(255,255,255,0.04)',
                borderRadius: '8px',
                position: 'relative',
                overflow: 'hidden',
                animation: 'pulse 2s cubic-bezier(0.4, 0, 0.6, 1) infinite',
            }}>
                {/* Fake chart line using SVG — gives brain the right shape to anticipate */}
                <svg width="100%" height="100%" style={{ position: 'absolute', inset: 0 }}>
                    <path
                        d="M 0 200 Q 100 150 200 180 Q 300 120 400 140 Q 500 100 600 130 Q 700 90 800 110"
                        stroke="rgba(255,255,255,0.08)"
                        strokeWidth="2"
                        fill="none"
                    />
                </svg>
                {/* Shimmer sweep overlay */}
                <div style={{
                    position: 'absolute', inset: 0,
                    background: 'linear-gradient(90deg, transparent, rgba(255,255,255,0.04), transparent)',
                    animation: 'shimmer 2s infinite',
                }} />
            </div>
        </div>
    );
}

// Add to your global CSS:
// @keyframes pulse {
//     0%, 100% { opacity: 1; }
//     50% { opacity: 0.5; }
// }
// @keyframes shimmer {
//     0% { transform: translateX(-100%); }
//     100% { transform: translateX(100%); }
// }

4. Granular Suspense: Streaming Text Alongside Loading Charts

// Don't wrap entire chat in a single Suspense — be granular
// Each tool call should have its own independent skeleton

// GOOD: Text streams while chart is still loading
// The user reads the LLM's text analysis while the chart fetches data

// In your RSC streamUI handler:
generate: async function* ({ symbols }) {
    // Yield initial skeleton
    yield <BotCard><MultiChartSkeleton symbols={symbols} /></BotCard>;

    // Meanwhile, the AI text response is streaming separately
    // The text starts appearing within 600ms
    // The chart data arrives 2-3s later
    // Both are visible simultaneously!

    const data = await Promise.all(symbols.map(fetchStockData));
    return <BotCard><MultiLineChart data={data} symbols={symbols} /></BotCard>;
}

// The Vercel AI SDK handles concurrent streaming automatically:
// text stream: starts immediately when LLM starts generating
// tool call yields: flushed immediately when yielded
// They render in separate React trees — no blocking

// CLS Prevention: Always measure your real component
// Then make your skeleton the exact same height
// Test by temporarily slowing down your API calls:
// await new Promise(r => setTimeout(r, 5000)); // Add 5s delay for testing

Frequently Asked Questions

How do I measure my component's actual rendered height for CLS-free skeletons?

The most reliable method: temporarily set background: 'red' on your real component, load it, and take a screenshot. Measure the pixel height. Then match your skeleton to that exact height. For variable-height components (text analysis outputs that vary in length), use a min-height on the container that matches the shortest reasonable output, and accept that longer outputs will push content down rather than trying to perfectly predict height. The alternative is to use a fixed-height container with internal scrolling for variable content — this completely eliminates CLS at the cost of some visual truncation.

Should I show skeleton or text when the LLM is deciding which tool to call?

Show an animated "Thinking..." indicator or a contextual message like "Analyzing your request..." for the period between user message submission and first tool yield. This phase typically takes 500-1500ms depending on model and system prompt length. Avoid showing a chart skeleton during this phase — you don't know yet whether the LLM will call the chart tool, a text tool, or multiple tools. A generic "thinking" state followed by tool-specific skeletons (once you know what the agent is doing) is the correct UX pattern. The Vercel AI SDK provides isLoading flags that distinguish between these phases.

Conclusion

Loading state design is where many AI applications succeed or fail at the UX layer — not at the AI quality layer. The generator pattern in RSC makes yielding skeletons essentially free: one line of code provides immediate visual feedback that dramatically improves perceived performance. The three rules to internalize: always yield skeletons before any async work, ensure skeletons are the exact same height as final components (CLS is the enemy), and use granular Suspense boundaries so text streams independently of chart loading. These patterns transform 8-second waits from friction points into acceptable — even engaging — experiences.

👨‍💻
Written by

Vivek

AI Engineer

Full-stack AI engineer with 4+ years building LLM-powered products, autonomous agents, and RAG pipelines. I've shipped AI features to production for startups and worked hands-on with GPT-4o, LangChain, LlamaIndex, and the Vercel AI SDK. I started OpnCrafter to share everything I wish I had when learning — no fluff, just working code and real-world context.

GPT-4oLangChainNext.jsVector DBsRAGVercel AI SDK

Continue Reading

👨‍💻
Written by

Vivek

AI Engineer

Full-stack AI engineer with 4+ years building LLM-powered products, autonomous agents, and RAG pipelines. I've shipped AI features to production for startups and worked hands-on with GPT-4o, LangChain, LlamaIndex, and the Vercel AI SDK. I started OpnCrafter to share everything I wish I had when learning — no fluff, just working code and real-world context.

GPT-4oLangChainNext.jsVector DBsRAGVercel AI SDK