Back/Module A-5 Streaming SSR and the Suspense Architecture
Module A-5·26 min read

React 18/19 streaming internals, chunked Transfer-Encoding mechanics, selective hydration priority, hydration mismatch root causes, React.lazy vs dynamic() vs RSC boundary, and preventing flash before hydration.

A-5 — Streaming SSR and the Suspense Architecture

Who this is for: Architects who want to understand the mechanics underneath PPR's dynamic holes and Next.js's streaming responses — how React 18/19 actually streams HTML from the server, how selective hydration works, how Suspense nesting affects the flush order, and where the failure modes are.


The Two Rendering Pipelines

React has two server rendering pipelines. Understanding which one Next.js uses in which situation is the prerequisite for everything in this module.

renderToString — the legacy pipeline. Synchronous. Renders the entire tree to a string and returns it. No streaming. No Suspense support. Still used by Pages Router in some configurations, but never used by App Router.

renderToPipeableStream / renderToReadableStream — the concurrent pipeline. Asynchronous. Streams output in chunks as async work completes. Full Suspense support. This is what App Router uses for every non-static response.

renderToPipeableStream is for Node.js environments (writable streams). renderToReadableStream is for the Web Streams API (edge runtime, Cloudflare Workers, Deno). Next.js uses whichever is appropriate for the runtime — you never call these directly, but understanding them explains the behaviour you observe.


How renderToPipeableStream Streams

When Next.js calls renderToPipeableStream, it passes two callbacks:

ts
const { pipe, abort } = renderToPipeableStream(element, { onShellReady() { // The "shell" — everything outside Suspense boundaries — is ready // This is when Next.js starts piping to the HTTP response response.statusCode = 200; response.setHeader('Content-Type', 'text/html'); pipe(response); }, onShellError(error) { // The shell itself threw — no HTML was sent yet // Next.js falls back to error.tsx response.statusCode = 500; response.end('<h1>Error</h1>'); }, onAllReady() { // Everything has resolved — used for static generation // For streaming responses, you don't wait for this }, onError(error) { // An error inside a Suspense boundary — the boundary catches it // error.tsx for that segment activates }, });

The key insight: onShellReady fires as soon as the synchronous, non-suspended parts of the tree are rendered. Next.js starts streaming immediately at this point — it doesn't wait for async work. The HTTP response headers are sent, the browser starts receiving HTML, and data is still being fetched in the background.


Flush Points and Chunk Boundaries

The HTML that renderToPipeableStream sends isn't random chunks — it has a defined structure that corresponds to how React resolves Suspense boundaries.

Chunk 1 (onShellReady fires):
  - DOCTYPE, <head>, all static <meta> tags, CSS links
  - Synchronous layout content
  - Suspense fallbacks (<!--$?--><template id="B:0"></template><!--/$-->)

Chunk 2 (first Suspense boundary resolves, ~200ms):
  - <div hidden id="S:0">...resolved content...</div>
  - <script>$RC("B:0","S:0")</script>

Chunk 3 (second Suspense boundary resolves, ~400ms):
  - <div hidden id="S:1">...resolved content...</div>
  - <script>$RC("B:1","S:1")</script>

$RC is React's built-in content-replacement function, injected into the HTML by the streaming runtime. It moves the resolved content from its hidden <div> into the Suspense placeholder, triggering React's client-side reconciliation for that subtree only.

Each Suspense boundary is an independent flush point. They resolve and flush independently — a slow database query for one boundary doesn't block a fast query for another.


Suspense Nesting and Resolution Order

Suspense boundaries can be nested, and the nesting order matters for what the user sees.

tsx
export default function Page() { return ( <Suspense fallback={<PageSkeleton />}> {/* outer */} <Header /> <Suspense fallback={<SidebarSkeleton />}> {/* inner A */} <Sidebar /> </Suspense> <Suspense fallback={<ContentSkeleton />}> {/* inner B */} <MainContent /> </Suspense> </Suspense> ); }

Resolution rules:

  • The outer boundary shows its fallback until both the outer shell AND all inner boundaries (or their fallbacks) are ready to render.
  • Inner boundaries show their own fallbacks independently.
  • A parent boundary that hasn't resolved yet holds all its children — inner fallbacks don't render until the parent shell is ready.

In the example above: while Header is resolving, the outer <PageSkeleton /> shows. Once Header resolves, the page shell shows with <SidebarSkeleton /> and <ContentSkeleton /> visible. Sidebar and MainContent then resolve independently.

The practical implication: wrap only the slow parts in Suspense. Wrapping the entire page in a single Suspense boundary (outer only) means the user sees nothing until the slowest data resolves. Granular Suspense gives granular progressive rendering.


Selective Hydration — How It Works

Hydration in the streaming model is not a single pass over the entire page. React 18+ introduced selective hydration: it hydrates Suspense boundaries as they arrive, independently.

The sequence:

1. Server sends initial HTML with Suspense fallbacks
2. React.hydrateRoot() begins hydrating the shell
3. Sidebar chunk arrives → React hydrates <Sidebar /> in isolation
4. User clicks on <MainContent /> area before it's hydrated
5. React prioritises hydrating <MainContent /> first (user interaction)
6. MainContent chunk arrives → hydrated immediately (already prioritised)
7. Sidebar hydration completes after

Step 4-5 is the critical behaviour: React can detect user interaction on not-yet-hydrated areas and prioritise hydrating those areas first. A user click doesn't get dropped — it's queued and replayed after hydration of that specific subtree.

This is what renderToReadableStream and renderToPipeableStream enable that renderToString cannot. With renderToString, hydration is all-or-nothing — the browser waits for the full HTML, then React hydrates everything at once, blocking the main thread.


next/dynamic vs React.lazy vs Async Server Components

Three tools for deferred loading — they solve different problems:

React.lazy — client-side code splitting only. Wraps a Client Component import so the bundle chunk isn't downloaded until the component is rendered. Works with Suspense on the client. Does not affect server rendering.

next/dynamic — Next.js wrapper around React.lazy with additional SSR control:

tsx
import dynamic from 'next/dynamic'; // Client-side only — not rendered on server at all const HeavyChart = dynamic(() => import('./HeavyChart'), { ssr: false, loading: () => <ChartSkeleton />, }); // Rendered on server, hydrated on client — same as React.lazy but with loading state const RichEditor = dynamic(() => import('./RichEditor'), { loading: () => <EditorSkeleton />, });

Use ssr: false when the component uses browser-only APIs (canvas, WebGL, window) that would crash during SSR. Use ssr: true (default) when you want SSR output but can tolerate delayed hydration.

Async Server Components — not code splitting at all. The component runs on the server, its data fetches happen server-side, and the result is streamed to the client as HTML. No client-side bundle chunk. The right choice for components that fetch data and have no interactivity.

The decision:

  • Need server data, no client interactivity → async Server Component
  • Need client interactivity, browser APIs → next/dynamic with ssr: false
  • Need client interactivity, want SSR → next/dynamic with ssr: true (default) or React.lazy

Suspense During Build (SSG) vs Runtime (SSR)

Suspense behaves differently during static generation:

Static generation (SSG/ISR): Next.js uses onAllReady — it waits for all Suspense boundaries to resolve before writing the static HTML file. There's no streaming during build; the final static HTML contains fully-rendered content (no placeholders). This is correct behaviour — you want complete content in your static files.

Runtime streaming (SSR): Next.js uses onShellReady and streams. Placeholders go to the browser; content arrives later.

PPR: The static shell is generated at build time using a render pass that treats dynamic boundaries as permanent placeholders (they don't resolve). The shell HTML is stored with those placeholder nodes intact. At request time, the shell is served immediately, and a separate streaming render resolves only the dynamic boundaries.

Understanding this distinction clarifies a common confusion: "why does my page have a loading skeleton in the static HTML?" — because it's a PPR page with dynamic holes, and the static shell intentionally preserves the fallback.


Error Handling in Async Server Components

Errors thrown inside async Server Components behave differently depending on where the component is:

tsx
// Outside Suspense — error propagates to nearest error.tsx export default async function Page() { const data = await fetchData(); // throws return <div>{data}</div>; // never reached } // Inside Suspense — error.tsx for that segment activates <Suspense fallback={<Skeleton />}> <AsyncComponent /> {/* throws */} </Suspense>

Next.js's error.tsx files are React Error Boundary wrappers. Each route segment can have its own error.tsx. When an async Server Component throws:

  1. React catches it at the nearest Suspense boundary
  2. The boundary activates its error state
  3. If there's an error.tsx in scope, it renders
  4. If not, the error propagates up to the nearest parent that has one

The error.tsx component receives an error prop and a reset function. The reset function re-renders the error boundary — useful for transient errors (network blip, rate limit) where retrying might succeed.

tsx
// app/products/[id]/error.tsx 'use client'; // Error boundaries must be Client Components export default function ProductError({ error, reset, }: { error: Error & { digest?: string }; reset: () => void; }) { return ( <div> <h2>Failed to load product</h2> <p>Error: {error.message}</p> <button onClick={reset}>Try again</button> </div> ); }

The digest on the error is a hash that Next.js logs server-side. In production, error messages are redacted from the client (you see "An error occurred" instead of your stack trace) — the digest is how you correlate a client-side error to the server log.


The loading.tsx and not-found.tsx Shortcuts

loading.tsx is syntactic sugar for wrapping the entire route segment in a Suspense boundary with that file as the fallback:

app/
  products/
    loading.tsx   ← fallback for any async in page.tsx
    page.tsx

This is equivalent to wrapping <Page /> in <Suspense fallback={<Loading />}> at the layout level. It's a convenience — for granular control over which parts of the page stream, use explicit Suspense boundaries inside your component.

not-found.tsx is the same pattern for notFound() calls — it renders when any Server Component in the segment calls notFound(). The notFound() function throws a special error that Next.js routes to not-found.tsx instead of error.tsx.


Where We Go From Here

A-6 moves to Server Actions at the architectural level — the security model, how Server Actions communicate through the network, CSRF protection, progressive enhancement, and the patterns that scale to large mutation-heavy applications. With A-5's understanding of the streaming mechanism, A-6 explains how mutations integrate with the streaming rendering model.


Suspense Placement Anti-Patterns — When Streaming Makes Performance Worse

The correct mental model: a Suspense boundary does not speed up the data inside it. It allows the content OUTSIDE it to render immediately while content inside waits. Place it wrong and you eliminate that benefit.

Anti-Pattern 1: Wrapping the Entire Page

tsx
// WRONG — the entire page content streams as one chunk // Users wait for the slowest data before seeing ANYTHING export default function DashboardPage() { return ( <Suspense fallback={<PageSkeleton />}> <FastUserProfile /> {/* 50ms */} <SlowAnalytics /> {/* 800ms */} <RecentActivity /> {/* 200ms */} </Suspense> ) }

The user sees the skeleton for 800ms. The 50ms and 200ms components wait for SlowAnalytics. Streaming buys nothing here.

tsx
// CORRECT — wrap only the slow component export default function DashboardPage() { return ( <> <FastUserProfile /> {/* streams immediately, ~50ms */} <Suspense fallback={<ActivitySkeleton />}> <RecentActivity /> {/* streams at ~200ms */} </Suspense> <Suspense fallback={<AnalyticsSkeleton />}> <SlowAnalytics /> {/* streams at ~800ms */} </Suspense> </> ) }

Users see FastUserProfile in 50ms. RecentActivity appears at 200ms. SlowAnalytics appears at 800ms. Progressive disclosure.

Anti-Pattern 2: Nested Suspense Creates Serial Waterfalls

Sibling Suspense boundaries fetch concurrently. Nested Suspense boundaries fetch serially.

tsx
// WRONG — nested Suspense = serial fetching // UserProfile fetches user (200ms) // THEN ActivityFeed starts fetching (300ms) // Total: 500ms <Suspense fallback={<UserSkeleton />}> <UserProfile userId={id} /> {/* fetches user data: 200ms */} <Suspense fallback={<ActivitySkeleton />}> <ActivityFeed userId={id} /> {/* fetches activity: 300ms, but starts AFTER UserProfile */} </Suspense> </Suspense>
tsx
// CORRECT — sibling Suspense = parallel fetching // Both start immediately, total: max(200ms, 300ms) = 300ms <> <Suspense fallback={<UserSkeleton />}> <UserProfile userId={id} /> {/* 200ms */} </Suspense> <Suspense fallback={<ActivitySkeleton />}> <ActivityFeed userId={id} /> {/* 300ms, starts in parallel */} </Suspense> </>

Rule: if two data dependencies are independent, their Suspense boundaries must be siblings, not nested.

Anti-Pattern 3: Suspense on Synchronous Components

Wrapping a component that doesn't fetch data in Suspense adds React overhead with no benefit:

tsx
// POINTLESS — ButtonGroup has no async data <Suspense fallback={null}> <ButtonGroup actions={actions} /> </Suspense>

Suspense only makes sense when the wrapped component suspends — i.e., it awaits data or uses the use() hook on a Promise.

When to Use error.tsx vs Suspense Interaction

An error thrown inside a Suspense boundary propagates through the boundary to the nearest error.tsx. The Suspense boundary's fallback disappears and the error boundary's UI renders instead.

This means if you want per-component error recovery, you need both:

tsx
<Suspense fallback={<AnalyticsSkeleton />}> <ErrorBoundaryWrapper> {/* catches errors from SlowAnalytics */} <SlowAnalytics /> </ErrorBoundaryWrapper> </Suspense>

Without the error wrapper, a SlowAnalytics error propagates out of the Suspense boundary and up to the nearest segment-level error.tsx — possibly taking down the entire page segment.

The ErrorBoundaryWrapper is a Client Component that renders an error fallback in place of the failed component:

tsx
'use client' import { ErrorBoundary } from 'react-error-boundary' export function ErrorBoundaryWrapper({ children }: { children: React.ReactNode }) { return ( <ErrorBoundary fallback={<AnalyticsError />}> {children} </ErrorBoundary> ) }

Discussion