Hibryda a25e024d54 docs: add GPUI breakthrough (4.5% → 0.83% CPU) with shared-entity pattern

2026-03-19 22:36:13 +01:00

13 KiB

Raw Blame History

GPUI Framework — Detailed Findings

Date: 2026-03-19 (updated 2026-03-19 22:30) Source: Hands-on prototyping (ui-gpui/, 2,490 lines) + source code analysis of GPUI 0.2.2 + Zed editor + 4 parallel research agents

BREAKTHROUGH: 4.5% → 0.83% CPU

The #1 optimization rule in GPUI: Never make animated state an Entity CHILD of the view that renders it. Make it a PEER Entity that the view READS.

WRONG (4.5% CPU):
  Workspace → ProjectGrid → ProjectBox → PulsingDot (child Entity)
  cx.notify() on PulsingDot → mark_view_dirty walks ancestors
  → Workspace dirty → refreshing=true → ALL .cached() children miss

RIGHT (0.83% CPU):
  Workspace → ProjectGrid → ProjectBox (reads BlinkState)
  BlinkState (peer Entity, not child)
  cx.notify() on BlinkState → only dirties views that .read(cx) it
  → ProjectBox dirty, Workspace/ProjectGrid NOT dirty → siblings cached

Key mechanisms:

.cached(StyleRefinement::default()) on Entity children → GPU scene replay (zero Rust code)
Shared state Entity instead of child Entity → isolates dirty propagation
mark_view_dirty() walks ancestors — this is GPUI's fundamental constraint
refreshing=true on cache miss cascades down → children can't cache if parent is dirty

Overview

GPUI is the GPU-accelerated UI framework powering the Zed editor. Apache-2.0 licensed (the crate itself; Zed app is GPL-3.0). Pre-1.0 with breaking changes every 2-3 months. 76.7k stars (Zed repo), $32M Sequoia funding.

Architecture

Rendering Model

Hybrid immediate + retained mode
Target: 120 FPS (Metal on macOS, Vulkan via wgpu on Linux/Windows)
No CSS — all styling via Rust method chains: .bg(), .text_color(), .border(), .rounded(), .px(), .py(), .flex()
Text rendering via GPU glyph atlas
Binary size: ~12MB (bare GPUI app)
RAM baseline: ~73-200MB (depending on complexity)

Component Model

struct MyView { /* state */ }

impl Render for MyView {
    fn render(&mut self, window: &mut Window, cx: &mut Context<Self>) -> impl IntoElement {
        div().flex().bg(rgba(0x1e1e2eff)).child("Hello")
    }
}

Entity System

Entity<T> — strong reference to a view/model (Arc-based)
WeakEntity<T> — weak reference (used in async contexts)
Context<T> — mutable access to entity + app state during render/update
cx.new(|cx| T::new()) — create child entity
entity.update(cx, |view, cx| { ... }) — mutate entity from parent
cx.notify() — mark entity as dirty for re-render

State Management

Entity<T> for shared mutable state (like Svelte stores)
cx.observe(&entity, callback) — react to entity changes
entity.read(cx) — read entity state (SUBSCRIBES to changes in render context!)
cx.notify() — trigger re-render of current entity

Event Loop

X11 (Linux)

Uses calloop event loop with periodic timer at monitor refresh rate (from xrandr CRTC info, fallback 16.6ms = 60Hz)
Each tick: check invalidator.is_dirty() → if false, skip frame (near-zero cost)
If dirty: full window draw + present
When window hidden: timer removed entirely (0% CPU)
File: crates/gpui_linux/src/linux/x11/client.rs

Wayland

Entirely compositor-driven: wl_surface.frame() callback
No fixed timer — compositor tells GPUI when to render
Even more efficient than X11 (no polling)
File: crates/gpui_linux/src/linux/wayland/client.rs

Window Repaint Flow

cx.notify(entity_id)
  → App::notify() → push Effect::Notify
  → Window::invalidate() → set dirty=true
  → calloop tick → is_dirty()=true → Window::draw()
  → Window::draw() → render all dirty views → paint to GPU → present
  → is_dirty()=false → sleep until next tick or next notify

Animation Patterns

What We Tested (chronological)

Approach	Result	CPU
`request_animation_frame()` in render	Continuous vsync loop, full window repaint every frame	90%
`cx.spawn()` + `tokio::time::sleep()`	spawn didn't fire from `cx.new()` closure	N/A
Custom `Element` with `paint_quad()` + `request_animation_frame()`	Works but same vsync loop	90%
`cx.spawn()` + `background_executor().timer(200ms)` + `cx.notify()`	Works but timer spawns don't fire from `cx.new()`	10-15%
Zed BlinkManager pattern: `cx.spawn()` from `entity.update()` after registration + `timer(500ms)` + `cx.notify()`	Works correctly	5%
Same + cached SharedStrings + removed diagnostics	Same pattern, cheaper render tree	4.5%

The Correct Pattern (Zed BlinkManager)

From Zed's crates/editor/src/blink_manager.rs:

fn blink_cursors(&mut self, epoch: usize, cx: &mut Context<Self>) {
    if epoch == self.blink_epoch && self.enabled && !self.blinking_paused {
        self.visible = !self.visible;
        cx.notify();  // marks view dirty, does NOT repaint immediately

        let epoch = self.next_blink_epoch();
        let interval = self.blink_interval;
        cx.spawn(async move |this, cx| {
            cx.background_executor().timer(interval).await;
            this.update(cx, |this, cx| this.blink_cursors(epoch, cx));
        }).detach();
    }
}

Key elements:

Epoch counter — each next_blink_epoch() increments a counter. Stale timers check epoch == self.blink_epoch and bail out. No timer accumulation.
Recursive spawn — each blink schedules the next one. No continuous loop.
background_executor().timer() — sleeps the async task for the interval. NOT tokio::time::sleep (GPUI has its own executor).
cx.notify() — marks ONLY this view dirty. Window draws on next calloop tick.

Critical Gotcha: `cx.spawn()` Inside `cx.new()`

cx.spawn() called inside a cx.new(|cx| { ... }) closure does NOT execute. The entity is not yet registered with the window at that point. The spawn is created but the async task never runs.

Fix: Call start_blinking() via entity.update(cx, |dot, cx| dot.start_blinking(cx)) AFTER cx.new() returns, from the parent's init_subviews().

Why 4.5% CPU Instead of ~1% (Zed)

cx.notify() on PulsingDot propagates to the window level. GPUI redraws ALL dirty views in the window — including parent views (Workspace, ProjectGrid, ProjectBox). Our prototype rebuilds ~200 div elements per window redraw.

Zed achieves ~1% because:

Its render tree is heavily optimized with caching
Static parts are pre-computed, not rebuilt per frame
Entity children (Entity<T>) passed via .child(entity) are cached by GPUI
Zed's element tree is much deeper but uses IDs for efficient diffing

Our remaining 4.5% optimization path:

Move header, tab bar, content area into separate Entity views (GPUI caches them independently)
Avoid re-building tab buttons and static text on every render
Use SharedString everywhere (done — reduced from 5% to 4.5%)

API Reference (GPUI 0.2.2)

Entity Creation

let entity: Entity<T> = cx.new(|cx: &mut Context<T>| T::new());

Entity Update (from parent)

entity.update(cx, |view: &mut T, cx: &mut Context<T>| {
    view.do_something(cx);
});

Async Spawn (from entity context)

cx.spawn(async move |weak: WeakEntity<Self>, cx: &mut AsyncApp| {
    cx.background_executor().timer(Duration::from_millis(500)).await;
    weak.update(cx, |view, cx| {
        view.mutate();
        cx.notify();
    }).ok();
}).detach();

Timer

cx.background_executor().timer(Duration::from_millis(500)).await;

Color

// rgba(0xRRGGBBAA) — u32 hex
let green = rgba(0xa6e3a1ff);
let semi_transparent = rgba(0xa6e3a180);
// Note: alpha on bg() may not work on small elements; use color interpolation instead

Layout

div()
    .flex()           // display: flex
    .flex_col()       // flex-direction: column
    .flex_row()       // flex-direction: row
    .flex_1()         // flex: 1
    .w_full()         // width: 100%
    .h(px(36.0))      // height: 36px
    .min_w(px(400.0))
    .px(px(12.0))     // padding-left + padding-right
    .py(px(8.0))      // padding-top + padding-bottom
    .gap(px(8.0))     // gap
    .rounded(px(8.0)) // border-radius
    .border_1()       // border-width: 1px
    .border_color(color)
    .bg(color)
    .text_color(color)
    .text_size(px(13.0))
    .overflow_hidden()
    .items_center()   // align-items: center
    .justify_center() // justify-content: center
    .cursor_pointer()
    .hover(|s| s.bg(hover_color))
    .id("unique-id")  // for diffing + hit testing
    .child(element_or_entity)
    .children(option_entity) // renders Some, skips None

Custom Element (for direct GPU painting)

impl Element for MyElement {
    type RequestLayoutState = ();
    type PrepaintState = ();

    fn id(&self) -> Option<ElementId> { None }
    fn source_location(&self) -> Option<&'static Location<'static>> { None }

    fn request_layout(&mut self, ..., window: &mut Window, cx: &mut App)
        -> (LayoutId, ()) {
        let layout_id = window.request_layout(Style { size: ..., .. }, [], cx);
        (layout_id, ())
    }

    fn prepaint(&mut self, ..., bounds: Bounds<Pixels>, ...) -> () { () }

    fn paint(&mut self, ..., bounds: Bounds<Pixels>, ..., window: &mut Window, ...) {
        window.paint_quad(fill(bounds, color).corner_radii(radius));
    }
}

Animation Frame Scheduling

// From render() — schedules next vsync frame (CAUTION: 60fps = 90% CPU)
window.request_animation_frame();

// From anywhere — schedule callback on next frame
window.on_next_frame(|window, cx| { /* ... */ });

Window

window.set_window_title("Title");
window.request_animation_frame(); // schedule next frame
window.on_next_frame(callback);   // callback on next frame
window.paint_quad(quad);          // direct GPU paint in Element::paint()

Known Limitations (2026-03)

Pre-1.0 API — breaking changes every 2-3 months
No per-view-only repaint — cx.notify() propagates to window level, redraws all dirty views
cx.spawn() in cx.new() doesn't fire — must call after entity registration
rgba() alpha on .bg() unreliable for small elements — use color interpolation
No CSS — every style must be expressed via Rust methods
No WebDriver — can't use existing E2E test infrastructure
No plugin host API — must build your own (WASM/wasmtime or subprocess)
Sparse documentation — "read Zed source" is the primary reference
macOS-first — Linux (X11/Wayland) added 2025, Windows added late 2025
X11 calloop polls at monitor Hz — non-zero baseline CPU even when idle (~0.5%)

Comparison with Dioxus Blitz

Aspect	GPUI	Dioxus Blitz
Styling	Rust methods (`.bg()`, `.flex()`)	CSS (same as browser)
Animation	Spawn + timer + notify (~4.5% CPU)	Class toggle + no CSS transition (~5% CPU)
Animation limit	cx.notify propagates to window	CSS transition = full scene repaint
Custom paint	Yes (Element trait + paint_quad)	No (CSS only, no shader/canvas API)
Render model	Retained views + element diff	HTML/CSS via Vello compute shaders
Terminal	alacritty_terminal + GPUI rendering	xterm.js in WebView (or custom build)
Migration cost	Full rewrite (no web tech)	Low (same wry webview as Tauri)
Ecosystem	60+ components (gpui-component)	CSS ecosystem (any web component)
Text rendering	GPU glyph atlas	Vello compute shader text

Files in Our Prototype

ui-gpui/
├── Cargo.toml
├── src/
│   ├── main.rs              — App entry, window creation
│   ├── theme.rs             — Catppuccin Mocha as const Rgba values
│   ├── state.rs             — AppState, Project, AgentSession types
│   ├── backend.rs           — GpuiEventSink + Backend (PtyManager bridge)
│   ├── workspace.rs         — Root view (sidebar + grid + statusbar)
│   ├── components/
│   │   ├── sidebar.rs       — Icon rail
│   │   ├── status_bar.rs    — Bottom bar (agent counts, cost)
│   │   ├── project_grid.rs  — Grid of ProjectBox entities
│   │   ├── project_box.rs   — Project card (header, tabs, content, dot)
│   │   ├── agent_pane.rs    — Message list + prompt
│   │   ├── pulsing_dot.rs   — Animated status dot (BlinkManager pattern)
│   │   ├── settings.rs      — Settings drawer
│   │   └── command_palette.rs — Ctrl+K overlay
│   └── terminal/
│       ├── renderer.rs      — GPU terminal (alacritty_terminal cells)
│       └── pty_bridge.rs    — PTY via agor-core

13 KiB Raw Blame History