agent-orchestrator/docs/architecture/gpui-findings.md

377 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# GPUI Framework — Detailed Findings
**Date:** 2026-03-19 (updated 2026-03-19 22:30)
**Source:** Hands-on prototyping (ui-gpui/, 2,490 lines) + source code analysis of GPUI 0.2.2 + Zed editor + 4 parallel research agents
## BREAKTHROUGH: 4.5% → 0.83% CPU
**The #1 optimization rule in GPUI:** Never make animated state an Entity CHILD of the view that renders it. Make it a PEER Entity that the view READS.
```
WRONG (4.5% CPU):
Workspace → ProjectGrid → ProjectBox → PulsingDot (child Entity)
cx.notify() on PulsingDot → mark_view_dirty walks ancestors
→ Workspace dirty → refreshing=true → ALL .cached() children miss
RIGHT (0.83% CPU):
Workspace → ProjectGrid → ProjectBox (reads BlinkState)
BlinkState (peer Entity, not child)
cx.notify() on BlinkState → only dirties views that .read(cx) it
→ ProjectBox dirty, Workspace/ProjectGrid NOT dirty → siblings cached
```
Key mechanisms:
1. **`.cached(StyleRefinement::default())`** on Entity children → GPU scene replay (zero Rust code)
2. **Shared state Entity** instead of child Entity → isolates dirty propagation
3. **`mark_view_dirty()` walks ancestors** — this is GPUI's fundamental constraint
4. **`refreshing=true` on cache miss** cascades down → children can't cache if parent is dirty
## DEEP DIVE: Why 3% is the floor for div-based views (2026-03-19 late)
### The full dirty propagation chain (confirmed by source analysis)
```
cx.notify(entity_id)
→ App::notify → WindowInvalidator::invalidate_view(entity_id)
→ dirty_views.insert(entity_id)
→ draw() → invalidate_entities() → mark_view_dirty(entity_id)
→ dispatch_tree.view_path_reversed(entity_id) → walks parent chain
→ inserts entity + ALL render-time ancestors into dirty_views
→ AnyView::prepaint checks: dirty_views.contains(MY entity_id)
→ if true: cache MISS, re-render
→ if false: cache HIT, replay GPU commands
```
### Why Zed achieves <1% with the SAME mechanism
Zed's BlinkManager → observer → cx.notify(Editor) → Editor + Pane + Workspace ALL in dirty_views.
ALL re-render. But:
- Workspace::render() = ~5 div builders = <0.1ms
- Pane::render() = ~10 div builders = <0.2ms
- Editor::render() = returns EditorElement struct = <0.01ms
- EditorElement::prepaint/paint = real work, bounded to visible rows
Total: <1ms per blink frame. Our ProjectBox::render() = ~9 divs + 3 tab buttons = ~15ms.
The 15ms includes Taffy layout + element prepaint + paint pipeline overhead per div.
### SharedBlink pattern (Arc<AtomicBool>)
Replaced Entity<BlinkState> with `Arc<AtomicBool>`:
- Background timer toggles atomic bool every 500ms
- Timer calls cx.notify() on ProjectBox directly (no intermediate entity)
- ProjectBox::render() reads bool atomically — no Entity subscription overhead
- Same 3% CPU — confirms cost is in render() itself, not entity machinery
### Hierarchy flattening (4 levels → 2 levels)
Removed ProjectGrid entity level. ProjectBoxes render directly from Workspace.
Dispatch tree: Workspace → ProjectBox (2 levels, same as Zed's Workspace → Pane).
CPU unchanged at 3% — confirms cost is per-frame render(), not ancestor walk count.
### Path to <1%: Custom Element for ProjectBox header
Convert header (accent stripe + dot + name + CWD + tab bar) from div tree to
custom `impl Element` that paints directly via `window.paint_quad()` +
`window.text_system().shape_line().paint()`. This eliminates:
- 9 div() constructor calls
- 9 Taffy layout nodes
- 9 prepaint traversals
- 9 paint traversals
Replaced by ~6 direct GPU primitive insertions (paint_quad × 4 + shape_line × 2).
Expected reduction: 15ms → <1ms per frame.
## Overview
GPUI is the GPU-accelerated UI framework powering the Zed editor. Apache-2.0 licensed (the crate itself; Zed app is GPL-3.0). Pre-1.0 with breaking changes every 2-3 months. 76.7k stars (Zed repo), $32M Sequoia funding.
## Architecture
### Rendering Model
- Hybrid immediate + retained mode
- Target: 120 FPS (Metal on macOS, Vulkan via wgpu on Linux/Windows)
- No CSS all styling via Rust method chains: `.bg()`, `.text_color()`, `.border()`, `.rounded()`, `.px()`, `.py()`, `.flex()`
- Text rendering via GPU glyph atlas
- Binary size: ~12MB (bare GPUI app)
- RAM baseline: ~73-200MB (depending on complexity)
### Component Model
```rust
struct MyView { /* state */ }
impl Render for MyView {
fn render(&mut self, window: &mut Window, cx: &mut Context<Self>) -> impl IntoElement {
div().flex().bg(rgba(0x1e1e2eff)).child("Hello")
}
}
```
### Entity System
- `Entity<T>` strong reference to a view/model (Arc-based)
- `WeakEntity<T>` weak reference (used in async contexts)
- `Context<T>` mutable access to entity + app state during render/update
- `cx.new(|cx| T::new())` create child entity
- `entity.update(cx, |view, cx| { ... })` mutate entity from parent
- `cx.notify()` mark entity as dirty for re-render
### State Management
- `Entity<T>` for shared mutable state (like Svelte stores)
- `cx.observe(&entity, callback)` react to entity changes
- `entity.read(cx)` read entity state (SUBSCRIBES to changes in render context!)
- `cx.notify()` trigger re-render of current entity
## Event Loop
### X11 (Linux)
- Uses `calloop` event loop with periodic timer at monitor refresh rate (from xrandr CRTC info, fallback 16.6ms = 60Hz)
- Each tick: check `invalidator.is_dirty()` if false, skip frame (near-zero cost)
- If dirty: full window draw + present
- When window hidden: timer removed entirely (0% CPU)
- File: `crates/gpui_linux/src/linux/x11/client.rs`
### Wayland
- Entirely compositor-driven: `wl_surface.frame()` callback
- No fixed timer compositor tells GPUI when to render
- Even more efficient than X11 (no polling)
- File: `crates/gpui_linux/src/linux/wayland/client.rs`
### Window Repaint Flow
```
cx.notify(entity_id)
→ App::notify() → push Effect::Notify
→ Window::invalidate() → set dirty=true
→ calloop tick → is_dirty()=true → Window::draw()
→ Window::draw() → render all dirty views → paint to GPU → present
→ is_dirty()=false → sleep until next tick or next notify
```
## Animation Patterns
### What We Tested (chronological)
| Approach | Result | CPU |
|----------|--------|-----|
| `request_animation_frame()` in render | Continuous vsync loop, full window repaint every frame | **90%** |
| `cx.spawn()` + `tokio::time::sleep()` | spawn didn't fire from `cx.new()` closure | N/A |
| Custom `Element` with `paint_quad()` + `request_animation_frame()` | Works but same vsync loop | **90%** |
| `cx.spawn()` + `background_executor().timer(200ms)` + `cx.notify()` | Works but timer spawns don't fire from `cx.new()` | **10-15%** |
| Zed BlinkManager pattern: `cx.spawn()` from `entity.update()` after registration + `timer(500ms)` + `cx.notify()` | **Works correctly** | **5%** |
| Same + cached SharedStrings + removed diagnostics | Same pattern, cheaper render tree | **4.5%** |
### The Correct Pattern (Zed BlinkManager)
From Zed's `crates/editor/src/blink_manager.rs`:
```rust
fn blink_cursors(&mut self, epoch: usize, cx: &mut Context<Self>) {
if epoch == self.blink_epoch && self.enabled && !self.blinking_paused {
self.visible = !self.visible;
cx.notify(); // marks view dirty, does NOT repaint immediately
let epoch = self.next_blink_epoch();
let interval = self.blink_interval;
cx.spawn(async move |this, cx| {
cx.background_executor().timer(interval).await;
this.update(cx, |this, cx| this.blink_cursors(epoch, cx));
}).detach();
}
}
```
Key elements:
1. **Epoch counter** each `next_blink_epoch()` increments a counter. Stale timers check `epoch == self.blink_epoch` and bail out. No timer accumulation.
2. **Recursive spawn** each blink schedules the next one. No continuous loop.
3. **`background_executor().timer()`** sleeps the async task for the interval. NOT `tokio::time::sleep` (GPUI has its own executor).
4. **`cx.notify()`** marks ONLY this view dirty. Window draws on next calloop tick.
### Critical Gotcha: `cx.spawn()` Inside `cx.new()`
**`cx.spawn()` called inside a `cx.new(|cx| { ... })` closure does NOT execute.** The entity is not yet registered with the window at that point. The spawn is created but the async task never runs.
**Fix:** Call `start_blinking()` via `entity.update(cx, |dot, cx| dot.start_blinking(cx))` AFTER `cx.new()` returns, from the parent's `init_subviews()`.
### Why 4.5% CPU Instead of ~1% (Zed)
`cx.notify()` on PulsingDot propagates to the window level. GPUI redraws ALL dirty views in the window including parent views (Workspace, ProjectGrid, ProjectBox). Our prototype rebuilds ~200 div elements per window redraw.
Zed achieves ~1% because:
1. Its render tree is heavily optimized with caching
2. Static parts are pre-computed, not rebuilt per frame
3. Entity children (`Entity<T>`) passed via `.child(entity)` are cached by GPUI
4. Zed's element tree is much deeper but uses IDs for efficient diffing
Our remaining 4.5% optimization path:
- Move header, tab bar, content area into separate Entity views (GPUI caches them independently)
- Avoid re-building tab buttons and static text on every render
- Use `SharedString` everywhere (done reduced from 5% to 4.5%)
## API Reference (GPUI 0.2.2)
### Entity Creation
```rust
let entity: Entity<T> = cx.new(|cx: &mut Context<T>| T::new());
```
### Entity Update (from parent)
```rust
entity.update(cx, |view: &mut T, cx: &mut Context<T>| {
view.do_something(cx);
});
```
### Async Spawn (from entity context)
```rust
cx.spawn(async move |weak: WeakEntity<Self>, cx: &mut AsyncApp| {
cx.background_executor().timer(Duration::from_millis(500)).await;
weak.update(cx, |view, cx| {
view.mutate();
cx.notify();
}).ok();
}).detach();
```
### Timer
```rust
cx.background_executor().timer(Duration::from_millis(500)).await;
```
### Color
```rust
// rgba(0xRRGGBBAA) — u32 hex
let green = rgba(0xa6e3a1ff);
let semi_transparent = rgba(0xa6e3a180);
// Note: alpha on bg() may not work on small elements; use color interpolation instead
```
### Layout
```rust
div()
.flex() // display: flex
.flex_col() // flex-direction: column
.flex_row() // flex-direction: row
.flex_1() // flex: 1
.w_full() // width: 100%
.h(px(36.0)) // height: 36px
.min_w(px(400.0))
.px(px(12.0)) // padding-left + padding-right
.py(px(8.0)) // padding-top + padding-bottom
.gap(px(8.0)) // gap
.rounded(px(8.0)) // border-radius
.border_1() // border-width: 1px
.border_color(color)
.bg(color)
.text_color(color)
.text_size(px(13.0))
.overflow_hidden()
.items_center() // align-items: center
.justify_center() // justify-content: center
.cursor_pointer()
.hover(|s| s.bg(hover_color))
.id("unique-id") // for diffing + hit testing
.child(element_or_entity)
.children(option_entity) // renders Some, skips None
```
### Custom Element (for direct GPU painting)
```rust
impl Element for MyElement {
type RequestLayoutState = ();
type PrepaintState = ();
fn id(&self) -> Option<ElementId> { None }
fn source_location(&self) -> Option<&'static Location<'static>> { None }
fn request_layout(&mut self, ..., window: &mut Window, cx: &mut App)
-> (LayoutId, ()) {
let layout_id = window.request_layout(Style { size: ..., .. }, [], cx);
(layout_id, ())
}
fn prepaint(&mut self, ..., bounds: Bounds<Pixels>, ...) -> () { () }
fn paint(&mut self, ..., bounds: Bounds<Pixels>, ..., window: &mut Window, ...) {
window.paint_quad(fill(bounds, color).corner_radii(radius));
}
}
```
### Animation Frame Scheduling
```rust
// From render() — schedules next vsync frame (CAUTION: 60fps = 90% CPU)
window.request_animation_frame();
// From anywhere — schedule callback on next frame
window.on_next_frame(|window, cx| { /* ... */ });
```
### Window
```rust
window.set_window_title("Title");
window.request_animation_frame(); // schedule next frame
window.on_next_frame(callback); // callback on next frame
window.paint_quad(quad); // direct GPU paint in Element::paint()
```
## Known Limitations (2026-03)
1. **Pre-1.0 API** breaking changes every 2-3 months
2. **No per-view-only repaint** `cx.notify()` propagates to window level, redraws all dirty views
3. **`cx.spawn()` in `cx.new()` doesn't fire** must call after entity registration
4. **`rgba()` alpha on `.bg()` unreliable** for small elements use color interpolation
5. **No CSS** every style must be expressed via Rust methods
6. **No WebDriver** can't use existing E2E test infrastructure
7. **No plugin host API** must build your own (WASM/wasmtime or subprocess)
8. **Sparse documentation** "read Zed source" is the primary reference
9. **macOS-first** Linux (X11/Wayland) added 2025, Windows added late 2025
10. **X11 calloop polls at monitor Hz** non-zero baseline CPU even when idle (~0.5%)
## Comparison with Dioxus Blitz
| Aspect | GPUI | Dioxus Blitz |
|--------|------|-------------|
| Styling | Rust methods (`.bg()`, `.flex()`) | CSS (same as browser) |
| Animation | Spawn + timer + notify (~4.5% CPU) | Class toggle + no CSS transition (~5% CPU) |
| Animation limit | cx.notify propagates to window | CSS transition = full scene repaint |
| Custom paint | Yes (Element trait + paint_quad) | No (CSS only, no shader/canvas API) |
| Render model | Retained views + element diff | HTML/CSS via Vello compute shaders |
| Terminal | alacritty_terminal + GPUI rendering | xterm.js in WebView (or custom build) |
| Migration cost | Full rewrite (no web tech) | Low (same wry webview as Tauri) |
| Ecosystem | 60+ components (gpui-component) | CSS ecosystem (any web component) |
| Text rendering | GPU glyph atlas | Vello compute shader text |
## Files in Our Prototype
```
ui-gpui/
├── Cargo.toml
├── src/
│ ├── main.rs — App entry, window creation
│ ├── theme.rs — Catppuccin Mocha as const Rgba values
│ ├── state.rs — AppState, Project, AgentSession types
│ ├── backend.rs — GpuiEventSink + Backend (PtyManager bridge)
│ ├── workspace.rs — Root view (sidebar + grid + statusbar)
│ ├── components/
│ │ ├── sidebar.rs — Icon rail
│ │ ├── status_bar.rs — Bottom bar (agent counts, cost)
│ │ ├── project_grid.rs — Grid of ProjectBox entities
│ │ ├── project_box.rs — Project card (header, tabs, content, dot)
│ │ ├── agent_pane.rs — Message list + prompt
│ │ ├── pulsing_dot.rs — Animated status dot (BlinkManager pattern)
│ │ ├── settings.rs — Settings drawer
│ │ └── command_palette.rs — Ctrl+K overlay
│ └── terminal/
│ ├── renderer.rs — GPU terminal (alacritty_terminal cells)
│ └── pty_bridge.rs — PTY via agor-core
```
## Sources
- [GPUI crate (crates.io)](https://crates.io/crates/gpui)
- [GPUI README](https://github.com/zed-industries/zed/blob/main/crates/gpui/README.md)
- [Zed BlinkManager source](https://github.com/zed-industries/zed/blob/main/crates/editor/src/blink_manager.rs)
- [GPUI X11 client source](https://github.com/zed-industries/zed/blob/main/crates/gpui_linux/src/linux/x11/client.rs)
- [GPUI Wayland client source](https://github.com/zed-industries/zed/blob/main/crates/gpui_linux/src/linux/wayland/client.rs)
- [GPUI window.rs source](https://github.com/zed-industries/zed/blob/main/crates/gpui/src/window.rs)
- [gpui-component library](https://github.com/4t145/gpui-component)
- [awesome-gpui list](https://github.com/zed-industries/awesome-gpui)
- [Zed GPU rendering blog](https://zed.dev/blog/videogame)