Architecture
This document covers how chit works internally. It's for power users who want to understand the implementation, contribute, or debug complex scenarios.
Component Overview
┌─────────────────────────────────────────────────────────────┐
│ Chat │
│ ┌─────────┐ ┌───────────┐ ┌─────────┐ ┌─────────────┐ │
│ │ ID │ │ Session │ │ Config │ │Continuation │ │
│ └─────────┘ └───────────┘ └─────────┘ └─────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Handle() │ │
│ │ ┌──────────┐ ┌──────────────────┐ │ │
│ │ │Processor │◄────────────►│ Continuation │ │ │
│ │ └──────────┘ └──────────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌──────────┐ │ │
│ │ │ Emitter │───────────────────────────────────────►│ │
│ │ └──────────┘ Output │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Chat owns all state. Handle() is the single entry point. It delegates to either the Processor (fresh input) or Continuation (resumed input), then routes the result through the Emitter.
Pipeline Architecture
Chit uses pipz to compose reliability patterns. When you configure options like WithRetry or WithTimeout, chit builds a processing pipeline:
Input → [Rate Limit] → [Timeout] → [Retry] → [Middleware...] → Terminal → Result
The Terminal is the innermost layer that calls the actual processor. Everything else wraps it.
Continuation Pipeline
When a continuation is resumed, chit builds a separate pipeline with the same options:
func (c *Chat) buildContinuationPipeline(cont Continuation) pipz.Chainable[*ChatRequest] {
pipeline := NewContinuationTerminal(cont) // Different terminal
for _, opt := range c.pipelineOpts { // Same options applied
pipeline = opt(pipeline)
}
return pipeline
}
This ensures reliability parity — a continuation gets the same retry, timeout, rate limiting, and middleware as a fresh processor call.
The Handle Lifecycle
Every call to Handle(ctx, input) follows this flow:
┌─────────────────────────────────────────────┐
│ Handle(input) │
├─────────────────────────────────────────────┤
│ 1. Emit InputReceived signal │
│ 2. Inject emitter into context │
│ 3. Append user message to session │
│ 4. Build ChatRequest │
│ 5. Check for continuation │
│ ├─ YES: Emit TurnResumed │
│ │ Build continuation pipeline │
│ │ Process through pipeline │
│ └─ NO: Process through main pipeline │
│ 6. On error: Emit ProcessingFailed │
│ 7. On success: Emit ProcessingCompleted │
│ 8. Handle result via handleResult() │
└─────────────────────────────────────────────┘
Result Handling
┌─────────────────────────────────────────────┐
│ handleResult(result) │
├─────────────────────────────────────────────┤
│ Switch on result type: │
│ │
│ Response: │
│ 1. Append to session as assistant │
│ 2. Emit ResponseEmitted signal │
│ 3. Call emitter.Emit() with message │
│ │
│ Yield: │
│ 1. Store continuation on Chat │
│ 2. Emit TurnYielded signal │
│ 3. Call emitter.Emit() with prompt │
│ │
│ Unknown: Return ErrUnknownResultType │
└─────────────────────────────────────────────┘
Concurrency Model
Chat uses a mutex to protect mutable state:
continuation— stored/cleared during yield/resume- All session mutations happen under lock or after unlock
The lock is held briefly:
- Lock at start of Handle
- Read/clear continuation
- Unlock before calling processor or continuation
This allows long-running processor calls without blocking other goroutines that might query HasContinuation().
Context Injection
Before calling the processor, Chat injects the emitter into context:
ctx = WithEmitter(ctx, c.emitter)
result, err := c.processor.Process(ctx, input, c.session)
This allows processors (or underlying primitives like scio) to push resources during processing without needing explicit emitter references.
Retrieve it with:
emitter := chit.EmitterFromContext(ctx)
if emitter != nil {
emitter.Push(ctx, resource)
}
Signal Flow
Signals are emitted at key lifecycle points via capitan:
| Point | Signal | Fields |
|---|---|---|
New() | ChatCreated | chat_id |
Handle() start | InputReceived | chat_id, input, input_size |
| Before process | ProcessingStarted | chat_id |
| If resuming | TurnResumed | chat_id |
| On error | ProcessingFailed | chat_id, processing_duration, error |
| On success | ProcessingCompleted | chat_id, processing_duration |
| Response emit | ResponseEmitted | chat_id, role, content_size |
| Yield store | TurnYielded | chat_id, prompt |
ResourcePushed is available but not emitted by Chat — it's for processors that push resources to emit themselves.
Design Q&A
Why is Emitter required, not optional?
Chat is fundamentally about streaming output. A chat without output isn't useful. Making it required:
- Eliminates nil checks throughout
- Makes the contract explicit
- Simplifies testing (always provide a mock)
Why inject Emitter via context instead of passing directly?
Context injection decouples processors from chit's Emitter type. A processor built with cogito/ago/scio doesn't need to know about chit — it just checks context for an emitter. This enables:
- Reusable processors across different chat implementations
- Testing processors in isolation
- Gradual adoption
Why store continuation on Chat instead of returning it?
Storing the continuation makes the API simpler for callers:
// Simple: just call Handle repeatedly
chat.Handle(ctx, "start")
chat.Handle(ctx, "my name")
// vs. Alternative: caller manages continuation
result := chat.Handle(ctx, "start")
if yield, ok := result.(*Yield); ok {
result = yield.Continuation(ctx, "my name")
}
The Chat-manages-continuation approach matches how real chat UIs work — you don't see the continuation machinery, you just send messages.
Why dual-channel Emitter (Emit + Push)?
Text and structured data serve different purposes:
- Emit — conversational flow, rendered as chat bubbles
- Push — data payloads, rendered in side panels or used by UI logic
Separating them lets UI implementations handle each appropriately. A terminal UI might ignore Push; a web UI might render pushed resources in a data viewer.
Performance
Memory: Chat holds references, not copies. Session messages accumulate but that's zyn's concern.
Latency: Handle adds minimal overhead — signal emission, mutex acquire/release, context creation. The processor dominates latency.
Concurrency: Safe for concurrent Handle() calls, but calls serialize on the mutex. For parallel conversations, use separate Chat instances.
Next Steps
- Testing Guide — test your processors and emitters
- API Reference — function signatures
- Types Reference — complete type documentation