ADR-007: Event System Architecture
ADR-007: Event System Architecture
Status
Accepted
Context
Toast needs a domain event system to decouple operations from their side effects. Currently, several features are tightly coupled to the operations that trigger them:
- Audit logging - Direct
auditLogService.log()calls embedded in services - Future webhooks - Would require adding webhook dispatch to every operation
- Future SSE streaming - Real-time updates would need explicit push logic
The PRD explicitly calls out events as "product surface, not internal plumbing" with the requirement that payloads be "self-contained enough to act without additional API calls."
Requirements from PRD
"Events are product surface, not internal plumbing. External systems (including AI) build automations we haven't imagined."
Key requirements:
- Rich, typed payloads - Self-contained enough to act without follow-up API calls
- Multiple delivery mechanisms (webhooks, SSE, event log)
- Event taxonomy is product design - emit events for automations you haven't built yet
- Audit trail - who/what triggered each event (human, AI, automation)
Decision
Implement a domain event system with three components:
- Event Type System - Strongly-typed event interfaces with discriminated unions
- Event Bus Abstraction - Pluggable bus with in-memory implementation for development/testing
- Event-Driven Audit Logging - Audit logging as an event subscriber, not direct calls
Event Naming Convention
Events follow resource.verb naming in past tense:
content.created(notcontent.createorcreateContent)staff.updated(notstaff.updateorupdateStaff)auth.login(notauth.logged_inorlogin)
Past tense reflects that events are immutable facts about things that have already happened.
Event Taxonomy
Initial events based on existing audit log call sites:
| Event Type | Resource | Trigger |
|---|---|---|
content.created | content | New post/page created |
content.updated | content | Post/page modified |
staff.created | staff | New staff member added |
staff.updated | staff | Staff profile changed |
site.created | site | New site provisioned |
auth.login | session | Staff member signs in |
auth.logout | session | Staff member signs out |
Event Structure
All events share a base structure:
interface DomainEvent<
T extends string = string,
D extends Record<string, unknown> = Record<string, unknown>,
> {
id: string; // Unique event ID (UUID)
type: T; // Event type (e.g., 'content.created')
siteId: string; // Multi-tenant isolation
timestamp: string; // ISO 8601 timestamp
actor: {
type: ActorType; // 'staff' | 'api_key' | 'automation' | 'system'
id: string | null; // Actor ID (null for system events)
};
data: D; // Event-specific payload
}Concrete events narrow the type and data fields for type safety:
interface ContentCreatedEvent extends DomainEvent<'content.created', ContentCreatedData> {
type: 'content.created';
data: ContentCreatedData;
}Discriminated Union for Type Safety
The AnyDomainEvent union enables exhaustive pattern matching:
function handleEvent(event: AnyDomainEvent) {
switch (event.type) {
case 'content.created':
// TypeScript knows event.data is ContentCreatedData
console.log(event.data.title);
break;
case 'auth.login':
// TypeScript knows event.data is AuthLoginData
console.log(event.data.email);
break;
// ... other cases
}
}Event Bus Design
The event bus provides fire-and-forget semantics:
interface EventBus {
emit(event: DomainEvent): void; // Returns void, not Promise
subscribe(eventType: string, handler: EventHandler): void;
unsubscribeAll(): void;
}Key design decisions:
-
emit()returns void - Callers must never await event handling. This prevents cascading failures and keeps the primary operation fast. -
Wildcard subscription - Handlers can subscribe to
'*'to receive all events (used by audit log subscriber). -
Error isolation - A failing subscriber never crashes the emitter or blocks other subscribers. Errors are logged but not propagated.
-
Deferred execution - Handlers run via
queueMicrotask()to ensure emit returns immediately.
Implementation Strategy
Phase 1: Foundation (this ADR)
- Event type system with all current operations
- In-memory event bus for development/testing
- Event factory functions for type-safe event creation
Phase 2: Audit Integration
- Audit logging becomes an event subscriber
- Existing
auditLogService.log()calls replaced witheventBus.emit()
Phase 3: Delivery Mechanisms (future)
- Webhooks subscriber with retry logic
- SSE streaming subscriber
- Event log with cursor pagination
Production Backend (future)
- BullMQ with Redis for persistent, reliable event delivery
- Separate read/write connections
- Dead letter queue for failed deliveries
Alternatives Considered
1. Keep Direct Audit Calls
Approach: Continue calling auditLogService.log() directly from services.
Pros:
- Simpler, no new abstraction
- Already working
Cons:
- Every new subscriber (webhooks, SSE) requires touching every service
- No standardized event shape
- Tight coupling between operations and side effects
2. Generic Event Emitter (Node.js EventEmitter)
Approach: Use Node's built-in EventEmitter for pub/sub.
Pros:
- No dependencies
- Familiar API
Cons:
- Synchronous by default
- No type safety on event payloads
- No built-in error isolation
- Not suitable for future distributed scenarios
3. Full Message Queue Immediately (BullMQ)
Approach: Start with BullMQ/Redis from day one.
Pros:
- Production-ready persistence
- Built-in retry, dead letter queues
- Distributed worker support
Cons:
- Requires Redis dependency for development
- Over-engineered for current scale
- Harder to test without Redis
Consequences
Positive
- Loose coupling - Services emit events without knowing who listens
- Type safety - Discriminated unions enable exhaustive pattern matching
- Testability -
InMemoryEventBusenables synchronous, isolated tests - Extensibility - Adding webhooks/SSE is a new subscriber, not changes to every service
- Audit trail - Every operation automatically logged via event subscriber
- Self-documenting - Event types serve as a catalog of what the system does
Negative
- Indirection - Code path from operation to audit log is less obvious
- Learning curve - Developers must understand event-driven patterns
- Eventual consistency - Side effects happen asynchronously (by design)
Trade-offs Accepted
The indirection cost is acceptable because:
- Events are a documented, explicit abstraction
- Type safety ensures compile-time correctness
- The pattern scales to webhooks, SSE, and other subscribers
File Structure
apps/api/src/events/
├── types.ts # Base interfaces (DomainEvent, EventBus, EventHandler)
├── domain-events.ts # Concrete event types, payloads, factory functions
├── event-bus.ts # InMemoryEventBus implementation
├── index.ts # Re-exports for consumers
├── event-bus.test.ts # Event bus unit tests
└── domain-events.test.ts # Event type testsReferences
- PRD Event System section:
docs/decisions/000-product-requirements.md - Issue #283: Event System Foundation (epic)
- Issue #284: Event Type System & Taxonomy (this work)
- Issue #285: Event Bus Implementation (completed)
- Issue #286: Wire Audit Logging as Event Subscriber