ADR-007: Event System Architecture

Status

Accepted

Context

Toast needs a domain event system to decouple operations from their side effects. Currently, several features are tightly coupled to the operations that trigger them:

Audit logging - Direct auditLogService.log() calls embedded in services
Future webhooks - Would require adding webhook dispatch to every operation
Future SSE streaming - Real-time updates would need explicit push logic

The PRD explicitly calls out events as "product surface, not internal plumbing" with the requirement that payloads be "self-contained enough to act without additional API calls."

Requirements from PRD

"Events are product surface, not internal plumbing. External systems (including AI) build automations we haven't imagined."

Key requirements:

Rich, typed payloads - Self-contained enough to act without follow-up API calls
Multiple delivery mechanisms (webhooks, SSE, event log)
Event taxonomy is product design - emit events for automations you haven't built yet
Audit trail - who/what triggered each event (human, AI, automation)

Decision

Implement a domain event system with three components:

Event Type System - Strongly-typed event interfaces with discriminated unions
Event Bus Abstraction - Pluggable bus with in-memory implementation for development/testing
Event-Driven Audit Logging - Audit logging as an event subscriber, not direct calls

Event Naming Convention

Events follow resource.verb naming in past tense:

content.created (not content.create or createContent)
staff.updated (not staff.update or updateStaff)
auth.login (not auth.logged_in or login)

Past tense reflects that events are immutable facts about things that have already happened.

Event Taxonomy

Initial events based on existing audit log call sites:

Event Type	Resource	Trigger
`content.created`	content	New post/page created
`content.updated`	content	Post/page modified
`staff.created`	staff	New staff member added
`staff.updated`	staff	Staff profile changed
`site.created`	site	New site provisioned
`auth.login`	session	Staff member signs in
`auth.logout`	session	Staff member signs out

Event Structure

All events share a base structure:

interface DomainEvent<
  T extends string = string,
  D extends Record<string, unknown> = Record<string, unknown>,
> {
  id: string; // Unique event ID (UUID)
  type: T; // Event type (e.g., 'content.created')
  siteId: string; // Multi-tenant isolation
  timestamp: string; // ISO 8601 timestamp
  actor: {
    type: ActorType; // 'staff' | 'api_key' | 'automation' | 'system'
    id: string | null; // Actor ID (null for system events)
  };
  data: D; // Event-specific payload
}

Concrete events narrow the type and data fields for type safety:

interface ContentCreatedEvent extends DomainEvent<'content.created', ContentCreatedData> {
  type: 'content.created';
  data: ContentCreatedData;
}

Discriminated Union for Type Safety

The AnyDomainEvent union enables exhaustive pattern matching:

function handleEvent(event: AnyDomainEvent) {
  switch (event.type) {
    case 'content.created':
      // TypeScript knows event.data is ContentCreatedData
      console.log(event.data.title);
      break;
    case 'auth.login':
      // TypeScript knows event.data is AuthLoginData
      console.log(event.data.email);
      break;
    // ... other cases
  }
}

Event Bus Design

The event bus provides fire-and-forget semantics:

interface EventBus {
  emit(event: DomainEvent): void; // Returns void, not Promise
  subscribe(eventType: string, handler: EventHandler): void;
  unsubscribeAll(): void;
}

Key design decisions:

emit() returns void - Callers must never await event handling. This prevents cascading failures and keeps the primary operation fast.
Wildcard subscription - Handlers can subscribe to '*' to receive all events (used by audit log subscriber).
Error isolation - A failing subscriber never crashes the emitter or blocks other subscribers. Errors are logged but not propagated.
Deferred execution - Handlers run via queueMicrotask() to ensure emit returns immediately.

Implementation Strategy

Phase 1: Foundation (this ADR)

Event type system with all current operations
In-memory event bus for development/testing
Event factory functions for type-safe event creation

Phase 2: Audit Integration

Audit logging becomes an event subscriber
Existing auditLogService.log() calls replaced with eventBus.emit()

Phase 3: Delivery Mechanisms (future)

Webhooks subscriber with retry logic
SSE streaming subscriber
Event log with cursor pagination

Production Backend (future)

BullMQ with Redis for persistent, reliable event delivery
Separate read/write connections
Dead letter queue for failed deliveries

Alternatives Considered

1. Keep Direct Audit Calls

Approach: Continue calling auditLogService.log() directly from services.

Pros:

Simpler, no new abstraction
Already working

Cons:

Every new subscriber (webhooks, SSE) requires touching every service
No standardized event shape
Tight coupling between operations and side effects

2. Generic Event Emitter (Node.js EventEmitter)

Approach: Use Node's built-in EventEmitter for pub/sub.

Pros:

No dependencies
Familiar API

Cons:

Synchronous by default
No type safety on event payloads
No built-in error isolation
Not suitable for future distributed scenarios

3. Full Message Queue Immediately (BullMQ)

Approach: Start with BullMQ/Redis from day one.

Pros:

Production-ready persistence
Built-in retry, dead letter queues
Distributed worker support

Cons:

Requires Redis dependency for development
Over-engineered for current scale
Harder to test without Redis

Consequences

Positive

Loose coupling - Services emit events without knowing who listens
Type safety - Discriminated unions enable exhaustive pattern matching
Testability - InMemoryEventBus enables synchronous, isolated tests
Extensibility - Adding webhooks/SSE is a new subscriber, not changes to every service
Audit trail - Every operation automatically logged via event subscriber
Self-documenting - Event types serve as a catalog of what the system does

Negative

Indirection - Code path from operation to audit log is less obvious
Learning curve - Developers must understand event-driven patterns
Eventual consistency - Side effects happen asynchronously (by design)

Trade-offs Accepted

The indirection cost is acceptable because:

Events are a documented, explicit abstraction
Type safety ensures compile-time correctness
The pattern scales to webhooks, SSE, and other subscribers

File Structure

apps/api/src/events/
├── types.ts              # Base interfaces (DomainEvent, EventBus, EventHandler)
├── domain-events.ts      # Concrete event types, payloads, factory functions
├── event-bus.ts          # InMemoryEventBus implementation
├── index.ts              # Re-exports for consumers
├── event-bus.test.ts     # Event bus unit tests
└── domain-events.test.ts # Event type tests

References

PRD Event System section: docs/decisions/000-product-requirements.md
Issue #283: Event System Foundation (epic)
Issue #284: Event Type System & Taxonomy (this work)
Issue #285: Event Bus Implementation (completed)
Issue #286: Wire Audit Logging as Event Subscriber

ADR-007: Event System Architecture

On this page