Toast
ContributorDecisions

ADR-007: Event System Architecture

ADR-007: Event System Architecture

Status

Accepted

Context

Toast needs a domain event system to decouple operations from their side effects. Currently, several features are tightly coupled to the operations that trigger them:

  • Audit logging - Direct auditLogService.log() calls embedded in services
  • Future webhooks - Would require adding webhook dispatch to every operation
  • Future SSE streaming - Real-time updates would need explicit push logic

The PRD explicitly calls out events as "product surface, not internal plumbing" with the requirement that payloads be "self-contained enough to act without additional API calls."

Requirements from PRD

"Events are product surface, not internal plumbing. External systems (including AI) build automations we haven't imagined."

Key requirements:

  1. Rich, typed payloads - Self-contained enough to act without follow-up API calls
  2. Multiple delivery mechanisms (webhooks, SSE, event log)
  3. Event taxonomy is product design - emit events for automations you haven't built yet
  4. Audit trail - who/what triggered each event (human, AI, automation)

Decision

Implement a domain event system with three components:

  1. Event Type System - Strongly-typed event interfaces with discriminated unions
  2. Event Bus Abstraction - Pluggable bus with in-memory implementation for development/testing
  3. Event-Driven Audit Logging - Audit logging as an event subscriber, not direct calls

Event Naming Convention

Events follow resource.verb naming in past tense:

  • content.created (not content.create or createContent)
  • staff.updated (not staff.update or updateStaff)
  • auth.login (not auth.logged_in or login)

Past tense reflects that events are immutable facts about things that have already happened.

Event Taxonomy

Initial events based on existing audit log call sites:

Event TypeResourceTrigger
content.createdcontentNew post/page created
content.updatedcontentPost/page modified
staff.createdstaffNew staff member added
staff.updatedstaffStaff profile changed
site.createdsiteNew site provisioned
auth.loginsessionStaff member signs in
auth.logoutsessionStaff member signs out

Event Structure

All events share a base structure:

interface DomainEvent<
  T extends string = string,
  D extends Record<string, unknown> = Record<string, unknown>,
> {
  id: string; // Unique event ID (UUID)
  type: T; // Event type (e.g., 'content.created')
  siteId: string; // Multi-tenant isolation
  timestamp: string; // ISO 8601 timestamp
  actor: {
    type: ActorType; // 'staff' | 'api_key' | 'automation' | 'system'
    id: string | null; // Actor ID (null for system events)
  };
  data: D; // Event-specific payload
}

Concrete events narrow the type and data fields for type safety:

interface ContentCreatedEvent extends DomainEvent<'content.created', ContentCreatedData> {
  type: 'content.created';
  data: ContentCreatedData;
}

Discriminated Union for Type Safety

The AnyDomainEvent union enables exhaustive pattern matching:

function handleEvent(event: AnyDomainEvent) {
  switch (event.type) {
    case 'content.created':
      // TypeScript knows event.data is ContentCreatedData
      console.log(event.data.title);
      break;
    case 'auth.login':
      // TypeScript knows event.data is AuthLoginData
      console.log(event.data.email);
      break;
    // ... other cases
  }
}

Event Bus Design

The event bus provides fire-and-forget semantics:

interface EventBus {
  emit(event: DomainEvent): void; // Returns void, not Promise
  subscribe(eventType: string, handler: EventHandler): void;
  unsubscribeAll(): void;
}

Key design decisions:

  1. emit() returns void - Callers must never await event handling. This prevents cascading failures and keeps the primary operation fast.

  2. Wildcard subscription - Handlers can subscribe to '*' to receive all events (used by audit log subscriber).

  3. Error isolation - A failing subscriber never crashes the emitter or blocks other subscribers. Errors are logged but not propagated.

  4. Deferred execution - Handlers run via queueMicrotask() to ensure emit returns immediately.

Implementation Strategy

Phase 1: Foundation (this ADR)

  • Event type system with all current operations
  • In-memory event bus for development/testing
  • Event factory functions for type-safe event creation

Phase 2: Audit Integration

  • Audit logging becomes an event subscriber
  • Existing auditLogService.log() calls replaced with eventBus.emit()

Phase 3: Delivery Mechanisms (future)

  • Webhooks subscriber with retry logic
  • SSE streaming subscriber
  • Event log with cursor pagination

Production Backend (future)

  • BullMQ with Redis for persistent, reliable event delivery
  • Separate read/write connections
  • Dead letter queue for failed deliveries

Alternatives Considered

1. Keep Direct Audit Calls

Approach: Continue calling auditLogService.log() directly from services.

Pros:

  • Simpler, no new abstraction
  • Already working

Cons:

  • Every new subscriber (webhooks, SSE) requires touching every service
  • No standardized event shape
  • Tight coupling between operations and side effects

2. Generic Event Emitter (Node.js EventEmitter)

Approach: Use Node's built-in EventEmitter for pub/sub.

Pros:

  • No dependencies
  • Familiar API

Cons:

  • Synchronous by default
  • No type safety on event payloads
  • No built-in error isolation
  • Not suitable for future distributed scenarios

3. Full Message Queue Immediately (BullMQ)

Approach: Start with BullMQ/Redis from day one.

Pros:

  • Production-ready persistence
  • Built-in retry, dead letter queues
  • Distributed worker support

Cons:

  • Requires Redis dependency for development
  • Over-engineered for current scale
  • Harder to test without Redis

Consequences

Positive

  • Loose coupling - Services emit events without knowing who listens
  • Type safety - Discriminated unions enable exhaustive pattern matching
  • Testability - InMemoryEventBus enables synchronous, isolated tests
  • Extensibility - Adding webhooks/SSE is a new subscriber, not changes to every service
  • Audit trail - Every operation automatically logged via event subscriber
  • Self-documenting - Event types serve as a catalog of what the system does

Negative

  • Indirection - Code path from operation to audit log is less obvious
  • Learning curve - Developers must understand event-driven patterns
  • Eventual consistency - Side effects happen asynchronously (by design)

Trade-offs Accepted

The indirection cost is acceptable because:

  1. Events are a documented, explicit abstraction
  2. Type safety ensures compile-time correctness
  3. The pattern scales to webhooks, SSE, and other subscribers

File Structure

apps/api/src/events/
├── types.ts              # Base interfaces (DomainEvent, EventBus, EventHandler)
├── domain-events.ts      # Concrete event types, payloads, factory functions
├── event-bus.ts          # InMemoryEventBus implementation
├── index.ts              # Re-exports for consumers
├── event-bus.test.ts     # Event bus unit tests
└── domain-events.test.ts # Event type tests

References

  • PRD Event System section: docs/decisions/000-product-requirements.md
  • Issue #283: Event System Foundation (epic)
  • Issue #284: Event Type System & Taxonomy (this work)
  • Issue #285: Event Bus Implementation (completed)
  • Issue #286: Wire Audit Logging as Event Subscriber

On this page