Architecture & Tech Stack
4-layer architecture of the GamiWays Core engine — real stack extracted from the gami-digidouble-core repo. Every technology choice is justified by latency, sovereignty and maintainability constraints.
Single entry point of the engine. Exposes REST and WebSocket endpoints. Fastify is chosen for its native performance (3× faster than Express), TypeScript-first plugin ecosystem, and native SSE streaming support — critical for real-time LLM responses.
Business use case layer: StartSession, SendMessage, ResumeSession, SwitchAvatar. Each use case orchestrates domain services without knowing infrastructure details. This separation guarantees testability and long-term maintainability.
The engine core: Avatar (AI persona), Game Master (async director), Memory System v3 (3 layers), Context Manager (3-dimension assembly), Knowledge Pipeline (RAG). No infrastructure dependency — the domain is pure and portable.
Infrastructure adapters: PostgreSQL + pgvector for relational persistence and RAG embeddings, Redis for session cache and working memory, Langfuse self-hosted for LLM observability (traces, costs, latencies). All adapters are swappable via interfaces.
Memory System v3 is one of the engine's core innovations. It solves the token explosion problem in long sessions by distributing memory across 3 levels with deterministic selection policies.
Active context window: recent messages, GM state, available tokens. Sub-ms access.
Conversation summaries, extracted facts, scenario progression. Hydrated at session start.
Persistent user profile: preferences, learning history, biographical facts. Semantic retrieval via pgvector.
Every technology choice is driven by concrete constraints: <2s latency, data sovereignty, no vendor lock-in, and long-term maintainability by a small team.
The <2s constraint structures every choice
Latency is not just a technical problem — it is an experience problem. Beyond 2 seconds, users lose their train of thought, the avatar stops being a presence. The goal: <2s end-to-end, first sound within 500ms. This is why Fastify, Redis and SSE streaming are non-negotiable.
| Threshold | Qualification | UX Impact | Status |
|---|---|---|---|
| <500ms | Perceptive fluidity | Perceptive fluidity threshold. User perceives slight delay but interaction remains natural. Target for TTS first audio. | ✓ Target |
| 1s | Acceptable | Conversational comfort threshold. Beyond this, users start anticipating the wait. Target for TTFB (first video frame). | ✓ Target |
| 2s | Natural limit | Conversational naturalness threshold (Nielsen 1993). Beyond this, conversation becomes a series of waits. GamiWays TTFR target. | R&D Goal |
| 6–12s | Engagement break | Current prototype latency (HeyGem OS). User loses the thread, avatar stops being a presence. This is the problem to solve. | Current problem |