ACADEMIC 2026-05-10 PDF, 32 pages

Workflow Memory for Long-Horizon Agentic Composition: Architecture, Dual-Mode Retrieval, and Retrieval-Correction Signal

Sixty percent of context tokens in current LLM agentic sessions are wasted — redundant re-explanation of decisions already made in prior sessions. The key insight: the memory layer produces alignment data (retrieval-correction signal), not just consumes it.

Workflow Memory for Long-Horizon Agentic Composition

Architecture, Dual-Mode Retrieval, and Retrieval-Correction Signal

Volodymyr Ovcharov — LEX AI LLC, Kyiv, Ukraine

Abstract

Sixty percent of context tokens in current LLM agentic sessions are wasted — redundant re-explanation of decisions already made in prior sessions. We present a three-layer workflow memory architecture (domain, workflow, practitioner) with dual-mode retrieval: pull mode at session start, push mode for dormant task refresh. The key insight: the memory layer produces alignment data (retrieval-correction signal), not just consumes it.

Download Full Paper (PDF)