From Ontology-Controlled Systems to Oversight-Controlled Training: Formal Foundations for Human–LLM Alignment Signal Validation
Ontology-based filtering of human oversight signal predicts downstream outcome quality: sessions classified as full oversight by a formal domain constitution exhibit 3-6x higher rejection rate, concentrating the most informative alignment action.
From Ontology-Controlled Systems to Oversight-Controlled Training
Formal Foundations for Human–LLM Alignment Signal Validation
Volodymyr Ovcharov — LEX AI LLC, Kyiv, Ukraine
Abstract
Ontology-based filtering of human oversight signal predicts downstream outcome quality: sessions classified as full oversight by a formal domain constitution exhibit 3–6× higher rejection rate, concentrating the most informative alignment action. Five axiomatically defined conditions in ALC description logic formalize when human edit-traces constitute valid RLHF training signal.