The degree to which the AI maintains coherent information, personality, and logic across turns and sessions.
Framework for evaluating consistency in AI systems
Industry standards for conversational coherence
No contradictions across entire conversation or session, remembers and references previous statements accurately, maintains stable personality/tone throughout, logical coherence across all responses.
Example: User asks about pricing in message 1, asks follow-up in message 10 → AI provides consistent pricing information without contradiction
Minimal contradictions, quickly self-corrects if noticed. Good context retention across turns, stable personality and tone, logical flow maintained.
Occasional minor contradictions that don't undermine core message. Generally remembers context but may need reminders, mostly stable personality with slight variations, logic mostly sound with occasional gaps.
Multiple contradictions within conversation, forgets important context from earlier turns. Personality/tone shifts noticeably, logical gaps that require user to re-explain.
Frequent contradictions that confuse user, minimal context retention. Unstable personality (formal → casual → formal), logical incoherence across responses.
Direct contradictions within same response, no context retention even within few turns. Personality completely unstable, responses don't follow from user's input.
Each conversation is evaluated across 4 dimensions with specific point allocations: