System CoreCore Agents
Vitruvyan Docs
LLM / AI Layer (LLMAgent)
What it does
- Enforces a single entry point for LLM calls across services:
get_llm_agent()→LLMAgent. - Provides operational safety primitives:
- rate limiting to prevent provider throttling
- circuit breaker for graceful degradation
- cache as an optimization layer (Redis-backed when available)
- metrics for latency/tokens/cache hit rate
- Supports multiple interaction styles:
complete()for simple prompt + optional system promptcomplete_with_messages()for full message arrayscomplete_with_tools()for function calling / tool execution
Core contract
“Gateway, not a brain”
LLMAgentis a gateway: it does not own business logic, domain prompts, or truth rules.- Callers (nodes/services) own:
- prompts and schemas
- validation (Truth layer) and governance decisions
- what gets persisted or emitted
Configuration (model resolution)
Model resolution chain:
VITRUVYAN_LLM_MODEL → GRAPH_LLM_MODEL → OPENAI_MODEL → gpt-4o-mini
Required secret:
OPENAI_API_KEY
Code: vitruvyan_core/core/agents/llm_agent.py
Typical usage
Failure modes (designed behavior)
- If the provider throttles: the rate limiter blocks calls pre-emptively (predictive token accounting).
- If failures accumulate: the circuit breaker opens to stop cascading errors, then auto-resets after cooldown.
- If Redis/cache is unavailable: caching is disabled automatically (warning logged), calls still proceed.
Integration points
In the current architecture, LLMAgent is used by orchestration nodes and services that need:
- intent classification / parsing
- tool routing and structured extraction
- conversational composition
See: docs/architecture/MAPPA_ARCHITETTURALE_MODULI.md
References (deep dive)
- LLM gateway:
vitruvyan_core/core/agents/llm_agent.py - Conversational layer context:
.github/Vitruvyan_Appendix_F_Conversational_Layer.md