Problem
Enterprise ITSM teams drown in repetitive incident triage. Off-the-shelf cloud LLMs raise compliance and data residency concerns. Existing solutions don't integrate cleanly with ServiceNow data models or respect on-prem constraints.
Solution
Python middleware (Flask + LangChain) running on a Dell Precision RTX 6000 with Ollama-served gemma2:27b. ChromaDB vector store for incident history. PII sanitization layer before any prompt construction. REST API exposed to ServiceNow for inline incident enrichment. Zero internet egress required.
Outcome
Faster incident resolution, reduced analyst load on tier-1 triage, and a deployment model compliant with regulated industries (banking, healthcare, government). The architecture proved that production-grade AI augmentation doesn't require sending data to a cloud LLM.
