AI Integration · 2026

AI document Q&A for legal-tech SaaS (anonymised)

Rebuilt a retrofit GPT Q&A feature into an AI-native pipeline: pgvector retrieval, reranking, structured outputs, eval harness, LLM gateway with caching. Cost per question dropped 74%, latency dropped 77%, hallucination rate fell below 1%.

Section 01Problem

What we were asked to solve.

Client had a retrofit GPT wrapper bolted onto their existing legal research product. It was slow, expensive, and had a visible hallucination rate. They were considering abandoning the feature.
Section 02Approach

How we engineered it.

Three-week engagement. Week 1: migrated document embeddings into pgvector, designed reranking pipeline, built 40-case eval set. Week 2: structured output migration, LLM gateway with caching and model routing. Week 3: production cut-over with shadow-mode comparison for two days, then full rollout with feature flag.
Section 03Stack

What we built with.

·Next.js
·Postgres + pgvector
·LangChain
·Langfuse
Section 04Outcome

What shipped.

Result 01

Cost per question: $0.08 → $0.021 (74% reduction)

Result 02

Median latency: 4s → 900ms (77% reduction)

Result 03

Hallucination rate: 6% → under 1% (over 8 weeks post-launch)

Result 04

Monthly AI bill: $4,800 → $1,260

Result 05

40 evals shipped in CI, preventing regressions

CorrespondenceGet in touch

Tell us what your institution actually needs.

Send us your requirement. We respond fast, price transparently, and tell you honestly whether AI genuinely helps your problem — or whether you're better off without it.

Write to us
info@triomavtech.com
Call
+91 94402 66755
Office
Hyderabad, Telangana, India
Request a proposal