What is Shim?

Shim is a JSON repair layer for LLM outputs.

We intercept malformed JSON from language models and repair it in sub-millisecond time. No retries. No buffering delays. No production failures.

The Problem We Solve

LLMs produce malformed JSON that breaks production APIs. This isn't a bug—it's a fundamental characteristic of how language models work.

15-20% of outputs

Truncation

LLM hits token limit before closing brackets. Incomplete objects break JSON.parse().

10-15% of outputs

Markdown Wrapping

LLM adds ```json fences around valid JSON. Parser chokes on the wrapper.

5-10% of outputs

Schema Drift

LLM returns wrong types or missing required fields. Validation fails downstream.

Your options before Shim:

  • Retry → Expensive ($0.002/retry × 1000 failures = $2/day) and slow (2-5s delay)
  • OutputFixingParser → Waits for full output (2-5s buffering delay), breaks streaming
  • Regex hacks → Brittle, breaks on nested objects, maintenance nightmare

Shim: Receive. Repair. Return. Sub-millisecond repair, <10ms total API latency.

How Shim Works

Shim intercepts LLM outputs, repairs syntax and schema errors, and returns validated JSON in sub-millisecond time.

Shim Repair Pipeline
StageActionLatencyConfidence
1. Syntax RepairRemove markdown fences, close brackets, fix trailing commas<0.05msHigh
2. Schema ValidationValidate against provided JSON Schema<0.03msHigh
3. Type CoercionCoerce types to match schema (e.g., "30" → 30)<0.02msMedium

Who Uses Shim

Indie Developers

Building AI apps with LangChain, LlamaIndex, or custom frameworks. Ship faster without worrying about JSON edge cases.

Startups

Shipping AI features to production. Need reliability without hiring a dedicated AI infrastructure team.

Enterprises

Deploying AI at scale with compliance requirements. Zero data persistence meets security audits.

Performance Benchmarks

According to Shim's internal testing (February 2026):

<0.1ms
Repair engine (P99: 0.03ms)
99.9%
Success rate for syntax repairs
200-500x
Faster than OutputFixingParser
330+
Edge locations globally (Cloudflare)

Source: Shim Engineering Team, tested on 10M+ real-world LLM outputs

Our Principles

1. Zero Data Persistence

We never store your content. Period.

  • No raw LLM outputs stored
  • No repaired JSON stored
  • No field names or values logged
  • Only metadata tracked: repair types, confidence scores, latency

2. Never Fail Silently

Every response includes confidence scores and exact repairs applied. No surprises.

  • HTTP 200 always, even for failures (structured error envelope)
  • Confidence levels: high, medium, low (pessimistic, never overpromise)
  • Detailed metadata shows exactly what changed

3. Degrade, Don't Break

Never break production, even during abuse or quota overages.

  • Overage = throttle, not block
  • Rate limit = slow down, not reject
  • Even at 150% quota, return basic repairs

Why Shim Exists

We built Shim because we were tired of LLM outputs breaking production.

Every AI developer hits this wall: your LLM works perfectly in testing, then production traffic reveals the edge cases. Truncated JSON. Markdown fences. Type mismatches. The failure rate is 15-30% depending on your prompt complexity.

The existing solutions all suck:

  • Retries cost money and add 2-5 seconds of latency
  • OutputFixingParser waits for full output (kills streaming, adds 2-5s delay)
  • Regex hacks break on nested objects and become unmaintainable

We needed something that:

  1. Works with streaming (token-by-token repair, no buffering)
  2. Is fast (sub-millisecond, not multi-second)
  3. Respects privacy (zero data persistence)
  4. Never breaks production (graceful degradation, always return valid JSON)

That's Shim. A reliability layer for AI agents. Infrastructure, not a feature.

Ready to fix your JSON?

Start with 1,000 free repairs per month. No credit card required.