TECH 2026-02-25 10 min

How We Built an MCP Server with 56 Tools for Legal AI

One endpoint. Three services. 58 MCP tools. Triple transport: stdio for Claude Desktop, HTTP REST for web apps, SSE for streaming. Every tool call goes through an 11-step pipeline with cost tracking at each stage. The number of tools will grow. The architecture does not care.

How We Built an MCP Server with 56 Tools for Legal AI

One endpoint. Three services. Triple transport. Here is what it takes to build a production MCP server that actually scales.

The Problem: Legal AI Needs More Than a Single API Call

When a lawyer asks "Negatory or vindication claim for unauthorized occupation of a land plot?" — the answer requires: searching 200+ court decisions, retrieving texts from the Civil Code and the Land Code, comparing "for" and "against" practice, checking precedents, synthesizing a strategic recommendation.

This is not a single LLM call. It is an orchestrated pipeline of 5-7 tool calls.

Architecture: 56 Tools, Three Services, One Gateway

A single environment variable — ENABLE_UNIFIED_GATEWAY=true — turns the backend into an aggregation point.

Triple Transport

stdio (MCP Native)

Pure JSON-RPC via stdin/stdout. Claude Desktop, MCP CLI. Zero overhead.

HTTP REST API

POST /api/tools/:toolName with Bearer token. Batch endpoint for parallel execution. Accept: text/event-stream header switches to SSE.

SSE (MCP-over-SSE)

Two variants: ChatGPT/OpenAI protocol (/sse) and standard MCP SSE (/v1/sse).

Call Flow: 11 Steps

dualAuth — JWT or API key
Balance check → 402 if insufficient
Credit calculation for the tool
Cost tracking — pending record
Cost estimation before execution
Gateway routing — local or remote?
Execution in AsyncLocalStorage context
Handler dispatch → domain logic
Tracking completion — actual tokens
Credit deduction after success
Response with cost breakdown

Patterns That Saved Us

Cost hints in descriptions — every tool has an estimated cost in its description. The LLM sees this during planning.

Budget-aware models — the reasoning_budget parameter maps to different models: quick → nano, deep → gpt-5.1.

Vault isolation — userId is injected at the transport level, tool schema knows nothing about authentication.

Route normalization — without it, 56 tools + UUIDs create thousands of time series in Prometheus.

Numbers

56 tools across 3 services
12 handler classes in the backend
3 transports per service
5,191 legislation articles
16 state registries
Latency: 200ms (cache) to 8s (deep analysis)

The number of tools will grow. The architecture does not care.

Update: New Tools (March 2026)

The total number of MCP tools has grown from 56 to 58 thanks to two new tools in the mcp_openreyestr service.

New tools:

openreyestr_search_erb_debtors — search the Unified Debtors Registry (ERB). Allows finding individuals and legal entities with active enforcement proceedings, filtered by recovery type and debt category.
openreyestr_search_nbu_banks — search the NBU bank registry. Provides access to information about banking institutions, their status (active, liquidation), licenses, and contact details.

Improvements to existing tools:

The get_legislation_section tool now supports vector search as a fallback strategy. If the user provides a rada_id and a text query without a specific article number, the system automatically performs semantic search across the vector index of the relevant law, returning the most relevant sections.