LEX — AI Legal Platform for Law Firms

AI-powered legal analysis platform for law firms and corporate counsel.

Features

Resources

Blog Articles

Technology

Built on AWS (EC2, Bedrock Claude AI, ALB, WAF, S3, ACM, KMS). PostgreSQL, Redis, Qdrant vector database. TypeScript, React, Node.js.

Start free — 50 credits on registration. Sign up

ACADEMIC PDF, 24 pages

Tokenizer Fertility and Zero-Shot Performance of Foundation Models on Ukrainian Legal Text: A Comparative Study

Tokenizer fertility varies 1.6x across foundation models on Ukrainian legal text, yet this cost-critical dimension is absent from model selection practice. Qwen 3 consumes 60% more tokens than Llama-family; NVIDIA Nemotron Super 3 (120B) outperforms Mistral Large 3 at 1/3 the cost.

Tokenizer Fertility and Zero-Shot Performance of Foundation Models on Ukrainian Legal Text

A Comparative Study

Volodymyr Ovcharov — LEX AI LLC, Kyiv, Ukraine


Abstract

Tokenizer fertility varies 1.6× across foundation models on Ukrainian legal text, yet this cost-critical dimension is absent from model selection practice. (1) Qwen 3 consumes 60% more tokens than Llama-family; (2) NVIDIA Nemotron Super 3 (120B) outperforms Mistral Large 3 (5.6× more parameters) at 1/3 the cost; (3) few-shot degrades by up to 26pp on Ukrainian.


Download Full Paper (PDF)