TECH 2026-04-17 6 min

Open Doors: Looking for Independent AI/ML Engineers and Open-Source Contributors

Name: LEX
Author: SecondLayer

LEX AI is opening its platform as open source. We welcome strong engineers — AI/ML, backend, data, frontend — to contribute or join the team. What's already open, who we're looking for, and how to get involved.

Open Doors: Looking for Independent AI/ML Engineers and Open-Source Contributors

LEX AI has been built since 2024 by a small team. We're now opening part of the platform as open source and inviting independent engineers to join — as contributors and as future team members.

What LEX AI Is

LEX is a Ukrainian legal AI platform. Semantic search across 100M+ court decisions (EDRSR — the largest open court decisions corpus in Europe), legislation from the Ukrainian Parliament, OSINT and due diligence, consultations, billing. The stack is assembled as MCP (Model Context Protocol) servers behind a unified gateway.

Our second product — Panoptic (panoptic.com.ua) — is an OSINT platform aggregating 18+ intelligence data sources: sanctions, corporate ownership, credential breaches, IP/domain reputation, GDELT, INTERPOL, World Bank Debarment.

We're building Harvey.ai-level quality for Ukrainian jurisprudence on open-weight models — DeepSeek-V3, Llama, Qwen — because the data is unique (no such corpus exists in the EU), and open-weight models after continued pre-training deliver 90%+ of flagship LLM quality on domain tasks at a fraction of the cost.

Our Repository Layout

We maintain two repositories, and this is important to understand up front.

1. `overthelex/secondlayer` — public, open source

The main monorepo, now public:

https://github.com/overthelex/secondlayer

Almost the entire platform is there:

Three MCP servers (mcp_backend, mcp_rada, mcp_openreyestr) — court cases, parliament, business registry
Web frontend (lexwebapp) — React 19, Vite, TailwindCSS, Zustand, TanStack Query
Shared TypeScript package (packages/shared) — LLM manager, logger, cost tracker, SSE handler, database base class
Developer Console (platform) — platform.legal.org.ua, the developer portal: API keys, documentation, integration examples
Data importers for 340M+ records from 15 government APIs — EDRSR, Verkhovna Rada, NACP, OpenReyestr, OpenSanctions, GLEIF, ICIJ Offshore Leaks, HIBP, NVD, INTERPOL, World Bank
Full CI/CD — self-hosted GitHub Actions runner, blue-green deploy over SSH, Claude Code auto-fix agents for failing builds
All deployment configuration — Docker Compose for local, blue-green compose for production, nginx, manage-gateway script
Playwright E2E + Jest/Vitest unit tests
Migrations for three PostgreSQL instances
Internal documentation, architecture notes

Clone it, read it, run it locally. Everything needed for a working instance is there.

2. `overthelex/secondlayer-core` — private, closed source

A separate repository we deliberately keep private. It contains:

Chat and orchestration logic — how user queries are classified, routed between tools, and composed into multi-step responses
Production prompts — exact templates, few-shot examples, system messages used in production for classification, summarization, citation checks, tool selection
Billing and payment business logic — credit deduction rules, subscription tier resolution, Monobank callback handlers
Anti-abuse and rate-limiting heuristics we don't want adversaries to enumerate

This is the minimum closed surface that protects our product positioning without holding back the open parts. The whole "chat logic" — prompt engineering, tool orchestration, model cascading, response composition — lives here, and it is not public. The open repository expects this layer as a dependency but ships fully functional stub implementations for contributors.

If you join the team, you get access to secondlayer-core from day one. If you contribute externally, you work against the open repo and the stubs — that already covers everything except production prompt engineering.

Who We're Looking For

We don't hire by job title. We're looking for people who already do strong work — and want to do it on a meaningful domain, with real data and real users.

AI/ML engineers:

LoRA fine-tuning of large models (70B+), continued pre-training
Embeddings fine-tuning (BGE-M3, custom encoders) for retrieval
RLHF, constitutional alignment, adversarial training setups
Hands-on with Vertex AI / SageMaker HyperPod / Trainium / TPU v5p on multi-node clusters
Retrieval-augmented generation, citation verification, hallucination guards

Backend / distributed systems:

PostgreSQL at billion-row scale (pgvector, partitioning, TOAST optimizations)
Event-driven architectures, queues, replication, PgBouncer
MCP servers, tool orchestration, LLM gateways, cost tracking

Data engineering / OSINT:

Scraping at scale (rate-limiting, proxy rotation, resume logic, checkpointing)
ETL for government open registries
Sanctions screening, KYC/AML, due diligence pipelines

Frontend:

React 19 + TypeScript at production level
Complex UI for legal analytics (data-heavy dashboards, evidence panels)
Ukrainian i18n, accessibility, performance optimization

Philosophy

Open everything that doesn't break the business. We don't hide the architecture — it isn't the competitive edge. The edge is data, domain quality, and iteration speed.
Pragmatism over hype. A distributed monolith today can be the right answer. Microservices ≠ virtue. A framework ≠ a solution.
Legal deserves serious AI engineering. Not "a chatbot with statutes" — real legal modeling: constitutional alignment, citation verification, jurisdictional specialization.
Open source by default. If the code doesn't contain proprietary prompts, API keys, or client data — it's public.

How to Join

As a contributor:

Check open issues on GitHub (github.com/overthelex/secondlayer)
Submit a PR — we review within 48 hours
For large changes, open a discussion first

As a hiring candidate:

Email vladimir@legal.org.ua with a short resume. No page-long cover letter needed — show three things:

What you've done before (GitHub, a link to a specific project with detail)
Why this domain — legal AI, open data, OSINT — interests you
What you want to build in the next 6 months

We respond fast. Interview is a technical discussion (no LeetCode), a pair-programming session on a real task from the backlog, and a coffee chat with the team.

Our Promise

Fully remote. The team is distributed across Europe.
No micromanagement. Trust by default. Output matters more than Slack presence.
Prod access from day one. No "probation month" in read-only.
Compute budget. If an idea needs a GPU cluster — we talk to Google Cloud, AWS, Nebius and find the resource.
Publication under your name. Your work is your credit. We don't hide contributors.

Context

We're currently in active conversations with Google Cloud and AWS about sponsorship for a 12-month ML training plan (195K–265K, DeepSeek-V3 685B continued pre-training on 50–80B tokens of the EDRSR corpus). We have paying users and B2B clients. Not a startup-in-a-garage, not another enterprise clone. Something in between — and that's what makes the work interesting.

If you're excited by building real AI infrastructure for jurisprudence on the largest open court decisions corpus in Europe — let's talk.

Open repo: https://github.com/overthelex/secondlayer Closed core (chat logic): overthelex/secondlayer-core — private, granted on hire Contact: vladimir@legal.org.ua Site: https://legal.org.ua

Open Doors: Looking for Independent AI/ML Engineers and Open-Source Contributors

Open Doors: Looking for Independent AI/ML Engineers and Open-Source Contributors

What LEX AI Is

Our Repository Layout

1. overthelex/secondlayer — public, open source

2. overthelex/secondlayer-core — private, closed source

Who We're Looking For

Philosophy

How to Join

Our Promise

Context

1. `overthelex/secondlayer` — public, open source

2. `overthelex/secondlayer-core` — private, closed source