LEGAL 2026-04-02 18 min

AI Model Safety on Open Registries: Asimov's Laws as an Ethical Framework

How to ensure that a model with access to 50M+ records doesn't become a tool for pressuring the innocent? Asimov's Three Laws adapted for legal AI, threat scenarios, and architectural solutions for RLHF training on GCP.

AI Model Safety on Open Registries: Ethical Boundaries and Asimov's Laws

Introduction

Lex AI LLC has spent the past 6 months developing a specialized AI model trained on the complete corpus of Ukraine's open government registries: the Unified State Registry of Court Decisions (EDRSR), the legal entities registry, the debtors registry, data from the Verkhovna Rada (Parliament), NAPC (National Agency on Corruption Prevention), the Ministry of Internal Affairs' wanted persons and vehicles registries, NIPO patent registries, and more. Training takes place on Google Cloud Platform (GCP) infrastructure using RLHF (Reinforcement Learning from Human Feedback) and fine-tuning techniques.

This article raises a fundamental question: how do we ensure that a model with access to an unprecedented volume of structured data about individuals and legal entities does not become a tool for pressuring the innocent?

1. Asimov's Three Laws as an Ethical Foundation

In 1942, Isaac Asimov formulated the Three Laws of Robotics, which remain the most intuitively clear ethical framework for AI systems.

First Law: Do No Harm

A robot may not injure a human being or, through inaction, allow a human being to come to harm.

In the context of a legal AI model, this means: the model must not generate conclusions, arguments, or connections that could be used for groundless accusations or pressure against an individual. Even when data is formally public, its aggregation and interpretation can create a false picture that causes real harm.

The most acute issue here is the aggregation effect: individually, each registry record is harmless, but combining them can fabricate a "suspect profile" out of nothing. Closely related is the problem of correlation without causation – the model can find statistical relationships between facts that have no causal connection whatsoever and present them as meaningful. Finally, there is a systemic bias best described as survivorship bias: if the model is trained predominantly on guilty verdicts (which are statistically more common), it may carry a built-in tilt toward prosecution without even "realizing" it.

Second Law: Obey Humans (But Not Against the First)

A robot must obey orders given it by human beings except where such orders would conflict with the First Law.

This is a critically important principle. Even if a user explicitly asks the model to "find everything that can be used against person X," the model should provide objective information from the registries but refuse to construct a prosecutorial narrative. It must explicitly state that the presence of records in registries does not constitute proof of guilt and suggest that exculpatory circumstances also be considered. Obedience does not mean complicity in manipulation.

Third Law: Protect Yourself (But Not Against the First or Second)

A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

In the context of an AI system, this concerns model integrity: protection against adversarial attacks, prompt injection, and manipulations aimed at circumventing ethical constraints. The model must be resilient against attempts to "convince" it to violate the First Law. If an attacker tries to push the model beyond its boundaries through a series of incremental requests, the system must recognize this pattern and stop.

2. Specific Threats: The Model as a Weapon of Pressure

2.1. The "Dossier on Demand" Scenario

An attacker asks the model to compile everything known about an individual: court cases (including those where the person was a witness or victim), related legal entities, debt obligations, and connections to other persons through co-founding companies.

Why this is dangerous: The result looks like an "objective analysis" but is in fact a manipulative presentation of information. A person who had 3 court cases as a plaintiff (i.e., was defending their own rights) appears in such a dossier as "a person with numerous court disputes." Context is destroyed; only the count remains.

Defense: The model must always indicate the person's procedural status in each case – plaintiff, defendant, third party, victim – along with the case outcome. Without this context, any aggregation is potentially manipulative.

2.2. The "Guilt by Association" Scenario

The model discovers that a person co-founded a company whose other founder has a criminal record. Without context, this creates a false impression of involvement. The person may be an impeccable entrepreneur who has no idea about their business partner's past, yet the aggregated analysis puts them in the same category.

Defense: The model must explicitly separate facts about the person themselves from facts about related persons, accompanying each such association with a disclaimer about the absence of legal liability for the actions of third parties.

2.3. The "Old Sins" Scenario

The model finds a court decision from 15 years ago in which a person was found guilty of a minor offense. The conviction has long since been expunged, but the data remains in the EDRSR. In legal terms, this person has a completely clean record – but the machine does not understand this without specialized training.

Defense: The model must account for statutes of limitations, criminal record expungement, and the right to be forgotten. Information that, by law, should no longer affect a person's reputation must not be presented as current. Time is not just metadata – it is a legally significant factor.

3. Architectural Solutions for Ensuring Safety

3.1. Safety Layer in RLHF Training

When training the model on GCP using RLHF, it is critically important to include negative examples in the process – teaching the model to recognize and reject requests aimed at constructing prosecutorial narratives. In parallel, response balancing is essential: for every "aggravating" fact, the model should automatically seek context and mitigating circumstances. And finally, systematic red teaming – testing the model with a team that deliberately tries to "break" it and use it for manipulation.

3.2. Access Levels and Auditing

The system provides three access levels. At the first, public level, only basic registry search without aggregation is available – users can find a specific court decision or company but cannot build a comprehensive profile of a person. The second level, intended for attorneys and lawyers, unlocks aggregated analysis but accompanies every response with ethical disclaimers and logs requests to an audit trail. The third level – for courts and law enforcement – provides full analysis but with mandatory auditing of every request and the ability to investigate abuse.

Each level has different constraints on the depth of analysis and data cross-referencing.

3.3. Mandatory Disclaimers

The model must automatically append to every analytical response: the source of each fact (specific registry, case number, date), procedural context (the person's role in the case and the outcome), a general disclaimer that the presence of information in a registry does not constitute proof of guilt, and a recommendation to consult a qualified lawyer for legal assessment. This is not "fine print" – it is an integral part of every response, without which any analysis is incomplete and potentially dangerous.

3.4. The Presumption of Innocence (Hardcoded)

This is not a setting or a parameter – it is a fundamental rule built into the system at the architectural level:

The model always assumes that a person is innocent until a court has established otherwise through a legally binding verdict.

In practice, this means that pending cases are presented solely as "under consideration," with no hint at a probable outcome. Acquittals and dismissed cases are given the same priority as convictions – the model does not bury positive information. And the model categorically does not make predictions about the outcomes of pending cases, even if statistically "similar cases" ended in a particular way.

4. Fine-Tuning on Ukrainian Registries: Specific Challenges

4.1. Data Quality

Ukraine's open registries have well-known quality issues. The same person may appear under different name variants due to duplicate records and transliteration errors. Some records are incomplete – missing case outcomes, making correct analysis impossible. Additionally, there are significant update delays: a decision may be overturned on appeal, but the original record in the registry remains unchanged.

The model must account for these limitations and not draw conclusions from potentially inaccurate data. Uncertainty in input data must be transparently conveyed in the response, not masked by a confident tone.

4.2. Wartime Context

A separate class of sensitivity concerns data related to wartime conditions. Registries of displaced persons, data on persons eligible for military service, information from temporarily occupied territories – all of this requires special handling. The model categorically must not provide information that could reveal the location of individuals, aggregate data that in combination could identify military personnel, or use internally displaced person status as a negative factor in any analysis. This is not merely an ethical rule – in wartime, it is a matter of people's physical safety.

4.3. Training Scale and Infrastructure

Training on GCP operates on a massive corpus: over 50 million EDRSR court decisions, approximately 5 million legal entity records, NAPC data, and patent registries. GCP A3/A3+ instances with H100 GPUs are used for fine-tuning. The entire cycle is planned for 6 months of iterative work following a "data -> training -> red teaming -> correction -> repeat" cycle. Data security is ensured by keeping all data within the GCP EU region (europe-west4) with encryption at rest and in transit.

5. Legal Liability

As the developer, Lex AI LLC bears responsibility for ensuring that data processing complies with Ukraine's Law "On Personal Data Protection" and GDPR compliance for processing data of EU citizens, should such data appear in the registries. The company is obligated to ensure every individual's right to access information about themselves, correct inaccuracies, and request data deletion, as well as to prevent the model from being used for persecution, blackmail, or unlawful pressure.

The key question: even when data is public, its mass aggregation and intelligent analysis creates a new quality of information that requires separate legal regulation. Openness of data does not mean openness to abuse. Between the right to access public information and the right to privacy lies a fine line, and an AI model must be on the right side of that line.

6. Practical Recommendations

For Model Developers (the Lex AI Team)

Before releasing each model version, an "Asimov Test" must be conducted – verification against at least 100 potential abuse scenarios, from direct requests for compromising material to sophisticated multi-step manipulations. For independent oversight of the model's development, an Ethics Board should be established – a council of lawyers, human rights advocates, and technical specialists not subordinate to the product team.

On the technical level, a complete audit log of all requests for aggregated analysis of individuals must be maintained to enable investigation of abuse. Mass analysis of lists of persons without justification and authorization must be prohibited at the API level. Additionally, rate limiting must restrict the number of analytical requests about a single individual within a time period – if someone makes 50 queries about one person in an hour, that is a signal for the security system.

For Model Users

Analysis results are informational, not legal conclusions. They cannot be used as evidence in court or as grounds for making legally significant decisions without consulting a qualified lawyer. Aggregated analysis should not be used to pressure individuals without legal grounds, and the currency of any information should always be verified against primary sources, as registries may contain outdated or incomplete data.

7. The Zeroth Law: Protecting Humanity

Asimov later added the Zeroth Law:

A robot may not harm humanity, or, by inaction, allow humanity to come to harm.

This law supersedes all others. In the context of a legal AI model, it means: even if protecting a specific individual conflicts with the interests of society (for example, the person has indeed committed a crime), the model must still not substitute itself for the court. Its role is to provide information and context, not to pass judgment.

The temptation to "help justice" through algorithmic analysis is extraordinarily powerful. But history teaches that every time technology has become the judge, the result has been injustice. From predictive policing in the United States to China's social credit system – the automation of justice consistently leads to systemic discrimination against the most vulnerable.

The model is a tool of justice, not justice itself.

Conclusion

Building an AI model trained on the complete corpus of Ukraine's open registries is a technologically feasible and legally valuable project. However, the potential for abuse is significant. Asimov's Three Laws, adapted to the legal AI context, provide a clear ethical framework: do not generate prosecutorial narratives and always provide context; fulfill user requests but refuse manipulative aggregation; be resilient against attempts to circumvent ethical constraints.

Lex AI LLC commits to upholding these principles at every stage of development – from data collection to RLHF training on GCP to every response the model delivers to the end user.

Technology must serve justice, not be weaponized against it.

Lex AI LLC, 2026.