How Korrespond works — Hard-RAG, EU-hosted Claude, 3-pass pipeline

Architecture

Three passes. Each with a distinct job.

The pipeline is intentionally sequential — Pass 1 is cheap and fast (gpt-4o-mini); Pass 2 is expensive and only runs if the situation is clear enough; Pass 3 is optional and user-triggered.

Pass 1 · gpt-4o-mini

Classify & gap-check

Parses the intake and returns a structured JSON classification:

summary — one-sentence case summary
parties — identified actors
applicable_acts — relevant statute sets
missing_facts[] — gaps that would hurt draft quality
suggested_goal — inferred goal if none stated

If missing_facts is non-empty → emits clarify gate. No credit deducted until Pass 2 starts.

Pass 2 · gpt-4o

Retrieve → draft → check → translate

Four sub-steps, each verified before proceeding:

Retrieve: hybrid dense + BM25 search across the preset corpus slices; top 8 passages returned with source IDs
Draft: gpt-4o generates the letter using [CITE:N] tokens referencing only retrieved source IDs
Self-check: strips any [CITE:N] token whose source ID isn't in the retrieved pool; flags deadline/goal/tone compliance
Translate: Norwegian draft → working language (single call)

Pass 3 · optional

Formal citation refine

User-triggered (+1 credit). Jurisdiction-scoped retrieval, then rewrites inline citations to formal style and appends Rettskilder block:

Norwegian: jf. forvaltningsloven § 17
ECHR: full case name, application number, date, paragraph
Both: combined domestic + ECHR grounds

Hard-RAG

Every § citation is verified before it reaches you.

Hard-RAG means the model is constrained to only cite what it retrieved. No § number can appear in the final draft unless a corresponding source passage was actually found and fetched.

User intake + body preset

→

Corpus slice selection

→

Hybrid search (dense vector + BM25)

→

Top 8 passages with source IDs

Passages injected into gpt-4o prompt

→

Draft with [CITE:N] tokens only

→

Self-check: verify each [CITE:N] resolves

→

Strip unverified citations

The self-check pass parses every [CITE:N] token in the draft and looks up the source ID N in the retrieved pool. If it doesn't match — the citation is removed and the paragraph is rewritten without it. The output also flags whether the deadline was addressed, whether the stated goal was achieved, and whether the tone matched the selected chip.

What happens when no statute fits?

If no corpus passage closely matches the situation, the draft is produced in plain language without § references. A note in the output says: "No cited law sources — draft is plain-language (no § references available from corpus)." This is the intentional, honest behaviour — a blank draft is better than one with fake citations.

Knowledge base

220,000+ passages across 8 corpus slices.

The legal corpus is split into named slices. Each recipient body preset maps to a set of slices, so retrieval is always scoped to the right area of law.

220K+ total indexed passages

8 corpus slices

1,731 FNV tribunal decisions

23 ECHR Norwegian-family cases

Azure AI Search (West Europe)

Hybrid dense vector + BM25

Corpus slices

child_welfare echr family_core bufdir_guidance norwegian_courts broader_legal dbn_resources hague

Body preset → slice mapping (examples)

Recipient body	Corpus slices loaded
Barnevernet	child_welfare · echr · family_core
Bufdir	family_core · echr · bufdir_guidance
NAV	broader_legal (NAV-loven)
Skole / Barnehage / SFO	broader_legal (opplæringslova / barnehageloven)
Statsforvalteren	child_welfare · broader_legal
Trygderetten / Tingretten	norwegian_courts · broader_legal

EU-hosted model

EU-hosted Claude, grounded in Norwegian legal text.

EU · AWS Bedrock

EU-hosted Claude

Korrespond runs on EU-hosted Claude (AWS Bedrock, EU region), grounded on every request in the passages retrieved from Norwegian child-welfare and administrative law. Constrained to that corpus rather than answering from memory, it works the procedural vocabulary of forvaltningsloven: what triggers a § 17 right to be heard, what a lawful § 24 reasoned decision must contain, how barnevernsloven § 6-3 frames the child's best interest standard.

In the Korrespond pipeline, EU-hosted Claude handles legal synthesis alongside Azure gpt-4o. Retrieval assembles the relevant statute passages, gpt-4o structures the draft, and Claude produces the final legal reasoning within the Hard-RAG constraint — every § citation verified against its source. Your data stays in the EU throughout.

Hard-RAG forvaltningsloven barnevernsloven child-welfare corpus Norwegian bokmål output EU Bedrock

Model responsibilities in the pipeline

Pass	Model	Role
Pass 1 classify	gpt-4o-mini	Fast structured classification + gap detection
Pass 1 clarify questions	gpt-4o-mini + EU Claude	Domain-aware question generation
Pass 2 draft	gpt-4o	Full letter generation within Hard-RAG constraints
Pass 2 self-check	gpt-4o-mini	Citation verification + tone/goal/deadline audit
Pass 2 translate	gpt-4o-mini	Norwegian → working language translation
Pass 3 refine	gpt-4o	Formal citation rewrite + Rettskilder block

Pass 3 — Formal citation refine

Court-ready citations in two styles.

The optional third pass does a jurisdiction-scoped retrieval run, then rewrites the draft with formal inline citations and a Rettskilder appendix. Two distinct citation formats are supported:

🇳🇴

Norwegian statute style

Inline citations use jf. (with reference to) and the official statute name + section: jf. forvaltningsloven § 17, jf. opplæringslova § 9 A-4, jf. barnevernsloven § 6-3. Section numbers are verified against the corpus before inclusion.

⚖️

ECHR citation style

Full European Court of Human Rights citation format: case name · application number · date · chamber/Grand Chamber · paragraph. Example: Strand Lobben m.fl. mot Norge, EMD-37283/13 (Storkammer, 10.09.2019), § 207. Sources pulled from the ECHR corpus slice and HUDOC.

Example refined output

Refined output showing formal citations including opplæringslova §9 A-4 and EMK artikkel 8

Refined draft (Norwegian + English) with opplæringslova § 9 A-4 and EMK artikkel 8 inline citations.

Anchor queries for ECHR mode

For Barnevernet and Bufdir cases, the ECHR refine pass runs specific anchor queries targeting the most-cited Norwegian family cases in the HUDOC corpus:

Strand Lobben m.fl. mot Norge Johansen mot Norge K.O. og V.M. mot Norge Aune mot Norge EMK Art. 8 family life Norway EMK Art. 6 fair trial

Privacy & security

Your documents never leave your session.

Privacy by design

All uploaded files are extracted to text in memory using PHP's in-process file handlers. The raw binary is never written to disk on the server.
Session context (your narrative, uploaded text, drafts) is scoped to your authenticated session and discarded when the session ends.
Azure OpenAI (gpt-4o, gpt-4o-mini) is configured on the West Europe region. Data processed via Azure OpenAI is not used for model training under the default enterprise agreement.
Azure AI Search (bnl-legal-search) stores only the public legal corpus — statutes, tribunal decisions, ECHR judgments. None of your case information is stored in the search index.
Qdrant vector database stores only the public corpus embeddings — no user data.
Telemetry logged: tool name, language, output type, pass count, latency, source count. No case text, no names, no case references are logged.

How Korrespond knows what to write.