Search
OLDP exposes the same Elasticsearch-backed search engine on three surfaces:
Surface |
Endpoint |
Use case |
|---|---|---|
Web UI |
|
Human browsing with facets, citation chip, pagination |
REST |
|
Programmatic full-text queries |
MCP |
|
Agent-driven research |
All three share the same backend (CaseIndex / LawIndex in
oldp/apps/cases/search_indexes.py and oldp/apps/laws/search_indexes.py)
and the same SearchQueryBuilder (oldp/apps/search/api.py), so a result
that matches in one surface also matches in the others — subject to each
surface’s input contract.
Query syntax
The keyword query (q / text / query) is parsed as Elasticsearch
query_string
syntax on every surface. Supported operators:
Feature |
Example |
Notes |
|---|---|---|
Implicit AND |
|
All bare terms must match (default operator is AND). |
Exact phrase |
|
Terms must appear adjacent, in order. Typographic quotes are normalized — pasting |
Exclude |
|
Drops documents containing the excluded term ( |
OR |
|
Case-sensitive — lowercase |
Required |
|
Mostly redundant given the AND default. |
Grouping |
|
Parentheses group sub-expressions. |
Wildcard |
|
|
Fuzzy |
|
Edit-distance match for typos. |
Field-scoped |
|
Restricts a clause to one field. Footgun: |
Malformed fragments (a stray quote, unbalanced parens) are sanitized before they reach Elasticsearch, so they degrade gracefully instead of erroring.
German language handling
Free-text fields use a German analyzer (german_legal: light stemming +
german_normalization + a legal synonym filter), so queries match across
morphology and common variants automatically — no special syntax needed:
Inflection / plurals:
VertragmatchesVerträge,FristmatchesFristen.Umlaut / ß normalization:
MassnahmematchesMaßnahme,StrassematchesStraße.Gendered roles (synonyms):
VermietermatchesVermieterin,KlägermatchesKlägerin.Spelling variants (synonyms):
SchadenersatzmatchesSchadensersatz(Fugen-s).
Stemming is intentionally light (legal precision over recall): distinct
lemmas are kept apart, e.g. Kündigung (termination) does not collapse
into kundig (knowledgeable). There is no stopword removal, so function
words inside an exact phrase ("Treu und Glauben") still line up.
Changing the analyzer requires a full reindex — see Elasticsearch and the deployment runbook.
Available filters
Every filter composes with every other filter (logical AND).
Filter |
Web param |
REST param |
MCP kwarg ( |
Notes |
|---|---|---|---|---|
Keyword query |
|
|
|
Lucene syntax supported on all surfaces. REST requires this; web/MCP allow empty if other filters are set. |
Date range |
|
|
|
|
Court |
|
— (use |
|
Exact match on the court code (e.g. |
Decision type |
|
— |
|
Exact, e.g. |
Cited law section |
|
|
|
Both required together. Case-only — silently ignored on |
Cited case id |
|
|
|
Mutually exclusive with the law-citation pair (law citation wins when both are sent). |
Sort |
|
|
|
|
Combined filters
Citation filters compose with keyword + facets + date range. The form does not pick one filter at the cost of another — every filter narrows the result set independently.
# Cases citing § 823 BGB that mention "Mietrecht"
/search/?q=Mietrecht&cited_law_book=bgb&cited_law_section=823
# Same, restricted to BGH and 2020
/search/?q=Mietrecht&selected_facets=court_exact:BGH
&start_date=2020-01-01&end_date=2020-12-31
&cited_law_book=bgb&cited_law_section=823
# Newest cases first
/search/?q=Mietrecht&order_by=date
The citation chip in the web UI shows the active citation filter with a
clear (×) link that removes only the citation params and keeps q /
facets intact. The “Sort by” group in the sidebar auto-submits on
change. Clicking a year tile under the publication-date facet preserves
the citation chip, every selected facet, and the current sort.
Sort order
order_by=date orders by case publication date, newest first.
order_by=most_cited orders by reverse-citation count
(citing_cases_count) — how often a decision is cited by other cases —
surfacing landmark precedent first. Empty value (or any other value)
leaves Elasticsearch’s relevance score ordering. The results-count label
tells you which order is active:
192 documents sorted by date.
4 citing cases, sorted by date.
The German UI renders the equivalent: “192 Dokumente sortiert nach Datum.” / “4 zitierende Entscheidungen, sortiert nach Datum.”
REST examples
# Keyword + cited law section
curl -G "https://de.openlegaldata.io/api/cases/search/" \
--data-urlencode "text=Mietrecht" \
--data-urlencode "cited_law_book=bgb" \
--data-urlencode "cited_law_section=823" \
-H "Authorization: Token $OLDP_API_TOKEN"
# Keyword + date range
curl -G "https://de.openlegaldata.io/api/cases/search/" \
--data-urlencode "text=Urheberrecht" \
--data-urlencode "start_date=2023-01-01" \
--data-urlencode "end_date=2023-12-31" \
-H "Authorization: Token $OLDP_API_TOKEN"
# Cases citing a specific case (citation-graph filter)
curl -G "https://de.openlegaldata.io/api/cases/search/" \
--data-urlencode "text=Schadensersatz" \
--data-urlencode "cited_case=12345" \
-H "Authorization: Token $OLDP_API_TOKEN"
The REST contract requires the text param (returns 400 otherwise).
Citation params alone aren’t accepted on the REST surface — use the
dedicated nested actions (/api/laws/<id>/citing_cases/,
/api/cases/<id>/citing_cases/) for “all citing cases” without keyword
refinement. See API overview — Citations & Cross-References
for those endpoints.
MCP examples
# All citing cases of § 823 BGB, paginated — no keyword
get_cases_for_law(book_code="BGB", section="823", limit=20)
# Same set, narrowed by keyword + court (combined search)
search_cases(query="Mietrecht", court_code="BGH",
cited_law_book="bgb", cited_law_section="823", limit=10)
# Cases citing a specific case (reverse citation) narrowed by keyword
search_cases(query="Vermieterpflichten", cited_case_id=12345, limit=10)
# Date-range only with citation graph
search_cases(query="", start_date="2023-01-01", end_date="2023-12-31",
cited_law_book="bgb", cited_law_section="823")
For “give me every citing case” without keyword refinement, prefer the
dedicated get_cases_for_law / get_citing_cases tools — search_cases
exists for combined searches that intersect citations with keyword,
court, or date.
Unified and similarity search (MCP)
# One call across BOTH legislation and case law, grouped by type.
# Use when you don't know whether the answer is statute or jurisprudence.
search_legal(query="Eigenbedarf", limit=5)
# -> {"laws": [...], "cases": [...], "total_laws": N, "total_cases": M}
# Find cases textually similar to a known on-point decision (more_like_this).
get_similar_cases(case_id=12345, limit=10)
search_legal returns results grouped by type rather than as one
merged ranked list: court decisions are long and out-score short statute
texts on relevance, so a merged top-N would return only cases and bury the
on-point law. For type-specific control (citation/court/date filters), use
search_cases / search_laws directly.
When a book_code (e.g. for get_cases_for_law) or a court code/slug
(for get_court) is not found, the error carries a suggestions list of
the closest existing codes (a did_you_mean hint), e.g.
"DSGVOO" → ["DSGVO", …].
Surface differences at a glance
Web |
REST |
MCP |
|
|---|---|---|---|
Keyword required? |
No (any other filter unlocks the query) |
Yes ( |
No |
Facet selection |
|
Use |
Built-in kwargs |
Sort order |
|
|
|
Citation params |
Optional |
Optional |
Optional |
Highlighting |
Inline in result list |
|
|
Pagination |
Standard |
|
|
Backend details
Citation lookups use multi-value fields on CaseIndex:
cited_laws— list of"<book_slug>__<section_slug>"tokenscited_cases— list of cited-case PKs (as strings)
Built via oldp.apps.cases.search_indexes.cited_law_token(book, section);
consumers should call this helper rather than concatenating the token
manually. The slug pair is stable across book revisions, so citation
queries survive book-revision turnover.
most_cited sorting uses a denormalized Case.citing_cases_count (number
of distinct accepted cases citing a decision), mirrored into the
CaseIndex.citing_cases_count ES field. It is approximate between
recompute runs: recompute it off the hot path with
manage.py update_citing_counts (a single grouped aggregate, ~90s) after
an ingestion + reference-extraction pass, then reindex to mirror the new
counts into Elasticsearch.
For the underlying index fields, the operator-run reindex command, and
the ES outage behaviour on each surface, see
Elasticsearch. For the citation-graph endpoints
(REST /api/{cases,laws}/<id>/citing_*/ and the flat /api/references/),
see API overview — Citations & Cross-References.