AI Search is under active development. Results may vary in accuracy and completeness, and features or UI labels can change between releases.
Overview
Nobly AI Search combines two fundamentally different search technologies to find documents:- BM25 (Text/Keyword Search): Traditional full-text search that finds documents containing the exact words you typed. Similar to how a search engine matches keywords — it looks for the specific terms in your query across document text, keywords, and metadata.
- Vector/Semantic Search: AI-powered search that understands the meaning behind your query. Even if a document doesn’t contain the exact words you typed, it can be found if it covers the same concept. For example, searching “medical expenses” could find documents about “health insurance claims” or “hospital invoices.”
Search config
AI Search does not use named modes. Instead, the Search config section of the settings panel exposes three independent toggles that together determine how a query is processed.Auto weights
| Setting | Default |
|---|---|
| Auto weights | On |
LLM rewrite
| Setting | Default | Requires |
|---|---|---|
| LLM rewrite | On | Auto weights |
LLM rerank
| Setting | Default |
|---|---|
| LLM rerank | On |
How results are ranked
Each search passes through a pipeline. The stages that actually run depend on your Search config and server availability.Stage 1: Query rewrite (optional)
If LLM rewrite is on, the query is rewritten into BM25 and embedding variants before retrieval. The original query is always included alongside the rewritten forms.Stage 2: BM25 keyword ranking
A full-text search ranks documents by how well they match your query terms. This considers:- Term frequency (how often your search terms appear)
- Document length (shorter documents with the same matches score higher)
- Term rarity (rarer terms contribute more to the score)
Stage 3: Vector similarity ranking
Your query is converted into a high-dimensional vector and compared against pre-computed vectors for every indexed document chunk. Documents whose vectors are closest to your query vector rank highest. Measured as cosine similarity (0.0 = completely unrelated, 1.0 = identical meaning).Stage 4: Reciprocal Rank Fusion (RRF)
BM25 and vector rankings are combined into a single ordering using RRF, weighted by the BM25/semantic split chosen by Auto weights (or your manual sliders). RRF ensures neither ranking system fully dominates — a document ranked #100 in BM25 but #1 in vector still surfaces.Stage 5: LLM reranker (optional)
If LLM rerank is on, the top RRF candidates are read by an LLM that scores each one against your query (0–100) and produces a short reasoning text. The LLM score replaces the RRF score as the displayed relevance score.Summary
| LLM stages on | Displayed score |
|---|---|
| Neither | RRF score |
| Rewrite only | RRF score computed from rewritten queries |
| Rerank only | LLM reranker score (0–100) |
| Both | LLM reranker score applied to RRF-fused results from rewritten queries |
Understanding the relevance score
Each result displays a relevance score from 0 to 100. The source depends on whether LLM rerank ran:- With LLM rerank: the score is the reranker’s direct 0–100 estimate of how well the document answers your query.
- Without LLM rerank: the score is derived from the RRF fusion, capped by semantic similarity so the top score reflects actual match strength rather than just position in the result list.
The relevance score (0–100%) shows how well each result matches your search. It is capped by semantic similarity so the top score reflects actual match strength, not just ranking. Scores are relative to the best result in the current set — so the same document can have different scores depending on what else is returned.Important properties:
- Scores are relative to the result set. The same document can show different scores depending on which other results were returned.
- The top result always has the highest score, but its absolute value depends on query quality. A strong match might show 92; a weaker query might top out at 55.
- Lower scores are not necessarily bad. A top score of 60 on a partial-term keyword query can still be the right document.
- Rerank-on and rerank-off scores are not directly comparable. A 70 from LLM rerank and a 70 from RRF are measured differently.
Match sources - Why did this result appear?
Hovering over (or clicking) a result’s relevance score reveals detailed information about why this result was returned. Each result can have one or more match sources:TextContent
The document’s text content was semantically similar to your query. This means the AI understood a conceptual connection between your query and the document’s actual text, even if the exact words differ. When available, the match detail shows which page numbers contain the relevant text and an excerpt of the matching passage.Keywords
One or more of the document’s Nobly Insight keyword values matched terms in your query. The match detail shows exactly which keyword type and value matched, and which of your query terms caused the match. Example: Searching for847291 might show:
Keywords — Kundenummer: 847291
Metadata
The document’s metadata fields (document name, document type, or creator username) matched terms in your query. Example: Searching forHansen might show:
Metadata — DocumentName: Policy-Hansen-2024.pdf
Multiple sources
A result can appear from multiple sources simultaneously. A result that matches via both Keywords and TextContent is generally a stronger match than one appearing from a single source — it means the document matched both on exact data and on conceptual meaning.What gets searched
When you run an AI search, your query is matched against a composite of all available information about each document:- Document name — the file/document title
- Document type name — the Nobly Insight document type
- Creator — the username that stored the document
- All keyword values — every keyword type and value assigned to the document (formatted as “type: value”)
- Full document text content — the entire extracted text from the document (PDF text, Word content, etc.)
Search progress stages
While a search runs, a progress indicator shows what the system is doing. The exact stages depend on your Search config and server features:- Resolving search options…
- Rewriting query… (only with LLM rewrite)
- Building embeddings…
- Searching…
- The AI is reading the results and ranking them by relevance… (only with LLM rerank)
- Finishing…
Search query tips
For exact lookups (IDs, numbers, codes)
Just type the value directly. With Auto weights on, the server picks an appropriate BM25/semantic split for code-like queries.2001701234INV-2024-0847
For finding specific documents
Use a few distinctive terms — Auto weights will balance exact and semantic matching.Hansen pensioninsurance policy 2024
For exploratory / conceptual search
Describe what you’re looking for in natural language. LLM rerank sharpens the ordering further when enabled.documents about employee health benefits changescustomer complaints regarding delayed payments
General advice
- Be specific when you can. More distinctive terms lead to better BM25 matches.
- Use natural language for broad topics. The semantic engine and the LLM reranker excel at understanding intent.
- Don’t worry about exact wording. LLM rewrite expands your query to cover variations.
- Try turning Auto weights off if the default split doesn’t surface what you expect — the sliders give you direct control.
Search operators and syntax
Currently, AI Search does not support advanced search operators. The following do not work:| Syntax | Status | Notes |
|---|---|---|
* (wildcard) | Not supported | Treated as a literal character |
"exact phrase" | Not supported | Quotes are treated as literal characters |
AND / OR | Not supported | Treated as regular words |
-term (exclusion) | Not supported | Treated as a literal character |
field:value | Not supported | Treated as regular text |
- BM25 naturally handles multi-word queries by matching individual terms
- Semantic search understands phrasing and context, so typing
health insurance claimswill find documents about that topic even without phrase operators - The BM25 tokenizer may apply stemming at the database level, which provides some automatic fuzzy matching for word variations
Advanced settings
The settings panel next to the search bar is organized into collapsible sections.Search config
- Auto weights — on by default. When off, set Text weight and Semantic weight sliders manually (0–100%, step 5%, always summing to 100%).
- LLM rewrite — on by default. Requires Auto weights. Hidden when disabled on the server.
- LLM rerank — on by default. Hidden when disabled on the server.
Document types
Restrict the search to one or more document types or document type groups. When empty, all accessible document types are searched.Date range
Restrict results to documents with a document date between From and To. Either endpoint is optional.Columns
Toggle which columns appear in the results table and drag to reorder. Available columns:- Document ID
- Document name
- Document type
- Document date
- Created by
- File extension
- Reasoning (populated by LLM rerank)
Max results
Control how many results are returned: 10, 20, 50, 100, 200, 500, or 1000. Default is 50. Higher limits take longer to process — LLM rerank time scales roughly with this number. The server may enforce a maximum cap.Understanding different score ranges
Here are practical examples of what different score ranges typically mean. Remember that scores differ meaningfully depending on whether LLM rerank ran — treat these as rules of thumb rather than fixed thresholds.High scores (75-100)
Strong match. The document is highly relevant to your query.Medium scores (45-74)
Moderate match. The document is related to your query but may not be a direct hit. Could be:- A partial keyword match (some terms found, others not)
- A conceptually related document that covers adjacent topics
- A document where your terms appear but in different contexts
Low scores (20-44)
Weak match. The document has some tenuous connection to your query. Worth checking if the higher-ranked results didn’t have what you need, but don’t expect a strong match.Very low scores (below 20)
Marginal match. Typically only appears when few results are available and the system is returning the least-bad options. Consider refining your query.Security and permissions
AI Search respects Nobly Insight document security at all times:- Only documents the user has permission to view are returned. Permission checking happens at the database level during search execution, not after.
- Security keywords (keyword types flagged for security) are synced to the search index and used for access control evaluation.
- User group permissions are synced periodically (every 30 minutes by default) from Nobly Insight to the search index.
- A user will never see documents in search results that they would not be able to access through normal Nobly Insight document retrieval.
- The document has been indexed (indexing is a separate process)
- The user’s group has permission to the document type
- Security keyword restrictions are not filtering the document out
Frequently asked questions
Why do I get different results when I rephrase my query?
Why do I get different results when I rephrase my query?
Different words produce different embeddings and different BM25 matches. With LLM rewrite on, the rewritten variants further depend on the original phrasing. This is normal and expected.
Why is the top score only 55%?
Why is the top score only 55%?
The relevance score reflects match quality, not ranking position. A top of 55% means the best match has moderate similarity to your query. This is common for partial keyword overlap or broad queries. The results may still be exactly what you need.
Can I search for documents by date?
Can I search for documents by date?
Yes — use the Date range section of the settings panel to filter by document date. Dates inside the free-text query itself are treated as text and matched against keyword values.
What's the difference between AI Search and standard Document Search?
What's the difference between AI Search and standard Document Search?
Standard Document Search uses structured filters: you select document types, date ranges, and keyword values from predefined fields. It queries the Nobly Insight database directly with exact criteria.AI Search uses free-text input against a separate search index (PostgreSQL with BM25 + vector embeddings), optionally enhanced by LLM rewrite and LLM rerank. It can find documents by meaning, not just exact field values. AI Search supports document-type and date-range filtering via its settings panel, but does not support keyword-value filters.
Why doesn't wildcard search work?
Why doesn't wildcard search work?
The search system does not parse special characters as operators. The
* character and quote marks are treated as literal text. Instead of wildcards, rely on the semantic component — the AI embedding naturally handles variations and related terms without needing explicit wildcards.How current is the search index?
How current is the search index?
Documents must be indexed before they appear in AI Search results. The index is updated through a separate indexing pipeline — newly stored documents may not appear immediately. Security permissions are synced every 30 minutes by default. Check with your system administrator for the indexing schedule specific to your environment.
