Prasanth Janardhanan

Why Semantic Search Alone Fails on Legal Text (And How Hybrid Search Fixed It)

I was building a RAG pipeline to classify AI systems under the EU AI Act — feed in a plain-English description of your AI system, get back the risk tier, the relevant articles, and a compliance checklist with citations. Classification was working perfectly. All 8 test scenarios nailed the correct risk tier.

But the confidence scores were stuck at 36%.

The problem wasn’t the LLM. It wasn’t the prompts. It was the retrieval. And the fix taught me something worth sharing about when semantic search falls short — and when adding keyword search actually makes things worse.

Continue Reading →

Self-Hosted RAG Architecture for Privacy-Sensitive Document Search with Golang

Remember the days of boolean search operators? The frustrating experience of trying to find that one document by guessing the exact keywords it might contain? For decades, our ability to search through documents has been limited by keyword matching—a primitive approach that fails to capture the richness of human language and intent.

Traditional search technology has evolved through several stages: from basic keyword matching to more sophisticated approaches involving stemming, lemmatization, and eventually, statistical methods like TF-IDF (Term Frequency-Inverse Document Frequency). Each iteration brought incremental improvements, but they all shared a fundamental limitation—they didn’t truly understand the meaning behind our queries.

Continue Reading →