Meet Josef Q’s new document pre-processing engine—expertly optimized for legal and compliance

Some say RAG is dead—but not us!

Introducing Josef Q’s all-new document pre-processing engine purpose-built for legal and compliance, featuring proprietary hierarchy-based chunking, data augmentation, and contextual enrichment.

After two years of research with customers like the global insurer Bupa and L’Oréal, alongside universities like NYU and Cornell Tech, we’ve learned a lot.

One key lesson, though, has been that in order to produce hyper-accurate, reliable legal AI, you must optimize your document pre-processing according to the conventions of the content you’re looking to unlock. In our customers’ cases, that documentation like policies, agreements, playbooks, and more.

Here’s a look at Josef’s all new pre-processing engine, and how it helps more legal and compliance teams unlock the true of value of AI on Josef Q.

Watch Josef Co-founder and COO Sam Flynn’s quick walk-through below! 

Watch Co-founder and COO Sam Flynn's quick walkthrough.

The problem with traditional chunking

Most retrieval-based AI tools use simple chunking strategies that divide text into fixed-size blocks. While this approach may work for simpler content, at Josef we know firsthand that legal documents demand nuance and extra care.

Simpler chunking strategies that split content in fixed‑size blocks can inadvertently slice through critical context—dividing sections, clauses, or legal arguments into unstructured segments when they should be read and understood as a whole.

The result is copilots and other AI tools that may help subject matter experts shave a few minutes from tasks here and there, but the tools themselves can retrieve mismatched or incomplete information, and, in some cases, produce a dreaded hallucination!

Hierarchy-based chunking tailored for legal and compliance content

Josef Q’s new pre-processing engine has been optimized following extensive research and analyses of 1000s of complex legal and compliance documents.

The result is a pre-processing powerhouse that leverages hierarchy-based chunking to tackle the challenges of simple RAG head-on. Instead of imposing arbitrary breaks, our approach analyzes and mirrors the natural structure of legal documents, taking into account sections, subsections, clauses, paragraphs, and more.

Now, each time an end-user asks a tool a question on Josef Q, the tool’s underlying content is analysed according to its existing inherent hierarchies. The most relevant sections are then passed to the LLM to deliver answers teams can trust.

Added data augmentation and contextual enrichment

Josef Q now also comes supercharged with high-quality data augmentation and contextual enrichment, including specialized algorithms optimized for legal and compliance content that ensure every piece of data within a policy, for example, is segmented correctly and enriched with semantic layers.

Data augmentation during the content retrieval process introduces controlled variations, expanding the model’s exposure to diverse phrasings and contexts.

Contextual enrichment adds additional metadata and semantic cues that help tools firmly grasp the subtleties of legal language—ensuring nothing important is lost. Together, this enables Josef’s LLM to efficiently scan, filter, and extract the most relevant information to form a tool’s answer.

Subscribe to Scooped. Snackable, curated legaltech and AI content every month.

Supercharging more Josef customers

With our all-new, pre-processing engine, Josef Q Q&A tools aren’t just the smartest of the bunch. They also enable teams to:

  1. Launch more hyper-accurate tools—faster than ever: You don’t have to spend countless hours training and optimizing your tools with or without the help of IT. We’ve done the hard work for you behind the scenes!
  2. Audit answer sources instantly via interactive lists: Need to check where an answer came from? The new engine not only generates accurate answers, but also displays an interactive list of sources for easy reference.
  3. Deliver user-friendly answers in a range of formats: No one likes a wall of text. Josef Q tools now summarize and display answers in a range of rich formats, featuring hyperlinks, lists, bold, italics and more.

A future-proofed, self-service platform

RAG isn’t dead. It’s evolving! Josef Q’s new pre-processing engine represents its future, enabling two key things:

One: Teams can create tools fit for the whole business, not just subject matter experts needing a quick helping hand.

Unlike other AI-powered platforms, Josef isn’t just for experts. With our new pre-processing engine, legal and compliance teams can create even more tools that handle all the tedious tasks themselves—giving clients access to services they need around the clock!

Two: Josef can continue laying a strong foundation for the future of RAG.

Any AI-powered platform using RAG will soon need to leverage one or many of the emerging RAG approaches including search, vision language models (VLMs), multi-step reasoning, and more. Our new engine helps us get there, and we can’t wait to see where it’ll take us, and the platform, next.

Coming soon…

Josef Q’s new document pre-processing engine is just one of the recent ways we’ve doubled down our mission to help customers launch the most reliable and hyper-accurate Q&A tools out there—and there’s more to come.

Keep an eye out for new upcoming features, including automatic question categorization, built-in tool strength assessments, and more.

Why Bupa thinks Josef Q is “so easy!”

Watch Josef Co-founder and CEO Tom Dreyfus chat with Bupa’s Head of Legal Operations, Claire Nuske, to learn how the global insurer developed a range of Q&A tools across Legal, People, Marketing, and Workplace Relations.

Learn more.

See Josef Q in action

Book a demo to see how you can increase access to policies, playbooks and guidance with hyper-accurate Q&As.

Thanks for requesting a demo!

We'll be in touch soon to arrange a time to speak.

Josef Q

Experience Josef Q,
our AI tool, in action

Book a demo to see the transformative power of AI to create digital tools that unlock corporate content.

Josef Q

Book a
Josef Q Demo

Josef No-code

See Josef's no-code
automation platform

Book a demo to see our no-code workflow automation platform in action and learn how it transforms your day-to-day.

Josef no-code

Book a Josef
no-code Demo

Thanks for requesting
a demo!

We'll be in touch soon to arrange a time to speak.

Get legal
innovation news

Subscribe to our newsletter to get regular news and
updates from the exciting world of legal tech.

Thanks for subscribing!