Case study

Hereditary genealogy · ~200 employees

Guenifey automates heir-file processing and saves 30 minutes per case.

An Azure pipeline now handles every inbound case file: reading documents (including handwritten ones), extracting the data, and creating entries in the genealogy platform — with no operator opening a single PDF.

2 months

Azure OpenAI (GPT-4o)Azure Document IntelligenceAzure FunctionsMCP Server

−30 minper processed file

~2 hrecovered every week

0manual entry errors

ARCHITECTURE

How it works

Incoming email

PDF attachments

Azure Doc Intelligence

OCR + handwriting

GPT-4o

Structuring & validation

Automated checks

Duplicates · dates · fields

Business software

Entries created automatically

About the client

Guenifey is a specialist cabinet in hereditary genealogy. Its business: identifying and locating legal heirs during estate proceedings, on behalf of notarial practices.

With over 200 employees, the firm processes thousands of succession files every year. Each case begins with the collection of a corpus of civil and administrative documents — birth certificates, family booklets, identity documents, municipal records — used to reconstruct family ties and determine each heir's entitlements.

The process is rigorously documentary. Its quality depends on the ability to read, interpret and correctly enter data from heterogeneous sources, some very old, often partially handwritten.

The trigger

Each incoming file followed the same pattern: an email containing several PDFs, accompanied by a few contextual notes. An operator would take stock of it, open the PDFs one by one, read through the documents — sometimes handwritten pages that were difficult to decipher — extract the relevant data, and manually enter it into the business application.

The time spent on this task alone reached thirty minutes per file. Across four to five cases a week, that added up to one to two hours of work entirely consumed by data entry.

The risk of error was real. Duplicates occasionally appeared in the database; incorrectly transcribed dates slipped through entry filters. None of these incidents were critical in isolation, but each meant correction time — and a structural fragility in a field where documentary accuracy is non-negotiable.

Our approach

OLIXID began by mapping the workflow as it existed: receiving the email, reading the attachments, manually extracting data, entering it into the application. The goal was not to replace the operator, but to eliminate the mechanical part of their work: opening files, copying fields, checking for duplicates.

The main technical obstacle was not document extraction, but the interface with the business application. Its API was sparsely documented and required substantial integration engineering. The engagement ran for two months — largely to harden this integration layer through iterative testing with the Guenifey team.

What we delivered

The pipeline activates as soon as an email arrives at the dedicated address. PDF attachments are passed to Azure Document Intelligence, which extracts their content — including handwritten sections. GPT-4o then takes this raw content, structures the data according to the schema expected by the business application, and triggers a set of automated checks: date consistency, duplicate detection, missing mandatory fields.

If the checks pass, entries are created directly in the application. The operator receives a notification and can validate or correct the result in seconds — without having opened a single PDF.

The whole system runs on Azure Functions and an MCP server, hosted within Guenifey's Microsoft environment.

Results

Processing a file, which previously took an operator thirty minutes, is now handled by the pipeline in a matter of minutes. Based on four to five cases per week, that amounts to one to two hours recovered every week — time returned to higher-value tasks.

Data entry errors stemming from manual transcription — duplicates, incorrectly copied dates — have been eliminated at source. The automated check blocks anomalies before they reach the database.

Day-to-day impact

Avant

Previously, processing a file meant opening each PDF, identifying the relevant documents among others, sometimes deciphering several pages of handwritten text, and copying the data field by field. The difficulty varied with document legibility: some older records required genuine reading effort.

Aujourd'hui

Today, the operator sends the email to the dedicated address. A few minutes later, the entries are created and a notification is waiting. They review, validate if everything is correct, or make corrections if needed. Reading PDFs is no longer a task — it is an exception.

What's next

The pipeline runs autonomously. The engagement is complete. Guenifey has embedded the system within its Microsoft environment and continues to operate it at its usual weekly volume.