How we detect ghost farmers in real time

"Ghost farmers" — people listed as beneficiaries who never received the seedlings, or who were double-enrolled across districts — are the single biggest reputational risk for institutional funders of agroforestry programmes. When EU and World Bank auditors ask "where did the money go?" what they're really asking is "did the people you said exist, actually exist, and did they get what you said they got?"

Most distribution programmes can't answer that question. Records sit in parish books, beneficiary lists are entered into Excel after the fact, and there's no cross-referencing across programmes that operate in the same district. Nurseryz.io's ghost-farmer detector closes that gap with five rules that run automatically every hour.

The detector doesn't catch fraud. It catches data inconsistency that humans can then investigate — and 9 times out of 10, that inconsistency turns out to be exactly what audit teams want to know about.

The five detection rules

Each rule is implemented as a separate Rule class under App\Services\Ghost\Rules\, so we can tune thresholds per programme without touching the detector orchestrator. The five rules in production:

Rule	Signal	Severity
Duplicate phone	Same phone number on ≥3 distributions in one programme	Medium
Duplicate national ID	Same NID across two or more distributions	Critical
Uncontacted >60 days	Distribution with no survival report after 60 days	Medium
Calendar anomaly	Distribution date outside the programme's start/end window	Critical
Suspicious concentration	3+ distributions per farmer in the same programme	Medium

The two critical rules — duplicate national ID and calendar anomaly — block resolution by programme managers. Only super-admin can dismiss them, because they're the rules that most directly indicate either fraud or data corruption that an external auditor would flag.

False positives, and what we do about them

Across the first 200 flags raised by the live detector, roughly 8% were genuine fraud, 24% were legitimate (e.g. the same farmer enrolled in two programmes through different cooperatives), and 68% were data-entry errors — phone numbers transposed, NIDs duplicated due to copy-paste, dates entered in the wrong format.

That last bucket is the most valuable. Catching data-entry errors before they reach the donor PDF is the entire point. If a programme has a 6% data-error rate on enrolment, that's the difference between a donor signing off on the report and an audit team flagging it back six months later.

The resolution flow

Every flag has four states: open, verified (confirmed fraud, escalated), resolved (fixed and closed), or false positive (rule fired but data is legitimate). Each state change requires a resolution note that lands on the audit trail and surfaces in the donor PDF.

Programme managers can resolve the three medium-severity rules.
Super-admin is required for critical rules — duplicate NID and calendar anomaly.
Funders see the entire queue read-only, including resolution notes.

The complete audit chain — when the flag was raised, what rule fired, what evidence it was based on, who resolved it, and what note they left — is queryable per programme and exportable to CSV for donor compliance.

Calibration per programme

The thresholds in the table above are defaults. Each programme can tune them up or down via the super-admin Programme view — for example, a Lamwo District programme during the long rainy season might want the "uncontacted >60 days" threshold relaxed to 90 days, because monsoon access disruptions can legitimately delay first survival reports. Calibration changes are themselves logged.

Detection runs on a 60-minute cron schedule plus a "Re-scan now" button on the funder Ghost Flags page. Re-runs are idempotent — the same (programme, rule_type, farmer_id, distribution_id) tuple never produces duplicate flags.

What this isn't

The detector doesn't determine intent. It surfaces data inconsistency for human review. It's not an AI; it doesn't predict; it doesn't score farmers. It runs five deterministic rules and produces a queue of records that humans then triage.

That's by design. Institutional buyers — the EU's Green Deal partnerships, the World Bank's Agri-Industrialisation framework — explicitly require that automated systems making accountability decisions be auditable and explainable. Rule-based detection is auditable; ML scoring is not, at least not yet at the bar these partners require.

How we detect ghost farmers in real time.

The five detection rules

False positives, and what we do about them

The resolution flow

Calibration per programme

What this isn't

Want this running on your programme?

How we detect ghost farmers in real time.

The five detection rules

False positives, and what we do about them

The resolution flow

Calibration per programme

What this isn't

Why SMS still wins in rural Uganda — even in 2026

Northern Uganda Q1 2026: District-Level Survival Data & Methodology

Want this running on your programme?