You're offline. Showing cached data.
Saved offline 0
All insights
Methodology Note

How we detect ghost farmers in real time.

Five rules running on a 60-minute cadence. Critical-severity flags wait for super-admin sign-off; everything else routes to the programme manager for triage. The math, the false-positive rate, and how we tune the gates per programme.

"Ghost farmers" — people listed as beneficiaries who never received the seedlings, or who were double-enrolled across districts — are the single biggest reputational risk for institutional funders of agroforestry programmes. When EU and World Bank auditors ask "where did the money go?" what they're really asking is "did the people you said exist, actually exist, and did they get what you said they got?"

Most distribution programmes can't answer that question. Records sit in parish books, beneficiary lists are entered into Excel after the fact, and there's no cross-referencing across programmes that operate in the same district. Nurseryz.io's ghost-farmer detector closes that gap with five rules that run automatically every hour.

The detector doesn't catch fraud. It catches data inconsistency that humans can then investigate — and 9 times out of 10, that inconsistency turns out to be exactly what audit teams want to know about.

The five detection rules

Each rule is implemented as a separate Rule class under App\Services\Ghost\Rules\, so we can tune thresholds per programme without touching the detector orchestrator. The five rules in production:

RuleSignalSeverity
Duplicate phoneSame phone number on ≥3 distributions in one programmeMedium
Duplicate national IDSame NID across two or more distributionsCritical
Uncontacted >60 daysDistribution with no survival report after 60 daysMedium
Calendar anomalyDistribution date outside the programme's start/end windowCritical
Suspicious concentration3+ distributions per farmer in the same programmeMedium

The two critical rules — duplicate national ID and calendar anomaly — block resolution by programme managers. Only super-admin can dismiss them, because they're the rules that most directly indicate either fraud or data corruption that an external auditor would flag.

False positives, and what we do about them

Across the first 200 flags raised by the live detector, roughly 8% were genuine fraud, 24% were legitimate (e.g. the same farmer enrolled in two programmes through different cooperatives), and 68% were data-entry errors — phone numbers transposed, NIDs duplicated due to copy-paste, dates entered in the wrong format.

That last bucket is the most valuable. Catching data-entry errors before they reach the donor PDF is the entire point. If a programme has a 6% data-error rate on enrolment, that's the difference between a donor signing off on the report and an audit team flagging it back six months later.

The resolution flow

Every flag has four states: open, verified (confirmed fraud, escalated), resolved (fixed and closed), or false positive (rule fired but data is legitimate). Each state change requires a resolution note that lands on the audit trail and surfaces in the donor PDF.

The complete audit chain — when the flag was raised, what rule fired, what evidence it was based on, who resolved it, and what note they left — is queryable per programme and exportable to CSV for donor compliance.

Calibration per programme

The thresholds in the table above are defaults. Each programme can tune them up or down via the super-admin Programme view — for example, a Lamwo District programme during the long rainy season might want the "uncontacted >60 days" threshold relaxed to 90 days, because monsoon access disruptions can legitimately delay first survival reports. Calibration changes are themselves logged.

Detection runs on a 60-minute cron schedule plus a "Re-scan now" button on the funder Ghost Flags page. Re-runs are idempotent — the same (programme, rule_type, farmer_id, distribution_id) tuple never produces duplicate flags.

What this isn't

The detector doesn't determine intent. It surfaces data inconsistency for human review. It's not an AI; it doesn't predict; it doesn't score farmers. It runs five deterministic rules and produces a queue of records that humans then triage.

That's by design. Institutional buyers — the EU's Green Deal partnerships, the World Bank's Agri-Industrialisation framework — explicitly require that automated systems making accountability decisions be auditable and explainable. Rule-based detection is auditable; ML scoring is not, at least not yet at the bar these partners require.

Ready when you are

Want this running on your programme?

We onboard your beneficiary list to a sandbox dashboard. You see the ghost-flag detector running on your real data — before signing anything.