Lead Sourcing Bottleneck — Directory-to-CRM Pipeline
A lead-sourcing pipeline that removes the prospecting bottleneck — finds businesses in a target directory, scores each one for fit with AI, and hands qualified, context-rich leads to the sales team. Runs locally with no cloud rental.
Capability demonstration · Sales Operations · Published 2026-05-26
Results
- ▲200+ qualified leads delivered from initial pipeline runs
- ▲100+ businesses contacted with AI-generated pain-point-specific outreach
- ▲50% phone-pickup rate on the qualified set — confirming real, reachable decision-makers
- ▲Two-stage LLM scoring (classify, then rank against a weighted rubric) before any human touches a lead
- ▲Every lead ships with an LLM-written pain-point brief so the caller opens with context, not a cold script
The Bottleneck
Plenty of businesses have a sales motion that depends on finding the right prospects in public directories — then manually copying names, numbers, and addresses into a spreadsheet and guessing which ones are worth a call. It's slow, it doesn't scale, and the qualification is inconsistent because it lives in one person's head.
The question this build answers: is "find and qualify the right prospects" a bottleneck you can hand to an AI system? It is — as long as the qualification logic can be written down.
What Was Built
A fully local AI pipeline — no cloud APIs, no monthly per-seat licensing. The system is built on LangGraph with a five-stage StateGraph:
**Scrape → Categorize → Qualify → Insert → Report**
Every stage runs on local infrastructure. A public directory is scraped across many locations and categories. Each raw lead is passed through a local LLM (Ollama) in two stages — first to classify the business into an ICP taxonomy, then to score it for fit against a weighted rubric.
Qualified leads are stored in a self-hosted database with tier, pain points, and a plain-English explanation of the score — so the caller knows exactly what to say before they dial.
Why It Maps Cleanly to an AI System
Three properties make this bottleneck a clean fit for automation:
**The source is structured.** A directory or listing site has predictable fields — name, contact, location, category. Structured input is what a scraper and an LLM can both reason about reliably.
**The qualification is a rubric, not a vibe.** Once "a good lead" can be expressed as weighted criteria, an LLM can apply it consistently across thousands of records — far more consistently than a tired human at row 400.
**The output is a handoff, not a decision.** The pipeline doesn't close anyone. It arms a human with a ranked list and a reason. The judgement stays where it belongs and the grunt-work goes where it doesn't.
The Human Loop
The pipeline doesn't replace the sales person — it arms them.
```
Scrape → LLM Categorize → LLM Qualify → Database
↓
Caller reviews pain_points → dials with context
```
Every qualified lead includes a `pain_points` field generated by the LLM — specific operational pain points for that business type. The caller opens with a relevant, specific observation instead of a cold script.
Outcome
- **200+ qualified leads** delivered from initial pipeline runs
- **100+ businesses** contacted with personalized, context-aware outreach
- **50% phone-pickup rate** on valid numbers — the qualification gate is filtering for real businesses
- A **repeatable pipeline shape** that ports to any directory-sourced sales motion