Shadow Inbox/blog
Subscribe
← back to indexblog / enrichment / enrichment-workflows
Enrichment

Enrichment workflows: from a Reddit username to a verified work email in 90 seconds

Reddit username to verified work email in 90 seconds. The exact enrichment pipeline, the realistic hit rate, and the fallback branch when there's no LinkedIn.

A
ArthurFounder, Shadow Inbox
publishedDec 19, 2025
read7 min
Enrichment workflows: from a Reddit username to a verified work email in 90 seconds

A Reddit username is a useless lead. A verified work email at the same person's company is a conversation. The gap between the two is roughly 90 seconds of automated work if you've wired the pipeline correctly, and 30 minutes if you haven't

A Reddit username is a useless lead. A verified work email at the same person's company is a conversation. The gap between the two is roughly 90 seconds of automated work if you've wired the pipeline correctly, and 30 minutes if you haven't.

We've run this enrichment graph on tens of thousands of usernames inside Shadow Inbox. The hit rate isn't 100% — and anyone selling you 100% is lying — but the realistic number is good enough to make the channel work. Here's the exact path.

The username is the wrapper. The verified email is the lead. Everything in between is a 90-second automation problem.

The full pipeline, end to end.

Reddit username (e.g. u/jane_ops_devops)


1. Comment history scrape (last 200 items, ~3s)


2. Company hint extraction (LLM, ~1.5s)
   │       │
   │       └─ branch: no company hint → website/portfolio search fallback


3. LinkedIn search via Apollo or Clay (~5s)
   │       │
   │       └─ branch: no LinkedIn match → comment-history company guess


4. Email pattern guess via Hunter (~2s)


5. SMTP verify (~3s)


Verified work email + role + company

Total wall-clock time when everything works: 12 to 18 seconds. The "90 seconds" headline includes retry, fallback branches, and the polite rate-limit pauses that keep you from getting blocked.

38-52%end-to-end hit rate from username to verified email
12-18swall-clock time per username when all steps succeed
~$0.10retail cost per fully-resolved contact
1 in 5no-LinkedIn cases that still yield a company from comment history

Step 1: Comment history scrape.

The Reddit username is your only input. The first move is to read what they've written.

We pull the last 200 comments and posts via the public JSON endpoint:

curl "https://www.reddit.com/user/jane_ops_devops/.json?limit=200" \
  -H "User-Agent: shadow-inbox/1.0 (by /u/your-account)"

Set a real user-agent. Reddit serves anonymous requests but rate-limits aggressively if your UA looks like a bot. We respect 1 request per second per user lookup.

The response gives you every public action — comment bodies, post bodies, subreddits, timestamps. Concatenate the bodies, deduplicate, and pass the blob to step 2.

Step 2: Company hint extraction.

A small LLM call reads the comment history and looks for any clue about who the OP works for.

You are extracting employer hints from a user's public Reddit history.
Read the concatenated comments below.
 
Look for any of:
  - explicit employer mention ("at Acme", "I work for X", "my company Y")
  - role with company context ("CTO of Z", "running ops at Q")
  - signature blocks
  - tool stacks that imply a company size or industry
  - portfolio links or websites in profile
 
Return JSON:
{
  "company_name": "<name or null>",
  "role_hint": "<role or null>",
  "industry_hint": "<industry or null>",
  "evidence": ["<quoted span>", ...],
  "confidence": 0.0-1.0
}

This is Claude Sonnet 4.6 at temperature 0. The chain-of-thought via evidence quotation matters here because users mention companies for many reasons — they don't all work there. "I love what Stripe is doing" is not a Stripe employee.

Hit rate at this stage: roughly 55-70% of usernames yield some kind of company or industry hint. The other 30-45% have anonymized profiles with no work-context comments.

Step 3: LinkedIn search via Apollo or Clay.

With a company hint plus the Reddit display name, we search Apollo (or Clay, or any data provider you prefer) for the person.

The query shape:

const apolloMatch = await apollo.peopleSearch({
  q_organization_name: companyHint,
  q_keywords: usernameAsHumanName(reddit.username),
  // also try the display name from their Reddit profile if set
  page: 1,
  per_page: 5
});

We try a few variations: the literal username, the username with separators removed, the display name, and a "first name + company" combination if the username looks personal (e.g., u/jane_ops_devops → "Jane" at Acme).

Hit rate at this stage given a good company hint: roughly 60-70%. So your compounding rate from username → LinkedIn match is about (0.65 × 0.65) ≈ 42%.

Step 4: Email pattern guess via Hunter.

Once you have a name and a company, you don't need to scrape an email. You need to derive it from the company's pattern.

Hunter exposes the pattern via its domain-search endpoint:

const pattern = await hunter.domainSearch({ domain: company.domain });
// returns e.g. { pattern: "{first}.{last}@{domain}", confidence: 96 }
 
const guessedEmail = applyPattern(pattern.pattern, {
  first: person.first_name,
  last: person.last_name,
  domain: company.domain
});

Hunter's pattern confidence is calibrated reasonably well — anything over 90% is near-certain to be the company's pattern. Below 70% we treat as a guess and run extra verification.

Step 5: SMTP verify.

The pattern gives you a guess. SMTP verification gives you a "yes this mailbox accepts mail." Use Hunter's email-verifier endpoint or roll your own with nodemailer's SMTP probe.

const result = await hunter.emailVerifier({ email: guessedEmail });
// result.status: "valid" | "invalid" | "accept_all" | "unknown"

Three outcomes worth distinguishing:

  • valid: the SMTP server confirmed the mailbox. Send.
  • accept_all: the server accepts everything (catch-all domain). You don't actually know if your guess is right. Treat as soft-positive and send anyway, but consider a follow-up channel.
  • invalid: mailbox doesn't exist. Try the next pattern variant or fall back.
  • unknown: server didn't respond clearly. Retry once, then treat as soft-positive.

Roughly 80% of pattern guesses on Hunter-confident patterns return valid or accept_all. The other 20% need a fallback.

The no-LinkedIn fallback branch.

About a third of usernames don't resolve to a LinkedIn profile. They're privacy-conscious, they use a different name, or they're not on LinkedIn at all. The branch:

LinkedIn miss


Re-read company hint from step 2


Direct company website lookup (team page, contact page, blog authors)


If found: derive email via Hunter pattern as before
If not found: surface the username + company hint to operator
              (they decide whether to engage publicly on Reddit)

About 1 in 5 no-LinkedIn cases still yields a verified email through this branch. The other 4 we surface as "engage on platform only" — comment publicly, build relationship, get the email later.

A worked example.

Username: u/jane_ops_devops

Step 1: Comment history pull — 187 comments and posts, mostly in r/devops, r/kubernetes, r/sre. Several mentions of "our terraform setup" and one "our team at Pelican Logistics deals with this every week."

Step 2: LLM extracts company_name: "Pelican Logistics", role_hint: "ops/SRE", industry_hint: "logistics", confidence 0.82.

Step 3: Apollo search for "Jane" at "Pelican Logistics" returns one match: Jane Hartwell, Senior Site Reliability Engineer.

Step 4: Hunter pattern for pelicanlogistics.com returns {first}.{last}@{domain} at 94% confidence. Derived: jane.hartwell@pelicanlogistics.com.

Step 5: SMTP verify returns valid.

Total wall-clock: 14 seconds. Hit. The operator now has a verified email tied to a specific Reddit post the OP wrote 6 hours ago, with the original Reddit context preserved.

The actual cold message that follows is its own craft — the contextual cold message playbook is in the contextual cold message piece, and the multi-channel sequencing thinking is in the sequencing piece.

What this stops working when.

Privacy-pseudonymous accounts. A user who never names their employer in any comment, has no display name, and uses a generic username is essentially un-enrichable. About 12-18% of accounts in our data fall into this bucket. Engage them on platform only.

Tiny companies. If the OP works at a 4-person company without a LinkedIn presence and a personal Gmail, you'll burn through Hunter pattern guesses and get nothing. The fallback is a public comment + waiting for them to DM you.

Catch-all domains. Companies that accept email at any address make verification meaningless — every guess returns "accept_all." You can send, but bounce data won't tell you if your guess was right. Pair these with a LinkedIn message as a backstop.

Reddit's anti-scraping pushback. They've gotten more aggressive in the last 18 months. Use the official API with OAuth credentials, respect rate limits, and don't try to be clever with rotating user-agents — that's what gets your IP blocked.

The full pipeline this slots into is in the Reddit lead generation playbook. And the cost-vs-build analysis on whether to wire this yourself is in build vs buy sales intelligence.

Where Shadow Inbox fits.

The pipeline above is exactly what we run inside Shadow Inbox when an operator clicks "enrich" on a high-intent post. The branches, fallbacks, and verification all happen automatically and you get the verified email in the dashboard along with the original Reddit context. If you'd rather wire this yourself, the steps above are the recipe — and they're the recipe we'd use if we were building it again from scratch.

● FAQ

What's the realistic end-to-end hit rate from username to verified email?
About 38 to 52 percent across the cohorts we've measured. The variance comes from how technical the audience is — r/devops users have noisy LinkedIn presences, while r/sales users are easier to find. We tell operators to expect 4 verified emails per 10 promising usernames as a working baseline.
What do we do when there's no LinkedIn at all?
We branch to comment-history scraping for company hints (employer mentions, project references, signature blocks in long-form replies). About 1 in 5 username-with-no-LinkedIn cases yields a company name from comment history. From there, the company website usually lists a team page or a generic contact, and you can pattern-guess from there.
Is comment-history scraping ToS-compliant?
Reading public comment history through Reddit's official API or public JSON endpoints is fine. What's not fine is impersonating users, scraping behind authentication you don't own, or republishing the data publicly. We treat enriched data as private to the operator who triggered the enrichment, and we don't retain it across customers.
Why use Hunter for the email pattern when Apollo also returns emails?
Apollo's emails are good but they're cached. Hunter's pattern engine + SMTP verify gives you a fresher answer and tells you the confidence of the guess. We use Apollo for company/title resolution and Hunter for the actual email derivation. They complement, not compete.
How much does the full pipeline cost per enriched contact?
Roughly 8 to 14 cents per fully-resolved contact when you're paying retail rates for Apollo + Hunter + a small LLM step. At volume you can negotiate that down, and Shadow Inbox pools the API costs across customers so the marginal cost per signal is closer to 3 cents. Either way, it's pennies relative to the value of a real conversation.
— share
— keep reading

Three more from the log.

How to reply on Reddit without getting banned
003 · Reddit

How to reply on Reddit without getting banned

Reddit reply strategy for founders: why most marketing advice gets you banned, how moderators actually think, and the disclosure pattern that earns upvotes.

Jan 09, 2026 · 10 min