Why use the Algolia HN API instead of the official HN Firebase API?

Algolia gives you full-text search, filters, and tag-based queries in one HTTP call. The Firebase API gives you raw item IDs and you have to fetch each one, which means hundreds of requests per query. For monitoring keywords across all of HN, Algolia is the right tool. Use Firebase only when you need real-time submission events.

How often should we poll the API?

Every 15 minutes is plenty for HN. The site moves slower than Reddit and most posts have a multi-hour decay curve. Polling every minute is wasteful and you'll hit Algolia's rate limit eventually. If you need true realtime, the Firebase API has a /v0/updates endpoint, but for buyer-intent monitoring, 15-minute granularity is fine.

Will an LLM filter step add latency we care about?

Adding a Claude Sonnet 4.6 classifier on top of each result adds about 800ms per post and roughly $0.002 per post. For 50 posts a day that's pennies and the latency is invisible because you're already polling on a cron. Skip the LLM only if you're chasing sub-minute alerting.

What's the minimum keyword count to make this worthwhile?

One sharp keyword can be enough. We have customers monitoring a single competitor's product name and getting 4 to 8 high-intent posts a month. The cost of running this for one keyword is the same as for fifty, so over-monitor and let the LLM filter cut the noise.

← back to indexblog / hackernews / hn-monitoring-setup

● HackerNews

The 5-minute HN monitoring setup that replaces a $5K SDR tool

HackerNews monitoring with the Algolia API, a cron, and an optional LLM filter. The full setup costs nothing and replaces the $5K SDR tool you almost bought.

ArthurFounder, Shadow Inbox

publishedMar 11, 2026

read6 min

Five minutes. One Algolia HTTP call. A cron job. An optional 30-line LLM classifier. That's the entire HN monitoring stack you need to replace the SDR intelligence tool your VP of Sales is about to expense for $5K a year. We've built this e

We've built this exact setup for hundreds of HN keyword monitors inside Shadow Inbox. It's not complicated. The complicated part is knowing which keywords to monitor and what to do with the results, both of which we'll cover.

HackerNews has a free, undocumented-but-stable search API. Most of the SDR tool category is reselling it back to you with a logo on top.

The whole stack, in five steps.

1. Algolia HN search API  (free, no auth)
       │
       ▼
2. Keyword + filter spec  (JSON, version-controlled)
       │
       ▼
3. Cron job or webhook    (every 15 min)
       │
       ▼
4. LLM intent filter      (optional, $0.002/post)
       │
       ▼
5. Slack / email destination  (Resend, webhook, whatever)

The first three steps are 5 minutes. Steps 4 and 5 add another 20 minutes if you've never wired up an LLM call before. The whole thing fits in a single Vercel cron + edge function, or a node cron.js on a $5 VPS.

$0cost of the Algolia HN API per request

15minpolling interval that catches every signal you care about

2¢cost per post if you add an LLM intent filter

5minfrom zero to first alert in your inbox

Step 1: The Algolia HN search API.

HN runs on Algolia for search. The API is public, undocumented in any official place, and stable since 2014. The base endpoint:

https://hn.algolia.com/api/v1/search_by_date

Two query types matter:

search — relevance ranked. Use for one-off queries.
search_by_date — chronological. Use for monitoring (you want new stuff, not popular stuff).

The full param list is on the Algolia HN search docs page. The ones you'll actually use:

query — your keyword string
tags — filter by post type: story, comment, ask_hn, show_hn, front_page
numericFilters — filter by timestamp, points, comment count
hitsPerPage — default 20, max 1000

Step 2: The keyword + filter spec.

Don't hardcode keywords in your cron. Put them in a JSON file you can edit without redeploying.

{
  "monitors": [
    {
      "name": "self-hosted-analytics",
      "queries": ["plausible alternative", "self-hosted analytics", "umami vs"],
      "tags": "(story,ask_hn,comment)",
      "min_points": 1,
      "max_age_minutes": 30
    },
    {
      "name": "ci-pain",
      "queries": ["github actions slow", "circleci alternative"],
      "tags": "(comment)",
      "min_points": 0,
      "max_age_minutes": 60
    }
  ]
}

Two non-obvious choices baked into this shape:

Comments are first-class signals. Most monitoring tools only watch stories. The buyer who's actually frustrated is in the 47-comment Show HN thread saying "yeah we tried CircleCI and it was painful." Watch comments.

Max age, not min age. You want fresh posts, not popular ones. A post that was filtered out last poll cycle should not show up again 6 hours later when it hits the front page. Filter on created_at_i > now - 30min.

Step 3: The cron.

A 30-line Node script that hits the API, dedupes against a flat-file or Postgres seen_ids table, and pushes new hits to step 4.

// hn-monitor.ts — runs every 15 min
import { readFileSync } from "fs";
const config = JSON.parse(readFileSync("./monitors.json", "utf8"));
const seen = await loadSeenIds();
const now = Math.floor(Date.now() / 1000);
 
for (const monitor of config.monitors) {
  for (const q of monitor.queries) {
    const url = new URL("https://hn.algolia.com/api/v1/search_by_date");
    url.searchParams.set("query", q);
    url.searchParams.set("tags", monitor.tags);
    url.searchParams.set(
      "numericFilters",
      `created_at_i>${now - monitor.max_age_minutes * 60}`
    );
    const hits = (await fetch(url).then(r => r.json())).hits;
    for (const hit of hits) {
      if (seen.has(hit.objectID)) continue;
      await processHit(hit, monitor);
      seen.add(hit.objectID);
    }
  }
}

Stick that in a vercel.json cron, a GitHub Actions schedule, or a literal crontab -e entry. Whatever you have. Don't let infrastructure be the reason this doesn't ship.

Step 4: The LLM intent filter.

The Algolia API returns posts that match your keywords. Most of those posts will not be buyers. They'll be devs evaluating, students researching, hobbyists nerd-sniping. You need an LLM filter or you'll send yourself junk.

The shape of the prompt:

You are filtering HackerNews posts for buying intent.
The user sells: {one-line product description}.
 
Read the post below. Answer in JSON:
{
  "intent": "buyer_now" | "buyer_soon" | "evaluating" | "off_topic",
  "confidence": 0.0-1.0,
  "evidence": "quoted span from the post that justifies the verdict"
}
 
Only mark buyer_now if the OP states they are deciding within 30 days
and have a current pain. Otherwise default to evaluating or off_topic.

Run this with Claude Sonnet 4.6 at temperature 0. Cost is roughly $0.002 per post for typical HN body length. For 50 posts a day, that's $3 a month.

The chain-of-thought trick (forcing evidence quotation before classification) cuts false positives roughly in half. We cover the full LLM intent classifier design in our buying intent piece.

Step 5: Push to where humans look.

Don't build a dashboard you'll never check. Push to where you already work.

Slack — incoming webhook, one channel per monitor. Format: title link, score, evidence quote, "reply" button.
Email — Resend or your SMTP. Daily digest at 8:30am with everything from the last 24 hours sorted by intent score.
Inbox — your existing CRM via Zapier or a direct API call.

The right destination depends on the decay curve. HN posts decay slower than Reddit (median useful-comment window is 18 hours, not 6), so a daily email digest works. If you're chasing breaking Show HN launches, use Slack.

The full multi-channel sequencing thinking — what to do once a signal lands somewhere — is in the multi-channel sequencing playbook.

A working example: monitoring CI/CD pain.

Suppose you sell a faster CI runner. Here's the actual config we'd run:

{
  "monitors": [
    {
      "name": "ci-frustration",
      "queries": [
        "github actions slow",
        "ci pipeline slow",
        "circleci alternative",
        "ci minutes expensive",
        "buildkite vs"
      ],
      "tags": "(story,comment,ask_hn)",
      "max_age_minutes": 30
    }
  ]
}

Five keywords, hits posts and comments, polled every 15 minutes. In a typical week this surfaces 8 to 15 raw matches, of which the LLM filter promotes 2 to 4 to buyer_now. That's a small enough volume that a founder can write personal replies without burning out.

The HN-specific intent playbook — Show HN, Ask HN, Who's Hiring, and the gold most people miss — is laid out in the HN intent piece. For comment-mining specifically, the HN comments piece covers the etiquette of replying without getting flamed.

What this stops working when.

Three failure modes worth naming:

Your category has no shared vocabulary on HN. If your buyers don't use the words you'd search for, the keyword monitor finds nothing. Sell to non-technical buyers? HN isn't the channel.

You set keywords too broad. "API" matches 80 posts a day. "REST API rate limiting frustration" matches one post a month. Start narrow and broaden.

You don't act within 24 hours. HN threads die. If your monitor fires at 6pm Friday and you check Monday morning, the OP has moved on. At minimum, push the signal somewhere you'll see it on weekends if you care about weekend posts.

Where Shadow Inbox fits.

The setup above is the version you build yourself in an afternoon. Shadow Inbox is the version with the intent filter pre-built, the Slack/email destinations wired up, the dedupe and decay timer baked in, and the same pipeline running on Reddit (and soon LinkedIn, X, Quora). If you'd rather skip the wiring, that's what we do. If you'd rather own the stack, the JSON above is what we'd build again.

● FAQ

Why use the Algolia HN API instead of the official HN Firebase API?: Algolia gives you full-text search, filters, and tag-based queries in one HTTP call. The Firebase API gives you raw item IDs and you have to fetch each one, which means hundreds of requests per query. For monitoring keywords across all of HN, Algolia is the right tool. Use Firebase only when you need real-time submission events.
How often should we poll the API?: Every 15 minutes is plenty for HN. The site moves slower than Reddit and most posts have a multi-hour decay curve. Polling every minute is wasteful and you'll hit Algolia's rate limit eventually. If you need true realtime, the Firebase API has a /v0/updates endpoint, but for buyer-intent monitoring, 15-minute granularity is fine.
Will an LLM filter step add latency we care about?: Adding a Claude Sonnet 4.6 classifier on top of each result adds about 800ms per post and roughly $0.002 per post. For 50 posts a day that's pennies and the latency is invisible because you're already polling on a cron. Skip the LLM only if you're chasing sub-minute alerting.
What's the minimum keyword count to make this worthwhile?: One sharp keyword can be enough. We have customers monitoring a single competitor's product name and getting 4 to 8 high-intent posts a month. The cost of running this for one keyword is the same as for fifty, so over-monitor and let the LLM filter cut the noise.

— filed underHackerNews Outbound Signal AI

— share

x in tg

— keep reading

Three more from the log.

001 · HackerNews

Reading HackerNews comments for buying signals without getting flamed

HackerNews comments hold richer buying intent than the front page. The 4 comment patterns that signal active evaluation, and how to engage without getting flamed.

Nov 13, 2025 · 10 min

002 · HackerNews

The HackerNews intent playbook: 4 veins of buyer signal

HackerNews intent playbook for outbound: how to read Show HN, Ask HN, Who's Hiring, and the comment threads where buyers actually reveal what they need.

Oct 16, 2025 · 9 min

003 · Buyer Intent

The signal economy: why intent beats volume in 2026

The signal economy: why intent-based outbound beats volume in 2026, the new operator stack, and where real-time intent graphs go from here.

Apr 09, 2026 · 12 min