use case · headless cms

Headless CMS spam protection.

Updated May 12, 2026

Sanity, Contentful, Strapi, Hygraph — the modern headless CMS layer ships with great content modeling and zero spam classification. The moment you let users submit anything (comments, reviews, profile bios, support tickets) the spam shows up. One webhook hop through Siftfy lets you classify per-document before it goes live, without coupling your CMS to a specific anti-spam vendor.

Webhooks are the right boundary

Every modern headless CMS fires a webhook on document create/update/publish. That's the natural place to insert classification — after the user has submitted, before the document is visible to the public. The classification result decides whether the document goes live, gets unpublished, or routes to a review queue. You write that decision logic in your own code, on your own infrastructure, in a language you already use. Nothing in the CMS changes.

The pattern below is a single edge function that accepts any headless-CMS payload, looks up which fields to classify per document type, sends them to /v1/predict, and applies the decision through your CMS's mutation API.

Generic webhook receiver

typescript
// Generic webhook receiver — works for Sanity, Contentful,
// Strapi, Hygraph, and anything else that POSTs JSON on document changes.
//
// Strategy: classify only documents of types where users supply the text
// (comments, reviews, profile bios). Skip your own editorial content.

const SPAM_THRESHOLD = 0.85;
const QUEUE_THRESHOLD = 0.5;
const SIFTFY_KEY = process.env.SIFTFY_KEY!;

// Map: document type -> array of field paths to concatenate and classify.
// Edit per project. Anything not in this map is passed through untouched.
const CLASSIFY_FIELDS: Record<string, string[]> = {
  comment: ["body", "authorName"],
  review:  ["body", "title"],
  profile: ["bio"],
};

export async function POST(req: Request): Promise<Response> {
  const doc = await req.json();
  const type = doc._type ?? doc.contentType ?? doc.kind;
  const id = doc._id ?? doc.id ?? doc.sys?.id;

  const fields = CLASSIFY_FIELDS[type];
  if (!fields) return Response.json({ skipped: true });

  const text = fields
    .map((p) => p.split(".").reduce((o, k) => o?.[k], doc))
    .filter(Boolean)
    .join("\n\n");
  if (!text) return Response.json({ ok: true });

  let probability = 0;
  try {
    const resp = await fetch("https://api.siftfy.io/v1/predict", {
      method: "POST",
      headers: { "Content-Type": "application/json", "X-API-Key": SIFTFY_KEY },
      body: JSON.stringify({ text }),
      signal: AbortSignal.timeout(2000),
    });
    if (resp.ok) ({ spam_probability: probability } = await resp.json());
  } catch {
    // Fall open on timeout — better one borderline doc than blocking writes.
  }

  if (probability >= SPAM_THRESHOLD) {
    await mutateDoc(id, { unpublish: true, spam: true });
  } else if (probability >= QUEUE_THRESHOLD) {
    await mutateDoc(id, { needsReview: true });
  }

  return Response.json({ ok: true, type, id, probability });
}

// CMS-specific. Sanity: client.patch(id).set({...}).commit().
// Contentful: management API entry.unpublish() / patch.
// Strapi: admin REST PUT /content-manager/collection-types/...
async function mutateDoc(id: string, patch: object): Promise<void> {
  // ...
}

The CLASSIFY_FIELDS map is the only piece that changes per project — list the document types where users provide text, and which field paths to concatenate. Editorial content (your own posts, your own marketing pages) isn't in the map and is skipped. This keeps you from accidentally classifying a launch announcement as borderline because it uses urgent language.

CMS-specific notes

  • Sanity. Configure under API → Webhooks. Filter the GROQ projection so the webhook only fires on document types you classify. The mutation step uses @sanity/client with a write-token.
  • Contentful. Webhooks live under Settings → Webhooks. Use the Content Management API for the mutation step; Contentful expects entries to be unpublished before they can be deleted.
  • Strapi. Webhooks are configured in the admin panel; the payload includes the full entity. The mutation step uses Strapi's own REST endpoints with an admin API token.
  • Hygraph / GraphCMS. Webhooks under Project settings → Webhooks. The platform's mutation API is GraphQL — straightforward but typed.

Edge cases worth handling

  • Republish on edit. If a user edits a comment that was originally clean, the webhook fires again. Re-classify rather than caching the old score — edits are how spammers slip through.
  • Multilingual content. Raise the block threshold on document types that serve non-English markets. The model is primarily English-trained.
  • Review-queue UX. Use a CMS-native needsReview: true flag so editors filter the queue inside the CMS they already use, instead of bouncing into a separate tool.
  • Backfill. First time you wire this up, run a one-off script over existing user-submitted documents to score the backlog. Most teams find a handful of historical spam that slipped through.
Try it free

10,000 documents / month free. Read the /v1/predict reference, or peek at related use cases: comments, static sites, Ghost CMS.