customer · webmail

Emcognito WebMail uses Siftfy to keep multi-domain inboxes clean.

Updated May 12, 2026

Emcognito WebMail is a webmail product built around a single unified inbox spanning many business domains. Every inbound message — from any of those domains — runs through POST /v1/predict before it ever reaches the user's notification path.

Built by the same team. Disclosed up-front.

The product

Emcognito WebMail collapses many email addresses across many custom domains into one inbox for one human. Each incoming message displays a colour stripe in the margin keyed to the domain it landed on, and replies auto-select the matching From: address based on which domain received the original. Mail is per-domain DKIM-signed.

The signup model is one paying user with their own DNS records pointing at the Emcognito MTA — see wm.emcognito.com for product details and pricing.

Why post-store classification

The webmail backend classifies after the MTA hands off to the storage layer, not during SMTP. Two reasons:

  • The classifier is never on the SMTP critical path. A 10ms hop from inside our cluster to api.siftfy.io is cheap; an SMTP-time call to the same place would couple our mail availability to a third-party outside our incident perimeter.
  • Fail-open is cheap. The message is already persisted before the classifier call. If Siftfy times out, the message stays deliverable and lands in Inbox with a verdict="unavailable" metadata tag — we'd rather let one borderline message through than drop real mail during a transient outage.

Two thresholds, one model

Siftfy returns a calibrated probability between 0 and 1. The backend reads two thresholds from config:

  • SPAM_AUTO_THRESHOLD — at or above this, the message moves to the Spam folder and skips push notifications. Production is currently set conservatively.
  • SPAM_SUSPICIOUS_THRESHOLD — between this and the auto threshold, the message stays in Inbox but carries a "suspicious" annotation visible in the message view. Useful for marketing-adjacent content that users want to see but want flagged.

Both thresholds tune off the same calibrated probability — no retraining, no second model. That's the three-bucket pattern the rest of the docs describe, applied to email.

The integration, end-to-end

Three pieces of information persist on every classified message: the raw probability, the verdict bucket, and a spamModel tag of the form siftfy:<likelihood>. The provider tag is the cheapest insurance against a future provider swap — historical data stays interpretable even if we change classifiers.

python
# Simplified from the production webmail backend. Two thresholds:
# SPAM_AUTO_THRESHOLD moves to the Spam folder; SPAM_SUSPICIOUS_THRESHOLD
# annotates the message for review but keeps it in Inbox.
import httpx

async def classify_message(user_id: str, msg_id: str) -> SpamDecision:
    msg = await repository.get_message(user_id, msg_id)
    if msg.get("folder") != "inbox":
        return SpamDecision(verdict="skipped")

    # User-level allow/block overrides the classifier. "Report spam" /
    # "Not spam" feedback writes a sender or domain preference that
    # short-circuits the API call entirely.
    if (pref := await preference_action(user_id, msg)) == "block":
        await repository.update_message_folder(user_id, msg_id, "spam")
        return SpamDecision(verdict="blocked", moved_to_spam=True)
    if pref == "allow":
        return SpamDecision(verdict="allowlisted")

    # Build the classifier input from the headers + body. Subject + From
    # + To + Served-To carry the spam signal almost as well as the body
    # itself for forwarded marketing campaigns.
    text = build_detector_text(msg)[:20_000]

    try:
        async with httpx.AsyncClient(timeout=settings.spam_detector_timeout_seconds) as client:
            resp = await client.post(
                f"{settings.spam_detector_url}/predict",
                headers={"X-API-Key": settings.spam_detector_api_key},
                json={"text": text},
            )
            resp.raise_for_status()
            payload = resp.json()
    except Exception as exc:
        # Fail open: store the message, mark detector unavailable, deliver.
        # Better to let one borderline message through than to drop legitimate
        # mail because of a transient outage.
        await repository.update_message_spam_metadata(
            user_id, msg_id, verdict="unavailable", error=str(exc)[:300]
        )
        return SpamDecision(verdict="unavailable", error=str(exc))

    probability = float(payload["spam_probability"])
    likelihood = payload.get("likelihood", "")

    if probability >= settings.spam_auto_threshold:
        verdict = "spam"
    elif probability >= settings.spam_suspicious_threshold:
        verdict = "suspicious"
    else:
        verdict = "clean"

    await repository.update_message_spam_metadata(
        user_id, msg_id,
        verdict=verdict,
        probability=probability,
        # Tag with provider so a future swap stays traceable in the data layer.
        model=f"siftfy:{likelihood}",
    )
    if verdict == "spam":
        await repository.update_message_folder(user_id, msg_id, "spam")
    return SpamDecision(verdict=verdict, probability=probability)

User feedback ("Report spam", "Not spam") writes to a per-user sender/domain allow-block table that overrides Siftfy's verdict for future mail from that sender. The classifier is the default; user preference is the escape hatch.

What we got from it

  • Single round trip. Spam classification is one HTTP POST inside the cluster. No queues, no async callback, no second model service to run.
  • Calibrated thresholding. The two thresholds map cleanly onto product UX (move-to-Spam vs. flag-in-Inbox) without retraining or a custom decision layer.
  • Provider-agnostic data layer. Tagging with siftfy:<likelihood> means a future swap to a different classifier (or a self-hosted one) doesn't lose the historical context.
  • Build cost: half a day. Including config wiring, Kubernetes secret, threshold tuning, and the test shape that mocks the API response.
Try Siftfy free

10,000 classifications / month free. /v1/predict reference, related patterns: contact forms, comments.