customer · webmail
Emcognito WebMail uses Siftfy to keep multi-domain inboxes clean.
Updated May 12, 2026
Emcognito WebMail is a webmail product built around a single unified inbox spanning many business domains. Every inbound message — from any of those domains — runs through POST /v1/predict before it ever reaches the user's notification path.
The product
Emcognito WebMail collapses many email addresses across many custom domains into one inbox for one human. Each incoming message displays a colour stripe in the margin keyed to the domain it landed on, and replies auto-select the matching From: address based on which domain received the original. Mail is per-domain DKIM-signed.
The signup model is one paying user with their own DNS records pointing at the Emcognito MTA — see wm.emcognito.com for product details and pricing.
Why post-store classification
The webmail backend classifies after the MTA hands off to the storage layer, not during SMTP. Two reasons:
- The classifier is never on the SMTP critical path. A 10ms hop from inside our cluster to
api.siftfy.iois cheap; an SMTP-time call to the same place would couple our mail availability to a third-party outside our incident perimeter. - Fail-open is cheap. The message is already persisted before the classifier call. If Siftfy times out, the message stays deliverable and lands in Inbox with a
verdict="unavailable"metadata tag — we'd rather let one borderline message through than drop real mail during a transient outage.
Two thresholds, one model
Siftfy returns a calibrated probability between 0 and 1. The backend reads two thresholds from config:
SPAM_AUTO_THRESHOLD— at or above this, the message moves to the Spam folder and skips push notifications. Production is currently set conservatively.SPAM_SUSPICIOUS_THRESHOLD— between this and the auto threshold, the message stays in Inbox but carries a "suspicious" annotation visible in the message view. Useful for marketing-adjacent content that users want to see but want flagged.
Both thresholds tune off the same calibrated probability — no retraining, no second model. That's the three-bucket pattern the rest of the docs describe, applied to email.
The integration, end-to-end
Three pieces of information persist on every classified message: the raw probability, the verdict bucket, and a spamModel tag of the form siftfy:<likelihood>. The provider tag is the cheapest insurance against a future provider swap — historical data stays interpretable even if we change classifiers.
# Simplified from the production webmail backend. Two thresholds:
# SPAM_AUTO_THRESHOLD moves to the Spam folder; SPAM_SUSPICIOUS_THRESHOLD
# annotates the message for review but keeps it in Inbox.
import httpx
async def classify_message(user_id: str, msg_id: str) -> SpamDecision:
msg = await repository.get_message(user_id, msg_id)
if msg.get("folder") != "inbox":
return SpamDecision(verdict="skipped")
# User-level allow/block overrides the classifier. "Report spam" /
# "Not spam" feedback writes a sender or domain preference that
# short-circuits the API call entirely.
if (pref := await preference_action(user_id, msg)) == "block":
await repository.update_message_folder(user_id, msg_id, "spam")
return SpamDecision(verdict="blocked", moved_to_spam=True)
if pref == "allow":
return SpamDecision(verdict="allowlisted")
# Build the classifier input from the headers + body. Subject + From
# + To + Served-To carry the spam signal almost as well as the body
# itself for forwarded marketing campaigns.
text = build_detector_text(msg)[:20_000]
try:
async with httpx.AsyncClient(timeout=settings.spam_detector_timeout_seconds) as client:
resp = await client.post(
f"{settings.spam_detector_url}/predict",
headers={"X-API-Key": settings.spam_detector_api_key},
json={"text": text},
)
resp.raise_for_status()
payload = resp.json()
except Exception as exc:
# Fail open: store the message, mark detector unavailable, deliver.
# Better to let one borderline message through than to drop legitimate
# mail because of a transient outage.
await repository.update_message_spam_metadata(
user_id, msg_id, verdict="unavailable", error=str(exc)[:300]
)
return SpamDecision(verdict="unavailable", error=str(exc))
probability = float(payload["spam_probability"])
likelihood = payload.get("likelihood", "")
if probability >= settings.spam_auto_threshold:
verdict = "spam"
elif probability >= settings.spam_suspicious_threshold:
verdict = "suspicious"
else:
verdict = "clean"
await repository.update_message_spam_metadata(
user_id, msg_id,
verdict=verdict,
probability=probability,
# Tag with provider so a future swap stays traceable in the data layer.
model=f"siftfy:{likelihood}",
)
if verdict == "spam":
await repository.update_message_folder(user_id, msg_id, "spam")
return SpamDecision(verdict=verdict, probability=probability)User feedback ("Report spam", "Not spam") writes to a per-user sender/domain allow-block table that overrides Siftfy's verdict for future mail from that sender. The classifier is the default; user preference is the escape hatch.
What we got from it
- Single round trip. Spam classification is one HTTP POST inside the cluster. No queues, no async callback, no second model service to run.
- Calibrated thresholding. The two thresholds map cleanly onto product UX (move-to-Spam vs. flag-in-Inbox) without retraining or a custom decision layer.
- Provider-agnostic data layer. Tagging with
siftfy:<likelihood>means a future swap to a different classifier (or a self-hosted one) doesn't lose the historical context. - Build cost: half a day. Including config wiring, Kubernetes secret, threshold tuning, and the test shape that mocks the API response.
10,000 classifications / month free. /v1/predict reference, related patterns: contact forms, comments.