A technical explanation of Verigin's detection pipeline, published false positive rates, confidence intervals, and the appeals process. We publish this because institutions can't buy what they can't audit.
Verigin runs two fundamentally different signal types on every piece of content. They are complementary: one is probabilistic, one is cryptographic. Neither is sufficient alone.
Statistical models analyze patterns in text, images, and video. Multiple independent detectors run in parallel — each with different strengths — and their outputs are reconciled into a single confidence score. This layer covers the vast majority of web content, which has no provenance metadata.
Inherently probabilistic. Can produce false positives. FP rates published below.
The Coalition for Content Provenance and Authenticity standard embeds a cryptographically signed chain of custody directly into files — from the moment of capture to publication. Where a valid C2PA manifest exists, Verigin reads and validates the signature.
Binary and certain when present. No false positives possible. Coverage growing as hardware adoption expands.
Two detection models run in parallel for every text submission. Results are blended 50/50 into a single AI probability score.
| Model | Provider | Strength | Weighting |
|---|---|---|---|
| Winston AI v2 | GoWinston.ai | GPT-4, Claude, Gemini, Llama outputs. Auto language detection — supports non-English content. Returns 0–100 (higher = more human). | 50% |
| GPTZero | GPTZero.me | Strong on academic and journalistic writing styles. Returns completely_generated_prob directly as the AI score. |
50% |
Before scoring, text is preprocessed to reduce false positives caused by quoting sources:
Two image detection models run in parallel. The final score is the maximum of both outputs — not the average. This is a deliberate design choice: the two models cover different attack surfaces, and averaging would suppress true positives.
| Model | Provider | Attack surface | Aggregation |
|---|---|---|---|
| AI-generated image detector | Sightengine | Fully synthetic images: Midjourney, DALL-E, Stable Diffusion, Firefly. Excels at detecting images generated entirely by AI. | max() |
| rd-context-img | Reality Defender | AI-modified real photographs: inpainting, generative fill, object removal, face swap. The only commercially available model with reliable detection of manipulated real photos. | max() |
Images can be submitted by URL (Sightengine fetches directly) or as a file upload (presigned S3 URL → Reality Defender async polling).
Both detection calls run concurrently. Typical latency: 400–900ms. Reality Defender uses async polling with a 30-second timeout.
Final ai_score = max(sightengine_score, reality_defender_score). Both individual scores are returned in verbose mode.
Result is cached by content hash for 24 hours. Repeated calls for the same image return the cached result instantly at no additional API cost.
The Coalition for Content Provenance and Authenticity (C2PA) is an open standard backed by Adobe, Google, Microsoft, the BBC, and 6,000+ member organizations. It enables cameras, editing software, and publishing platforms to embed a cryptographically signed chain of custody into a file.
Verigin verifies C2PA signatures using the official c2pa-python library (open-source, maintained by the Content Authenticity Initiative). Where a valid C2PA manifest exists:
c2pa field of the responseVerigin returns detection signals with confidence intervals — not binary verdicts. This is a deliberate product design decision, not a hedge. Automated adverse actions based solely on a probabilistic score create legal liability and cause harm to falsely flagged individuals.
What Verigin returns
{ "content_type": "text", "ai_score": 0.71, "confidence": "moderate", "signals_elevated": 4, "signals_total": 7, "c2pa": null, "recommendation": "human_review", "verbose": { "winston_score": 0.68, "gptzero_score": 0.74, "model_agreement": "high", "language": "en", "char_count": 1842 } }
How to read this output
These rates are measured on our internal test sets and updated quarterly. A false positive is a human-authored piece of content that Verigin scores above 0.5 (the threshold at which "human review" is recommended).
Last measured: March 2026 | Next update: June 2026
| Content type | FP Rate (current) | Public target | Notes |
|---|---|---|---|
| Text — standard English | 6.2% | < 5% | Journalism, blog, business writing. Improving with preprocessing updates. |
| Text — non-native English | 11.4% | < 8% | Elevated. Bias audit in progress — see below. Do not use for adverse screening of ESL writers without additional human review. |
| Text — academic / scientific | 8.1% | < 7% | Formal writing style overlaps with AI output patterns. Improving. |
| Images — AI-generated (synthetic) | 9.3% | < 12% | Strong performance on photorealistic AI images. Higher FP on stylized/illustrated content. |
| Images — AI-modified (deepfake) | 14.7% | < 18% | Manipulation detection is harder. Heavy compression and resaving reduce signal quality. |
AI detection models trained predominantly on native English text systematically flag non-native English writing at elevated false positive rates. This is the highest asymmetric risk in Verigin's methodology.
Affected populations include: ESL journalists and academics, neurodivergent writers whose prose patterns differ from training data, writers from non-Western academic traditions, and any writer whose style diverges significantly from mainstream Anglo-American conventions.
Testing across non-native English writers, neurodivergent writing styles, and non-Western academic conventions. Results will be published regardless of findings — including if they are worse than the estimates above.
All false positive disclosures are segmented by writer type and writing context, not just content type. Aggregate rates can mask systematic bias against specific populations.
A bias mitigation advisory board including a computational linguistics researcher and a digital rights advocate. Advisory board members will be named publicly when confirmed.
Verigin has shared this methodology documentation proactively with the EFF and ACLU before any complaint is filed. We invite review and critique.
Any content that receives a Verigin score can be submitted for manual review. Human reviewers examine the raw signal data — not just the blended score — and issue a revised assessment if warranted.
Email appeals@verigin.ai with the content URL or hash and the original score. Include any context you believe is relevant (writing style, language, subject matter).
A trained reviewer examines the raw signal output from both detection models, the preprocessing log, and the content itself. The 48-hour SLA applies to Pro and Enterprise customers. Free tier: best effort.
If the appeal is upheld, a revised score is issued and the cached result is updated. The original score and the revised score are both retained in the audit log — Verigin does not delete evidence of errors.
Appeal volumes and uphold rates are reported in quarterly methodology updates. Systematic false positive patterns identified through appeals trigger model recalibration.
Transparency about limitations is as important as accuracy claims.
Heavily edited AI text, AI text run through paraphrasing tools, and AI content from models not in the training data may score as human. Detection is probabilistic and has known blind spots.
Verigin detects probable origin — not quality, accuracy, truthfulness, or editorial value. A human-written article can be false. An AI-generated article can be accurate. These are different questions.
Text submissions under 300 characters do not have enough signal for a reliable score. Results for short text are returned with a low-confidence flag.
Verigin is a triage tool that identifies content warranting closer review. It is not designed to make final decisions about employment, publication, or legal matters without human judgment.
We answer methodology questions directly. No sales process required.