Claude Model Checker
Is your Claude endpoint actually Claude?
Many API providers and proxies claim to serve genuine Anthropic models — but deliver cheaper substitutes without disclosure. Clarus runs 15 independent behavioral probes to detect model substitution, downgrading, and fraud with measurable confidence.
15Probes
3Profiles
EN/RULanguage
Check Your Endpoint →
How It Works
1
Configure Endpoint
Enter the API base URL and your key. Fetch the available models, pick one, choose the API format and expected model profile. Clarus first verifies the endpoint is reachable before starting.
2
Run 15 Probes
Behavioral, structural, and capability tests run in parallel — each targeting a different aspect of the model's identity. A pre-flight check confirms the model responds before the full suite begins.
3
Get Verdict
Receive AUTHENTIC / DOWNGRADE / FRAUD verdict with substitution probability, proxy-adjusted behavioral score, confidence level, per-probe breakdown, and human-readable signals.
Probe Suite
Core Probes
Models List
Fetches /models and checks for official Claude IDs vs suspicious custom names and unusual model counts.
System Prompt
Attempts to extract the system prompt and analyzes it for proxy personas, competitor names, or injection artifacts.
Identity
Asks direct identity questions. Genuine Claude knows exactly who it is — substitutes reveal themselves under pressure.
Thinking
Tests extended and adaptive thinking capability. Claude 4 models have unique thinking behaviors that cheap substitutes can't replicate.
Timing
Measures response latency patterns. Proxy relays add detectable overhead; cached responses come back suspiciously fast.
Refusal Style
Tests refusal phrasing and boundary patterns unique to Anthropic's Constitutional AI training. Different models refuse differently.
Knowledge
Verifies knowledge cutoff, context window size, and maximum output token limit match the claimed model's documented specs.
Encoding
Tests tokenization behavior and special encoding patterns. Different model families handle edge-case tokens differently.
Creative
Evaluates creative writing quality, vocabulary richness, and stylistic fingerprints that vary significantly across model tiers.
Capability Probes
Streaming
Analyzes SSE stream chunk patterns and timing. Proxy layers buffer and re-emit chunks in detectably different ways.
Tools
Tests function/tool calling behavior, JSON schema adherence, and tool-use phrasing. Substitutes often fail or format differently.
Context Window
Sends a prompt near the claimed context limit and verifies the model can actually process it, exposing silent truncation.
Model Specific
Tailored per-model tests: Opus 4.7 adaptive depth scaling, Opus 4.6 output limits, Sonnet 4.6 extended thinking blocks and self-knowledge.
Fingerprint
Deep behavioral fingerprinting across multiple independent dimensions: math style, logic patterns, formatting preferences.
Consistency
Sends the same prompt 5× at temperature 0 and measures Jaccard similarity. Proxy backends rotating models produce inconsistent outputs.
Ready to verify your endpoint?
Check Your Endpoint →