Clarus — Claude Model Checker

◈ Claude Model Checker

Is your Claude endpoint actually Claude?

Many API providers and proxies claim to serve genuine Anthropic models — but deliver cheaper substitutes without disclosure. Clarus runs 15 independent behavioral probes to detect model substitution, downgrading, and fraud with measurable confidence.

15Probes

3Profiles

EN/RULanguage

Check Your Endpoint →

⚙

Configure Endpoint

Enter the API base URL and your key. Fetch the available models, pick one, choose the API format and expected model profile. Clarus first verifies the endpoint is reachable before starting.

→

◈

Run 15 Probes

Behavioral, structural, and capability tests run in parallel — each targeting a different aspect of the model's identity. A pre-flight check confirms the model responds before the full suite begins.

→

✔

Get Verdict

Receive AUTHENTIC / DOWNGRADE / FRAUD verdict with substitution probability, proxy-adjusted behavioral score, confidence level, per-probe breakdown, and human-readable signals.

⊞

Models List

Fetches /models and checks for official Claude IDs vs suspicious custom names and unusual model counts.

◙

System Prompt

Attempts to extract the system prompt and analyzes it for proxy personas, competitor names, or injection artifacts.

◈

Identity

Asks direct identity questions. Genuine Claude knows exactly who it is — substitutes reveal themselves under pressure.

◉

Thinking

Tests extended and adaptive thinking capability. Claude 4 models have unique thinking behaviors that cheap substitutes can't replicate.

◷

Timing

Measures response latency patterns. Proxy relays add detectable overhead; cached responses come back suspiciously fast.

✔

Refusal Style

Tests refusal phrasing and boundary patterns unique to Anthropic's Constitutional AI training. Different models refuse differently.

◙

Knowledge

Verifies knowledge cutoff, context window size, and maximum output token limit match the claimed model's documented specs.

⌘

Encoding

Tests tokenization behavior and special encoding patterns. Different model families handle edge-case tokens differently.

◉

Creative

Evaluates creative writing quality, vocabulary richness, and stylistic fingerprints that vary significantly across model tiers.

⚡

Streaming

Analyzes SSE stream chunk patterns and timing. Proxy layers buffer and re-emit chunks in detectably different ways.

⚙

Tools

Tests function/tool calling behavior, JSON schema adherence, and tool-use phrasing. Substitutes often fail or format differently.

▤

Context Window

Sends a prompt near the claimed context limit and verifies the model can actually process it, exposing silent truncation.

◈

Model Specific

Tailored per-model tests: Opus 4.7 adaptive depth scaling, Opus 4.6 output limits, Sonnet 4.6 extended thinking blocks and self-knowledge.

⇌

Fingerprint

Deep behavioral fingerprinting across multiple independent dimensions: math style, logic patterns, formatting preferences.

⇄

Consistency

Sends the same prompt 5× at temperature 0 and measures Jaccard similarity. Proxy backends rotating models produce inconsistent outputs.