← All skills
Tencent SkillHub Β· AI

Clarity Gate

Pre-ingestion verification for epistemic quality in RAG systems. Ensures documents are properly qualified before entering knowledge bases. Produces CGD (Clar...

skill openclawclawhub Free
0 Downloads
0 Stars
0 Installs
0 Score
High Signal

Pre-ingestion verification for epistemic quality in RAG systems. Ensures documents are properly qualified before entering knowledge bases. Produces CGD (Clar...

⬇ 0 downloads β˜… 0 stars Unverified but indexed

Install for OpenClaw

Quick setup
  1. Download the package from Yavira.
  2. Extract the archive and review SKILL.md first.
  3. Import or place the package into your OpenClaw setup.

Requirements

Target platform
OpenClaw
Install method
Manual import
Extraction
Extract archive
Prerequisites
OpenClaw
Primary doc
SKILL.md

Package facts

Download mode
Yavira redirect
Package format
ZIP package
Source platform
Tencent SkillHub
What's included
SKILL.md, scripts/claim_id.py, scripts/document_hash.py

Validation

  • Use the Yavira download entry.
  • Review SKILL.md after the package is downloaded.
  • Confirm the extracted package contains the expected setup assets.

Install with your agent

Agent handoff

Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.

  1. Download the package from Yavira.
  2. Extract it into a folder your agent can access.
  3. Paste one of the prompts below and point your agent at the extracted folder.
New install

I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.

Upgrade existing

I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.

Trust & source

Release facts

Source
Tencent SkillHub
Verification
Indexed source record
Version
2.1.3

Documentation

ClawHub primary doc Primary doc: SKILL.md 44 sections Open source page

Clarity Gate v2.1

Purpose: Pre-ingestion verification system that enforces epistemic quality before documents enter RAG knowledge bases. Produces Clarity-Gated Documents (CGD) compliant with the Clarity Gate Format Specification v2.1. Core Question: "If another LLM reads this document, will it mistake assumptions for facts?" Core Principle: "Detection finds what is; enforcement ensures what should be. In practice: find the missing uncertainty markers before they become confident hallucinations."

What's New in v2.1

FeatureDescriptionClaim Completion StatusPENDING/VERIFIED determined by field presence (no explicit status field)Source Field SemanticsActionable source (PENDING) vs. what-was-found (VERIFIED)Claim ID Format GuidanceHash-based IDs preferred, collision analysis for scaleBody Structure RequirementsHITL Verification Record section mandatory when claims existNew Validation CodesE-ST10, W-ST11, W-HC01, W-HC02, E-SC06 (FORMAT_SPEC); E-TB01-07 (SOT validation)Bundled Scriptsclaim_id.py and document_hash.py for deterministic computations

Specifications

This skill implements and references: SpecificationVersionLocationClarity Gate Format (Unified)v2.1docs/CLARITY_GATE_FORMAT_SPEC.md Note: v2.0 unifies CGD and SOT into a single .cgd.md format. SOT is now a CGD with an optional tier: block.

Validation Codes

Clarity Gate defines validation codes for structural and semantic checks per FORMAT_SPEC v2.1:

HITL Claim Validation (Β§1.3.2-1.3.3)

CodeCheckSeverityW-HC01Partial confirmed-by/confirmed-date fieldsWARNINGW-HC02Vague source (e.g., "industry reports", "TBD")WARNINGE-SC06Schema error in hitl-claims structureERROR

Body Structure (Β§1.2.1)

CodeCheckSeverityE-ST10Missing ## HITL Verification Record when claims existERRORW-ST11Table rows don't match hitl-claims countWARNING

SOT Table Validation (Β§3.1)

CodeCheckSeverityE-TB01No ## Verified Claims sectionERRORE-TB02Table has no data rowsERRORE-TB03Required columns missingERRORE-TB04Column order wrongERRORE-TB05Empty cell in required columnERRORE-TB06Invalid date format in Verified columnERRORE-TB07Verified date in future (beyond 24h grace)ERROR Note: Additional validation codes may be defined in RFC-001 (clarification document) but are not part of the normative FORMAT_SPEC.

Bundled Scripts

This skill includes Python scripts for deterministic computations per FORMAT_SPEC.

scripts/claim_id.py

Computes stable, hash-based claim IDs for HITL tracking (per Β§1.3.4). # Generate claim ID python scripts/claim_id.py "Base price is $99/mo" "api-pricing/1" # Output: claim-75fb137a # Run test vectors python scripts/claim_id.py --test Algorithm: Normalize text (strip + collapse whitespace) Concatenate with location using pipe delimiter SHA-256 hash, take first 8 hex chars Prefix with "claim-" Test vectors: claim_id("Base price is $99/mo", "api-pricing/1") β†’ claim-75fb137a claim_id("The API supports GraphQL", "features/1") β†’ claim-eb357742

scripts/document_hash.py

Computes document SHA-256 hash per FORMAT_SPEC Β§2.2-2.4 with full canonicalization. # Compute hash python scripts/document_hash.py my-doc.cgd.md # Output: 7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730 # Verify existing hash python scripts/document_hash.py --verify my-doc.cgd.md # Output: PASS: Hash verified: 7d865e... # Run normalization tests python scripts/document_hash.py --test Algorithm (per Β§2.2-2.4): Extract content between opening ---\n and <!-- CLARITY_GATE_END --> Remove document-sha256 line from YAML frontmatter ONLY (with multiline continuation support) Canonicalize: Strip trailing whitespace per line Collapse 3+ consecutive newlines to 2 Normalize final newline (exactly 1 LF) UTF-8 NFC normalization Compute SHA-256 Cross-platform normalization: BOM removed if present CRLF to LF (Windows) CR to LF (old Mac) Boundary detection (prevents hash computation on content outside CGD structure) Whitespace variations produce identical hashes (deterministic across platforms)

The Key Distinction

Existing tools like UnScientify and HedgeHunter (CoNLL-2010) detect uncertainty markers already present in text ("Is uncertainty expressed?"). Clarity Gate enforces their presence where epistemically required ("Should uncertainty be expressed but isn't?"). Tool TypeQuestionExampleDetection"Does this text contain hedges?"UnScientify/HedgeHunter find "may", "possibly"Enforcement"Should this claim be hedged but isn't?"Clarity Gate flags "Revenue will be $50M"

Critical Limitation

Clarity Gate verifies FORM, not TRUTH. This skill checks whether claims are properly marked as uncertainβ€”it cannot verify if claims are actually true. Risk: An LLM can hallucinate facts INTO a document, then "pass" Clarity Gate by adding source markers to false claims. Solution: HITL (Human-In-The-Loop) verification is MANDATORY before declaring PASS.

When to Use

Before ingesting documents into RAG systems Before sharing documents with other AI systems After writing specifications, state docs, or methodology descriptions When a document contains projections, estimates, or hypotheses Before publishing claims that haven't been validated When handing off documentation between LLM sessions

Relationship to Spec Suite

The 9 Verification Points guide semantic review β€” content quality checks that require judgment (human or AI). They answer questions like "Should this claim be hedged?" and "Are these numbers consistent?" When review completes, output a CGD file conforming to CLARITY_GATE_FORMAT_SPEC.md. The C/S rules in CLARITY_GATE_FORMAT_SPEC.md validate file structure, not semantic content. The connection: Semantic findings (9 points) determine what issues exist Issues are recorded in CGD state fields (clarity-status, hitl-status, hitl-pending-count) State consistency is enforced by structural rules (C7-C10) Example: If Point 5 (Data Consistency) finds conflicting numbers, you'd mark clarity-status: UNCLEAR until resolved. Rule C7 then ensures you can't claim REVIEWED while still UNCLEAR.

Epistemic Checks (Core Focus: Points 1-4)

1. HYPOTHESIS vs FACT LABELING Every claim must be clearly marked as validated or hypothetical. FailsPasses"Our architecture outperforms competitors""Our architecture outperforms competitors [benchmark data in Table 3]""The model achieves 40% improvement""The model achieves 40% improvement [measured on dataset X]" Fix: Add markers: "PROJECTED:", "HYPOTHESIS:", "UNTESTED:", "(estimated)", "~", "?" 2. UNCERTAINTY MARKER ENFORCEMENT Forward-looking statements require qualifiers. FailsPasses"Revenue will be $50M by Q4""Revenue is projected to be $50M by Q4""The feature will reduce churn""The feature is expected to reduce churn" Fix: Add "projected", "estimated", "expected", "designed to", "intended to" 3. ASSUMPTION VISIBILITY Implicit assumptions that affect interpretation must be explicit. FailsPasses"The system scales linearly""The system scales linearly [assuming <1000 concurrent users]""Response time is 50ms""Response time is 50ms [under standard load conditions]" Fix: Add bracketed conditions: "[assuming X]", "[under conditions Y]", "[when Z]" 4. AUTHORITATIVE-LOOKING UNVALIDATED DATA Tables with specific percentages and checkmarks look like measured data. Red flag: Tables with specific numbers (89%, 95%, 100%) without sources Fix: Add "(guess)", "(est.)", "?" to numbers. Add explicit warning: "PROJECTED VALUES - NOT MEASURED"

Data Quality Checks (Complementary: Points 5-7)

5. DATA CONSISTENCY Scan for conflicting numbers, dates, or facts within the document. Red flag: "500 users" in one section, "750 users" in another Fix: Reconcile conflicts or explicitly note the discrepancy with explanation. 6. IMPLICIT CAUSATION Claims that imply causation without evidence. Red flag: "Shorter prompts improve response quality" (plausible but unproven) Fix: Reframe as hypothesis: "Shorter prompts MAY improve response quality (hypothesis, not validated)" 7. FUTURE STATE AS PRESENT Describing planned/hoped outcomes as if already achieved. Red flag: "The system processes 10,000 requests per second" (when it hasn't been built) Fix: Use future/conditional: "The system is DESIGNED TO process..." or "TARGET: 10,000 rps"

Verification Routing (Points 8-9)

8. TEMPORAL COHERENCE Document dates and timestamps must be internally consistent and plausible. FailsPasses"Last Updated: December 2024" (when current is 2026)"Last Updated: January 2026"v1.0.0 dated 2024-12-23, v1.1.0 dated 2024-12-20Versions in chronological order Sub-checks: Document date vs current date Internal chronology (versions, events in order) Reference freshness ("current", "now", "today" claims) Fix: Update dates, add "as of [date]" qualifiers, flag stale claims 9. EXTERNALLY VERIFIABLE CLAIMS Specific numbers that could be fact-checked should be flagged for verification. TypeExampleRiskPricing"Costs ~$0.005 per call"API pricing changesStatistics"Papers average 15-30 equations"May be wildly offRates/ratios"40% of researchers use X"Needs citationCompetitor claims"No competitor offers Y"May be outdated Fix options: Add source with date Add uncertainty marker Route to HITL or external search Generalize ("low cost" instead of "$0.005")

The Verification Hierarchy

Claim Extracted --> Does Source of Truth Exist? | +---------------+---------------+ YES NO | | Tier 1: Automated Tier 2: HITL Consistency & Verification Two-Round Verification | | PASS / BLOCK Round A β†’ Round B β†’ APPROVE / REJECT

Tier 1: Automated Verification

A. Internal Consistency Figure vs. Text contradictions Abstract vs. Body mismatches Table vs. Prose conflicts Numerical consistency B. External Verification (Extension Interface) User-provided connectors to structured sources Financial systems, Git commits, CRM, etc.

Tier 2: Two-Round HITL Verification β€” MANDATORY

Round A: Derived Data Confirmation Claims from sources found in session Human confirms interpretation, not truth Round B: True HITL Verification Claims needing actual verification No source found, human's own data, extrapolations

CGD Output Format

  • When producing a Clarity-Gated Document, use this format per CLARITY_GATE_FORMAT_SPEC.md v2.1:
  • ---
  • clarity-gate-version: 2.1
  • processed-date: 2026-01-12
  • processed-by: Claude + Human Review
  • clarity-status: CLEAR
  • hitl-status: REVIEWED
  • hitl-pending-count: 0
  • points-passed: 1-9
  • rag-ingestable: true # computed by validator - do not set manually
  • document-sha256: 7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730
  • hitl-claims:
  • - id: claim-75fb137a
  • text: "Revenue projection is $50M"
  • value: "$50M"
  • source: "Q3 planning doc"
  • location: "revenue-projections/1"
  • round: B
  • confirmed-by: Francesco
  • confirmed-date: 2026-01-12
  • ---
  • # Document Title
  • [Document body with epistemic markers applied]
  • Claims like "Revenue will be $50M" become "Revenue is **projected** to be $50M *(unverified projection)*"
  • ---
  • ## HITL Verification Record
  • ### Round A: Derived Data Confirmation
  • Claim 1 (source) βœ“
  • Claim 2 (source) βœ“
  • ### Round B: True HITL Verification
  • | # | Claim | Status | Verified By | Date |
  • |---|-------|--------|-------------|------|
  • | 1 | [claim] | βœ“ Confirmed | [name] | [date] |
  • <!-- CLARITY_GATE_END -->
  • Clarity Gate: CLEAR | REVIEWED
  • Required CGD Elements (per spec):
  • YAML frontmatter with all required fields:
  • clarity-gate-version β€” Tool version (no "v" prefix)
  • processed-date β€” YYYY-MM-DD format
  • processed-by β€” Processor name
  • clarity-status β€” CLEAR or UNCLEAR
  • hitl-status β€” PENDING, REVIEWED, or REVIEWED_WITH_EXCEPTIONS
  • hitl-pending-count β€” Integer β‰₯ 0
  • points-passed β€” e.g., 1-9 or 1-4,7,9
  • hitl-claims β€” List of verified claims (may be empty [])
  • End marker (HTML comment + status line):
  • <!-- CLARITY_GATE_END -->
  • Clarity Gate: <clarity-status> | <hitl-status>
  • HITL verification record (if status is REVIEWED)
  • Optional/Computed Fields:
  • rag-ingestable β€” Computed by validators, not manually set. Shows true only when CLEAR | REVIEWED with no exclusion blocks.
  • document-sha256 β€” Required. 64-char lowercase hex hash for integrity verification. See spec Β§2 for computation rules.
  • exclusions-coverage β€” Optional. Fraction of body inside exclusion blocks (0.0–1.0).
  • Escape Mechanism: To write about markers like *(estimated)* without triggering parsing, wrap in backticks: `*(estimated)*`

Claim Completion Status (v2.1)

Claim verification status is determined by field presence, not an explicit status field: Stateconfirmed-byconfirmed-dateMeaningPENDINGabsentabsentAwaiting human verificationVERIFIEDpresentpresentHuman has confirmed(invalid)presentabsentW-HC01: partial fields(invalid)absentpresentW-HC01: partial fields Why no explicit status field? Field presence is self-enforcingβ€”you can't accidentally set status without providing who/when.

Source Field Semantics (v2.1)

The source field meaning changes based on claim state: Statesource ContainsExamplePENDINGWhere to verify (actionable)"Check Q3 planning doc"VERIFIEDWhat was found (evidence)"Q3 planning doc, page 12" Vague source detection (W-HC02): Sources like "industry reports", "research", "TBD" trigger warnings.

Claim ID Format (v2.1)

General pattern: claim-[a-z0-9._-]{1,64} (alphanumeric, dots, underscores, hyphens) ApproachPatternExampleUse CaseHash-based (preferred)claim-[a-f0-9]{8,}claim-75fb137aDeterministic, collision-resistantSequentialclaim-[0-9]+claim-1, claim-2Simple documentsSemanticclaim-[a-z0-9-]+claim-revenue-q3Human-friendly Collision probability: At 1,000 claims with 8-char hex IDs: ~0.012%. For >1,000 claims, use 12+ hex characters. Recommendation: Use hash-based IDs generated by scripts/claim_id.py for consistency and collision resistance.

Exclusion Blocks

When content cannot be resolved (no SME available, legacy prose, etc.), mark it as excluded rather than leaving it ambiguous: <!-- CG-EXCLUSION:BEGIN id=auth-legacy-1 --> Legacy authentication details that require SME review... <!-- CG-EXCLUSION:END id=auth-legacy-1 --> Rules: IDs must match: [A-Za-z0-9][A-Za-z0-9._-]{0,63} No nesting or overlapping blocks Each ID used only once Requires hitl-status: REVIEWED_WITH_EXCEPTIONS Must document exceptions-reason and exceptions-ids in frontmatter Important: Documents with exclusion blocks are not RAG-ingestable. They're rejected entirely (no partial ingestion). See CLARITY_GATE_FORMAT_SPEC.md Β§4 for complete rules.

SOT Validation

When validating a Source of Truth file, the skill checks both format compliance (per CLARITY_GATE_FORMAT_SPEC.md) and content quality (the 9 points).

Format Compliance (Structural Rules)

SOT documents are CGDs with a tier: block. They require a ## Verified Claims section with a valid table. CodeCheckSeverityE-TB01No ## Verified Claims sectionERRORE-TB02Table has no data rowsERRORE-TB03Required columns missing (Claim, Value, Source, Verified)ERRORE-TB04Column order wrong (Claim not first or Verified not last)ERRORE-TB05Empty cell in required columnERRORE-TB06Invalid date format in Verified columnERRORE-TB07Verified date in future (beyond 24h grace)ERROR

Content Quality (9 Points)

The 9 Verification Points apply to SOT content: PointSOT Application1-4Check claims in ## Verified Claims are actually verified5Check for conflicting values across tables6Check claims don't imply unsupported causation7Check table doesn't state futures as present8Check dates are chronologically consistent9Flag specific numbers for external check

SOT-Specific Requirements

Tier block required: SOT is a CGD with tier: block containing level, owner, version, promoted-date, promoted-by Structured claims table: ## Verified Claims section with columns: Claim, Value, Source, Verified Table outside exclusions: The verified claims table must NOT be inside an exclusion block Staleness markers: Use [STABLE], [CHECK], [VOLATILE], [SNAPSHOT] in content [STABLE] β€” Safe to cite without rechecking [CHECK] β€” Verify before citing [VOLATILE] β€” Changes frequently; always verify [SNAPSHOT] β€” Point-in-time data; include date when citing

Output Format

  • After running Clarity Gate, report:
  • ## Clarity Gate Results
  • **Document:** [filename]
  • **Issues Found:** [number]
  • ### Critical (will cause hallucination)
  • [issue + location + fix]
  • ### Warning (could cause equivocation)
  • [issue + location + fix]
  • ### Temporal (date/time issues)
  • [issue + location + fix]
  • ### Externally Verifiable Claims
  • | # | Claim | Type | Suggested Verification |
  • |---|-------|------|------------------------|
  • | 1 | [claim] | Pricing | [where to verify] |
  • ---
  • ## Round A: Derived Data Confirmation
  • [claim] ([source])
  • Reply "confirmed" or flag any I misread.
  • ---
  • ## Round B: HITL Verification Required
  • | # | Claim | Why HITL Needed | Human Confirms |
  • |---|-------|-----------------|----------------|
  • | 1 | [claim] | [reason] | [ ] True / [ ] False |
  • ---
  • **Would you like me to produce an annotated CGD version?**
  • ---
  • **Verdict:** PENDING CONFIRMATION

Severity Levels

LevelDefinitionActionCRITICALLLM will likely treat hypothesis as factMust fix before useWARNINGLLM might misinterpretShould fixTEMPORALDate/time inconsistency detectedVerify and updateVERIFIABLESpecific claim that could be fact-checkedRoute to HITL or external searchROUND ADerived from witnessed sourceQuick confirmationROUND BRequires true verificationCannot pass without confirmationPASSClearly marked, no ambiguity, verifiedNo action needed

Quick Scan Checklist

PatternActionSpecific percentages (89%, 73%)Add source or mark as estimateComparison tablesAdd "PROJECTED" header"Achieves", "delivers", "provides"Use "designed to", "intended to" if not validatedCheckmarksVerify these are confirmed"100%" anythingAlmost always needs qualification"Last Updated: [date]"Check against current dateVersion numbers with datesVerify chronological order"$X.XX" or "~$X" (pricing)Flag for external verification"averages", "typically"Flag for source/citationCompetitor capability claimsFlag for external verification

What This Skill Does NOT Do

Does not classify document types (use Stream Coding for that) Does not restructure documents Does not add deep links or references Does not evaluate writing quality Does not check factual accuracy autonomously (requires HITL)

Related Projects

ProjectPurposeURLSource of Truth CreatorCreate epistemically calibrated docsgithub.com/frmoretto/source-of-truth-creatorStream CodingDocumentation-first methodologygithub.com/frmoretto/stream-codingArXiParseScientific paper verificationarxiparse.org

v2.1.3 (2026-03-02)

FIXED: document_hash.py now implements full FORMAT_SPEC Β§2.1-2.4 compliance FIXED: Fence-aware end marker detection (Quine Protection per Β§2.3/Β§8.5) FIXED: All 4 deployment copies converged to single canonical implementation ADDED: canonicalize() function: trailing whitespace stripping, newline collapsing, NFC normalization ADDED: YAML-aware document-sha256 removal with multiline continuation support (Β§2.2) ADDED: Fence-tracking test vectors (7 new tests, 15 total)

v2.1.0 (2026-01-27)

ADDED: Claim Completion Status semantics (PENDING/VERIFIED by field presence) ADDED: Source Field Semantics (actionable vs. what-was-found) ADDED: Claim ID Format guidance with collision analysis ADDED: Body Structure Requirements (HITL Verification Record mandatory when claims exist) ADDED: New validation codes: E-ST10, W-ST11, W-HC01, W-HC02, E-SC06 (FORMAT_SPEC Β§1.2-1.3) ADDED: Bundled scripts: claim_id.py, document_hash.py UPDATED: References to FORMAT_SPEC v2.1 UPDATED: CGD output example to version 2.1

v2.0.0 (2026-01-13)

ADDED: agentskills.io compliant YAML frontmatter ADDED: Clarity Gate Format Specification v2.0 compliance (unified CGD/SOT) ADDED: SOT validation support with E-TB* error codes ADDED: Validation rules mapping (9 points β†’ rule codes) ADDED: CGD output format template with <!-- CLARITY_GATE_END --> markers ADDED: Quine Protection note (Β§2.3 fence-aware marker detection) ADDED: Redacted Export feature (Β§8.11) UPDATED: hitl-claims format to v2.0 schema (id, text, value, source, location, round) UPDATED: End marker format to HTML comment style UPDATED: Unified format spec v2.0 (single .cgd.md extension) RESTRUCTURED: For multi-platform skill discovery

v1.6 (2025-12-31)

Added Two-Round HITL verification system Round A: Derived Data Confirmation Round B: True HITL Verification

v1.5 (2025-12-28)

Added Point 8: Temporal Coherence Added Point 9: Externally Verifiable Claims

v1.4 (2025-12-23)

Added CGD annotation output mode

v1.3 (2025-12-21)

Restructured points into Epistemic (1-4) and Data Quality (5-7)

v1.2 (2025-12-21)

Added Source of Truth request step

v1.1 (2025-12-21)

Added HITL Fact Verification (mandatory)

v1.0 (2025-11)

Initial release with 6-point verification Version: 2.1.3 Spec Version: 2.1 Author: Francesco Marinoni Moretto License: CC-BY-4.0

Category context

Agent frameworks, memory systems, reasoning layers, and model-native orchestration.

Source: Tencent SkillHub

Largest current source with strong distribution and engagement signals.

Package contents

Included in package
2 Scripts1 Docs
  • SKILL.md Primary doc
  • scripts/claim_id.py Scripts
  • scripts/document_hash.py Scripts