Tencent SkillHub · Developer Tools

QA & Testing Engine

Provides a comprehensive testing methodology for AI software, covering strategy design, unit, integration, and end-to-end tests with coverage and reporting g...

skill openclawclawhub Free

0 Downloads

0 Stars

0 Installs

0 Score

High Signal

Provides a comprehensive testing methodology for AI software, covering strategy design, unit, integration, and end-to-end tests with coverage and reporting g...

⬇ 0 downloads ★ 0 stars Unverified but indexed

Install for OpenClaw

Quick setup

Download the package from Yavira.
Extract the archive and review SKILL.md first.
Import or place the package into your OpenClaw setup.

Requirements

Target platform: OpenClaw
Install method: Manual import
Extraction: Extract archive
Prerequisites: OpenClaw
Primary doc: SKILL.md

Package facts

Download mode: Yavira redirect
Package format: ZIP package
Source platform: Tencent SkillHub
What's included: README.md, SKILL.md

Validation

Use the Yavira download entry.
Review SKILL.md after the package is downloaded.
Confirm the extracted package contains the expected setup assets.

Install with your agent

Agent handoff

Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.

Download the package from Yavira.
Extract it into a folder your agent can access.
Paste one of the prompts below and point your agent at the extracted folder.

New install

I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.

Upgrade existing

I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.

Open Send to Agent page Open JSON manifest Open Markdown brief

Trust & source

Release facts

Source: Tencent SkillHub
Verification: Indexed source record
Version: 1.0.0

Provenance

Publisher: 1kalin
Source page: View original listing
Canonical URL: Open canonical page

Documentation

ClawHub primary doc Primary doc: SKILL.md 36 sections Open source page

QA & Testing Engine — Complete Software Quality System

The definitive testing methodology for AI agents. From test strategy to execution, coverage to reporting — everything you need to ship quality software.

Phase 1: Test Strategy Design

Before writing a single test, design the strategy.

Strategy Brief Template

project: name: "" type: web-app | api | mobile | library | cli | data-pipeline languages: [typescript, python, go, java] frameworks: [react, express, django, spring] risk_profile: data_sensitivity: low | medium | high | critical # PII, financial, health user_impact: internal | b2b | b2c | life-safety deployment_frequency: daily | weekly | monthly regulatory: [none, SOC2, HIPAA, PCI-DSS, GDPR] test_scope: in_scope: [] # Features, services, components out_of_scope: [] # Explicitly excluded (with reason) environments: dev: { url: "", db: "local" } staging: { url: "", db: "seeded" } prod: { url: "", smoke_only: true }

Test Type Decision Matrix

Risk ProfileUnitIntegrationE2EPerformanceSecurityAccessibilityInternal tool✅ Core✅ API⚠️ Happy path❌⚠️ Basic❌B2B SaaS✅ Full✅ Full✅ Critical flows✅ Load✅ OWASP Top 10✅ WCAG AAB2C high-traffic✅ Full✅ Full✅ Full✅ Stress + soak✅ Full✅ WCAG AAFinancial/Health✅ Full + mutation✅ Full + contract✅ Full + chaos✅ Full suite✅ Pen test✅ WCAG AAA

Test Pyramid Architecture

/ E2E \ 5-10% — Critical user journeys only / Integration \ 20-30% — API contracts, service boundaries / Unit Tests \ 60-70% — Business logic, pure functions Anti-pattern: Ice cream cone — More E2E than unit tests. Slow, flaky, expensive. Fix by pushing test coverage DOWN the pyramid. Anti-pattern: Hourglass — Lots of unit + E2E, no integration. Misses contract bugs between services.

The AAA Pattern (Arrange-Act-Assert)

Every unit test follows this structure: describe('PricingCalculator', () => { // Group by behavior, not by method describe('when customer has volume discount', () => { it('applies tiered pricing above threshold', () => { // ARRANGE — Set up the scenario const calculator = new PricingCalculator(); const customer = createCustomer({ tier: 'enterprise', units: 150 }); // ACT — Execute the behavior under test const price = calculator.calculate(customer); // ASSERT — Verify the outcome (ONE logical assertion) expect(price).toEqual({ subtotal: 12000, discount: 1800, // 15% volume discount total: 10200, }); }); }); });

Test Naming Convention

Format: [unit] [scenario] [expected behavior] ✅ Good: PricingCalculator applies 15% discount when units exceed 100 UserService throws NotFoundError when user ID is invalid parseDate returns null for malformed ISO strings ❌ Bad: test1, should work, calculates price

What to Unit Test (Priority Order)

Business logic — Pricing, rules, calculations, state machines Data transformations — Parsers, formatters, serializers, mappers Edge cases — Boundaries, null/undefined, empty collections, overflow Error handling — Every catch block, every validation path Pure functions — Easiest to test, highest ROI

What NOT to Unit Test

Framework internals (React rendering, Express routing) Simple getters/setters with no logic Third-party library behavior Implementation details (private methods, internal state)

Mocking Rules

Dependency TypeStrategyExampleDatabaseMock the repository/DAOjest.mock('./userRepo')HTTP APIMock the client or use MSWmsw.http.get('/api/users', ...)File systemMock fs or use temp dirsjest.mock('fs/promises')Time/DateFake timersjest.useFakeTimers()RandomnessSeed or mockjest.spyOn(Math, 'random')EnvironmentOverride env varsprocess.env.NODE_ENV = 'test' Rule: Mock at boundaries, not internals. If you're mocking a class you own, your design might need refactoring.

Coverage Targets

MetricMinimumGoodExcellentLine coverage70%85%95%+Branch coverage60%80%90%+Function coverage75%90%95%+Critical path coverage100%100%100% Warning: 100% coverage ≠ quality. Coverage measures what code ran, not what was verified. A test with no assertions has coverage but no value.

API Testing Checklist

For every API endpoint, test: endpoint: POST /api/orders tests: happy_path: - Valid request returns 201 with order ID - Response matches schema - Database record created correctly - Events/webhooks fired validation: - Missing required fields → 400 with field errors - Invalid data types → 400 with type errors - Business rule violations → 422 with explanation authentication: - No token → 401 - Expired token → 401 - Wrong role → 403 - Valid token → proceeds edge_cases: - Duplicate request (idempotency) → same response - Concurrent requests → no race condition - Maximum payload size → 413 or graceful handling - Special characters in input → no injection error_handling: - Database down → 503 with retry hint - External service timeout → 504 or fallback - Rate limit exceeded → 429 with retry-after

Contract Testing

When services communicate, test the contract: contract: consumer: order-service provider: payment-service interactions: - description: "Process payment" request: method: POST path: /payments body: amount: 99.99 currency: USD order_id: "ord_123" response: status: 200 body: payment_id: "pay_xxx" # string, not null status: "completed" # enum: completed|pending|failed breaking_changes: # NEVER do these without versioning - Remove a field from response - Change a field's type - Add a required field to request - Change the URL path - Change error response format

Database Testing Rules

Each test gets a clean state — Use transactions that rollback, or truncate between tests Use factories, not fixtures — createUser({ role: 'admin' }) > hardcoded SQL dumps Test migrations — Run migrate-up, migrate-down, migrate-up (roundtrip) Test constraints — Unique violations, FK cascades, NOT NULL Test queries — Especially complex JOINs, aggregations, window functions

Critical User Journey Mapping

Identify and test the flows that generate revenue or block users: critical_journeys: - name: "Sign up → First value" steps: - Visit landing page - Click sign up - Fill registration form - Verify email - Complete onboarding - Perform first key action max_duration: 3 minutes - name: "Purchase flow" steps: - Browse products - Add to cart - Enter shipping - Enter payment - Confirm order - Receive confirmation email max_duration: 2 minutes - name: "Login → Core task → Logout" steps: - Login (password + SSO + MFA variants) - Navigate to core feature - Complete primary workflow - Verify result - Logout max_duration: 1 minute

E2E Best Practices

Test user behavior, not implementation — Click buttons by text/role, not by CSS class Use data-testid sparingly — Only when no accessible selector exists Wait for state, not time — waitFor(element) not sleep(3000) Isolate test data — Each test creates its own users/data Run in CI with retries — 1 retry for flaky network, investigate if >5% flake rate

Selector Priority (Best → Worst)

getByRole('button', { name: 'Submit' }) — Accessible, resilient getByLabelText('Email') — Form-specific, accessible getByText('Welcome back') — Content-based getByTestId('submit-btn') — Explicit test hook querySelector('.btn-primary') — ❌ Fragile, breaks on CSS changes

Flaky Test Triage

SymptomLikely CauseFixPasses locally, fails in CITiming/race conditionAdd explicit waits, check CI resource limitsFails intermittentlyShared state between testsIsolate test data, reset stateFails after deployEnvironment differenceCheck env vars, API versions, feature flagsFails at specific timeTime-dependent logicMock dates/times, avoid time-sensitive assertionsFails in parallelResource contentionUse unique ports/DBs per worker Rule: Quarantine flaky tests within 24 hours. A flaky test suite that everyone ignores is worse than no tests.

Load Test Design

performance_tests: smoke: vus: 5 duration: 1m purpose: "Verify test works" load: vus: 100 # Expected concurrent users duration: 10m ramp_up: 2m purpose: "Normal traffic behavior" thresholds: p95_response: <500ms error_rate: <1% stress: vus: 300 # 3x expected load duration: 15m ramp_up: 5m purpose: "Find breaking point" soak: vus: 80 duration: 2h purpose: "Memory leaks, connection exhaustion" spike: stages: - { vus: 50, duration: 2m } - { vus: 500, duration: 30s } # Sudden spike - { vus: 50, duration: 2m } purpose: "Recovery behavior"

Performance Budgets

MetricWeb AppAPIBackground JobResponse time (p50)<200ms<100msN/AResponse time (p95)<1s<500msN/AResponse time (p99)<3s<1sN/AThroughput>100 rps>500 rps>1000/minError rate<0.1%<0.1%<0.5%CPU usage<70%<70%<90%Memory growth<5%/hr<2%/hr<10%/hr

Database Performance Testing

db_performance: query_tests: - name: "Dashboard aggregate query" baseline: 50ms max_acceptable: 200ms with_1M_rows: measure with_10M_rows: measure index_verification: - Run EXPLAIN ANALYZE on all critical queries - Verify no sequential scans on tables >10K rows - Check index usage statistics weekly connection_pool: - Test at max connections - Verify graceful handling when pool exhausted - Monitor connection wait time

OWASP Top 10 Test Checklist

security_tests: A01_broken_access_control: - [ ] Horizontal privilege escalation (access other user's data) - [ ] Vertical privilege escalation (access admin functions) - [ ] IDOR (Insecure Direct Object References) - [ ] Missing function-level access control - [ ] CORS misconfiguration A02_cryptographic_failures: - [ ] Sensitive data in transit (TLS 1.2+) - [ ] Sensitive data at rest (encryption) - [ ] Password hashing (bcrypt/argon2, not MD5/SHA) - [ ] No secrets in code/logs/URLs A03_injection: - [ ] SQL injection (parameterized queries) - [ ] NoSQL injection - [ ] Command injection (OS commands) - [ ] XSS (stored, reflected, DOM-based) - [ ] Template injection (SSTI) A04_insecure_design: - [ ] Rate limiting on auth endpoints - [ ] Account lockout after N failures - [ ] CAPTCHA on public forms - [ ] Business logic abuse scenarios A05_security_misconfiguration: - [ ] Default credentials removed - [ ] Error messages don't leak stack traces - [ ] Security headers set (CSP, HSTS, X-Frame-Options) - [ ] Directory listing disabled - [ ] Unnecessary HTTP methods disabled A07_auth_failures: - [ ] Brute force protection - [ ] Session fixation - [ ] Session timeout - [ ] JWT validation (signature, expiry, issuer) - [ ] MFA bypass attempts

Input Validation Test Payloads

Test every user input with: injection_payloads: sql: ["' OR 1=1--", "'; DROP TABLE users;--", "1 UNION SELECT * FROM users"] xss: ["<script>alert(1)</script>", "<img onerror=alert(1) src=x>", "javascript:alert(1)"] path_traversal: ["../../etc/passwd", "..\\..\\windows\\system32", "%2e%2e%2f"] command: ["; ls -la", "| cat /etc/passwd", "$(whoami)", "`id`"] boundary_values: strings: ["", " ", "a"*10000, null, undefined, "emoji: 🎯", "unicode: é à ü", "rtl: مرحبا"] numbers: [0, -1, 2147483647, -2147483648, NaN, Infinity, 0.1+0.2] arrays: [[], [null], Array(10000)] dates: ["1970-01-01", "2099-12-31", "invalid-date", "2024-02-29", "2023-02-29"]

Framework Selection Guide

NeedJavaScript/TSPythonGoJavaUnitVitest / Jestpytesttesting + testifyJUnit 5APISupertesthttpx + pytestnet/http/httptestRestAssuredE2E (browser)PlaywrightPlaywrightchromedpSeleniumPerformancek6LocustvegetaGatlingContractPactPactPactPactSecurityZAP + customBandit + customgosecSpotBugs

CI Pipeline Test Stages

pipeline: stage_1_fast: # <2 min, blocks PR - Lint + type check - Unit tests - Security: dependency scan (npm audit / safety) stage_2_thorough: # <10 min, blocks merge - Integration tests - Contract tests - Security: SAST scan - Coverage report + threshold check stage_3_confidence: # <30 min, blocks deploy - E2E critical journeys - Visual regression (if applicable) - Security: container scan stage_4_post_deploy: # After deploy to staging - Smoke tests against staging - Performance baseline check - Security: DAST scan (ZAP) stage_5_production: # After prod deploy - Smoke tests (critical paths only) - Synthetic monitoring enabled - Canary metrics watching

Test Data Management

test_data_strategy: unit_tests: approach: factories # Builder pattern, create exactly what you need example: "createUser({ role: 'admin', plan: 'enterprise' })" integration_tests: approach: seeded_database reset: per_test_suite # Transaction rollback or truncate sensitive_data: anonymized # Never use real PII e2e_tests: approach: api_setup # Create data via API before test cleanup: after_each # Delete created data isolation: unique_identifiers # Timestamp or UUID in test data performance_tests: approach: representative_dataset volume: 10x_production # Test with more data than prod generation: faker_libraries # Realistic but synthetic

Test Health Dashboard

metrics: test_suite_health: total_tests: 0 passing: 0 failing: 0 skipped: 0 # >5% skipped = tech debt alarm flaky: 0 # >2% flaky = quarantine immediately coverage: line: "0%" branch: "0%" critical_paths: "0%" # Must be 100% execution: unit_duration: "0s" # Target: <30s integration_duration: "0s" # Target: <5m e2e_duration: "0s" # Target: <15m total_ci_time: "0s" # Target: <20m defect_metrics: bugs_found_in_test: 0 bugs_escaped_to_prod: 0 escape_rate: "0%" # Target: <5% mttr: "0h" # Mean time to resolve trends: # Track weekly new_tests_added: 0 tests_deleted: 0 # Healthy deletion = removing redundant tests coverage_delta: "+0%" flake_rate_delta: "+0%"

Test Report Template

# Test Report — [Feature/Sprint/Release]
## Summary
**Status:** ✅ PASS / ⚠️ PASS WITH RISKS / ❌ FAIL
**Tests Run:** X | **Passed:** X | **Failed:** X | **Skipped:** X
**Coverage:** Line X% | Branch X% | Critical 100%
**Duration:** Xm Xs
## Key Findings
### 🔴 Critical (Block Release)
1. [Finding] — [Impact] — [Fix recommendation]
### 🟡 High (Fix Before Next Release)
1. [Finding] — [Impact] — [Fix recommendation]
### 🟢 Medium/Low (Backlog)
1. [Finding] — [Impact]
## Risk Assessment
**Untested areas:** [list]
**Known flaky tests:** [list with ticket IDs]
**Performance concerns:** [if any]
## Recommendation
[Ship / Ship with monitoring / Hold for fixes]

Quality Score (0-100)

DimensionWeightScoringTest coverage20%<60%=0, 60-70%=5, 70-80%=10, 80-90%=15, 90%+=20Critical path coverage20%<100%=0, 100%=20Defect escape rate15%>10%=0, 5-10%=5, 2-5%=10, <2%=15Test suite speed10%>30m=0, 20-30m=3, 10-20m=7, <10m=10Flake rate10%>5%=0, 2-5%=3, 1-2%=7, <1%=10Security test coverage10%None=0, Basic=3, OWASP Top 10=7, Full=10Documentation5%None=0, Basic=2, Complete=5Automation ratio10%<50%=0, 50-70%=3, 70-90%=7, 90%+=10 Scoring: 0-40 = 🔴 Critical | 41-60 = 🟡 Needs Work | 61-80 = 🟢 Good | 81-100 = 💎 Excellent

Accessibility Testing (WCAG 2.1)

accessibility_checklist: level_a: # Minimum compliance - [ ] All images have alt text - [ ] All form inputs have labels - [ ] Color is not the only visual indicator - [ ] Page has proper heading hierarchy (h1→h2→h3) - [ ] All functionality available via keyboard - [ ] Focus is visible and logical - [ ] No content flashes >3 times/second level_aa: # Standard compliance (recommended) - [ ] Color contrast ratio ≥4.5:1 (normal text) - [ ] Color contrast ratio ≥3:1 (large text) - [ ] Text resizable to 200% without loss - [ ] Skip navigation links - [ ] Consistent navigation across pages - [ ] Error suggestions provided - [ ] ARIA landmarks for page regions tools: - axe-core (automated, catches ~30% of issues) - Lighthouse accessibility audit - Manual keyboard navigation test - Screen reader testing (VoiceOver/NVDA)

API Backward Compatibility Testing

compatibility_tests: when_updating_api: - [ ] All existing fields still present in response - [ ] No field type changes (string→number) - [ ] New required request fields have defaults - [ ] Deprecated fields still work (with warning header) - [ ] Error format unchanged - [ ] Pagination behavior unchanged - [ ] Rate limits not reduced versioning_strategy: - URL versioning: /v1/users, /v2/users - Header versioning: Accept: application/vnd.api+json;version=2 - Sunset header for deprecated versions - Minimum 6-month deprecation notice

Chaos Engineering Principles

chaos_tests: network: - Service dependency goes down → graceful degradation? - Network latency increases 10x → timeout handling? - DNS resolution fails → fallback behavior? infrastructure: - Database primary fails → replica promotion? - Cache (Redis) goes down → DB fallback works? - Disk fills up → alerting + graceful failure? application: - Memory pressure → OOM handling? - CPU saturation → request queuing? - Certificate expiry → monitoring alert? data: - Corrupt message in queue → dead letter + alert? - Schema migration fails mid-way → rollback works? - Clock skew between services → idempotency holds?

For New Features

Review requirements — Identify test scenarios before code is written (shift-left) Write test cases — Cover happy path, edge cases, error cases, security Review PR tests — Are tests meaningful? Do they test behavior, not implementation? Run full suite — Unit + integration + E2E for affected areas Report findings — Use the test report template above

For Bug Fixes

Write failing test first — Reproduce the bug as a test Verify fix makes test pass — The test IS the proof Check for regression — Run related test suites Add to regression suite — Bug tests prevent re-introduction

Weekly QA Review

weekly_review: monday: - Review flaky test quarantine — fix or delete - Check coverage trends — declining = tech debt - Review escaped defects — update test strategy friday: - Update test health dashboard - Clean up obsolete tests - Document new testing patterns discovered - Plan next week's testing focus

Natural Language Commands

"Create test strategy for [project/feature]" → Full strategy brief "Write unit tests for [function/class]" → AAA pattern tests with edge cases "Test this API endpoint: [method] [path]" → Full API test checklist "Review these tests for quality" → Test code review with scoring "Generate performance test plan" → k6/Locust test design "Security test [feature/endpoint]" → OWASP-based test checklist "Create test report for [release]" → Formatted test report "What's our test health?" → Dashboard with metrics and recommendations "Find gaps in our test coverage" → Analysis with prioritized recommendations "Help debug this flaky test" → Root cause analysis with fix suggestions "Set up CI test pipeline" → Stage-by-stage pipeline config "Accessibility audit [page/component]" → WCAG checklist with findings

Category context

Code helpers, APIs, CLIs, browser automation, testing, and developer operations.

Source: Tencent SkillHub

Largest current source with strong distribution and engagement signals.

Package contents

Included in package

2 Docs

SKILL.md Primary doc
README.md Docs