Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Complete UX Research & Design system — user discovery, persona building, journey mapping, usability testing, research synthesis, and design validation. Zero...
Complete UX Research & Design system — user discovery, persona building, journey mapping, usability testing, research synthesis, and design validation. Zero...
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.
Complete UX research methodology — from discovery to validated design decisions. No scripts, no APIs, no dependencies. Pure agent skill.
project: "[Product/Feature Name]" research_question: "[What do we need to learn?]" business_context: objective: "[Business goal this research supports]" decision: "[What decision will this research inform?]" stakeholders: ["PM", "Design Lead", "Engineering"] deadline: "YYYY-MM-DD" scope: product_area: "[Feature/flow being studied]" user_segment: "[Who are we studying?]" geographic: "[Regions/markets]" methodology: "[See selection matrix below]" sample_size: "[See calculator below]" timeline: planning: "Week 1" recruiting: "Week 1-2" fieldwork: "Week 2-3" analysis: "Week 3-4" reporting: "Week 4" budget: participant_incentives: "$X" tools: "$X" total: "$X" success_criteria: - "[Specific insight we need]" - "[Confidence level required]" - "[Actionable output format]"
MethodBest ForSample SizeTimeCostConfidenceUser InterviewsDeep "why" understanding, exploring unknowns5-152-4 weeks$$High (qualitative)Usability TestingFinding interaction problems, validating flows5-8 per round1-2 weeks$$High (behavioral)SurveysQuantifying attitudes, measuring satisfaction100-400+1-2 weeks$High (statistical)Card SortingInformation architecture, navigation labels15-30 (open), 30+ (closed)1 week$MediumDiary StudiesLong-term behavior, context of use10-152-6 weeks$$$High (longitudinal)A/B TestingComparing specific design variants1000+ per variant1-4 weeks$Very HighContextual InquiryUnderstanding real environment, workflows4-82-3 weeks$$$Very HighTree TestingValidating IA without visual design50+1 week$HighFirst-Click TestingNavigation effectiveness30-501 week$MediumConcept TestingEarly-stage idea validation8-151-2 weeks$$MediumHeuristic EvaluationExpert review of existing UI3-5 evaluators2-3 days$MediumCompetitive UX AuditUnderstanding market standardsN/A1 week$Low-Medium
Do you know WHAT the problem is? ├── NO → Generative Research │ ├── Need context? → Contextual Inquiry │ ├── Need attitudes? → User Interviews │ ├── Need behaviors over time? → Diary Study │ └── Need broad patterns? → Survey (exploratory) │ └── YES → Evaluative Research ├── Have a prototype/product? │ ├── YES → Usability Testing │ │ ├── Early concept → Concept Test (paper/low-fi) │ │ ├── Key flow → Task-based Usability Test │ │ └── Comparing options → A/B Test │ └── NO → │ ├── Testing IA → Card Sort / Tree Test │ └── Testing content → First-Click Test └── Need expert opinion fast? → Heuristic Evaluation
Qualitative (interviews, usability): 5 users find ~85% of usability issues (Nielsen) 8-12 for thematic saturation in interviews 15+ for diverse populations or complex domains Rule: keep going until you hear the same things 3x Quantitative (surveys): Population90% Confidence ±5%95% Confidence ±5%99% Confidence ±5%1007480875001762172851,00021427839910,000264370622100,000+271384660 A/B Tests: MDE (Minimum Detectable Effect) drives sample size 5% MDE, 80% power, 95% confidence → ~1,600 per variant 2% MDE → ~10,000 per variant Always run for full business cycles (min 1 week)
screener: title: "[Study Name] Participant Screener" target_profile: demographics: age_range: "[e.g., 25-45]" location: "[e.g., US-based]" language: "[e.g., English-fluent]" behavioral: product_usage: "[e.g., Uses [product] 3+ times/week]" experience_level: "[e.g., 1+ year with similar tools]" recent_activity: "[e.g., Made a purchase in last 30 days]" psychographic: decision_maker: "[e.g., Primary household purchaser]" tech_comfort: "[e.g., Comfortable with mobile apps]" screening_questions: - question: "How often do you use [product category]?" type: "single-select" options: ["Daily", "Weekly", "Monthly", "Rarely", "Never"] qualify: ["Daily", "Weekly"] disqualify: ["Never"] - question: "Which of these tools do you currently use?" type: "multi-select" options: ["Tool A", "Tool B", "Tool C", "None"] qualify_min: 1 - question: "What is your primary role?" type: "single-select" options: ["Developer", "Designer", "PM", "Marketing", "Other"] qualify: ["Developer", "Designer", "PM"] - question: "Have you participated in a UX study in the last 6 months?" type: "single-select" options: ["Yes", "No"] disqualify: ["Yes"] # Avoid professional participants anti-patterns: - "Works at a competitor or in UX research" - "Family/friends of team members" - "Participated in study for this product before" incentive: "$75 for 60-min session" recruiting_channels: - channel: "Existing user database" quality: "★★★★★" cost: "Free" - channel: "UserTesting.com / UserInterviews.com" quality: "★★★★" cost: "$50-150/participant" - channel: "Social media recruitment" quality: "★★★" cost: "Free-$$" - channel: "Craigslist / local posting" quality: "★★" cost: "$"
Screener doesn't lead (no "right" answers obvious) Mix of demographics within target segment No more than 20% from single recruiting source At least 1 "edge case" participant (power user, new user, accessibility needs) Over-recruit by 20% for no-shows Consent form prepared and sent in advance Incentive delivery method confirmed
80/20 rule: Participant talks 80%, you talk 20% Never ask "Would you use this?" — people can't predict future behavior Ask about past behavior, not hypothetical futures Follow the energy — when they get animated, dig deeper Silence is a tool — pause 5 seconds after they answer; they'll elaborate "Tell me more about that" — your most powerful phrase Watch for say/do gaps — note when claimed behavior contradicts observed behavior Record everything — audio minimum, video ideal, notes always
participant: id: "P01" date: "YYYY-MM-DD" demographics: "[age, role, experience level]" session_duration: "58 min" key_quotes: - quote: "[Exact words]" timestamp: "12:34" context: "[What prompted this]" theme: "[Emerging theme tag]" observations: behaviors: - "[What they DID, not what they said]" emotions: - "[Frustration when..., delight when..., confusion at...]" workarounds: - "[Creative solutions they've built]" pain_points: - pain: "[Specific problem]" severity: "[1-5]" frequency: "[daily/weekly/monthly/rarely]" current_solution: "[How they cope]" needs: - need: "[Unmet need identified]" type: "[functional/emotional/social]" evidence: "[Quote or behavior that reveals this]" surprises: - "[Anything unexpected — these are gold]" moderator_notes: - "[Post-session reflection, what to adjust for next interview]"
persona: name: "[Realistic name — not cutesy]" photo: "[Representative stock photo description]" archetype: "[1-3 word label, e.g., 'The Overwhelmed Manager']" demographics: age: "[Range or specific]" role: "[Job title / life stage]" experience: "[Years with product/domain]" tech_proficiency: "[Novice / Intermediate / Advanced / Expert]" environment: "[Office / remote / mobile / field]" # MOST IMPORTANT SECTION goals: primary: "[The #1 thing they're trying to accomplish]" secondary: - "[Supporting goal]" - "[Supporting goal]" underlying: "[The emotional/social need behind the functional goal]" frustrations: - frustration: "[Specific pain point]" frequency: "[How often — from research data]" severity: "[1-5]" current_workaround: "[What they do today]" evidence: "[P03, P07, P11 mentioned this]" behaviors: usage_pattern: "[When, where, how often they engage]" decision_process: "[How they evaluate options]" information_sources: "[Where they learn / get help]" social_influence: "[Who influences their decisions]" key_workflows: - "[Task 1 — frequency — duration]" - "[Task 2 — frequency — duration]" mental_models: - "[How they think about [concept] — often surprising]" - "[Vocabulary they use — not our jargon]" motivations: gains: "[What success looks like to them]" fears: "[What failure looks like]" triggers: "[What prompts them to act]" barriers: "[What stops them from acting]" quotes: - "\"[Real quote from research that captures this persona]\"" - "\"[Another revealing quote]\"" design_implications: must_have: - "[Feature/quality this persona absolutely needs]" should_have: - "[Important but not dealbreaker]" must_avoid: - "[Things that will drive this persona away]" communication_style: "[How to talk to this persona]" data_sources: interviews: "[# of participants who map to this persona]" survey_segment: "[% of survey respondents]" analytics_cohort: "[Behavioral data that identifies this group]"
Based on real research data, not assumptions Represents a meaningful segment (not 1 outlier) Goals are specific enough to design for Frustrations include frequency + severity (not just a list) Contains at least 2 real quotes Design implications are actionable Reviewed with 3+ stakeholders Cross-checked against analytics data Does NOT describe everyone (a good persona excludes people)
anti_persona: name: "[Label]" description: "[Who this is]" why_excluded: "[Business reason — too small a segment, wrong market, etc.]" risk_if_included: "[What happens to the product if we try to serve them too]"
journey_map: title: "[Persona] — [Goal/Scenario]" persona: "[Which persona]" scenario: "[Specific situation triggering this journey]" stages: - stage: "1. Awareness / Trigger" duration: "[Time in this stage]" goals: "[What they want to accomplish]" actions: - "[Step they take]" - "[Step they take]" touchpoints: - "[Where they interact — website, app, email, phone, in-person]" thoughts: - "\"[What they're thinking — from research]\"" emotions: rating: 3 # 1=frustrated, 3=neutral, 5=delighted feeling: "[Curious but uncertain]" pain_points: - "[Problem encountered]" opportunities: - "[How we could improve this moment]" - stage: "2. Consideration / Research" # ... same structure - stage: "3. Decision / Sign-Up" # ... same structure - stage: "4. Onboarding / First Use" # ... same structure - stage: "5. Regular Use / Value Realization" # ... same structure - stage: "6. Expansion / Advocacy (or Churn)" # ... same structure moments_of_truth: - moment: "[Critical make-or-break interaction]" stage: "[Which stage]" current_experience: "[What happens now — score 1-5]" desired_experience: "[What should happen — score 1-5]" gap: "[Difference = priority]" service_blueprint_layer: # Optional — behind-the-scenes - stage: "[Stage name]" frontstage: "[What user sees]" backstage: "[What team does]" support_systems: "[Tools/processes involved]" failure_points: "[Where things break down]"
Plot emotions across the journey: 5 ★ Delighted ──────────╮ ╭── 4 ☺ Happy │ │ 3 😐 Neutral ──╮ │ ╭─────╯ 2 😟 Frustrated │ │ │ 1 😤 Angry ╰──────╯────╯ Stage1 Stage2 Stage3 Stage4 Stage5
Based on research, not assumptions (note data source for each insight) One persona per map (don't average) Include BOTH functional and emotional dimensions Identify "moments of truth" — the 2-3 interactions that make or break the experience Prioritize opportunities by gap size (desired minus current) Include backstage/blueprint layer for service design
usability_test: study_name: "[Name]" objective: "[What design question are we answering?]" format: type: "[Moderated / Unmoderated]" location: "[Remote / In-person / Lab]" device: "[Desktop / Mobile / Tablet / Cross-device]" duration: "60 min" recording: "[Screen + audio + face camera]" prototype: fidelity: "[Paper / Wireframe / Hi-fi / Live product]" tool: "[Figma / InVision / Live URL]" scope: "[Which flows are testable]" known_limitations: "[What won't work in the prototype]" participants: target: 5-8 criteria: "[From screener — link to Phase 2]" incentive: "$75" tasks: - task_id: "T1" scenario: "You need to [context]. Using this app, [goal]." success_criteria: - "[Specific completion definition]" time_limit: "5 min" priority: "critical" # critical / important / nice-to-know metrics: - completion_rate - time_on_task - error_count - satisfaction_rating - task_id: "T2" scenario: "[Next task...]" # ... same structure post_task_questions: - "On a scale of 1-7, how easy was that? (SEQ)" - "What did you expect to happen when you [action]?" - "Was anything confusing?" post_test_questions: - "SUS (System Usability Scale) — 10 questions" - "What was the easiest part?" - "What was the most frustrating part?" - "Would you use this? Why/why not?" - "What's missing?"
Set the scene — give context, not instructions ("You want to book a flight to NYC next Friday" NOT "Click the search button") Don't use interface words — say "find" not "navigate to," say "purchase" not "add to cart and checkout" Make it realistic — use scenarios from actual research data One goal per task — don't combine ("book a flight AND a hotel") Order: easy → hard — build confidence before complex tasks
SeverityLabelDefinitionAction0Not a problemDisagreement among evaluators, no real issueNone1CosmeticNoticed but doesn't affect task completionFix if time allows2MinorCauses hesitation or minor inefficiencySchedule fix3MajorCauses significant difficulty, workarounds neededFix before launch4CatastrophicPrevents task completion entirelyFix immediately
finding: id: "UF-001" title: "[Short descriptive title]" severity: 3 # 0-4 frequency: "4/5 participants" task: "T2" observation: "[What happened — factual, behavioral]" evidence: - participant: "P01" behavior: "[What they did]" quote: "\"[What they said]\"" timestamp: "14:22" - participant: "P03" behavior: "[What they did]" root_cause: "[Why this happened — mental model mismatch, visibility, feedback, etc.]" recommendation: change: "[Specific design change]" rationale: "[Why this will fix it]" effort: "[S/M/L]" impact: "[High/Medium/Low]" heuristic_violated: "[Which Nielsen heuristic, if applicable]"
#HeuristicWhat to Check1Visibility of system statusLoading indicators, progress bars, confirmation messages2Match real worldLabels match user language, not internal jargon3User control & freedomUndo, back, cancel, exit are easy to find4Consistency & standardsSame action = same result everywhere5Error preventionConfirmations, constraints, smart defaults6Recognition > recallOptions visible, not memorized7Flexibility & efficiencyShortcuts for experts, simple for novices8Aesthetic & minimalistNo unnecessary information competing for attention9Error recoveryClear error messages with solutions, not codes10Help & documentationSearchable, task-focused, concise
Rate each heuristic 1-5 per screen/flow: heuristic_audit: screen: "[Screen/Flow name]" evaluator: "[Name]" date: "YYYY-MM-DD" scores: visibility_of_status: 4 real_world_match: 3 user_control: 2 consistency: 4 error_prevention: 3 recognition_over_recall: 4 flexibility_efficiency: 2 aesthetic_minimal: 3 error_recovery: 1 help_documentation: 2 total: 28 # out of 50 grade: "C" # A=45+, B=38+, C=28+, D=20+, F=<20 critical_issues: - heuristic: "Error recovery" location: "[Where]" issue: "[What's wrong]" fix: "[Recommendation]"
Extract: Pull every observation, quote, behavior onto individual notes Cluster: Group similar notes (bottom-up, not top-down) Label: Name each cluster with a theme (use participant language) Hierarchy: Group clusters into meta-themes Prioritize: Rank by frequency × impact
theme: name: "[Theme label — use participant language]" description: "[2-3 sentence summary]" evidence: participant_count: "8/12 participants" segments_affected: ["Persona A", "Persona B"] quotes: - participant: "P03" quote: "\"[Exact quote]\"" - participant: "P07" quote: "\"[Exact quote]\"" behaviors_observed: - "[What they did]" - "[Pattern across participants]" data_points: - "[Any quantitative support — survey %, analytics, etc.]" impact: on_users: "[How this affects their experience]" on_business: "[Revenue, retention, acquisition, support cost impact]" severity: "High" # High / Medium / Low insight: "[The 'so what' — what does this mean for design?]" recommendations: - recommendation: "[Specific, actionable change]" effort: "M" impact: "High" confidence: "High" # based on evidence strength
Every insight must follow: Observation + Evidence + So What + Now What "Users consistently [OBSERVATION] — seen in [X/Y participants, with supporting quotes]. This matters because [SO WHAT — impact on goals/business]. We should [NOW WHAT — specific recommendation]." Bad insight: "Users found the navigation confusing" Good insight: "7 of 12 participants couldn't find the settings page within 30 seconds. 4 looked in the profile menu, 2 used search, 1 gave up. This maps to 15% of support tickets ('How do I change my password'). Moving settings to the top-level nav and adding a search shortcut would reduce discovery time and cut related support volume."
DimensionWeightCriteriaMethodology Rigor20%Right method for question, adequate sample, proper recruitingData Quality15%Rich observations, real quotes, behavioral evidenceAnalysis Depth20%Beyond surface themes, root causes identified, patterns across segmentsInsight Actionability25%Specific recommendations, effort/impact rated, prioritizedPresentation Clarity10%Stakeholders can understand and act without explanationBusiness Connection10%Findings connected to business metrics and goals Scoring: 90-100: Publication-quality research 75-89: Strong actionable research 60-74: Adequate — some gaps in methodology or analysis 40-59: Weak — findings are surface-level or poorly supported Below 40: Redo — methodology flaws undermine findings
DimensionQuestions to AskClarityCan users understand what this is and what to do within 5 seconds?AlignmentDoes this solve the problem identified in research? For the right persona?Mental ModelDoes it match how users think about this task? (from interview data)PriorityDoes the visual hierarchy match user task priority?SimplicityCan anything be removed without losing function?
design_review: screen: "[Screen name]" reviewer: "[Name]" date: "YYYY-MM-DD" research_alignment: - check: "Addresses top pain point from research" status: "✅ / ❌ / ⚠️" notes: "[Which finding this addresses]" - check: "Uses language from user interviews (not internal jargon)" status: "✅ / ❌ / ⚠️" - check: "Matches mental model revealed in research" status: "✅ / ❌ / ⚠️" - check: "Works for primary persona AND doesn't break for secondary" status: "✅ / ❌ / ⚠️" usability: - check: "Primary action is visually dominant" status: "✅ / ❌ / ⚠️" - check: "Error states designed and messaged" status: "✅ / ❌ / ⚠️" - check: "Empty states designed (first use, no data, no results)" status: "✅ / ❌ / ⚠️" - check: "Loading states designed" status: "✅ / ❌ / ⚠️" - check: "Edge cases handled (long text, missing data, permissions)" status: "✅ / ❌ / ⚠️" accessibility: - check: "Color contrast meets WCAG AA (4.5:1 text, 3:1 UI)" status: "✅ / ❌ / ⚠️" - check: "Touch targets ≥44px" status: "✅ / ❌ / ⚠️" - check: "Information not conveyed by color alone" status: "✅ / ❌ / ⚠️" - check: "Logical reading/tab order" status: "✅ / ❌ / ⚠️" - check: "Alt text for meaningful images" status: "✅ / ❌ / ⚠️" overall_score: "[1-5]" ship_decision: "Ready / Needs changes / Needs testing / Needs research"
research/ ├── YYYY/ │ ├── Q1/ │ │ ├── [study-name]/ │ │ │ ├── plan.yaml # Research brief │ │ │ ├── screener.yaml # Recruiting criteria │ │ │ ├── guide.md # Interview/test guide │ │ │ ├── notes/ # Per-participant notes │ │ │ │ ├── P01.yaml │ │ │ │ └── P02.yaml │ │ │ ├── synthesis/ # Themes, affinity maps │ │ │ ├── personas/ # Updated personas │ │ │ ├── journey-maps/ # Updated maps │ │ │ ├── report.md # Final report │ │ │ └── recordings/ # Session recordings (link) │ │ └── [next-study]/ │ └── Q2/ ├── personas/ # Master persona library │ ├── persona-a.yaml │ └── persona-b.yaml ├── journey-maps/ # Master journey maps ├── insights-database.yaml # Cross-study insight tracker └── research-calendar.yaml # Planned studies
insights_database: - insight_id: "INS-001" theme: "[Category]" insight: "[The insight]" first_found: "2026-01-15" studies: ["Study A", "Study C", "Study F"] evidence_strength: "Strong" # 3+ studies status: "Addressed" # Open / In Progress / Addressed / Won't Fix design_response: "[What was done]" impact_measured: "[Before/after metric if available]"
MetricHow to MeasureTargetFindings → shipped features% of recommendations implemented within 2 quarters>60%Pre/post usability scoresSUS score before vs after changes+10 pointsSupport ticket reductionRelated ticket volume after design change-25%Task completion rateUsability test success rate over time>85%Time on taskAverage task time trendDecreasingStakeholder confidencePost-study survey: "How useful was this?">4/5
CommandWhat It Does"Plan a research study for [topic]"Generate research brief YAML"Build a screener for [audience]"Generate screening questionnaire"Create interview guide for [topic]"Generate interview questions and structure"Build persona from [data/notes]"Synthesize data into persona YAML"Map the journey for [persona + goal]"Generate journey map"Plan usability test for [prototype]"Generate test plan with tasks"Run heuristic evaluation of [screen/flow]"Score against Nielsen's 10"Synthesize findings from [study]"Generate themes and insights"Write research report for [study]"Generate executive summary and recommendations"Score this research [report/study]"Evaluate against quality rubric"Review this design against research"CAMPS critique + alignment check"Set up research repository"Create folder structure and templates
Guerrilla testing: coffee shop intercepts (5 min tests, buy them a coffee) Internal users: use colleagues from different departments (not product/design team) Social media: post in relevant communities for volunteers Existing users: email opt-in for research panel
Video call with screen share (Zoom, Google Meet) Async: Loom recordings of tasks + written responses Unmoderated: UserTesting.com, Maze, Lookback Diary studies: use messaging apps (WhatsApp, Telegram) for daily check-ins
"5 users, 1 week, 3 critical findings" — the minimum viable study Pair research with existing touchpoints (support calls, sales demos) Frame as risk reduction: "Would you rather discover this before or after launch?" Show past research ROI (support ticket reduction, conversion improvement)
Check sample composition — different segments may have different needs Prioritize by business impact: which segment is more valuable? Run a survey to quantify: "60% prefer A, 40% prefer B" Consider designing for both (progressive disclosure, personalization)
Don't just translate — localize scenarios and contexts Account for cultural response bias (e.g., reluctance to criticize in some cultures) Use local moderators when possible Adjust incentives to local norms Watch for design patterns that don't transfer (icons, colors, reading direction)
Recruit participants with disabilities (screen reader users, motor impairments, cognitive differences) Test with actual assistive technology, not simulation Include in regular studies (at least 1 participant with accessibility needs per study) WCAG compliance testing is NOT a substitute for research with disabled users Built by AfrexAI — Autonomous Intelligence for Business
Code helpers, APIs, CLIs, browser automation, testing, and developer operations.
Largest current source with strong distribution and engagement signals.