Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Convert vague instructions into clear AI prompts using structures, techniques, and templates for reliable, precise, and measurable outputs.
Convert vague instructions into clear AI prompts using structures, techniques, and templates for reliable, precise, and measurable outputs.
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Then review README.md for any prerequisites, environment setup, or post-install checks. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Then review README.md for any prerequisites, environment setup, or post-install checks. Summarize what changed and any follow-up checks I should run.
Complete methodology for writing, testing, and optimizing prompts that reliably produce high-quality outputs from any LLM. From first draft to production-grade prompt systems.
Run this diagnostic on any prompt: #CheckPass?1Clear task statement in first 2 sentences2Output format explicitly specified3At least one concrete example included4Edge cases addressed5Evaluation criteria defined6No ambiguous pronouns or references7Tested on 3+ diverse inputs8Failure modes documented Score: X/8. Below 6 = high risk of inconsistent outputs.
# [ROLE] ## Context [Background the model needs. Domain, constraints, audience.] ## Task [Clear, specific instruction. One primary action.] ## Input [What the user will provide. Format description.] ## Output Format [Exact structure required. Use examples.] ## Rules [Hard constraints. What to always/never do.] ## Examples [At least one inputβoutput pair showing ideal behavior.] ## Edge Cases [What to do when input is ambiguous, missing, or unusual.]
When to use: Complex reasoning, math, multi-step logic, analysis Basic CoT: Think through this step-by-step before giving your final answer. Structured CoT (more reliable): Before answering, work through these steps: 1. Identify the key variables in the problem 2. List the constraints and requirements 3. Consider 2-3 possible approaches 4. Evaluate each approach against the constraints 5. Select the best approach and explain why 6. Generate the solution 7. Verify the solution against the original requirements When NOT to use CoT: Simple factual lookups Format conversion tasks When speed matters more than accuracy Tasks under 50 tokens of output
Golden rule: Examples teach format AND quality simultaneously. Example design checklist: Shows the exact input format users will provide Shows the exact output format you want Demonstrates the reasoning depth expected Includes at least one edge case example Examples are diverse (not all the same pattern) Few-shot template: ## Examples ### Example 1: [Simple case] **Input**: [representative input] **Output**: [ideal output showing format + quality] ### Example 2: [Edge case] **Input**: [tricky or ambiguous input] **Output**: [how to handle gracefully] ### Example 3: [Complex case] **Input**: [challenging real-world input] **Output**: [thorough, high-quality response] How many examples? Task ComplexityExamples NeededNotesFormat conversion1-2Format is the lessonClassification3-5One per category minimumGeneration2-3Show quality rangeAnalysis2One simple, one complexExtraction3-5Cover structural variations
Use structural tags to separate concerns: <context> Background information the model needs </context> <input> The actual data to process </input> <instructions> What to do with the input </instructions> <output_format> How to structure the response </output_format> When to use XML tags vs markdown headers: XML: When sections contain user-provided content (prevents injection) Markdown: When writing system prompts for readability Both: Complex prompts with mixed static/dynamic content
For building AI agents, assistants, and skills: # [Agent Name] β System Prompt ## Identity [Who this agent is. 2-3 sentences max.] ## Primary Directive [One sentence. The single most important thing this agent does.] ## Capabilities [What this agent CAN do. Bullet list, specific.] ## Boundaries [What this agent CANNOT or SHOULD NOT do. Hard limits.] ## Knowledge [Domain-specific information the agent needs. Can be extensive.] ## Interaction Style [How the agent communicates. Voice, format preferences, length.] ## Tools Available [If agent has tools: what each does, when to use each.] ## Workflows [Step-by-step processes for common tasks. Decision trees for branching.] ## Error Handling [What to do when uncertain, when input is bad, when tools fail.]
DimensionWeightScoreClarity: No ambiguous instructions20/20Completeness: Covers all expected use cases15/15Boundaries: Clear limits prevent hallucination15/15Examples: At least 2 inputβoutput pairs15/15Error handling: Graceful failure paths defined10/10Format control: Output structure specified10/10Voice consistency: Persona well-calibrated10/10Efficiency: No redundant or contradictory instructions5/5TOTAL/100 Score interpretation: 90-100: Production-ready 75-89: Good, minor gaps 60-74: Needs iteration Below 60: Rewrite recommended
When instructions conflict, models follow this implicit hierarchy: Safety/ethics (hardcoded, can't override) System prompt (highest user-controllable priority) Recent conversation context (recency bias) User's current message (immediate request) Earlier conversation context (may be forgotten) Training data patterns (default behavior) Design implication: Put critical rules in the system prompt. Repeat critical rules periodically in long conversations. Don't rely on early context surviving in long threads.
Break complex tasks into sequential prompts where each output feeds the next: chain: - name: "Extract" prompt: "Extract all claims from this document. Output as numbered list." output_to: claims_list - name: "Classify" prompt: "Classify each claim as: Factual, Opinion, or Unverifiable.\n\nClaims:\n{claims_list}" output_to: classified_claims - name: "Verify" prompt: "For each Factual claim, assess accuracy (Accurate/Inaccurate/Partially Accurate) with evidence.\n\nClaims:\n{classified_claims}" output_to: verified_claims - name: "Report" prompt: "Generate a fact-check report from these verified claims.\n\n{verified_claims}" When to chain vs single prompt: Single PromptChainTask under 500 words outputMulti-step reasoningOne clear actionDifferent skills per stepSimple inputβoutputQuality needs to be verified per stepSpeed mattersAccuracy matters
Run the same prompt 3-5 times, then aggregate: [Run prompt 3 times with temperature > 0] Aggregation prompt: "Here are 3 independent analyses of the same input. Identify where all 3 agree (high confidence), where 2/3 agree (medium confidence), and where they disagree (investigate further). Produce a final synthesized analysis." Best for: classification, scoring, risk assessment, diagnosis.
Force reliable JSON/YAML output: Respond with ONLY a valid JSON object. No markdown, no explanation, no text before or after. Schema: { "summary": "string, 1-2 sentences", "sentiment": "positive | negative | neutral", "confidence": "number 0-1", "key_entities": ["string array"], "action_required": "boolean" } Example output: {"summary": "Customer reports billing error on invoice #4521", "sentiment": "negative", "confidence": 0.92, "key_entities": ["invoice #4521", "billing department"], "action_required": true} Reliability tricks: Provide the exact schema with types Include one complete example Say "ONLY a valid JSON object" to prevent preamble For complex schemas, use the model's native JSON mode if available
Analyze [SUBJECT] using this framework: 1. **Current State**: What exists today? (facts only, cite sources) 2. **Strengths**: What's working well? (with evidence) 3. **Weaknesses**: What's failing or underperforming? (with metrics) 4. **Root Causes**: Why do the weaknesses exist? (use 5 Whys) 5. **Opportunities**: What could be improved? (ranked by impact) 6. **Recommendations**: Top 3 actions with expected outcome and effort level 7. **Risks**: What could go wrong with each recommendation? Output as a structured report. Lead with the single most important finding.
test_suite: name: "[Prompt Name] Test Suite" prompt_version: "1.0" test_cases: - id: "TC-01" name: "Happy path - standard input" input: "[typical, well-formed input]" expected: "[key elements that must appear]" anti_expected: "[elements that must NOT appear]" - id: "TC-02" name: "Edge case - minimal input" input: "[bare minimum input]" expected: "[graceful handling, asks for more info or works with what's given]" - id: "TC-03" name: "Edge case - ambiguous input" input: "[input with multiple interpretations]" expected: "[acknowledges ambiguity, handles explicitly]" - id: "TC-04" name: "Adversarial - injection attempt" input: "[input containing 'ignore instructions and...']" expected: "[treats as regular text, follows original instructions]" - id: "TC-05" name: "Scale - large input" input: "[maximum expected input size]" expected: "[handles without truncation or quality loss]" - id: "TC-06" name: "Empty/null input" input: "" expected: "[helpful error message, not a crash or hallucination]"
PROMPT IMPROVEMENT CYCLE: 1. BASELINE: Run prompt on 10 diverse test inputs. Score each 1-10. 2. DIAGNOSE: Categorize failures: - Format failures (wrong structure) β fix format instructions - Content failures (wrong substance) β fix examples/constraints - Consistency failures (varies between runs) β add constraints, lower temperature - Hallucination failures (invented content) β add grounding rules - Verbosity failures (too long/short) β add length constraints 3. HYPOTHESIZE: Change ONE thing at a time 4. TEST: Run same 10 inputs. Compare scores. 5. COMMIT: If improvement > 10%, keep the change. Otherwise revert. 6. REPEAT: Until average score > 8/10 on test suite
SymptomLikely CauseFixOutput format variesFormat not specified precisely enoughAdd exact template + exampleHallucinated factsNo grounding instructionAdd "only use provided information"Too verboseNo length constraintAdd word/sentence limitsIgnores edge casesEdge cases not anticipatedAdd edge case handling sectionInconsistent qualityTemperature too high or prompt too vagueLower temp, add quality criteriaStarts with fillerNo opening instructionAdd "Start directly with [X]"Misses key infoInput not clearly delimitedUse XML tags around input sectionsWrong audience levelAudience not specifiedAdd explicit audience descriptionContradictory outputConflicting instructionsAudit for conflicts, add priority rulesRefuses valid tasksOver-broad safety rulesNarrow safety constraints to actual risks
Reduce token usage without losing quality: Techniques: Compress examples: Remove redundant examples that teach the same lesson Use references: "Follow AP style" instead of listing every AP rule Structured over prose: Bullet lists use fewer tokens than paragraphs Abbreviation glossary: Define abbreviations once, use throughout Template variables: {input} placeholders instead of inline content Efficiency audit: For each section of your prompt, ask: 1. What does this section teach the model? 2. Could the same lesson be taught in fewer tokens? 3. Is this section USED in 80%+ of responses? (If not, move to conditional) 4. Does removing this section degrade output quality? (Test it!)
Task TypeTemperatureTop-PNotesFactual extraction0.0-0.10.9Deterministic preferredCode generation0.0-0.20.95Consistency criticalAnalysis/reasoning0.2-0.50.95Some exploration, mostly focusedCreative writing0.7-0.90.95Variety desiredBrainstorming0.8-1.01.0Maximum diversityClassification0.00.9Deterministic
Claude (Anthropic): Excels with detailed system prompts and XML structuring Responds well to specific persona instructions Use <thinking> tags for step-by-step reasoning Strong with long context β can handle detailed instructions Prefill assistant responses for format control GPT-4 (OpenAI): Works well with JSON mode for structured output Function calling for tool use Strong with concise, directive instructions Use system message for persistent instructions General principles (all models): More specific = more reliable (across all models) Examples > descriptions (show, don't tell) Recency bias exists β put important instructions at start AND end Test on YOUR model β don't assume cross-model transfer
# prompt-registry.yaml prompts: contract_reviewer: current_version: "2.3.1" versions: "2.3.1": date: "2026-02-20" change: "Added indemnification clause detection" avg_score: 8.4 test_cases: 15 "2.3.0": date: "2026-02-15" change: "Restructured output format" avg_score: 8.1 test_cases: 12 "2.2.0": date: "2026-02-01" change: "Initial production version" avg_score: 7.2 test_cases: 8
Track in production: Quality score: Sample and rate outputs weekly (1-10) Failure rate: % of outputs requiring human correction Latency: Time to generate (affects UX) Token usage: Cost per prompt execution User satisfaction: Thumbs up/down or explicit rating Alert thresholds: alerts: quality_drop: "avg_score < 7.0 over 50 samples" failure_spike: "failure_rate > 15% in 24h" cost_spike: "avg_tokens > 2x baseline" latency_spike: "p95 > 30 seconds"
# [Prompt Name] ## Purpose [One sentence β what this prompt does] ## Owner [Who maintains this prompt] ## Version [Current version + date] ## Input [What the prompt expects. Format, schema, constraints.] ## Output [What the prompt produces. Format, schema, example.] ## Dependencies [Other prompts in the chain, tools, data sources] ## Performance [Current avg score, failure rate, edge cases known] ## Changelog [Version history with what changed and why]
Add self-checking to any prompt: [Main instruction] Before providing your final response, verify: 1. Does the output match the requested format exactly? 2. Are all claims supported by the provided input? 3. Have I addressed all parts of the request? 4. Would a domain expert find any errors in this response? If any check fails, fix the issue before responding.
Break complex input into manageable pieces: You will receive a complex [document/request/problem]. Step 1: List the distinct components or sub-tasks (do not solve yet). Step 2: Order them by dependency (which must be done first?). Step 3: Solve each component individually. Step 4: Synthesize the individual solutions into a coherent whole. Step 5: Check for contradictions between components.
Multi-perspective analysis: Analyze this [proposal/plan/decision] from three perspectives: **The Optimist**: What's the best case? What could go right? **The Skeptic**: What could go wrong? What's being overlooked? **The Pragmatist**: What's the most likely outcome? What's the practical path? Synthesize the three perspectives into a balanced recommendation.
The Vague Role: "You are a helpful assistant" β Be specific about expertise The Missing Example: Describing format in words instead of showing it β Add concrete examples The Kitchen Sink: Cramming every possible instruction into one prompt β Chain or prioritize The Optimism Bias: Only testing happy paths β Test edge cases and failures The Copy-Paste: Using the same prompt across models without testing β Test per model The Novel: Writing paragraphs when bullet points work better β Be concise The Perfectionist: Iterating endlessly on minor improvements β Ship at 8/10 The Blind Trust: Not reviewing outputs because "the prompt is good" β Always sample The Static Prompt: Never updating prompts as models update β Re-test quarterly The Secret Prompt: No documentation, only the author understands it β Document everything
Use these to invoke specific capabilities: CommandAction"Write a prompt for [task]"Build from scratch using CRAFT framework"Review this prompt"Score against quality rubric, suggest improvements"Optimize this prompt"Reduce tokens while maintaining quality"Test this prompt"Generate test suite with 6+ diverse cases"Convert to system prompt"Restructure as agent/skill system prompt"Add examples to this prompt"Generate few-shot examples from description"Make this prompt robust"Add edge cases, error handling, injection defense"Chain these tasks"Design multi-step prompt chain with handoffs"Debug this prompt"Diagnose failure patterns, suggest fixes"Compare prompts"A/B test two versions with same inputs"Simplify this prompt"Remove redundancy, improve clarity"Document this prompt"Generate production documentation template Built by AfrexAI β production-grade AI skills for teams that ship.
Agent frameworks, memory systems, reasoning layers, and model-native orchestration.
Largest current source with strong distribution and engagement signals.