Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
This skill should be used when the user asks to "optimize prompts", "design prompt templates", "evaluate LLM outputs", "build agentic systems", "implement RAG", "create few-shot examples", "analyze token usage", or "design AI workflows". Use for prompt engineering patterns, LLM evaluation frameworks, agent architectures, and structured output design.
This skill should be used when the user asks to "optimize prompts", "design prompt templates", "evaluate LLM outputs", "build agentic systems", "implement RAG", "create few-shot examples", "analyze token usage", or "design AI workflows". Use for prompt engineering patterns, LLM evaluation frameworks, agent architectures, and structured output design.
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
Prompt engineering patterns, LLM evaluation frameworks, and agentic system design.
Quick Start Tools Overview Prompt Optimizer RAG Evaluator Agent Orchestrator Prompt Engineering Workflows Prompt Optimization Workflow Few-Shot Example Design Structured Output Design Reference Documentation Common Patterns Quick Reference
# Analyze and optimize a prompt file python scripts/prompt_optimizer.py prompts/my_prompt.txt --analyze # Evaluate RAG retrieval quality python scripts/rag_evaluator.py --contexts contexts.json --questions questions.json # Visualize agent workflow from definition python scripts/agent_orchestrator.py agent_config.yaml --visualize
Analyzes prompts for token efficiency, clarity, and structure. Generates optimized versions. Input: Prompt text file or string Output: Analysis report with optimization suggestions Usage: # Analyze a prompt file python scripts/prompt_optimizer.py prompt.txt --analyze # Output: # Token count: 847 # Estimated cost: $0.0025 (GPT-4) # Clarity score: 72/100 # Issues found: # - Ambiguous instruction at line 3 # - Missing output format specification # - Redundant context (lines 12-15 repeat lines 5-8) # Suggestions: # 1. Add explicit output format: "Respond in JSON with keys: ..." # 2. Remove redundant context to save 89 tokens # 3. Clarify "analyze" -> "list the top 3 issues with severity ratings" # Generate optimized version python scripts/prompt_optimizer.py prompt.txt --optimize --output optimized.txt # Count tokens for cost estimation python scripts/prompt_optimizer.py prompt.txt --tokens --model gpt-4 # Extract and manage few-shot examples python scripts/prompt_optimizer.py prompt.txt --extract-examples --output examples.json
Evaluates Retrieval-Augmented Generation quality by measuring context relevance and answer faithfulness. Input: Retrieved contexts (JSON) and questions/answers Output: Evaluation metrics and quality report Usage: # Evaluate retrieval quality python scripts/rag_evaluator.py --contexts retrieved.json --questions eval_set.json # Output: # === RAG Evaluation Report === # Questions evaluated: 50 # # Retrieval Metrics: # Context Relevance: 0.78 (target: >0.80) # Retrieval Precision@5: 0.72 # Coverage: 0.85 # # Generation Metrics: # Answer Faithfulness: 0.91 # Groundedness: 0.88 # # Issues Found: # - 8 questions had no relevant context in top-5 # - 3 answers contained information not in context # # Recommendations: # 1. Improve chunking strategy for technical documents # 2. Add metadata filtering for date-sensitive queries # Evaluate with custom metrics python scripts/rag_evaluator.py --contexts retrieved.json --questions eval_set.json \ --metrics relevance,faithfulness,coverage # Export detailed results python scripts/rag_evaluator.py --contexts retrieved.json --questions eval_set.json \ --output report.json --verbose
Parses agent definitions and visualizes execution flows. Validates tool configurations. Input: Agent configuration (YAML/JSON) Output: Workflow visualization, validation report Usage: # Validate agent configuration python scripts/agent_orchestrator.py agent.yaml --validate # Output: # === Agent Validation Report === # Agent: research_assistant # Pattern: ReAct # # Tools (4 registered): # [OK] web_search - API key configured # [OK] calculator - No config needed # [WARN] file_reader - Missing allowed_paths # [OK] summarizer - Prompt template valid # # Flow Analysis: # Max depth: 5 iterations # Estimated tokens/run: 2,400-4,800 # Potential infinite loop: No # # Recommendations: # 1. Add allowed_paths to file_reader for security # 2. Consider adding early exit condition for simple queries # Visualize agent workflow (ASCII) python scripts/agent_orchestrator.py agent.yaml --visualize # Output: # βββββββββββββββββββββββββββββββββββββββββββ # β research_assistant β # β (ReAct Pattern) β # βββββββββββββββββββ¬ββββββββββββββββββββββββ # β # ββββββββββΌβββββββββ # β User Query β # ββββββββββ¬βββββββββ # β # ββββββββββΌβββββββββ # β Think βββββββββ # ββββββββββ¬βββββββββ β # β β # ββββββββββΌβββββββββ β # β Select Tool β β # ββββββββββ¬βββββββββ β # β β # βββββββββββββββΌββββββββββββββ β # βΌ βΌ βΌ β # [web_search] [calculator] [file_reader] # β β β β # βββββββββββββββΌββββββββββββββ β # β β # ββββββββββΌβββββββββ β # β Observe βββββββββ # ββββββββββ¬βββββββββ # β # ββββββββββΌβββββββββ # β Final Answer β # βββββββββββββββββββ # Export workflow as Mermaid diagram python scripts/agent_orchestrator.py agent.yaml --visualize --format mermaid
Use when improving an existing prompt's performance or reducing token costs. Step 1: Baseline current prompt python scripts/prompt_optimizer.py current_prompt.txt --analyze --output baseline.json Step 2: Identify issues Review the analysis report for: Token waste (redundant instructions, verbose examples) Ambiguous instructions (unclear output format, vague verbs) Missing constraints (no length limits, no format specification) Step 3: Apply optimization patterns IssuePattern to ApplyAmbiguous outputAdd explicit format specificationToo verboseExtract to few-shot examplesInconsistent resultsAdd role/persona framingMissing edge casesAdd constraint boundaries Step 4: Generate optimized version python scripts/prompt_optimizer.py current_prompt.txt --optimize --output optimized.txt Step 5: Compare results python scripts/prompt_optimizer.py optimized.txt --analyze --compare baseline.json # Shows: token reduction, clarity improvement, issues resolved Step 6: Validate with test cases Run both prompts against your evaluation set and compare outputs.
Use when creating examples for in-context learning. Step 1: Define the task clearly Task: Extract product entities from customer reviews Input: Review text Output: JSON with {product_name, sentiment, features_mentioned} Step 2: Select diverse examples (3-5 recommended) Example TypePurposeSimple caseShows basic patternEdge caseHandles ambiguityComplex caseMultiple entitiesNegative caseWhat NOT to extract Step 3: Format consistently Example 1: Input: "Love my new iPhone 15, the camera is amazing!" Output: {"product_name": "iPhone 15", "sentiment": "positive", "features_mentioned": ["camera"]} Example 2: Input: "The laptop was okay but battery life is terrible." Output: {"product_name": "laptop", "sentiment": "mixed", "features_mentioned": ["battery life"]} Step 4: Validate example quality python scripts/prompt_optimizer.py prompt_with_examples.txt --validate-examples # Checks: consistency, coverage, format alignment Step 5: Test with held-out cases Ensure model generalizes beyond your examples.
FileContainsLoad when user asks aboutreferences/prompt_engineering_patterns.md10 prompt patterns with input/output examples"which pattern?", "few-shot", "chain-of-thought", "role prompting"references/llm_evaluation_frameworks.mdEvaluation metrics, scoring methods, A/B testing"how to evaluate?", "measure quality", "compare prompts"references/agentic_system_design.mdAgent architectures (ReAct, Plan-Execute, Tool Use)"build agent", "tool calling", "multi-agent"
PatternWhen to UseExampleZero-shotSimple, well-defined tasks"Classify this email as spam or not spam"Few-shotComplex tasks, consistent format neededProvide 3-5 examples before the taskChain-of-ThoughtReasoning, math, multi-step logic"Think step by step..."Role PromptingExpertise needed, specific perspective"You are an expert tax accountant..."Structured OutputNeed parseable JSON/XMLInclude schema + format enforcement
# Prompt Analysis python scripts/prompt_optimizer.py prompt.txt --analyze # Full analysis python scripts/prompt_optimizer.py prompt.txt --tokens # Token count only python scripts/prompt_optimizer.py prompt.txt --optimize # Generate optimized version # RAG Evaluation python scripts/rag_evaluator.py --contexts ctx.json --questions q.json # Evaluate python scripts/rag_evaluator.py --contexts ctx.json --compare baseline # Compare to baseline # Agent Development python scripts/agent_orchestrator.py agent.yaml --validate # Validate config python scripts/agent_orchestrator.py agent.yaml --visualize # Show workflow python scripts/agent_orchestrator.py agent.yaml --estimate-cost # Token estimation
Agent frameworks, memory systems, reasoning layers, and model-native orchestration.
Largest current source with strong distribution and engagement signals.