Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Build, optimize, and debug RAG pipelines with chunking strategies, retrieval tuning, evaluation metrics, and production monitoring.
Build, optimize, and debug RAG pipelines with chunking strategies, retrieval tuning, evaluation metrics, and production monitoring.
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
User wants to implement, improve, or troubleshoot Retrieval-Augmented Generation systems.
TopicFilePipeline components & architecturearchitecture.mdImplementation patterns & codeimplementation.mdEvaluation metrics & debuggingevaluation.mdSecurity & compliancesecurity.md
Architecture design β Select embedding models, vector DBs, and chunking strategies based on requirements Implementation β Write ingestion pipelines, query handlers, and update logic Retrieval optimization β Tune top-k, reranking, hybrid search parameters Evaluation β Build test datasets, measure recall/precision, diagnose failures Production ops β Monitor quality drift, set up alerts, debug degradation Security β PII detection, access control, compliance requirements
Before recommending architecture, ask: What document types and volume? Latency requirements (real-time chat vs batch)? Update frequency (how often do docs change)? Access control needs (who can see what)? Compliance constraints (GDPR, HIPAA, SOC2)? Budget (managed vs self-hosted, embedding costs)?
Never skip access control β Filter at retrieval time, not after Always overlap chunks β 10-20% prevents context loss at boundaries Evaluate before optimizing β Build eval dataset first, then tune Same embedding model β Query and documents must use identical model Monitor similarity scores β Dropping averages signal drift or issues Plan for deletion β GDPR erasure requires re-embedding capability
SymptomLikely CauseFixWrong docs retrievedQuery too vague, poor chunksQuery expansion, smaller chunksRelevant doc missedNot indexed, low similarityCheck ingestion, hybrid searchHallucinated answersContext too shortIncrease top-k, better rerankingSlow responsesLarge chunks, no cachingOptimize chunk size, cache embeddingsInconsistent resultsNon-deterministic rerankingSet seeds, use stable sorting
Agent frameworks, memory systems, reasoning layers, and model-native orchestration.
Largest current source with strong distribution and engagement signals.