← All skills
Tencent SkillHub · Data Analysis

Bioinformatics

Analyze DNA, RNA, and protein sequences with alignment, variant calling, and expression analysis pipelines.

skill openclawclawhub Free
0 Downloads
0 Stars
0 Installs
0 Score
High Signal

Analyze DNA, RNA, and protein sequences with alignment, variant calling, and expression analysis pipelines.

⬇ 0 downloads ★ 0 stars Unverified but indexed

Install for OpenClaw

Quick setup
  1. Download the package from Yavira.
  2. Extract the archive and review SKILL.md first.
  3. Import or place the package into your OpenClaw setup.

Requirements

Target platform
OpenClaw
Install method
Manual import
Extraction
Extract archive
Prerequisites
OpenClaw
Primary doc
SKILL.md

Package facts

Download mode
Yavira redirect
Package format
ZIP package
Source platform
Tencent SkillHub
What's included
SKILL.md, formats.md, memory-template.md, rnaseq.md, setup.md, tools.md

Validation

  • Use the Yavira download entry.
  • Review SKILL.md after the package is downloaded.
  • Confirm the extracted package contains the expected setup assets.

Install with your agent

Agent handoff

Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.

  1. Download the package from Yavira.
  2. Extract it into a folder your agent can access.
  3. Paste one of the prompts below and point your agent at the extracted folder.
New install

I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.

Upgrade existing

I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.

Trust & source

Release facts

Source
Tencent SkillHub
Verification
Indexed source record
Version
1.0.0

Documentation

ClawHub primary doc Primary doc: SKILL.md 18 sections Open source page

Setup

On first use, read setup.md for integration guidelines. Create ~/bioinformatics/ with user consent to store project context and preferences.

When to Use

User needs to analyze biological sequences, run genomic pipelines, or interpret sequencing data. Agent handles sequence alignment, variant calling, expression analysis, and format conversions.

Architecture

Memory lives in ~/bioinformatics/. See memory-template.md for structure. ~/bioinformatics/ ├── memory.md # Projects, preferences, reference genomes ├── pipelines/ # Saved pipeline configurations └── results/ # Analysis outputs and logs

Quick Reference

TopicFileSetup processsetup.mdMemory templatememory-template.mdFile formatsformats.mdTool commandstools.mdRNA-seq pipelinernaseq.mdVariant callingvariants.md

1. Verify Input Quality First

Before any analysis, check input data quality: FASTQ: Run FastQC, check per-base quality, adapter content BAM: Verify sorted, indexed (samtools quickcheck) VCF: Validate format (bcftools view -h) Bad input → garbage output. Always QC first.

2. Use Reference Genome Consistently

Track which reference is used per project: Human: GRCh38/hg38 (prefer) or GRCh37/hg19 Mouse: GRCm39/mm39 or GRCm38/mm10 Mixing references = invalid results Store reference info in ~/bioinformatics/memory.md per project.

3. Preserve Raw Data

NEVER modify original FASTQ/BAM files: Work on copies Keep originals read-only Log every transformation step

4. Resource Awareness

Bioinformatics commands can consume massive resources: Check file sizes before operations Use streaming when possible (samtools view | ...) Estimate memory needs (BWA: ~6GB for human genome) Warn before operations >10 minutes

5. Reproducibility

Every analysis must be reproducible: Log exact tool versions (samtools --version) Save command parameters Record input file checksums for critical analyses

Common Traps

Wrong chromosome naming — chr1 vs 1 causes silent failures. Check and convert with sed 's/^chr//' Unsorted BAM — Most tools expect sorted input. Symptoms: errors or wrong results with no warning Index missing — BAM needs .bai, VCF needs .tbi. Commands fail cryptically without them Memory exhaustion — Large BAM operations kill the session. Stream or use --threads wisely Stale indices — After modifying BAM/VCF, regenerate index. Old index = corrupt reads 0-based vs 1-based coordinates — BED is 0-based, VCF/GFF is 1-based. Off-by-one bugs are common

File Formats Quick Reference

FormatPurposeKey ToolFASTAReference sequencessamtools faidxFASTQRaw reads + qualityseqtk, fastpSAM/BAMAligned readssamtoolsVCF/BCFVariantsbcftoolsBEDGenomic intervalsbedtoolsGFF/GTFGene annotationsgffreadBigWigCoverage tracksdeepTools

Quality Control

# FASTQ quality report fastqc sample.fastq.gz -o qc_reports/ # Trim adapters + low quality fastp -i R1.fq.gz -I R2.fq.gz -o R1.clean.fq.gz -O R2.clean.fq.gz # BAM statistics samtools flagstat aligned.bam samtools stats aligned.bam > stats.txt

Alignment

# Index reference (once) bwa index reference.fa # Align paired-end reads bwa mem -t 8 reference.fa R1.fq.gz R2.fq.gz | \ samtools sort -o aligned.bam - # Index BAM samtools index aligned.bam

Variant Calling

# Call variants bcftools mpileup -Ou -f reference.fa aligned.bam | \ bcftools call -mv -Oz -o variants.vcf.gz # Index VCF bcftools index variants.vcf.gz # Filter variants bcftools filter -s LowQual -e 'QUAL<20' variants.vcf.gz

Data Manipulation

# Extract region samtools view -b aligned.bam chr1:1000000-2000000 > region.bam # Convert BAM to FASTQ samtools fastq -1 R1.fq.gz -2 R2.fq.gz aligned.bam # Merge BAMs samtools merge merged.bam sample1.bam sample2.bam # Subset VCF by region bcftools view -r chr1:1000-2000 variants.vcf.gz

Security & Privacy

Data access: Only reads files user explicitly provides as input Writes outputs to directories user specifies Stores preferences in ~/bioinformatics/ (with consent) Data that stays local: All sequence data processed locally No external API calls for analysis Pipeline configs in ~/bioinformatics/ This skill does NOT: Upload sequence data anywhere Access files without explicit user instruction Infer or collect data beyond explicit inputs Make network requests during analysis Note: Installing tools (conda, brew) and downloading reference genomes requires internet access. These are user-initiated actions.

Related Skills

Install with clawhub install <slug> if user confirms: data-analysis — statistical interpretation statistics — hypothesis testing science — research methodology

Feedback

If useful: clawhub star bioinformatics Stay updated: clawhub sync

Category context

Data access, storage, extraction, analysis, reporting, and insight generation.

Source: Tencent SkillHub

Largest current source with strong distribution and engagement signals.

Package contents

Included in package
6 Docs
  • SKILL.md Primary doc
  • formats.md Docs
  • memory-template.md Docs
  • rnaseq.md Docs
  • setup.md Docs
  • tools.md Docs