โ† All skills
Tencent SkillHub ยท Data Analysis

Data Analyst

Data visualization, report generation, SQL queries, and spreadsheet automation. Transform your AI agent into a data-savvy analyst that turns raw data into actionable insights.

skill openclawclawhub Free
0 Downloads
0 Stars
0 Installs
0 Score
High Signal

Data visualization, report generation, SQL queries, and spreadsheet automation. Transform your AI agent into a data-savvy analyst that turns raw data into actionable insights.

โฌ‡ 0 downloads โ˜… 0 stars Unverified but indexed

Install for OpenClaw

Quick setup
  1. Download the package from Yavira.
  2. Extract the archive and review SKILL.md first.
  3. Import or place the package into your OpenClaw setup.

Requirements

Target platform
OpenClaw
Install method
Manual import
Extraction
Extract archive
Prerequisites
OpenClaw
Primary doc
SKILL.md

Package facts

Download mode
Yavira redirect
Package format
ZIP package
Source platform
Tencent SkillHub
What's included
SKILL.md, scripts/data-init.sh, scripts/query.sh

Validation

  • Use the Yavira download entry.
  • Review SKILL.md after the package is downloaded.
  • Confirm the extracted package contains the expected setup assets.

Install with your agent

Agent handoff

Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.

  1. Download the package from Yavira.
  2. Extract it into a folder your agent can access.
  3. Paste one of the prompts below and point your agent at the extracted folder.
New install

I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.

Upgrade existing

I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.

Trust & source

Release facts

Source
Tencent SkillHub
Verification
Indexed source record
Version
1.0.0

Documentation

ClawHub primary doc Primary doc: SKILL.md 27 sections Open source page

Data Analyst Skill ๐Ÿ“Š

Turn your AI agent into a data analysis powerhouse. Query databases, analyze spreadsheets, create visualizations, and generate insights that drive decisions.

What This Skill Does

โœ… SQL Queries โ€” Write and execute queries against databases โœ… Spreadsheet Analysis โ€” Process CSV, Excel, Google Sheets data โœ… Data Visualization โ€” Create charts, graphs, and dashboards โœ… Report Generation โ€” Automated reports with insights โœ… Data Cleaning โ€” Handle missing data, outliers, formatting โœ… Statistical Analysis โ€” Descriptive stats, trends, correlations

Quick Start

  • Configure your data sources in TOOLS.md:
  • ### Data Sources
  • Primary DB: [Connection string or description]
  • Spreadsheets: [Google Sheets URL / local path]
  • Data warehouse: [BigQuery/Snowflake/etc.]
  • Set up your workspace:
  • ./scripts/data-init.sh
  • Start analyzing!

Common Query Templates

Basic Data Exploration -- Row count SELECT COUNT(*) FROM table_name; -- Sample data SELECT * FROM table_name LIMIT 10; -- Column statistics SELECT column_name, COUNT(*) as count, COUNT(DISTINCT column_name) as unique_values, MIN(column_name) as min_val, MAX(column_name) as max_val FROM table_name GROUP BY column_name; Time-Based Analysis -- Daily aggregation SELECT DATE(created_at) as date, COUNT(*) as daily_count, SUM(amount) as daily_total FROM transactions GROUP BY DATE(created_at) ORDER BY date DESC; -- Month-over-month comparison SELECT DATE_TRUNC('month', created_at) as month, COUNT(*) as count, LAG(COUNT(*)) OVER (ORDER BY DATE_TRUNC('month', created_at)) as prev_month, (COUNT(*) - LAG(COUNT(*)) OVER (ORDER BY DATE_TRUNC('month', created_at))) / NULLIF(LAG(COUNT(*)) OVER (ORDER BY DATE_TRUNC('month', created_at)), 0) * 100 as growth_pct FROM transactions GROUP BY DATE_TRUNC('month', created_at) ORDER BY month; Cohort Analysis -- User cohort by signup month SELECT DATE_TRUNC('month', u.created_at) as cohort_month, DATE_TRUNC('month', o.created_at) as activity_month, COUNT(DISTINCT u.id) as users FROM users u LEFT JOIN orders o ON u.id = o.user_id GROUP BY cohort_month, activity_month ORDER BY cohort_month, activity_month; Funnel Analysis -- Conversion funnel WITH funnel AS ( SELECT COUNT(DISTINCT CASE WHEN event = 'page_view' THEN user_id END) as views, COUNT(DISTINCT CASE WHEN event = 'signup' THEN user_id END) as signups, COUNT(DISTINCT CASE WHEN event = 'purchase' THEN user_id END) as purchases FROM events WHERE date >= CURRENT_DATE - INTERVAL '30 days' ) SELECT views, signups, ROUND(signups * 100.0 / NULLIF(views, 0), 2) as signup_rate, purchases, ROUND(purchases * 100.0 / NULLIF(signups, 0), 2) as purchase_rate FROM funnel;

Common Data Quality Issues

IssueDetectionSolutionMissing valuesIS NULL or empty stringImpute, drop, or flagDuplicatesGROUP BY with HAVING COUNT(*) > 1Deduplicate with rulesOutliersZ-score > 3 or IQR methodInvestigate, cap, or excludeInconsistent formatsSample and pattern matchStandardize with transformsInvalid valuesRange checks, referential integrityValidate and correct

Data Cleaning SQL Patterns

-- Find duplicates SELECT email, COUNT(*) FROM users GROUP BY email HAVING COUNT(*) > 1; -- Find nulls SELECT COUNT(*) as total, SUM(CASE WHEN email IS NULL THEN 1 ELSE 0 END) as null_emails, SUM(CASE WHEN name IS NULL THEN 1 ELSE 0 END) as null_names FROM users; -- Standardize text UPDATE products SET category = LOWER(TRIM(category)); -- Remove outliers (IQR method) WITH stats AS ( SELECT PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY value) as q1, PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY value) as q3 FROM data ) SELECT * FROM data, stats WHERE value BETWEEN q1 - 1.5*(q3-q1) AND q3 + 1.5*(q3-q1);

Data Cleaning Checklist

  • # Data Quality Audit: [Dataset]
  • ## Row-Level Checks
  • [ ] Total row count: [X]
  • [ ] Duplicate rows: [X]
  • [ ] Rows with any null: [X]
  • ## Column-Level Checks
  • | Column | Type | Nulls | Unique | Min | Max | Issues |
  • |--------|------|-------|--------|-----|-----|--------|
  • | [col] | [type] | [n] | [n] | [v] | [v] | [notes] |
  • ## Data Lineage
  • Source: [Where data came from]
  • Last updated: [Date]
  • Known issues: [List]
  • ## Cleaning Actions Taken
  • 1. [Action and reason]
  • 2. [Action and reason]

CSV/Excel Processing with Python

import pandas as pd # Load data df = pd.read_csv('data.csv') # or pd.read_excel('data.xlsx') # Basic exploration print(df.shape) # (rows, columns) print(df.info()) # Column types and nulls print(df.describe()) # Numeric statistics # Data cleaning df = df.drop_duplicates() df['date'] = pd.to_datetime(df['date']) df['amount'] = df['amount'].fillna(0) # Analysis summary = df.groupby('category').agg({ 'amount': ['sum', 'mean', 'count'], 'quantity': 'sum' }).round(2) # Export summary.to_csv('analysis_output.csv')

Common Pandas Operations

# Filtering filtered = df[df['status'] == 'active'] filtered = df[df['amount'] > 1000] filtered = df[df['date'].between('2024-01-01', '2024-12-31')] # Aggregation by_category = df.groupby('category')['amount'].sum() pivot = df.pivot_table(values='amount', index='month', columns='category', aggfunc='sum') # Window functions df['running_total'] = df['amount'].cumsum() df['pct_change'] = df['amount'].pct_change() df['rolling_avg'] = df['amount'].rolling(window=7).mean() # Merging merged = pd.merge(df1, df2, on='id', how='left')

Chart Selection Guide

Data TypeBest ChartUse WhenTrend over timeLine chartShowing patterns/changes over timeCategory comparisonBar chartComparing discrete categoriesPart of wholePie/DonutShowing proportions (โ‰ค5 categories)DistributionHistogramUnderstanding data spreadCorrelationScatter plotRelationship between two variablesMany categoriesHorizontal barRanking or comparing many itemsGeographicMapLocation-based data

Python Visualization with Matplotlib/Seaborn

import matplotlib.pyplot as plt import seaborn as sns # Set style plt.style.use('seaborn-v0_8-whitegrid') sns.set_palette("husl") # Line chart (trends) plt.figure(figsize=(10, 6)) plt.plot(df['date'], df['value'], marker='o') plt.title('Trend Over Time') plt.xlabel('Date') plt.ylabel('Value') plt.xticks(rotation=45) plt.tight_layout() plt.savefig('trend.png', dpi=150) # Bar chart (comparisons) plt.figure(figsize=(10, 6)) sns.barplot(data=df, x='category', y='amount') plt.title('Amount by Category') plt.xticks(rotation=45) plt.tight_layout() plt.savefig('comparison.png', dpi=150) # Heatmap (correlations) plt.figure(figsize=(10, 8)) sns.heatmap(df.corr(), annot=True, cmap='coolwarm', center=0) plt.title('Correlation Matrix') plt.tight_layout() plt.savefig('correlation.png', dpi=150)

ASCII Charts (Quick Terminal Visualization)

When you can't generate images, use ASCII: Revenue by Month (in $K) ======================== Jan: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 160 Feb: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 180 Mar: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 240 Apr: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 220 May: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 260 Jun: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 280

Standard Report Template

  • # [Report Name]
  • **Period:** [Date range]
  • **Generated:** [Date]
  • **Author:** [Agent/Human]
  • ## Executive Summary
  • [2-3 sentences with key findings]
  • ## Key Metrics
  • | Metric | Current | Previous | Change |
  • |--------|---------|----------|--------|
  • | [Metric] | [Value] | [Value] | [+/-X%] |
  • ## Detailed Analysis
  • ### [Section 1]
  • [Analysis with supporting data]
  • ### [Section 2]
  • [Analysis with supporting data]
  • ## Visualizations
  • [Insert charts]
  • ## Insights
  • 1. **[Insight]**: [Supporting evidence]
  • 2. **[Insight]**: [Supporting evidence]
  • ## Recommendations
  • 1. [Actionable recommendation]
  • 2. [Actionable recommendation]
  • ## Methodology
  • Data source: [Source]
  • Date range: [Range]
  • Filters applied: [Filters]
  • Known limitations: [Limitations]
  • ## Appendix
  • [Supporting data tables]

Automated Report Script

#!/bin/bash # generate-report.sh # Pull latest data python scripts/extract_data.py --output data/latest.csv # Run analysis python scripts/analyze.py --input data/latest.csv --output reports/ # Generate report python scripts/format_report.py --template weekly --output reports/weekly-$(date +%Y-%m-%d).md echo "Report generated: reports/weekly-$(date +%Y-%m-%d).md"

Descriptive Statistics

StatisticWhat It Tells YouUse CaseMeanAverage valueCentral tendencyMedianMiddle valueRobust to outliersModeMost commonCategorical dataStd DevSpread around meanVariabilityMin/MaxRangeData boundariesPercentilesDistribution shapeBenchmarking

Quick Stats with Python

# Full descriptive statistics stats = df['amount'].describe() print(stats) # Additional stats print(f"Median: {df['amount'].median()}") print(f"Mode: {df['amount'].mode()[0]}") print(f"Skewness: {df['amount'].skew()}") print(f"Kurtosis: {df['amount'].kurtosis()}") # Correlation correlation = df['sales'].corr(df['marketing_spend']) print(f"Correlation: {correlation:.3f}")

Statistical Tests Quick Reference

TestUse CasePythonT-testCompare two meansscipy.stats.ttest_ind(a, b)Chi-squareCategorical independencescipy.stats.chi2_contingency(table)ANOVACompare 3+ meansscipy.stats.f_oneway(a, b, c)PearsonLinear correlationscipy.stats.pearsonr(x, y)

Standard Analysis Process

Define the Question What are we trying to answer? What decisions will this inform? Understand the Data What data is available? What's the structure and quality? Clean and Prepare Handle missing values Fix data types Remove duplicates Explore Descriptive statistics Initial visualizations Identify patterns Analyze Deep dive into findings Statistical tests if needed Validate hypotheses Communicate Clear visualizations Actionable insights Recommendations

Analysis Request Template

  • # Analysis Request
  • ## Question
  • [What are we trying to answer?]
  • ## Context
  • [Why does this matter? What decision will it inform?]
  • ## Data Available
  • [Dataset 1]: [Description]
  • [Dataset 2]: [Description]
  • ## Expected Output
  • [Deliverable 1]
  • [Deliverable 2]
  • ## Timeline
  • [When is this needed?]
  • ## Notes
  • [Any constraints or considerations]

data-init.sh

Initialize your data analysis workspace.

query.sh

Quick SQL query execution. # Run query from file ./scripts/query.sh --file queries/daily-report.sql # Run inline query ./scripts/query.sh "SELECT COUNT(*) FROM users" # Save output to file ./scripts/query.sh --file queries/export.sql --output data/export.csv

analyze.py

Python analysis toolkit. # Basic analysis python scripts/analyze.py --input data/sales.csv # With specific analysis type python scripts/analyze.py --input data/sales.csv --type cohort # Generate report python scripts/analyze.py --input data/sales.csv --report weekly

With Other Skills

SkillIntegrationMarketingAnalyze campaign performance, content metricsSalesPipeline analytics, conversion analysisBusiness DevMarket research data, competitor analysis

Common Data Sources

Databases: PostgreSQL, MySQL, SQLite Warehouses: BigQuery, Snowflake, Redshift Spreadsheets: Google Sheets, Excel, CSV APIs: REST endpoints, GraphQL Files: JSON, Parquet, XML

Best Practices

Start with the question โ€” Know what you're trying to answer Validate your data โ€” Garbage in = garbage out Document everything โ€” Queries, assumptions, decisions Visualize appropriately โ€” Right chart for right data Show your work โ€” Methodology matters Lead with insights โ€” Not just data dumps Make it actionable โ€” "So what?" โ†’ "Now what?" Version your queries โ€” Track changes over time

Common Mistakes

โŒ Confirmation bias โ€” Looking for data to support a conclusion โŒ Correlation โ‰  causation โ€” Be careful with claims โŒ Cherry-picking โ€” Using only favorable data โŒ Ignoring outliers โ€” Investigate before removing โŒ Over-complicating โ€” Simple analysis often wins โŒ No context โ€” Numbers without comparison are meaningless

License

License: MIT โ€” use freely, modify, distribute. "The goal is to turn data into information, and information into insight." โ€” Carly Fiorina

Category context

Data access, storage, extraction, analysis, reporting, and insight generation.

Source: Tencent SkillHub

Largest current source with strong distribution and engagement signals.

Package contents

Included in package
2 Scripts1 Docs
  • SKILL.md Primary doc
  • scripts/data-init.sh Scripts
  • scripts/query.sh Scripts