Tencent SkillHub · Data Analysis

Data Analyst

Data visualization, report generation, SQL queries, and spreadsheet automation. Transform your AI agent into a data-savvy analyst that turns raw data into actionable insights.

skill openclawclawhub Free

0 Downloads

0 Stars

0 Installs

0 Score

High Signal

Data visualization, report generation, SQL queries, and spreadsheet automation. Transform your AI agent into a data-savvy analyst that turns raw data into actionable insights.

⬇ 0 downloads ★ 0 stars Unverified but indexed

Install for OpenClaw

Known item issue.

This item's current download entry is known to bounce back to a listing or homepage instead of returning a package file.

Quick setup

Open the source page and confirm the package flow manually.
Review SKILL.md if you can obtain the files.
Treat this source as manual setup until the download is verified.

Requirements

Target platform: OpenClaw
Install method: Manual import
Extraction: Extract archive
Prerequisites: OpenClaw
Primary doc: SKILL.md

Package facts

Download mode: Manual review
Package format: ZIP package
Source platform: Tencent SkillHub
What's included: SKILL.md, scripts/data-init.sh, scripts/query.sh

Validation

Open the source listing and confirm there is a real package or setup artifact available.
Review SKILL.md before asking your agent to continue.
Treat this source as manual setup until the upstream download flow is fixed.

Install with your agent

Agent handoff

Use the source page and any available docs to guide the install because the item currently does not return a direct package file.

Open the source page via Open source listing.
If you can obtain the package, extract it into a folder your agent can access.
Paste one of the prompts below and point your agent at the source page and extracted files.

New install

I tried to install a skill package from Yavira, but the item currently does not return a direct package file. Inspect the source page and any extracted docs, then tell me what you can confirm and any manual steps still required.

Upgrade existing

I tried to upgrade a skill package from Yavira, but the item currently does not return a direct package file. Compare the source page and any extracted docs with my current installation, then summarize what changed and what manual follow-up I still need.

Open Send to Agent page Open JSON manifest Open Markdown brief

Trust & source

Release facts

Source: Tencent SkillHub
Verification: Indexed source record
Version: 1.0.0

Provenance

Publisher: oyi77
Source page: View original listing
Canonical URL: Open canonical page

Documentation

ClawHub primary doc Primary doc: SKILL.md 27 sections Open source page

Data Analyst Skill 📊

Turn your AI agent into a data analysis powerhouse. Query databases, analyze spreadsheets, create visualizations, and generate insights that drive decisions.

What This Skill Does

✅ SQL Queries — Write and execute queries against databases ✅ Spreadsheet Analysis — Process CSV, Excel, Google Sheets data ✅ Data Visualization — Create charts, graphs, and dashboards ✅ Report Generation — Automated reports with insights ✅ Data Cleaning — Handle missing data, outliers, formatting ✅ Statistical Analysis — Descriptive stats, trends, correlations

Quick Start

Configure your data sources in TOOLS.md:
### Data Sources
Primary DB: [Connection string or description]
Spreadsheets: [Google Sheets URL / local path]
Data warehouse: [BigQuery/Snowflake/etc.]
Set up your workspace:
./scripts/data-init.sh
Start analyzing!

Common Query Templates

Basic Data Exploration -- Row count SELECT COUNT(*) FROM table_name; -- Sample data SELECT * FROM table_name LIMIT 10; -- Column statistics SELECT column_name, COUNT(*) as count, COUNT(DISTINCT column_name) as unique_values, MIN(column_name) as min_val, MAX(column_name) as max_val FROM table_name GROUP BY column_name; Time-Based Analysis -- Daily aggregation SELECT DATE(created_at) as date, COUNT(*) as daily_count, SUM(amount) as daily_total FROM transactions GROUP BY DATE(created_at) ORDER BY date DESC; -- Month-over-month comparison SELECT DATE_TRUNC('month', created_at) as month, COUNT(*) as count, LAG(COUNT(*)) OVER (ORDER BY DATE_TRUNC('month', created_at)) as prev_month, (COUNT(*) - LAG(COUNT(*)) OVER (ORDER BY DATE_TRUNC('month', created_at))) / NULLIF(LAG(COUNT(*)) OVER (ORDER BY DATE_TRUNC('month', created_at)), 0) * 100 as growth_pct FROM transactions GROUP BY DATE_TRUNC('month', created_at) ORDER BY month; Cohort Analysis -- User cohort by signup month SELECT DATE_TRUNC('month', u.created_at) as cohort_month, DATE_TRUNC('month', o.created_at) as activity_month, COUNT(DISTINCT u.id) as users FROM users u LEFT JOIN orders o ON u.id = o.user_id GROUP BY cohort_month, activity_month ORDER BY cohort_month, activity_month; Funnel Analysis -- Conversion funnel WITH funnel AS ( SELECT COUNT(DISTINCT CASE WHEN event = 'page_view' THEN user_id END) as views, COUNT(DISTINCT CASE WHEN event = 'signup' THEN user_id END) as signups, COUNT(DISTINCT CASE WHEN event = 'purchase' THEN user_id END) as purchases FROM events WHERE date >= CURRENT_DATE - INTERVAL '30 days' ) SELECT views, signups, ROUND(signups * 100.0 / NULLIF(views, 0), 2) as signup_rate, purchases, ROUND(purchases * 100.0 / NULLIF(signups, 0), 2) as purchase_rate FROM funnel;

Common Data Quality Issues

IssueDetectionSolutionMissing valuesIS NULL or empty stringImpute, drop, or flagDuplicatesGROUP BY with HAVING COUNT(*) > 1Deduplicate with rulesOutliersZ-score > 3 or IQR methodInvestigate, cap, or excludeInconsistent formatsSample and pattern matchStandardize with transformsInvalid valuesRange checks, referential integrityValidate and correct

Data Cleaning SQL Patterns

-- Find duplicates SELECT email, COUNT(*) FROM users GROUP BY email HAVING COUNT(*) > 1; -- Find nulls SELECT COUNT(*) as total, SUM(CASE WHEN email IS NULL THEN 1 ELSE 0 END) as null_emails, SUM(CASE WHEN name IS NULL THEN 1 ELSE 0 END) as null_names FROM users; -- Standardize text UPDATE products SET category = LOWER(TRIM(category)); -- Remove outliers (IQR method) WITH stats AS ( SELECT PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY value) as q1, PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY value) as q3 FROM data ) SELECT * FROM data, stats WHERE value BETWEEN q1 - 1.5*(q3-q1) AND q3 + 1.5*(q3-q1);

Data Cleaning Checklist

# Data Quality Audit: [Dataset]
## Row-Level Checks
[ ] Total row count: [X]
[ ] Duplicate rows: [X]
[ ] Rows with any null: [X]
## Column-Level Checks
| Column | Type | Nulls | Unique | Min | Max | Issues |
|--------|------|-------|--------|-----|-----|--------|
| [col] | [type] | [n] | [n] | [v] | [v] | [notes] |
## Data Lineage
Source: [Where data came from]
Last updated: [Date]
Known issues: [List]
## Cleaning Actions Taken
1. [Action and reason]
2. [Action and reason]

CSV/Excel Processing with Python

import pandas as pd # Load data df = pd.read_csv('data.csv') # or pd.read_excel('data.xlsx') # Basic exploration print(df.shape) # (rows, columns) print(df.info()) # Column types and nulls print(df.describe()) # Numeric statistics # Data cleaning df = df.drop_duplicates() df['date'] = pd.to_datetime(df['date']) df['amount'] = df['amount'].fillna(0) # Analysis summary = df.groupby('category').agg({ 'amount': ['sum', 'mean', 'count'], 'quantity': 'sum' }).round(2) # Export summary.to_csv('analysis_output.csv')

Common Pandas Operations

# Filtering filtered = df[df['status'] == 'active'] filtered = df[df['amount'] > 1000] filtered = df[df['date'].between('2024-01-01', '2024-12-31')] # Aggregation by_category = df.groupby('category')['amount'].sum() pivot = df.pivot_table(values='amount', index='month', columns='category', aggfunc='sum') # Window functions df['running_total'] = df['amount'].cumsum() df['pct_change'] = df['amount'].pct_change() df['rolling_avg'] = df['amount'].rolling(window=7).mean() # Merging merged = pd.merge(df1, df2, on='id', how='left')

Chart Selection Guide

Data TypeBest ChartUse WhenTrend over timeLine chartShowing patterns/changes over timeCategory comparisonBar chartComparing discrete categoriesPart of wholePie/DonutShowing proportions (≤5 categories)DistributionHistogramUnderstanding data spreadCorrelationScatter plotRelationship between two variablesMany categoriesHorizontal barRanking or comparing many itemsGeographicMapLocation-based data

Python Visualization with Matplotlib/Seaborn

import matplotlib.pyplot as plt import seaborn as sns # Set style plt.style.use('seaborn-v0_8-whitegrid') sns.set_palette("husl") # Line chart (trends) plt.figure(figsize=(10, 6)) plt.plot(df['date'], df['value'], marker='o') plt.title('Trend Over Time') plt.xlabel('Date') plt.ylabel('Value') plt.xticks(rotation=45) plt.tight_layout() plt.savefig('trend.png', dpi=150) # Bar chart (comparisons) plt.figure(figsize=(10, 6)) sns.barplot(data=df, x='category', y='amount') plt.title('Amount by Category') plt.xticks(rotation=45) plt.tight_layout() plt.savefig('comparison.png', dpi=150) # Heatmap (correlations) plt.figure(figsize=(10, 8)) sns.heatmap(df.corr(), annot=True, cmap='coolwarm', center=0) plt.title('Correlation Matrix') plt.tight_layout() plt.savefig('correlation.png', dpi=150)

ASCII Charts (Quick Terminal Visualization)

When you can't generate images, use ASCII: Revenue by Month (in $K) ======================== Jan: ████████████████ 160 Feb: ██████████████████ 180 Mar: ████████████████████████ 240 Apr: ██████████████████████ 220 May: ██████████████████████████ 260 Jun: ████████████████████████████ 280

Standard Report Template

# [Report Name]
**Period:** [Date range]
**Generated:** [Date]
**Author:** [Agent/Human]
## Executive Summary
[2-3 sentences with key findings]
## Key Metrics
| Metric | Current | Previous | Change |
|--------|---------|----------|--------|
| [Metric] | [Value] | [Value] | [+/-X%] |
## Detailed Analysis
### [Section 1]
[Analysis with supporting data]
### [Section 2]
[Analysis with supporting data]
## Visualizations
[Insert charts]
## Insights
1. **[Insight]**: [Supporting evidence]
2. **[Insight]**: [Supporting evidence]
## Recommendations
1. [Actionable recommendation]
2. [Actionable recommendation]
## Methodology
Data source: [Source]
Date range: [Range]
Filters applied: [Filters]
Known limitations: [Limitations]
## Appendix
[Supporting data tables]

Automated Report Script

#!/bin/bash # generate-report.sh # Pull latest data python scripts/extract_data.py --output data/latest.csv # Run analysis python scripts/analyze.py --input data/latest.csv --output reports/ # Generate report python scripts/format_report.py --template weekly --output reports/weekly-$(date +%Y-%m-%d).md echo "Report generated: reports/weekly-$(date +%Y-%m-%d).md"

Descriptive Statistics

StatisticWhat It Tells YouUse CaseMeanAverage valueCentral tendencyMedianMiddle valueRobust to outliersModeMost commonCategorical dataStd DevSpread around meanVariabilityMin/MaxRangeData boundariesPercentilesDistribution shapeBenchmarking

Quick Stats with Python

# Full descriptive statistics stats = df['amount'].describe() print(stats) # Additional stats print(f"Median: {df['amount'].median()}") print(f"Mode: {df['amount'].mode()[0]}") print(f"Skewness: {df['amount'].skew()}") print(f"Kurtosis: {df['amount'].kurtosis()}") # Correlation correlation = df['sales'].corr(df['marketing_spend']) print(f"Correlation: {correlation:.3f}")

Statistical Tests Quick Reference

TestUse CasePythonT-testCompare two meansscipy.stats.ttest_ind(a, b)Chi-squareCategorical independencescipy.stats.chi2_contingency(table)ANOVACompare 3+ meansscipy.stats.f_oneway(a, b, c)PearsonLinear correlationscipy.stats.pearsonr(x, y)

Standard Analysis Process

Define the Question What are we trying to answer? What decisions will this inform? Understand the Data What data is available? What's the structure and quality? Clean and Prepare Handle missing values Fix data types Remove duplicates Explore Descriptive statistics Initial visualizations Identify patterns Analyze Deep dive into findings Statistical tests if needed Validate hypotheses Communicate Clear visualizations Actionable insights Recommendations

Analysis Request Template

# Analysis Request
## Question
[What are we trying to answer?]
## Context
[Why does this matter? What decision will it inform?]
## Data Available
[Dataset 1]: [Description]
[Dataset 2]: [Description]
## Expected Output
[Deliverable 1]
[Deliverable 2]
## Timeline
[When is this needed?]
## Notes
[Any constraints or considerations]

data-init.sh

Initialize your data analysis workspace.

query.sh

Quick SQL query execution. # Run query from file ./scripts/query.sh --file queries/daily-report.sql # Run inline query ./scripts/query.sh "SELECT COUNT(*) FROM users" # Save output to file ./scripts/query.sh --file queries/export.sql --output data/export.csv

analyze.py

Python analysis toolkit. # Basic analysis python scripts/analyze.py --input data/sales.csv # With specific analysis type python scripts/analyze.py --input data/sales.csv --type cohort # Generate report python scripts/analyze.py --input data/sales.csv --report weekly

With Other Skills

SkillIntegrationMarketingAnalyze campaign performance, content metricsSalesPipeline analytics, conversion analysisBusiness DevMarket research data, competitor analysis

Common Data Sources

Databases: PostgreSQL, MySQL, SQLite Warehouses: BigQuery, Snowflake, Redshift Spreadsheets: Google Sheets, Excel, CSV APIs: REST endpoints, GraphQL Files: JSON, Parquet, XML

Best Practices

Start with the question — Know what you're trying to answer Validate your data — Garbage in = garbage out Document everything — Queries, assumptions, decisions Visualize appropriately — Right chart for right data Show your work — Methodology matters Lead with insights — Not just data dumps Make it actionable — "So what?" → "Now what?" Version your queries — Track changes over time

Common Mistakes

❌ Confirmation bias — Looking for data to support a conclusion ❌ Correlation ≠ causation — Be careful with claims ❌ Cherry-picking — Using only favorable data ❌ Ignoring outliers — Investigate before removing ❌ Over-complicating — Simple analysis often wins ❌ No context — Numbers without comparison are meaningless

License

License: MIT — use freely, modify, distribute. "The goal is to turn data into information, and information into insight." — Carly Fiorina

Category context

Data access, storage, extraction, analysis, reporting, and insight generation.

Source: Tencent SkillHub

Largest current source with strong distribution and engagement signals.

Package contents

Included in package

2 Scripts1 Docs

SKILL.md Primary doc
scripts/data-init.sh Scripts
scripts/query.sh Scripts