Requirements
- Target platform
- OpenClaw
- Install method
- Manual import
- Extraction
- Extract archive
- Prerequisites
- OpenClaw
- Primary doc
- SKILL.md
Analyze, transform, and clean DataFrames with efficient patterns for filtering, grouping, merging, and pivoting.
Analyze, transform, and clean DataFrames with efficient patterns for filtering, grouping, merging, and pivoting.
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
On first use, create ~/pandas/ and read setup.md for initialization. User preferences are stored in ~/pandas/memory.md — users can view or edit this file anytime.
User needs to work with tabular data in Python. Agent handles DataFrame operations, data cleaning, aggregations, merges, pivots, and exports.
Memory lives in ~/pandas/. See memory-template.md for structure. ~/pandas/ ├── memory.md # User preferences and common patterns └── snippets/ # Saved code patterns (optional)
TopicFileSetup processsetup.mdMemory templatememory-template.md
NEVER iterate with for loops over DataFrame rows Use .apply() only when vectorized alternatives don't exist Prefer df['col'].str.method() over apply(lambda x: x.method())
# Good: method chaining result = (df .query('age > 30') .groupby('city') .agg({'salary': 'mean'}) .reset_index()) # Bad: intermediate variables everywhere filtered = df[df['age'] > 30] grouped = filtered.groupby('city') result = grouped.agg({'salary': 'mean'}).reset_index()
Always check df.isna().sum() before analysis Choose strategy: dropna(), fillna(), or interpolation Document WHY missing values exist before removing them
# Memory savings for columns with few unique values df['status'] = df['status'].astype('category') df['country'] = df['country'].astype('category')
# Always specify how and validate result = pd.merge( df1, df2, on='id', how='left', validate='m:1' # Many-to-one: catch unexpected duplicates )
# Readable df.query('age > 30 and city == "NYC" and salary < 100000') # Hard to read df[(df['age'] > 30) & (df['city'] == 'NYC') & (df['salary'] < 100000)]
# Faster lookups, cleaner merges df = df.set_index('user_id') user_data = df.loc[12345] # O(1) lookup
SettingWithCopyWarning → Use .loc[] for assignment: df.loc[mask, 'col'] = value Slow loops → Replace iterrows() with vectorized ops or apply() Memory explosion → Use dtype in read_csv(): pd.read_csv(f, dtype={'id': 'int32'}) Silent data loss → Check shape before/after merge: print(f"Before: {len(df1)}, After: {len(result)}") Index confusion → Use reset_index() after groupby() to get clean DataFrame Chained indexing → df['a']['b'] fails silently; use df.loc[:, ['a', 'b']]
Data storage: User preferences stored in ~/pandas/memory.md All DataFrame operations run locally No data is sent externally This skill does NOT: Upload data to any service Access files outside ~/pandas/ and the working directory Modify source data files without explicit instruction User control: View stored preferences: cat ~/pandas/memory.md Clear all data: rm -rf ~/pandas/
Install with clawhub install <slug> if user confirms: data-analysis — general data analysis patterns csv — CSV file handling sql — database queries excel-xlsx — Excel file operations
If useful: clawhub star pandas Stay updated: clawhub sync
Data access, storage, extraction, analysis, reporting, and insight generation.
Largest current source with strong distribution and engagement signals.