# Send Tabstack Extractor to your agent
Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.
## Fast path
- Download the package from Yavira.
- Extract it into a folder your agent can access.
- Paste one of the prompts below and point your agent at the extracted folder.
## Suggested prompts
### New install

```text
I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.
```
### Upgrade existing

```text
I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.
```
## Machine-readable fields
```json
{
  "schemaVersion": "1.0",
  "item": {
    "slug": "tabstack-extractor",
    "name": "Tabstack Extractor",
    "source": "tencent",
    "type": "skill",
    "category": "开发工具",
    "sourceUrl": "https://clawhub.ai/noblepayne/tabstack-extractor",
    "canonicalUrl": "https://clawhub.ai/noblepayne/tabstack-extractor",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadUrl": "/downloads/tabstack-extractor",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=tabstack-extractor",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "packageFormat": "ZIP package",
    "primaryDoc": "SKILL.md",
    "includedAssets": [
      "SKILL.md",
      "references/api_reference.md",
      "references/job_schema.json",
      "references/news_schema.json",
      "references/schema_guide.md",
      "references/simple_article.json"
    ],
    "downloadMode": "redirect",
    "sourceHealth": {
      "source": "tencent",
      "slug": "tabstack-extractor",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-05-09T16:13:46.584Z",
      "expiresAt": "2026-05-16T16:13:46.584Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=tabstack-extractor",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=tabstack-extractor",
        "contentDisposition": "attachment; filename=\"tabstack-extractor-0.1.0.zip\"",
        "redirectLocation": null,
        "bodySnippet": null,
        "slug": "tabstack-extractor"
      },
      "scope": "item",
      "summary": "Item download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this item.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/tabstack-extractor"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    }
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/tabstack-extractor",
    "downloadUrl": "https://openagent3.xyz/downloads/tabstack-extractor",
    "agentUrl": "https://openagent3.xyz/skills/tabstack-extractor/agent",
    "manifestUrl": "https://openagent3.xyz/skills/tabstack-extractor/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/tabstack-extractor/agent.md"
  }
}
```
## Documentation

### Overview

This skill enables structured data extraction from websites using the Tabstack API. It's ideal for web scraping tasks where you need consistent, schema-based data extraction from job boards, news sites, product pages, or any structured content.

### 1. Install Babashka (if needed)

# Option A: From GitHub (recommended for sharing)
curl -s https://raw.githubusercontent.com/babashka/babashka/master/install | bash

# Option B: From Nix
nix-shell -p babashka

# Option C: From Homebrew
brew install borkdude/brew/babashka

### 2. Set up API Key

Option A: Environment variable (recommended)

export TABSTACK_API_KEY="your_api_key_here"

Option B: Configuration file

mkdir -p ~/.config/tabstack
echo '{:api-key "your_api_key_here"}' > ~/.config/tabstack/config.edn

Get an API key: Sign up at Tabstack Console

### 3. Test Connection

bb scripts/tabstack.clj test

### 4. Extract Markdown (Simple)

bb scripts/tabstack.clj markdown "https://example.com"

### 5. Extract JSON (Start Simple)

# Start with simple schema (fast, reliable)
bb scripts/tabstack.clj json "https://example.com" references/simple_article.json

# Try more complex schemas (may be slower)
bb scripts/tabstack.clj json "https://news.site" references/news_schema.json

### 6. Advanced Features

# Extract with retry logic (3 retries, 1s delay)
bb scripts/tabstack.clj json-retry "https://example.com" references/simple_article.json

# Extract with caching (24-hour cache)
bb scripts/tabstack.clj json-cache "https://example.com" references/simple_article.json

# Batch extract from URLs file
echo "https://example.com" > urls.txt
echo "https://example.org" >> urls.txt
bb scripts/tabstack.clj batch urls.txt references/simple_article.json

### 1. Markdown Extraction

Extract clean, readable markdown from any webpage. Useful for content analysis, summarization, or archiving.

When to use: When you need the textual content of a page without the HTML clutter.

Example use cases:

Extract article content for summarization
Archive webpage content
Analyze blog post content

### 2. JSON Schema Extraction

Extract structured data using JSON schemas. Define exactly what data you want and get it in a consistent format.

When to use: When scraping job listings, product pages, news articles, or any structured data.

Example use cases:

Scrape job listings from BuiltIn/LinkedIn
Extract product details from e-commerce sites
Gather news articles with consistent metadata

### 3. Schema Templates

Pre-built schemas for common scraping tasks. See references/ directory for templates.

Available schemas:

Job listing schema (see references/job_schema.json)
News article schema
Product page schema
Contact information schema

### Workflow: Job Scraping Example

Follow this workflow to scrape job listings:

Identify target sites - BuiltIn, LinkedIn, company career pages
Choose or create schema - Use references/job_schema.json or customize
Test extraction - Run a single page to verify schema works
Scale up - Process multiple URLs
Store results - Save to database or file

Example job schema:

{
  "type": "object",
  "properties": {
    "title": {"type": "string"},
    "company": {"type": "string"},
    "location": {"type": "string"},
    "description": {"type": "string"},
    "salary": {"type": "string"},
    "apply_url": {"type": "string"},
    "posted_date": {"type": "string"},
    "requirements": {"type": "array", "items": {"type": "string"}}
  }
}

### Combine with Web Search

Use web_search to find relevant URLs
Use Tabstack to extract structured data from those URLs
Store results in Datalevin (future skill)

### Combine with Browser Automation

Use browser tool to navigate complex sites
Extract page URLs
Use Tabstack for structured extraction

### Error Handling

Common issues and solutions:

Authentication failed - Check TABSTACK_API_KEY environment variable
Invalid URL - Ensure URL is accessible and correct
Schema mismatch - Adjust schema to match page structure
Rate limiting - Add delays between requests

### scripts/

tabstack.clj - Main API wrapper in Babashka (recommended, has retry logic, caching, batch processing)
tabstack_curl.sh - Bash/curl fallback (simple, no dependencies)
tabstack_api.py - Python API wrapper (requires requests module)

### references/

job_schema.json - Template schema for job listings
api_reference.md - Tabstack API documentation

### Best Practices

Start small - Test with single pages before scaling
Respect robots.txt - Check site scraping policies
Add delays - Avoid overwhelming target sites
Validate schemas - Test schemas on sample pages
Handle errors gracefully - Implement retry logic for failed requests

### Teaching Focus: How to Create Schemas

This skill is designed to teach agents how to use Tabstack API effectively. The key is learning to create appropriate JSON schemas for different websites.

### Learning Path

Start Simple - Use references/simple_article.json (4 basic fields)
Test Extensively - Try schemas on multiple page types
Iterate - Add fields based on what the page actually contains
Optimize - Remove unnecessary fields for speed

See Schema Creation Guide for detailed instructions and examples.

### Common Mistakes to Avoid

Over-complex schemas - Start with 2-3 fields, not 20
Missing fields - Don't require fields that don't exist on the page
No testing - Always test with example.com first, then target sites
Ignoring timeouts - Complex schemas take longer (45s timeout)

### Babashka Advantages

Using Babashka for this skill provides:

Single binary - Easy to share/install (GitHub releases, brew, nix)
Fast startup - No JVM warmup, ~50ms startup time
Built-in HTTP client - No external dependencies
Clojure syntax - Familiar to you (Wes), expressive
Retry logic & caching - Built into the skill
Batch processing - Parallel extraction for multiple URLs

### Example User Requests

For this skill to trigger:

"Scrape job listings from Docker careers page"
"Extract the main content from this article"
"Get structured product data from this e-commerce page"
"Pull all the news articles from this site"
"Extract contact information from this company page"
"Batch extract job listings from these 20 URLs"
"Get cached results for this page (avoid API calls)"
## Trust
- Source: tencent
- Verification: Indexed source record
- Publisher: noblepayne
- Version: 0.1.0
## Source health
- Status: healthy
- Item download looks usable.
- Yavira can redirect you to the upstream package for this item.
- Health scope: item
- Reason: direct_download_ok
- Checked at: 2026-05-09T16:13:46.584Z
- Expires at: 2026-05-16T16:13:46.584Z
- Recommended action: Download for OpenClaw
## Links
- [Detail page](https://openagent3.xyz/skills/tabstack-extractor)
- [Send to Agent page](https://openagent3.xyz/skills/tabstack-extractor/agent)
- [JSON manifest](https://openagent3.xyz/skills/tabstack-extractor/agent.json)
- [Markdown brief](https://openagent3.xyz/skills/tabstack-extractor/agent.md)
- [Download page](https://openagent3.xyz/downloads/tabstack-extractor)