Traktor - web asset extraction skill v2.0
Follow these steps exactly when extracting website assets.
Trigger
/traktor <url1> [url2] [url3]...
Step 1: Parse URLs from arguments
Extract all URLs from the user's command. Store them in a list.
URLS = [all URLs provided after /traktor]
SITE_COUNT = number of URLs
PROJECT_DIR = current working directory
If SITE_COUNT == 1
RUN THIS BASH COMMAND:
mkdir -p assets/{logos,icons,images,videos,fonts,backgrounds} full-site/{pages,assets,styles}
If SITE_COUNT > 1
For each URL, extract site name (e.g., "stripe.com" -> "stripe-com") and RUN:
# Replace {site-name} with actual site name for each URL
mkdir -p assets/{site-name}/{logos,icons,images,videos,fonts,backgrounds}
mkdir -p full-sites/{site-name}/{pages,assets,styles}
Example for 3 sites:
mkdir -p assets/{stripe-com,vercel-com,linear-app}/{logos,icons,images,videos,fonts,backgrounds}
mkdir -p full-sites/{stripe-com,vercel-com,linear-app}/{pages,assets,styles}
Step 3: Spawn extraction agents
For EACH URL, call the Task tool with these EXACT parameters:
Tool: Task
Parameters:
- subagent_type: "general-purpose"
- run_in_background: true
- description: "Extract {site-name} assets"
- prompt: [USE THE AGENT PROMPT BELOW - replace variables]
Agent prompt (copy this exactly, replace {variables})
- NOTE: This agent prompt uses mcp__claude-in-chrome__* tools which require the
- claude-in-chrome browser extension MCP server. These tools will not be available
- in environments without the browser extension installed.
- Extract ALL assets from {URL} with paranoid-level thoroughness. Miss nothing.
- OUTPUT DIRECTORIES
- Assets: {PROJECT_DIR}/assets/{site-name}/ (or {PROJECT_DIR}/assets/ if single site)
- Full site: {PROJECT_DIR}/full-sites/{site-name}/ (or {PROJECT_DIR}/full-site/ if single site)
- PHASE 1: Browser setup and navigation
- 1. Call mcp__claude-in-chrome__tabs_context_mcp to get browser context
- 2. Call mcp__claude-in-chrome__tabs_create_mcp to create new tab
- 3. Call mcp__claude-in-chrome__navigate with url="{URL}" and the new tabId
- 4. Call mcp__claude-in-chrome__computer with action="screenshot" to verify page loaded
- 5. Scroll full page to trigger lazy loading:
- - mcp__claude-in-chrome__computer action="scroll" scroll_direction="down" scroll_amount=10
- - Repeat 3-5 times with 1 second waits
- PHASE 2: Asset discovery (JavaScript extraction)
- Call mcp__claude-in-chrome__javascript_tool with this code:
- (() => {
- const assets = {
- images: [...document.querySelectorAll('img')].map(img => ({
- src: img.src,
- srcset: img.srcset,
- dataSrc: img.dataset.src,
- alt: img.alt
- })).filter(i => i.src || i.srcset || i.dataSrc),
- videos: [...document.querySelectorAll('video, video source')].map(v => ({
- src: v.src || v.currentSrc,
- poster: v.poster,
- type: v.type
- })).filter(v => v.src),
- svgsInline: [...document.querySelectorAll('svg')].map((svg, i) => ({
- id: svg.id || 'svg-' + i,
- class: svg.className?.baseVal || '',
- html: svg.outerHTML
- })),
- backgrounds: [...document.querySelectorAll('*')].map(el => {
- const bg = getComputedStyle(el).backgroundImage;
- if (bg && bg !== 'none' && bg.includes('url(')) {
- return bg.match(/url\(['"]?([^'"]+)['"]?\)/)?.[1];
- }
- return null;
- }).filter(Boolean),
- favicons: [...document.querySelectorAll('link[rel*="icon"]')].map(l => ({
- href: l.href,
- rel: l.rel,
- sizes: l.sizes?.value
- })),
- ogImages: (() => {
- const og = document.querySelector('meta[property="og:image"]');
- const twitter = document.querySelector('meta[name="twitter:image"]');
- return [og?.content, twitter?.content].filter(Boolean);
- })(),
- fonts: (() => {
- const fonts = [];
- for (const sheet of document.styleSheets) {
- try {
- for (const rule of sheet.cssRules) {
- if (rule.type === 5) { // FONT_FACE_RULE
- const src = rule.style.getPropertyValue('src');
- const urls = src.match(/url\(['"]?([^'"]+)['"]?\)/g);
- if (urls) fonts.push(...urls.map(u => u.match(/url\(['"]?([^'"]+)['"]?\)/)?.[1]));
- }
- }
- } catch (e) {}
- }
- return [...new Set(fonts)];
- })()
- };
- return JSON.stringify(assets, null, 2);
- })()
- Store the result as DISCOVERED_ASSETS.
- PHASE 3: Content extraction
- Call mcp__claude-in-chrome__javascript_tool with this code:
- (() => {
- const content = {
- url: window.location.href,
- title: document.title,
- extractedAt: new Date().toISOString().split('T')[0],
- meta: {
- description: document.querySelector('meta[name="description"]')?.content,
- ogTitle: document.querySelector('meta[property="og:title"]')?.content,
- ogDescription: document.querySelector('meta[property="og:description"]')?.content,
- ogImage: document.querySelector('meta[property="og:image"]')?.content,
- favicon: document.querySelector('link[rel*="icon"]')?.href
- },
- navigation: {
- header: [...document.querySelectorAll('header a, nav a')].slice(0, 20).map(a => ({
- text: a.textContent?.trim(),
- href: a.href
- })),
- footer: [...document.querySelectorAll('footer a')].slice(0, 20).map(a => ({
- text: a.textContent?.trim(),
- href: a.href
- }))
- },
- headings: [...document.querySelectorAll('h1, h2, h3')].slice(0, 30).map(h => ({
- level: h.tagName,
- text: h.textContent?.trim()
- })),
- buttons: [...document.querySelectorAll('button, a.btn, [class*="button"]')].slice(0, 20).map(b => ({
- text: b.textContent?.trim(),
- href: b.href || null
- }))
- };
- return JSON.stringify(content, null, 2);
- })()
- Store the result as PAGE_CONTENT.
- PHASE 4: Download assets
- For each asset URL discovered, download using curl with error handling.
- If curl fails (non-zero exit), log the URL and continue to the next asset.
- # Logos (favicon, og:image, header logos)
- curl -sfLo "{output_dir}/logos/{site-name}-favicon.ico" "{favicon_url}" || echo "FAIL: {favicon_url}"
- curl -sfLo "{output_dir}/logos/{site-name}-og-image.png" "{og_image_url}" || echo "FAIL: {og_image_url}"
- # Images
- curl -sfLo "{output_dir}/images/{site-name}-{descriptive-name}.{ext}" "{image_url}" || echo "FAIL: {image_url}"
- # SVGs - Write inline SVGs to files using Write tool
- # Videos
- curl -sfLo "{output_dir}/videos/{site-name}-{name}.mp4" "{video_url}" || echo "FAIL: {video_url}"
- # Fonts
- curl -sfLo "{output_dir}/fonts/{font-name}.woff2" "{font_url}" || echo "FAIL: {font_url}"
- NAMING CONVENTION: {site-prefix}-{descriptive-name}.{ext}
- Use alt text or context for descriptive names
- Example: stripe-hero-illustration.svg, vercel-logo-white.svg
- PHASE 5: Save JSON files
- 1. Save page content:
- Use Write tool to save PAGE_CONTENT to:
- {full-site-dir}/pages/homepage.json
- 2. Save asset URLs catalog:
- Use Write tool to save DISCOVERED_ASSETS to:
- {full-site-dir}/asset-urls.json
- PHASE 6: Report results
- When complete, report:
- Total assets downloaded (count by type)
- Total size (estimate from curl outputs)
- Any failed downloads (list URLs)
- Output paths
- ERROR HANDLING
- If site unreachable: Report error, skip
- If asset download fails: Retry once with 2s delay, then log URL and continue
- If browser tab crashes: Call tabs_create_mcp again, continue
- If rate limited: Add 2 second delays between requests
- CONSTRAINTS
- Skip files > 50MB (log URL for manual download)
- Skip duplicate URLs
- Maximum 100 assets per category
- 5 minute timeout for entire extraction
Step 4: Monitor agents
After spawning all agents:
Report to user:
Traktor v2.0 - Extraction started
Folder structure created
Spawned {N} extraction agents:
- Agent 1: {site1} [running in background]
- Agent 2: {site2} [running in background]
...
Agents working in background. You'll be notified as they complete.
As agents complete, collect their results.
Step 5: Generate final manifest
After ALL agents complete, create manifest file:
Use Write tool to create asset-manifest.json:
{
"generated_at": "{current_datetime}",
"tool": "traktor v2.0",
"project_dir": "{PROJECT_DIR}",
"sites_extracted": ["{site1}", "{site2}"],
"total_assets": "{total_count}",
"by_type": {
"logos": "{count}",
"icons": "{count}",
"images": "{count}",
"videos": "{count}",
"fonts": "{count}",
"svgs": "{count}"
},
"naming_convention": "{site-prefix}-{descriptive-name}.{ext}",
"output_structure": {
"assets": "./assets/",
"full_sites": "./full-sites/"
}
}
Step 6: Final report
Display to user:
Traktor extraction complete!
Summary:
Sites: {N}
Total assets: {count} ({size_mb} MB)
By type:
- Logos: {n}
- Images: {n}
- Videos: {n}
- SVGs: {n}
- Fonts: {n}
Output:
- Assets: ./assets/
- Full sites: ./full-sites/
- Manifest: ./asset-manifest.json
Failed downloads: {list or "None"}
Quick reference
StepActionTool1Parse URLsInternal2Create foldersBash3Spawn agentsTask (background)4MonitorWait for notifications5Create manifestWrite6ReportOutput to user
Example execution
User: /traktor https://0g.ai
Claude executes:
Parse: URLS = ["https://0g.ai"], SITE_COUNT = 1
Bash: mkdir -p assets/{logos,icons,images,videos,fonts,backgrounds} full-site/{pages,assets,styles}
Task: Spawn 1 agent with exact prompt above
Report: "Traktor started, 1 agent running..."
Wait for agent completion
Write: asset-manifest.json
Report: Final summary
User: /traktor https://stripe.com https://vercel.com https://linear.app
Claude executes:
Parse: 3 URLs
Bash: Create 3 site folders in assets/ and full-sites/
Task: Spawn 3 parallel background agents
Report: "Traktor started, 3 agents running..."
Collect results as agents finish
Write: Combined manifest
Report: Combined summary