{
  "schemaVersion": "1.0",
  "item": {
    "slug": "kirk-content-pipeline",
    "name": "Kirk Content Pipeline",
    "source": "tencent",
    "type": "skill",
    "category": "开发工具",
    "sourceUrl": "https://clawhub.ai/lukerspace/kirk-content-pipeline",
    "canonicalUrl": "https://clawhub.ai/lukerspace/kirk-content-pipeline",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadMode": "redirect",
    "downloadUrl": "/downloads/kirk-content-pipeline",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=kirk-content-pipeline",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "installMethod": "Manual import",
    "extraction": "Extract archive",
    "prerequisites": [
      "OpenClaw"
    ],
    "packageFormat": "ZIP package",
    "includedAssets": [
      "SKILL.md",
      "references/citrini7-style.md",
      "references/jukan05-patterns.md",
      "references/kirk-voice.md",
      "references/serenity-style.md",
      "references/zephyr-patterns.md"
    ],
    "primaryDoc": "SKILL.md",
    "quickSetup": [
      "Download the package from Yavira.",
      "Extract the archive and review SKILL.md first.",
      "Import or place the package into your OpenClaw setup."
    ],
    "agentAssist": {
      "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
      "steps": [
        "Download the package from Yavira.",
        "Extract it into a folder your agent can access.",
        "Paste one of the prompts below and point your agent at the extracted folder."
      ],
      "prompts": [
        {
          "label": "New install",
          "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
        },
        {
          "label": "Upgrade existing",
          "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
        }
      ]
    },
    "sourceHealth": {
      "source": "tencent",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-04-30T16:55:25.780Z",
      "expiresAt": "2026-05-07T16:55:25.780Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
        "contentDisposition": "attachment; filename=\"network-1.0.0.zip\"",
        "redirectLocation": null,
        "bodySnippet": null
      },
      "scope": "source",
      "summary": "Source download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this source.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/kirk-content-pipeline"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    },
    "downloadPageUrl": "https://openagent3.xyz/downloads/kirk-content-pipeline",
    "agentPageUrl": "https://openagent3.xyz/skills/kirk-content-pipeline/agent",
    "manifestUrl": "https://openagent3.xyz/skills/kirk-content-pipeline/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/kirk-content-pipeline/agent.md"
  },
  "agentAssist": {
    "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
    "steps": [
      "Download the package from Yavira.",
      "Extract it into a folder your agent can access.",
      "Paste one of the prompts below and point your agent at the extracted folder."
    ],
    "prompts": [
      {
        "label": "New install",
        "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
      },
      {
        "label": "Upgrade existing",
        "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
      }
    ]
  },
  "documentation": {
    "source": "clawhub",
    "primaryDoc": "SKILL.md",
    "sections": [
      {
        "title": "Kirk Content Pipeline",
        "body": "Create Twitter content from analyst research PDFs, validated against KSVC holdings."
      },
      {
        "title": "Pipeline Steps (MANDATORY)",
        "body": "1a.   Scan PDFs (Explore agents for broad screening)\n1b.   Extract insights (RLM for deep extraction - text, tables, AND charts)\n1c.   Cross-doc synthesis (rlm-multi for insights across sources)\n2.    Check KSVC holdings (preliminary - with known tickers)\n3.    Write content (data backbone, Serenity-heavy)\n4a.   AUDIT (verify draft claims against source PDFs with RLM)\n4a.5. GEMINI CROSS-VALIDATION (web-verify FAIL/UNSOURCED inferences)\n4b.   Final Holdings Verification (check ALL 7 models with discovered tickers)\n4c.   Stylize (invoke kirk-mode skill for voice/character)\n4d.   Humanize (remove AI patterns)\n5.    Save draft for approval\n6.    Chart decision & generation (after draft crystallizes thesis)\n7.    PUBLISH to final folder (clean version for posting)\n\nNever skip steps 4a-4d. Use 1a for multi-PDF screening, 1b for deep extraction, 1c for cross-doc synthesis, 4a for verification, 4a.5 for web cross-validation, 4b for final holdings check, 4c for character voice, 4d for AI pattern removal.\n\n⚠️ CRITICAL: Step 1b extracts data. Step 1c synthesizes across docs. Step 4a VERIFIES the written content. Step 4a.5 CROSS-VALIDATES inferences.\n\n1b: \"What does each PDF say?\" (per-doc extraction)\n1c: \"What patterns emerge across PDFs?\" (cross-doc synthesis)\n4a: \"Does my draft accurately reflect the sources?\" (source-locked verification)\n4a.5: \"Are the flagged inferences valid per public sources?\" (web cross-validation)\n4c: \"Which Kirk mode fits this situation?\" (character voice)"
      },
      {
        "title": "Subagent Permissions (CRITICAL)",
        "body": "Subagents CANNOT Read files outside the project directory. PDFs in /Users/Shared/ksvc/pdfs/ are blocked. The fix: symlink PDFs into the project directory before spawning subagents.\n\nThe main agent MUST create a symlink before Step 1a:\n\nln -sf \"/Users/Shared/ksvc/pdfs/YYYYMMDD\" \".claude/pdfs-scan\"\n\nThen subagents Read from .claude/pdfs-scan/filename.pdf — this works because the path resolves inside the project.\n\nAccess Method/Users/Shared/ pathSymlinked project pathSubagent Read tool (PDF)❌ Auto-denied✅ WorksSubagent Read tool (images)❌ Auto-denied✅ WorksMain agent Read tool✅ User approves✅ WorksBash → RLM✅ Any path✅ Any path\n\nDiscovered 2026-02-07: Subagents fail with \"Permission to use Read has been auto-denied (prompts unavailable)\" on /Users/Shared/ paths. Symlink into project dir = full Read access. Tested: 19 PDFs, medium thoroughness, 125k tokens, zero errors."
      },
      {
        "title": "Content Types & Voice Blends",
        "body": "Full guide: references/kirk-voice.md — Read this for templates and examples.\n\nKirk voice = Serenity's data + Citrini7's wit + Jukan's skepticism + Zephyr's energy.\n\nTypeWhenBlendKey ElementLong ThreadDeep dive, multi-sourceSerenity + JukanTLDR + skepticismQuick TakeSingle insight, one reportCitrini7 + SerenityPunchy + one numberBreaking NewsJust droppedZephyr + JukanReaction word + numberShitpostMarket absurdityCitrini7 + ZephyrMeme formatPersonal CommentaryOpinion, questionPure JukanFirst-person + uncertaintyVictory LapKSVC call workedPure ZephyrEntry/Now + thesis"
      },
      {
        "title": "Quick Formulas",
        "body": "Long Thread: Hook → TLDR → Numbers → Skepticism → Position\n\nQuick Take: Headline number → Context → \"If you're looking now...\"\n\nBreaking News: \"Huge.\" / \"Well well well...\" → Key number → Source\n\nVictory Lap: \"$TICKER up X% since KSVC added it\" → Entry/Now → Thesis validated"
      },
      {
        "title": "Step 1a: Scan PDFs with Explore Agents",
        "body": "Use Explore agents for broad screening when you have many PDFs to review. This is faster than RLM for initial discovery."
      },
      {
        "title": "Step 1a.0: Check Published Threads (MANDATORY - DO FIRST)",
        "body": "⚠️ Before scanning any PDFs, check what Kirk has already posted.\n\n# List all published threads\nls /Users/Shared/ksvc/threads/\n\n# Read recent thread.md files to understand what topics are covered\n\nFor each published thread, note:\n\nTopic (what was the thesis?)\nSource PDFs used (check _metadata.md)\nDate (how recent?)\n\nThen when selecting a topic after scanning, REJECT any topic that:\n\nUses the same primary source PDF as a published thread\nCovers the same thesis/angle (even if from different sources)\nWould read as a repeat to Kirk's followers\n\nAcceptable overlap:\n\nA follow-up/update to a previous thread with NEW data (e.g., earnings confirm the thesis)\nA different angle on the same sector (e.g., posted about ABF shortage, now posting about specific company earnings)\nExplicitly framed as \"update: here's what changed since my last post on X\"\n\nWhy this exists (Case Study — ABF Substrate, 2026-02-07):\n\nKirk published a 10-tweet thread on Feb 5 covering Goldman's ABF shortage report (10%→21%→42%, Kinsus/NYPCB/Unimicron). On Feb 7, the pipeline picked the same Goldman report and produced a 3-tweet quick take with the same numbers, same companies, same angle. We didn't check published threads first, so we wasted a pipeline run on duplicate content when 10 other fresh topic angles were available."
      },
      {
        "title": "When to Use",
        "body": "Screening 10+ PDFs to find relevant ones\nFinding cross-document connections\nBuilding a thesis from multiple sources\nDon't know which PDFs matter yet"
      },
      {
        "title": "How to Scan",
        "body": "1. Check published threads (Step 1a.0 above)\n\n2. List recent PDF folders and count PDFs\n   ls /Users/Shared/ksvc/pdfs/ | tail -5\n   ls /Users/Shared/ksvc/pdfs/YYYYMMDD/ | wc -l\n\n3. Symlink PDFs into project directory (REQUIRED for subagent access)\n   ln -sf \"/Users/Shared/ksvc/pdfs/YYYYMMDD\" \".claude/pdfs-scan\"\n\n4. Split PDFs into groups and spawn parallel Explore agents\n   TARGET: ~5 PDFs per agent. Spawn ALL agents in a single message.\n   - Each agent gets a specific list of filenames to scan\n   - All agents run simultaneously → total time = slowest agent\n   - Haiku is cheap — more agents = faster with no meaningful cost increase"
      },
      {
        "title": "Agent Sizing",
        "body": "PDFsAgentsPDFs/AgentExpected Time≤51all~25s6-102~5 each~25s11-153~5 each~25s16-204~5 each~25s21-305-6~5 each~30s\n\nWhy ~5 PDFs per agent? Sweet spot for speed. Each PDF takes ~4-8s to Read + summarize. 5 PDFs ≈ 25s per agent. Adding more PDFs per agent saves nothing (same total tokens) but makes wall-clock time worse.\n\nCost: Haiku is cheap. 4 agents × 5 PDFs × ~4k tokens = ~80k input tokens total — same as 1 agent doing all 20. Parallelism is free.\n\nCross-doc synthesis trade-off: Each agent only sees its batch, so cross-batch themes are the main agent's job. This is fine — the main agent merges all results anyway."
      },
      {
        "title": "Example: Spawn Explore Agents",
        "body": "Step 1: Main agent creates symlink and lists PDFs:\n\nln -sf \"/Users/Shared/ksvc/pdfs/20260205\" \".claude/pdfs-scan\"\n/bin/ls \".claude/pdfs-scan/\"\n\nStep 2: Split filenames into groups and spawn agents in parallel (single message, multiple Task calls):\n\n# Agent 1 — first batch\nTask(subagent_type=\"Explore\", prompt=\"\"\"\n**THOROUGHNESS: medium**\n\nScan these specific PDFs for content angles:\n- file1.pdf\n- file2.pdf\n- file3.pdf\n- file4.pdf\n- file5.pdf\n- file6.pdf\n- file7.pdf\n\nFor each PDF, Read enough pages to understand the full thesis (use judgment — some need 1-2 pages, others 1-5):\n\nRead(file_path=\"/Users/dydo/Documents/agent/ksvc-intern/.claude/pdfs-scan/FILENAME.pdf\", pages=\"1-5\")\n\nFor each PDF extract:\n- Company/sector, ticker, rating, price target\n- Key thesis and supporting numbers\n- Supply chain connections\n- Potential content angles\n\nAfter scanning your batch, provide:\n1. Per-PDF summary (2-3 sentences each)\n2. Cross-document themes within your batch\n3. Which PDFs are most relevant for deep extraction\n\"\"\")\n\n# Agent 2 — second batch (SPAWN IN SAME MESSAGE as Agent 1)\nTask(subagent_type=\"Explore\", prompt=\"\"\"\n... same prompt with file8.pdf through file14.pdf ...\n\"\"\")\n\n# Agent 3 — third batch (SPAWN IN SAME MESSAGE)\nTask(subagent_type=\"Explore\", prompt=\"\"\"\n... same prompt with file15.pdf through file20.pdf ...\n\"\"\")\n\nStep 3: Main agent synthesizes results from all agents:\nAfter all agents return, the main agent:\n\nMerges per-PDF summaries\nIdentifies cross-agent themes (patterns Agent 1 found + patterns Agent 2 found)\nPicks top 3 content angles across all PDFs\nSelects 2-5 PDFs for Step 1b deep extraction"
      },
      {
        "title": "Output: Identify Which PDFs Matter",
        "body": "After scanning, you'll know:\n\nWhich reports have the best data\nCross-document connections (e.g., \"3 reports confirm memory shortage\")\nThesis recommendations (2-3 angles to explore)\nWhich to deep-extract with RLM\n\n⚠️ WARNING: Explore agents can hallucinate specific numbers. Treat all numbers from Explore summaries as \"unverified claims\" until RLM grep confirms them. Component counts, percentages, and market sizing are especially prone to errors.\n\nCapacity (tested 2026-02-07): Single Explore agent (haiku) handled 19 PDFs at medium thoroughness in 83 seconds, using 125k tokens (~4k tokens/PDF for pages 1-5). 3 agents in parallel = ~30-40s for the same batch."
      },
      {
        "title": "Step 1b: Deep Extract with RLM",
        "body": "Use RLM for deep extraction from specific PDFs you've identified in Step 1a.\n\nMANDATORY for any number you'll publish. Explore agents summarize; RLM verifies."
      },
      {
        "title": "When to Use",
        "body": "You know which 2-5 PDFs matter most\nNeed specific numbers, charts, tables\nBuilding cross-document verification tables\nExtracting technical details (fabs, yields, WPM)"
      },
      {
        "title": "Single PDF",
        "body": "cd ~/.claude/skills/rlm-repl/scripts\npython3 rlm_repl.py init \"/Users/Shared/ksvc/pdfs/YYYYMMDD/file.pdf\" --extract-images\npython3 rlm_repl.py exec -c \"print(grep('revenue|growth|target|price', max_matches=20, window=200))\""
      },
      {
        "title": "Multiple PDFs (synthesis)",
        "body": "cd ~/.claude/skills/rlm-repl-multi/scripts\npython3 rlm_repl.py init \"/path/to/report1.pdf\" --name report1 --extract-images\npython3 rlm_repl.py init \"/path/to/report2.pdf\" --name report2 --extract-images\npython3 rlm_repl.py exec -c \"results = grep_all('keyword', max_matches_per_context=20)\""
      },
      {
        "title": "View Extracted Charts/Images",
        "body": "# List images from a context\npython3 rlm_repl.py exec --name report1 -c \"print(list_images())\"\n\n# Get image path, then use Read tool to view\npython3 rlm_repl.py exec --name report1 -c \"print(get_image(0))\"\n\nCharts often contain key data (P/B trends, margin history, capacity timelines) that text extraction misses."
      },
      {
        "title": "Extraction Validation (MANDATORY)",
        "body": "⚠️ After EVERY rlm_repl.py init, validate the extraction actually worked.\n\nRLM reports chars_extracted after init. A multi-page analyst report should yield thousands of chars. If you get suspiciously few, the PDF is likely image-based and RLM only extracted metadata/headers.\n\nValidation rule:\n\nChars ExtractedExpected Report TypeAction> 5,000Multi-page report✅ Proceed with grep1,000 - 5,000Short note / partial⚠️ Check list_images() — if many images, trigger fallback< 1,000Image-based PDF❌ MUST use Read tool fallback\n\nThe threshold is context-dependent. A 20-page Goldman Sachs report yielding 666 chars is obviously broken. A 1-page pricing table yielding 800 chars might be fine. Use judgment, but when in doubt, fallback.\n\nMandatory Fallback when RLM extraction is low:\n\n# Step 1: RLM init (always try first)\npython3 rlm_repl.py init \"/path/to/report.pdf\" --extract-images\n# Output: \"Extracted 666 chars from 15 pages, saved 9 images\"\n\n# Step 2: Check - is 666 chars enough for a 15-page report? NO.\n# → Trigger fallback\n\n# Step 3: Check extracted images first (they may contain the data)\npython3 rlm_repl.py exec -c \"print(list_images())\"\n# View extracted images with Read tool\n# Read(file_path=\"/path/to/extracted/image-0.png\")\n\n# Step 4: Read the PDF directly (use symlinked path for subagents)\n# Read(file_path=\".claude/pdfs-scan/report.pdf\", pages=\"1-10\")\n# Read(file_path=\".claude/pdfs-scan/report.pdf\", pages=\"11-20\")\n\n⚠️ Path rule: Subagents must Read PDFs via the symlinked project path (.claude/pdfs-scan/), NOT from /Users/Shared/. See \"Subagent Permissions\" section above.\n\nWhy this exists (Case Study — ABF Substrate Shortage, 2026-02-07):\n\nGoldman Sachs published two reports: a main ABF upcycle report (71K chars, extracted fine) and a Kinsus upgrade report (15 pages, but only 666 chars extracted). We skipped the Kinsus PDF because \"the main report had everything we needed.\" It didn't. The Kinsus report had unique data (company-specific capacity plans, margin guidance, order book details) that would have strengthened the thread. Skipping it was lazy — the Read tool fallback takes 30 seconds and would have recovered the data.\n\nRules:\n\nNever skip a relevant PDF just because RLM extraction was low. Use the fallback.\nCheck extracted images. RLM with --extract-images often saves chart/table images even when text extraction fails. View them with Read tool.\nLog the fallback. In the extraction cache, note \"extraction_method\": \"read_fallback\" so audit knows the data source.\nIf fallback also fails (corrupted PDF, DRM), document it and move on. But you must TRY."
      },
      {
        "title": "RLM Cache: Include Visual Data",
        "body": "When extracting, capture all data types for potential chart generation later:\n\nSource TypeWhat to ExtractCache FormatText numbersExact quotes with page refs{\"value\": 5.3, \"unit\": \"B\", \"source\": \"p.3\", \"quote\": \"規模約53億美元\"}TablesFull table as structured JSON{\"columns\": [...], \"rows\": [...], \"source\": \"p.20\"}ChartsData points + source image path{\"data\": {...}, \"source_image\": \"pdf-3-1.png\", \"page\": 3}\n\nWhy cache visual data? Step 6 (chart generation) needs this. If you only cache text, you'll lose table structures and chart data points that make great visualizations."
      },
      {
        "title": "Cross-Document Reasoning",
        "body": "Build thesis by triangulating claims across multiple reports:\n\n# Find where multiple reports discuss the same topic\npython3 rlm_repl.py exec -c \"results = grep_all('DRAM.*price|ASP', max_matches_per_context=5)\"\n\n# Compare forecasts across sources\npython3 rlm_repl.py exec -c \"results = grep_all('2026|2027|growth|demand', max_matches_per_context=5)\"\n\nUse cross-doc to verify:\n\nDo multiple sources agree on price forecasts?\nAre supply constraint timelines consistent?\nAny contradictions between reports?"
      },
      {
        "title": "Step 1b.5: Build Extraction Cache (MANDATORY)",
        "body": "⚠️ Why this step exists: RLM creates state.pkl during extraction, but the writing phase (Step 3) doesn't access it. Without a persistent cache, writers rely on memory, leading to errors like wrong product types, missing time periods, or source attribution mistakes.\n\nWhat this does: Extracts from state.pkl (RLM's internal format) into structured JSON with context labels that the writing phase can reference."
      },
      {
        "title": "When to Run",
        "body": "After Step 1b (RLM extraction) and before Step 3 (writing).\n\nWorkflowWhen to CacheSingle PDF (rlm-repl)After rlm_repl.py init completesMultiple PDFs (rlm-repl-multi)After all init commands complete"
      },
      {
        "title": "How to Build Cache",
        "body": "New in v2: Auto-generates source tags and attribution map from PDF filenames!\n\nSingle PDF (rlm-repl):\n\ncd ~/.claude/skills/kirk-content-pipeline/scripts\n\n# Auto-extracts from default rlm-repl state location\npython3 build_extraction_cache.py \\\n  --output /path/to/draft-assets/rlm-extraction-cache.json\n\nMultiple PDFs (rlm-repl-multi):\n\ncd ~/.claude/skills/kirk-content-pipeline/scripts\n\n# Use --multi flag to load from rlm-repl-multi state\npython3 build_extraction_cache.py \\\n  --multi \\\n  --output /path/to/draft-assets/rlm-extraction-cache.json\n\nWith Cross-Doc Synthesis (Optional):\n\n# Add manual synthesis descriptions for cross-doc insights\npython3 build_extraction_cache.py \\\n  --multi \\\n  --output /path/to/draft-assets/rlm-extraction-cache.json \\\n  --synthesis /path/to/cross-doc-synthesis.json\n\nSynthesis format (optional, for complex multi-source threads):\n\n{\n  \"dual_squeeze_thesis\": {\n    \"description\": \"Memory shortage (1Q26) + ABF substrate shortage (2H26) = compounding AI server bottleneck\",\n    \"components\": [\n      {\"topic\": \"Memory Pricing\", \"source\": \"gfhk_memory\", \"timeframe\": \"1Q26\"},\n      {\"topic\": \"Abf Shortage\", \"source\": \"goldman_abf\", \"timeframe\": \"2H26-2028\"}\n    ]\n  }\n}\n\nWhat auto-generates:\n\n✅ Source tags from PDF filenames (\"GFHK - Memory.pdf\" → tag: \"GFHK\")\n✅ Topics with primary_source, key_metrics, source_context\n✅ Extraction entries with full context labels (product_type, time_period, units, scope)"
      },
      {
        "title": "Cache Format",
        "body": "The cache includes context labels and attribution map to prevent common errors:\n\n{\n  \"cache_version\": \"1.0\",\n  \"generated_at\": \"2026-02-05T14:00:00\",\n  \"sources\": [\n    {\n      \"source_id\": \"gfhk_memory\",\n      \"pdf_path\": \"/Users/Shared/ksvc/pdfs/20260204/GFHK - Memory.pdf\",\n      \"pdf_name\": \"GFHK - Memory price impact.pdf\",\n      \"tag\": \"GFHK\",\n      \"chars_extracted\": 13199\n    },\n    {\n      \"source_id\": \"goldman_abf\",\n      \"pdf_path\": \"/Users/Shared/ksvc/pdfs/20260204/Goldman ABF shortage.pdf\",\n      \"pdf_name\": \"Goldman Sachs ABF shortage report.pdf\",\n      \"tag\": \"Goldman Sachs\",\n      \"chars_extracted\": 25000\n    }\n  ],\n  \"extractions\": [\n    {\n      \"entry_id\": \"mem_001\",\n      \"source_id\": \"gfhk_memory\",\n      \"figure\": \"Figure 2\",\n      \"page\": 3,\n      \"metric\": \"Total BOM\",\n      \"product_type\": \"HGX B300 8-GPU server\",\n      \"time_period\": \"3Q25 → 1Q26E\",\n      \"units\": \"dollars per server\",\n      \"scope\": \"single HGX B300 8-GPU server\",\n      \"values\": {\n        \"before\": \"$369k\",\n        \"after\": \"$408k\",\n        \"change\": \"+$39k\"\n      },\n      \"context\": \"Memory price impact on AI server BOM\",\n      \"source_quote\": \"Figure 2: HGX B300 8-GPU server BOM...\",\n      \"verification\": \"RLM grep + visual inspection\"\n    }\n  ],\n  \"source_attribution_map\": {\n    \"topics\": {\n      \"Memory Pricing\": {\n        \"primary_source\": \"gfhk_memory\",\n        \"tag\": \"GFHK\",\n        \"key_metrics\": [\"HBM3e ASP\", \"DDR5-6400 (128GB)\", \"NVMe SSD (3.84TB)\", \"Total BOM\"],\n        \"source_context\": \"Figures: Figure 2; Time periods: 3Q25 → 1Q26E\",\n        \"notes\": \"4 extractions from this source\"\n      },\n      \"Abf Shortage\": {\n        \"primary_source\": \"goldman_abf\",\n        \"tag\": \"Goldman Sachs\",\n        \"key_metrics\": [\"ABF shortage ratio\", \"Kinsus PT\", \"NYPCB PT\", \"Unimicron PT\"],\n        \"source_context\": \"Time periods: 2H26, 2027, 2028\",\n        \"notes\": \"5 extractions from this source\"\n      }\n    },\n    \"cross_doc_synthesis\": {\n      \"dual_squeeze_thesis\": {\n        \"description\": \"Memory shortage (1Q26) + ABF substrate shortage (2H26) = compounding AI server bottleneck\",\n        \"components\": [\n          {\"topic\": \"Memory Pricing\", \"source\": \"gfhk_memory\", \"timeframe\": \"1Q26\"},\n          {\"topic\": \"Abf Shortage\", \"source\": \"goldman_abf\", \"timeframe\": \"2H26-2028\"}\n        ]\n      }\n    }\n  }\n}\n\nKey fields that prevent errors:\n\nproduct_type: Prevents \"GB300 rack\" when source says \"HGX B300 server\"\ntime_period: Prevents missing \"3Q25 → 1Q26E\" context\nsource_id: Prevents \"Goldman's BOM\" when data is from GFHK\ntag: Auto-extracted from PDF filename for quick attribution\nunits: Prevents \"22.5B racks\" when source means \"22.5bn dollars\"\nscope: Prevents \"per rack\" when source means \"per server\"\n\nAttribution map benefits:\n\ntopics: Topic-level mapping showing which source is primary authority\nkey_metrics: Quick lookup of what each source covers\nsource_context: Summary of figures, time periods covered\ncross_doc_synthesis: Manual insights connecting multiple sources"
      },
      {
        "title": "Integration with Step 3 (Writing)",
        "body": "MANDATORY: Reference the cache when writing.\n\nStep 3a: Load cache and attribution map:\n\ncache = load_json('rlm-extraction-cache.json')\nattr_map = cache['source_attribution_map']\n\n# Get topic attribution\ntopic = \"Memory Pricing\"\nsource_tag = attr_map['topics'][topic]['tag']  # \"GFHK\"\nkey_metrics = attr_map['topics'][topic]['key_metrics']\n\nStep 3b: Write using cache labels and attribution:\n\n## Content\n\n3/ Memory squeeze is already here. GFHK's BOM breakdown (3Q25 → 1Q26E):\n- HBM3e ASP: $3,756 → $4,378 (+17%)\n- DDR5-6400 (128GB): $563 → $1,920 (+241%)\n- HGX B300 8-GPU server BOM: $369k → $408k\n\nSource: rlm-extraction-cache.json, entry mem_001, mem_002, mem_003\n\nContext labels from cache:\n\nProduct type: HGX B300 8-GPU server (not GB300 rack)\nTime period: 3Q25 → 1Q26E (quarterly change)\nSource: GFHK Figure 2 (via attribution map tag)\n\nAttribution map usage:\n\nUsed topics[\"Memory Pricing\"][\"tag\"] → \"GFHK\"\nVerified metrics against key_metrics list\nCross-doc synthesis: See dual_squeeze_thesis for memory + ABF connection"
      },
      {
        "title": "Enforcement",
        "body": "Before saving draft (Step 5), verify:\n\nEvery published number has a cache entry\n Product types match cache labels\n Time periods included from cache\n Source attributions match cache source_id and attribution map tag\n Units match cache (dollars vs racks, per server vs per datacenter)\n Cross-doc claims reference cross_doc_synthesis if applicable\n\nRed flags - stop if you notice:\n\nWriting numbers from memory instead of cache\nProduct type differs from cache (product_type field)\nMissing time period when cache has time_period\nAttributing to wrong source vs cache source_id\nUsing wrong tag (e.g., \"Goldman\" for GFHK data)\nMissing cross-doc synthesis when connecting multiple sources"
      },
      {
        "title": "Manual Cache Building",
        "body": "If automatic extraction fails, manually create cache entries:\n\n{\n  \"entry_id\": \"manual_001\",\n  \"source_id\": \"report_name\",\n  \"metric\": \"Component count\",\n  \"product_type\": \"Humanoid robot (dexterous hand)\",\n  \"values\": {\"count\": 22},\n  \"units\": \"DOF (degrees of freedom)\",\n  \"context\": \"Dexterous hand articulation\",\n  \"source_quote\": \"22自由度靈巧手\",\n  \"verification\": \"Manual extraction from p.15\",\n  \"notes\": \"Summed from finger joints (20) + wrist (2)\"\n}\n\nSee: ~/.claude/skills/kirk-content-pipeline/scripts/README-extraction-cache.md for full documentation."
      },
      {
        "title": "Step 1c: Cross-Doc Synthesis (RECOMMENDED)",
        "body": "Why this step exists: Steps 1a and 1b produce per-document facts. Without explicit synthesis, the pipeline gravitates toward single-source claims (\"KHGEARS P/E is 20x\") rather than cross-doc insights (\"Taiwan brokers are more bullish than Western analysts on humanoid robotics\")."
      },
      {
        "title": "When to Use",
        "body": "ScenarioUse 1c?Multiple PDFs on same topicYesComparing broker viewsYesFinding consensus/disagreementYesSingle PDF deep diveNo (skip to Step 2)Breaking news (speed matters)No (skip to Step 2)"
      },
      {
        "title": "What 1c Produces",
        "body": "Output TypeExampleAudit RequirementConsensus claim\"3 of 4 brokers see DRAM ASP rising in 2H26\"Cross-doc (rlm-multi)Comparative insight\"HIWIN at 38x vs KHGEARS at 20x - market pricing in certainty\"Cross-doc (rlm-multi)Disagreement flag\"MS says neutral, local brokers say buy - who's right?\"Cross-doc (rlm-multi)Synthesized thesis\"Taiwan supply chain undervalued vs China peers\"Cross-doc (rlm-multi)"
      },
      {
        "title": "How to Run Cross-Doc Synthesis",
        "body": "cd ~/.claude/skills/rlm-repl-multi/scripts\n\n# Initialize all relevant PDFs\npython3 rlm_repl.py init \"/path/to/broker1.pdf\" --name broker1\npython3 rlm_repl.py init \"/path/to/broker2.pdf\" --name broker2\npython3 rlm_repl.py init \"/path/to/broker3.pdf\" --name broker3\n\n# Ask synthesis questions (not just extraction)\npython3 rlm_repl.py exec -c \"\n# Question 1: Do they agree on market sizing?\nmarket_data = grep_all('market size|TAM|規模|billion|億', max_matches_per_context=10)\nprint('=== MARKET SIZE ACROSS SOURCES ===')\nprint(market_data)\n\"\n\npython3 rlm_repl.py exec -c \"\n# Question 2: Compare recommendations\nratings = grep_all('BUY|SELL|NEUTRAL|買進|賣出|中立|rating|recommendation', max_matches_per_context=10)\nprint('=== RATINGS COMPARISON ===')\nprint(ratings)\n\"\n\npython3 rlm_repl.py exec -c \"\n# Question 3: Find disagreements\npe_data = grep_all('P/E|PE|本益比|target price|目標價', max_matches_per_context=10)\nprint('=== VALUATION COMPARISON ===')\nprint(pe_data)\n\""
      },
      {
        "title": "Synthesis Questions to Ask",
        "body": "CategoryQuestionsConsensusDo sources agree on [market size / timeline / key risk]?ComparisonHow does [broker A] view differ from [broker B]?ValuationAre local vs foreign analysts pricing the same?TimelineDo sources agree on [catalyst / inflection point]?RiskWhat risks does one source mention that others miss?"
      },
      {
        "title": "Output Format: Synthesis Cache",
        "body": "After running 1c, document synthesized insights for Step 3 (writing):\n\n## Cross-Doc Synthesis (Step 1c)\n\n**Sources:** broker1 (永豐), broker2 (MS), broker3 (Citi)\n\n### Consensus\n- Market size: All 3 agree on $5-6B (2025) → $30-35B (2029)\n- CAGR: 55-60% range across all sources\n\n### Disagreements\n- HIWIN: MS says NEUTRAL (38x too rich), 永豐 silent, Citi no coverage\n- Timeline: 永豐 more bullish on 2026 ramp, MS cautious until 2027\n\n### Comparative Insights (use in thread)\n- \"Taiwan brokers (永豐) bullish on KHGEARS; Western analysts (MS) more cautious on HIWIN\"\n- \"Local coverage sees 2026 inflection; foreign houses waiting for 2027 proof points\"\n\n### Audit Flag\nThese synthesized claims require cross-doc verification in Step 4b:\n- [ ] \"3 sources agree on market size\" → verify all 3 sources\n- [ ] \"Local vs foreign view divergence\" → verify specific ratings from each"
      },
      {
        "title": "Integration with Audit (Step 4a)",
        "body": "⚠️ CRITICAL: Synthesized claims from Step 1c MUST be flagged for cross-doc audit in Step 4a.\n\nIn the audit manifest, mark these claims with cross-doc: true:\n\n## Claims to Verify\n\n| # | Claim | Type | Source ID | Cross-Doc? |\n|---|-------|------|-----------|------------|\n| 1 | KHGEARS P/E 20x | P/E | src1 | No |\n| 2 | Market consensus $5.3B | Consensus | src1, src2, src3 | **Yes** |\n| 3 | Local vs foreign view divergence | Synthesis | src1, src2 | **Yes** |\n\nCross-doc claims use rlm-repl-multi for verification, not parallel single-doc agents."
      },
      {
        "title": "Extract with Technical Specificity",
        "body": "Go beyond surface numbers. Extract:\n\nWafer capacity (WPM)\nFab names (M15X, P4L, X2)\nYield percentages\nProcess nodes (1b, 1c)\nComponent counts per unit\n\nQuestionExtractWhatOne-sentence summaryWhyWhy readers should careWhoCompanies/tickers affectedWhenTimeline (specific quarters)WhereFab locations, geographyHowMechanism with technical detail"
      },
      {
        "title": "Step 2: Check KSVC Holdings (Initial)",
        "body": "⚠️ CRITICAL: This is a preliminary check. You MUST run Step 4c (Final Holdings Verification) after writing content to catch any tickers discovered during extraction."
      },
      {
        "title": "All Models (7 Total)",
        "body": "US Models: usa-model1 ~ usa-model5 (5 models)\nTaiwan (TWSE) Models: twse-model1 ~ twse-model2 (2 models)"
      },
      {
        "title": "Step 2a: Identify All Possible Tickers",
        "body": "Before querying the API, identify ALL possible identifiers for the company:\n\n# Example: Global Unichip Corp\n# Identifiers to search:\n# - US ticker: N/A (not US-listed)\n# - Taiwan ticker: 3443\n# - Chinese name: 創意 or 全球晶圓科技\n# - English name: Global Unichip, GUC\n# - Stock code: 3443 TW (TWSE format)\n\n# For Taiwan stocks, verify ticker via TWSE API first:\ncurl -s \"https://www.twse.com.tw/en/api/codeQuery?query=3443\"\n# Returns: {\"query\":\"3443\",\"suggestions\":[\"3443\\tGUC\"]}\n\nRules:\n\nUS stocks: Search by ticker only (e.g., \"MU\", \"AMD\", \"NVDA\")\nTaiwan stocks: Search by stock code (e.g., \"3443\") - may appear as \"3443 創意\" in API\nIf unsure: Check both US and TWSE models"
      },
      {
        "title": "Step 2b: Query All 7 Models",
        "body": "NEVER assume a stock isn't held without checking ALL 7 models.\n\nRECOMMENDED: Use tradebook for accurate entry prices and current status\n\n# FASTEST METHOD: Check tradebook for entry price + status\n# (Works for all models - US and TWSE)\ncurl -s \"https://kicksvc.online/api/twse-model2\" | \\\n  jq '.tradebook[] | select(.ticker == 6285 or .ticker == 3491) |\n  {ticker, enterDate, enterPrice, todayPrice, profitPercent, exitDate}'\n\n# Returns:\n# {\n#   \"ticker\": 6285,\n#   \"enterDate\": \"Wed, 28 Jan 2026 00:00:00 GMT\",\n#   \"enterPrice\": 162.0,\n#   \"todayPrice\": 207.5,  # ⚠️ May be stale! Use Yahoo Finance for current\n#   \"profitPercent\": 28.09,  # ⚠️ Based on stale todayPrice\n#   \"exitDate\": null  # null = still holding\n# }\n\n⚠️ CRITICAL: API's todayPrice and profitPercent can be STALE (hours or days old). Always verify current price with Yahoo Finance API (Step 2d).\n\nFALLBACK: Check equitySeries (slower, less data)\n\n# Check ALL 5 US models\nfor i in 1 2 3 4 5; do\n  echo \"=== USA-Model $i ===\"\n  curl -s \"https://kicksvc.online/api/usa-model$i\" | \\\n    jq --arg t \"MU\" '.equitySeries[0].series[] | select(.Ticker == $t) |\n    {ticker: .Ticker, return: .data[-1].value}'\ndone\n\n# Check ALL 2 TWSE models (search by stock code)\nfor i in 1 2; do\n  echo \"=== TWSE-Model $i ===\"\n  curl -s \"https://kicksvc.online/api/twse-model$i\" | \\\n    jq '.equitySeries[0].series[] | select(.Ticker | contains(\"3443\")) |\n    {ticker: .Ticker, return: .data[-1].value}'\ndone\n\nWhy still use equitySeries?\n\nHistorical tracking: Shows return % evolution over time (.data[] array)\nVerification: Confirms position is still active\nFallback: If tradebook is unavailable or empty\nEntry date discovery: First data point (return ≈ 0) indicates entry date\n\nExample: Finding entry date from equitySeries\n\n# Get all data points to find entry date\ncurl -s \"https://kicksvc.online/api/twse-model2\" | \\\n  jq '.equitySeries[0].series[] | select(.Ticker | contains(\"6285\")) | .data[0]'\n# Returns: {\"date\": \"2026-01-28 00:00:00\", \"value\": 0}\n# Entry date: Jan 28, 2026"
      },
      {
        "title": "Step 2c: Verification and Fallback Strategy",
        "body": "Use all three data sources for robustness:\n\nData SourceWhen to UseWhat It ShowsLimitationtradebookPrimaryEntry date, entry price, exit statustodayPrice may be staleequitySeriesVerificationReturn % over time, position statusNo entry price/datefilledOrdersFallbackActual trade orders, pricesEmpty if model didn't reset recently\n\nRecommended workflow:\n\n# 1. PRIMARY: Get entry details from tradebook\nTRADEBOOK=$(curl -s \"https://kicksvc.online/api/twse-model2\" | \\\n  jq '.tradebook[] | select(.ticker == 6285)')\n\n# 2. VERIFY: Cross-check with equitySeries\nEQUITY=$(curl -s \"https://kicksvc.online/api/twse-model2\" | \\\n  jq '.equitySeries[0].series[] | select(.Ticker | contains(\"6285\"))')\n\n# 3. FALLBACK: If tradebook empty, check filledOrders\nif [ -z \"$TRADEBOOK\" ]; then\n  curl -s \"https://kicksvc.online/api/twse-model2\" | \\\n    jq '.filledOrders[] | select(.ticker | contains(\"6285\"))'\nfi\n\nCross-verification example:\n\n# Check if tradebook and equitySeries agree on position status\nTRADEBOOK_HELD=$(curl -s \"https://kicksvc.online/api/twse-model2\" | \\\n  jq '.tradebook[] | select(.ticker == 6285 and .exitDate == null) | .ticker')\n\nEQUITY_HELD=$(curl -s \"https://kicksvc.online/api/twse-model2\" | \\\n  jq '.equitySeries[0].series[] | select(.Ticker | contains(\"6285\")) | .Ticker')\n\n# If both show position, high confidence\n# If only one shows position, investigate discrepancy\n\nFallback: Check filledOrders (if tradebook empty)\n\nIf equitySeries is empty OR tradebook is empty (rare, but possible after model reset):\n\n# Check ALL US models - filledOrders\nfor i in 1 2 3 4 5; do\n  echo \"=== USA-Model $i filledOrders ===\"\n  curl -s \"https://kicksvc.online/api/usa-model$i\" | \\\n    jq '.filledOrders[] | select(.ticker == \"MU\") | {ticker, price, quantity}'\ndone\n\n# Check ALL TWSE models - filledOrders\nfor i in 1 2; do\n  echo \"=== TWSE-Model $i filledOrders ===\"\n  curl -s \"https://kicksvc.online/api/twse-model$i\" | \\\n    jq '.filledOrders[] | select(.ticker | contains(\"3443\")) | {ticker, price, quantity}'\ndone\n\nWhen data sources disagree:\n\nScenarioActiontradebook shows position, equitySeries doesn'tTrust tradebook (equitySeries may lag)equitySeries shows position, tradebook doesn'tInvestigate - check filledOrdersfilledOrders shows buy but no current positionPosition was closed - check tradebook.exitDateAll three emptyPosition not held in this model"
      },
      {
        "title": "Step 2e: Document Holdings with Accurate Returns",
        "body": "CRITICAL: Always calculate actual returns using:\n\nEntry price from tradebook.enterPrice\nCurrent price from Yahoo Finance API (NOT KSVC API's stale todayPrice)\n\nOutput format (with accurate data):\n\n**KSVC Holdings Check:**\n- ✅ WNC (6285.TW) - Held in TWSE Model 2\n  - Entry: Jan 28, 2026 @ NT$162\n  - Current: NT$187 (Yahoo Finance)\n  - Gain: +15.4% (actual, not API's stale 28%)\n- ✅ UMT (3491.TWO) - Held in TWSE Model 2\n  - Entry: Jan 28, 2026 @ NT$1,120\n  - Current: NT$1,280 (Yahoo Finance)\n  - Gain: +14.3% (actual, not API's stale 23%)\n- ❌ Not held in TWSE Model 1 or USA Models 1-5\n\n**Note:** API's equitySeries and tradebook.todayPrice can lag hours/days behind market.\nAlways use Yahoo Finance for current prices.\n\nIf NOT held in any model:\n\n**KSVC Holdings Check:**\n- ❌ Not held in any of 7 models (checked USA 1-5, TWSE 1-2)\n- Content angle: Industry analysis / Market observation"
      },
      {
        "title": "Integration Strategies",
        "body": "SituationApproachExampleHeld (US)Call out position\"KSVC Model1 holds $MU at $412 entry\"Held (TW)Call out position\"KSVC台股Model1持有台積電 (2330)\"Not heldIndustry framing\"Memory cycle benefits $MU, SK Hynix\"WinVictory lap\"$MU +15% since Model1 added it\""
      },
      {
        "title": "Step 2d: Current Price Check (Yahoo Finance API - REQUIRED)",
        "body": "⚠️ CRITICAL: ALWAYS use Yahoo Finance for current prices. KSVC API's todayPrice can be stale.\n\nUS stocks:\n\n# Get current price\nTICKER=\"MU\"\ncurl -s -A \"Mozilla/5.0\" \"https://query1.finance.yahoo.com/v8/finance/chart/$TICKER?interval=1d&range=1d\" | \\\n  jq '.chart.result[0].meta.regularMarketPrice'\n\n# Get full market data\ncurl -s -A \"Mozilla/5.0\" \"https://query1.finance.yahoo.com/v8/finance/chart/$TICKER?interval=1d&range=1d\" | \\\n  jq '.chart.result[0].meta | {symbol, regularMarketPrice, currency, regularMarketTime}'\n\nTaiwan stocks (use .TW or .TWO suffix):\n\n# WNC (6285.TW)\ncurl -s -A \"Mozilla/5.0\" \"https://query1.finance.yahoo.com/v8/finance/chart/6285.TW?interval=1d&range=1d\" | \\\n  jq '.chart.result[0].meta | {symbol, regularMarketPrice, currency, regularMarketTime}'\n\n# UMT (3491.TWO - OTC stocks use .TWO)\ncurl -s -A \"Mozilla/5.0\" \"https://query1.finance.yahoo.com/v8/finance/chart/3491.TWO?interval=1d&range=1d\" | \\\n  jq '.chart.result[0].meta | {symbol, regularMarketPrice, currency, regularMarketTime}'\n\nTaiwan ticker suffixes:\n\n.TW - Listed on Taiwan Stock Exchange (TWSE)\n.TWO - Listed on Taipei Exchange (TPEx/OTC)\n\nCalculate actual gain (not API's stale profit%):\n\n# Example: WNC\nTICKER=\"6285.TW\"\nENTRY=162  # From tradebook.enterPrice\nCURRENT=$(curl -s -A \"Mozilla/5.0\" \"https://query1.finance.yahoo.com/v8/finance/chart/$TICKER?interval=1d&range=1d\" | jq '.chart.result[0].meta.regularMarketPrice')\necho \"$TICKER: NT\\$$CURRENT | Entry: NT\\$$ENTRY | Gain: $(awk \"BEGIN {printf \\\"%.1f\\\", ($CURRENT - $ENTRY) / $ENTRY * 100}\")%\"\n\n# Output: 6285.TW: NT$187 | Entry: NT$162 | Gain: +15.4%\n\nComplete workflow (tradebook + Yahoo Finance):\n\n# 1. Get entry price from tradebook\nENTRY=$(curl -s \"https://kicksvc.online/api/twse-model2\" | jq '.tradebook[] | select(.ticker == 6285) | .enterPrice')\n\n# 2. Get current price from Yahoo Finance\nCURRENT=$(curl -s -A \"Mozilla/5.0\" \"https://query1.finance.yahoo.com/v8/finance/chart/6285.TW?interval=1d&range=1d\" | jq '.chart.result[0].meta.regularMarketPrice')\n\n# 3. Calculate actual gain\necho \"Entry: NT\\$$ENTRY | Current: NT\\$$CURRENT | Gain: $(awk \"BEGIN {printf \\\"%.1f\\\", ($CURRENT - $ENTRY) / $ENTRY * 100}\")%\""
      },
      {
        "title": "Step 3: Write Content",
        "body": "See references/kirk-voice.md for full templates and examples."
      },
      {
        "title": "Thread Numbering Convention",
        "body": "FormatWhen to UseNo number on Tweet 1Recommended - cleaner hook, stands alone if quoted/shared2/, 3/, etc.Standard thread format - signals \"2 of N\"1/ on first tweetOptional - explicit \"thread incoming\" signal\n\nWhy skip number on first tweet:\n\nHook tweet often gets shared standalone\n\"1/\" makes it look incomplete out of context\nCleaner visual presentation\n\nFormat preference: Use / not ) - it's the established Twitter thread convention.\n\n✅ Recommended:\nHumanoid robots going from science fair to factory floor. Taiwan supply chain getting interesting.\n\n2/ TLDR:\n- Market: $5.3B (2025) to $32.4B (2029)...\n\n❌ Avoid:\n1/ Humanoid robots going from science fair..."
      },
      {
        "title": "Pick Content Type",
        "body": "What kind of content? (Thread / Quick Take / Breaking / Shitpost / Commentary / Victory Lap)\nLook up the formula in kirk-voice.md\nApply the blend"
      },
      {
        "title": "Technical Specificity",
        "body": "❌ Vague: \"NAND supply is tight\"\n\n✅ Specific: \"YMTC adding 135k WPM at Wuhan Fab 3. Still won't close the gap - Samsung X2 conversion delayed to Q2.\"\n\n❌ Vague: \"HBM margins are good\"\n\n✅ Specific: \"SK Hynix HBM yields at 80-90%. Samsung stuck at 60% on 1c DRAM.\"\n\nAlways include: specific numbers, time frames, fab names, comparisons."
      },
      {
        "title": "Referential Clarity (Learned 2026-02-08)",
        "body": "Never use vague pronouns or shorthand when the referent hasn't been introduced.\n\nIn thread format, each tweet may be read semi-independently. If earlier tweets discuss a concept as a category (e.g., \"ASIC revenue\"), don't suddenly refer to it as \"the project\" in a later tweet — the reader has no antecedent for \"the project.\"\n\n❌ Vague: \"MS thinks the project is the 3nm Google TPU\"\n(What project? The thread never introduced \"a project.\")\n\n✅ Clear: \"MS thinks the main client/program is the 3nm Google TPU\"\n(Names what MS is identifying — who's buying and what they're building.)\n\nRule: When a shorthand (\"the project\", \"this deal\", \"the play\") saves words but costs clarity, it's not saving anything. Name the thing directly. A few extra words that prevent the reader from pausing to re-read are always worth it.\n\nWhen shifting from category to specific: If the thread discusses an abstract category (ASIC revenue, memory supply) and then pivots to a specific entity (Google TPU, Samsung fab), bridge the transition. Don't assume the reader already knows which specific thing drives the category."
      },
      {
        "title": "Step 4a: Audit (MANDATORY — MUST USE SUBAGENTS)",
        "body": "⚠️ WHY THIS STEP EXISTS: We learned that RLM extraction (Step 1b) is not the same as verification. Explore agents hallucinate numbers. Writers make inferences. This step catches errors BEFORE publishing.\n\n⚠️ STRUCTURAL GATE: You (the main agent) are the WRITER. You cannot also be the AUDITOR. You MUST delegate audit to fresh-context subagents. See the \"WARM STATE TRAP\" section in the audit-content skill for why."
      },
      {
        "title": "Step 4a Process (3 actions, in order)",
        "body": "Action 1: Generate audit manifest\n\nWrite audit-manifest.md with all claims, sources, and search hints.\nThis is the handoff document for the audit agents.\n\nAction 2: Spawn Explore agents (MANDATORY — do NOT skip this)\n\nSpawn 1 Explore agent per source PDF via Task tool.\nEach agent gets: the manifest + its assigned PDF path + claim list.\nEach agent returns: JSON with PASS/FAIL/UNSOURCED per claim.\n\n⚠️ WARM STATE TRAP: If RLM is already loaded from Step 1b, you WILL be tempted to \"just grep it yourself.\" DO NOT. The audit-content skill explains why: you wrote the draft, so you already \"know\" the answers. Self-auditing is confirmation bias, not verification.\n\nSelf-check: If you are about to type rlm_repl.py exec during Step 4a, STOP. You are skipping the gate.\n\nAction 3: Collect results and write audit report\n\nAggregate agent results into audit-report.md.\nMUST include audit_agent_ids from the Task tool responses.\nIf audit_agent_ids is empty, the audit is invalid."
      },
      {
        "title": "Invoke the audit-content skill for full process details:",
        "body": "/audit-content"
      },
      {
        "title": "What Gets Verified",
        "body": "Claim TypeExampleHow to VerifyCompany names\"KHGEARS\"RLM grep + TWSE APITicker formats\"4571 TW\"TWSE APINumbers\"62 harmonic reducers\"RLM grep exact countPercentages\"19% cost\"RLM grep in sourceP/E ratios\"20x\"RLM grep analyst targetRatings\"BUY\"RLM grep recommendationTimelines\"2H27\"RLM grep + verify contextAttributions\"shipping to X\"Must be explicit in source, not inferred"
      },
      {
        "title": "When to Proceed",
        "body": "All PASS: Save draft (Step 5)\nAny FAIL: Fix the claim, re-audit\nUNSOURCED: Either remove, add caveat (\"reportedly\"), or find source\n\nDo NOT save draft with FAIL status. UNSOURCED claims need explicit decision."
      },
      {
        "title": "Step 4a.5: Gemini Web Cross-Validation (RECOMMENDED)",
        "body": "⚠️ WHY THIS STEP EXISTS: RLM audit (Step 4a) is source-locked — it only checks claims against the cited PDF. This over-flags reasonable inferences that go beyond one report but are well-documented publicly. Step 4a.5 gives flagged claims a second chance via web-grounded search.\n\nCase Study (Old Memory Squeeze, 2026-02-07):\n\nDraft said \"capacity getting cannibalized for HBM and DDR5\"\nRLM audit FAIL: MS report says \"exiting DDR4\" / \"cannibalization\" but doesn't name HBM/DDR5 as destination\nGemini confirmed: TrendForce, DigiTimes, The Elec all document the DDR4→HBM/DDR5 shift\nResult: Claim restored with dual attribution (MS + public sources)\n\nSame thread: \"Samsung, Kioxia, Micron all reducing MLC NAND\" — MS only confirmed Samsung, said Kioxia/Micron \"could\" reduce. Gemini confirmed all three are actively reducing per TrendForce (41.7% YoY MLC NAND capacity decrease)."
      },
      {
        "title": "When to Use",
        "body": "RLM Audit ResultUse Gemini?WhyFAIL — wrong numberNoNumber errors need source correction, not web searchFAIL — inference beyond sourceYesInference may be valid per public sourcesFAIL — misattributionMaybeCheck if correct attribution exists publiclyUNSOURCED — claim not in cited PDFYesClaim may be common industry knowledgePASSNoAlready verified"
      },
      {
        "title": "How to Run",
        "body": "# For each FAIL/UNSOURCED claim that looks like a reasonable inference:\ngemini -p \"Search the web: [specific factual question about the inference].\nI need external industry sources (TrendForce, DigiTimes, The Elec, Reuters,\ncompany earnings calls) from 2025-2026 confirming or denying this.\"\n\nKey: Ask Gemini to search the web explicitly. Without \"search the web\", Gemini may read local files instead."
      },
      {
        "title": "Decision Matrix",
        "body": "Gemini ResultActionConfirmed with public sourcesRestore claim, add dual attribution (source + Gemini web)Partially confirmedSoften language to match what's confirmedNot confirmed / contradictedKeep as FAIL, fix or remove claimGemini unsure / no sources foundKeep as FAIL (conservative default)"
      },
      {
        "title": "Audit Report Format",
        "body": "Update claims restored via Gemini:\n\n| # | Claim | Source | Status |\n|---|-------|--------|--------|\n| 5 | DDR4 capacity → HBM/DDR5 | ms_old_memory + Gemini web | PASS (restored) |\n| 8 | All three reducing MLC NAND | ms_old_memory + Gemini web | PASS (restored) |\n\nInclude Gemini's sources in the audit resolution log:\n\n### Claim 5 (RESTORED via Gemini)\n- **RLM flagged:** Source doesn't name HBM/DDR5 as destination\n- **Gemini confirmed:** TrendForce, DigiTimes, The Elec document capacity conversion\n- **Resolution:** Restored with corrected framing"
      },
      {
        "title": "Guidelines",
        "body": "Only use for inferences and industry knowledge claims, NOT for number verification\nGemini's web search is a second opinion, not the final word — present both perspectives\nIf Gemini contradicts both the source and common sense, flag for human review\nKeep Gemini queries specific and focused — one claim per query"
      },
      {
        "title": "Step 4b: Final Holdings Verification (MANDATORY)",
        "body": "⚠️ CRITICAL: This is the FINAL holdings check. You MUST run this after writing content because:\n\nTickers/stock codes may be discovered during extraction (Step 1b)\nCompany names may be clarified during audit (Step 4b)\nStep 2 was a preliminary check with limited information"
      },
      {
        "title": "Why This Step Exists",
        "body": "Problem: You might learn the correct ticker late in the pipeline.\n\nExample - GUC Case:\n\nStep 1b: Learn company is \"Global Unichip (GUC)\"\nStep 1b: Extract ticker \"3443 TW\" from report\nStep 2: ❌ Assumed \"not held\" without actually checking TWSE models\nStep 3: Wrote \"I don't have a position here\"\nStep 4c: ✅ Discovered GUC IS held in TWSE Model 1 (+2.22%)\nResult: Had to rewrite content to reflect actual position"
      },
      {
        "title": "Step 4c Process",
        "body": "1. Extract ALL tickers/identifiers from the draft:\n\n# Read the draft and extract tickers\ngrep -E \"[0-9]{4}|\\\\$[A-Z]{2,5}\" draft.md\n\n# Example output:\n# - 3443 TW (Taiwan stock)\n# - $MU (US stock)\n# - AMD, NVDA (US stocks)\n\n2. For EACH ticker, check ALL 7 models:\n\n# Taiwan stock example: 3443\nfor i in 1 2; do\n  echo \"=== TWSE-Model $i ===\"\n  curl -s \"https://kicksvc.online/api/twse-model$i\" | \\\n    jq '.equitySeries[0].series[] | select(.Ticker | contains(\"3443\")) |\n    {ticker: .Ticker, return: .data[-1].value}'\ndone\n\n# US stock example: MU\nfor i in 1 2 3 4 5; do\n  echo \"=== USA-Model $i ===\"\n  curl -s \"https://kicksvc.online/api/usa-model$i\" | \\\n    jq --arg t \"MU\" '.equitySeries[0].series[] | select(.Ticker == $t) |\n    {ticker: .Ticker, return: .data[-1].value}'\ndone\n\n3. Compare Step 2 vs Step 4c results:\n\n**Holdings Verification:**\n\nStep 2 (Initial): Claimed \"Not held\"\nStep 4c (Final): ✅ Found in TWSE Model 1 (+2.22%)\n\n**Action Required:** Update draft to reflect actual position\n\n4. If holdings status changed, update draft:\n\n# Before (Step 2):\n\"I don't have a position here, but watching...\"\n\n# After (Step 4c):\n\"KSVC holds GUC in TWSE Model 1 (+2.22% since entry). Watching...\""
      },
      {
        "title": "Decision Matrix",
        "body": "Step 2Step 4cActionNot heldNot held✅ No change neededNot heldHELD❌ UPDATE DRAFT - change content angleHeldHeld✅ Verify return % is currentHeldNot held❌ UPDATE DRAFT - position was closed"
      },
      {
        "title": "Output Format",
        "body": "**Step 4c: Final Holdings Verification**\n\n✅ Verified ALL 7 models (USA 1-5, TWSE 1-2)\n\n**Tickers checked:**\n- 3443 (GUC): ✅ Found in TWSE Model 1 (+2.22%)\n- $MU: ❌ Not held\n- $AMD: ✅ Found in USA Model 3 (+12.5%)\n\n**Changes required:**\n- Update draft line 32: Add KSVC position note for GUC\n- Update draft line 45: Add KSVC position note for AMD"
      },
      {
        "title": "Step 4c: Stylize (MANDATORY)",
        "body": "Why this step exists: The data backbone (Step 3) is Serenity-heavy - precise, comprehensive, verified facts. Step 4c transforms it into Kirk's authentic voice with emotional range and character.\n\nInvoke the kirk-mode skill:\n\n/kirk-mode"
      },
      {
        "title": "What Kirk Mode Does",
        "body": "Transforms verified data into Kirk's voice by:\n\nMode selection - Matches Kirk's emotional mode to situation (Analytical, Sarcastic, Emo, Shitpost, Degen, GIF Master)\nVoice elements - Adds discovery moments (\"Wait though\"), reactions (\"wayyy bigger\"), first-person thesis\nMeme culture - Integrates fintwit slang (ngmi, wagmi, brother, probably nothing) strategically\nAnti-formula - Rotates structure to prevent templating (varies TLDR → \"ok so\" → question)\nCredibility balance - Online enough to relate, credible enough to trust"
      },
      {
        "title": "When to Use Each Mode",
        "body": "SituationKirk ModeExampleDeep fundamental diveAnalytical\"ok so\", \"Wait though\", data-heavy with reactionsMarket absurditySarcastic\"brother Elon literally applied for 1M satellites\"Positions downEmo\"honestly getting wrecked\", vulnerable lowercaseQuick reaction/memeShitpostme/also me format, nobody: formatHigh-conviction risky playDegen\"sir this is a casino\", YOLO energyVictory lapGIF Master + AnalyticalPerfect GIFs + receipts\n\nMost natural: Mix modes in single post (Analytical + Sarcastic + maybe GIF)"
      },
      {
        "title": "Workflow",
        "body": "Assess situation - What's happening? (Deep dive, absurd market, position down, quick reaction)\nSelect mode(s) - Use kirk-mode decision tree or mix modes naturally\nApply voice toolkit - Discovery moments, strategic \"wayyy\", emphasis markers\nCheck meme integration - Would slang/GIF enhance or distract from analysis?\nVerify authenticity - Read aloud: sounds like intern at bar or ChatGPT report?\n\nOutput: Transformed content with Kirk's character voice - ready for humanizer pass.\n\nSee kirk-mode skill for:\n\nComplete mode descriptions with examples\nMeme vocabulary and format templates\nAnti-formula principles\nCredibility boundaries"
      },
      {
        "title": "Step 4d: Humanize (MANDATORY)",
        "body": "Note: Humanizer runs AFTER stylize to remove any AI patterns that slipped through during transformation.\n\nInvoke the humanizer skill:\n\n/humanizer"
      },
      {
        "title": "Patterns to Remove",
        "body": "PatternFix\"Full stop.\"\"Simple as.\" or just deleteEm-dashes (—)Periods, commas\"It's not X. It's Y.\"\"The play is Y, not X.\"Perfect parallelismVary structureRule of threeBreak the patternOver-confidenceAdd skepticism phrase"
      },
      {
        "title": "AI Words to Remove",
        "body": "Additionally, crucial, delve, emphasize, testament, enhance, foster, landscape, showcase, tapestry, underscore, vibrant, pivotal, key (adj), interplay"
      },
      {
        "title": "Soul to Add",
        "body": "Skepticism: \"I might be wrong\" / \"Not sure about this\"\nReactions: \"That number is wild\" / \"Interesting\"\nFirst person: \"I keep thinking about...\"\nMixed feelings: \"Impressive but also kind of unsettling\"\nQuestions: Ask the audience"
      },
      {
        "title": "File Organization Convention",
        "body": "CRITICAL: Use assets folder structure for all drafts.\n\ncontent-pipeline/draft/\n└── YYYY-MM-DD-topic-assets/\n    ├── README.md                           # Inventory, traceability, verification log\n    ├── YYYY-MM-DD-topic.md                 # Original draft\n    ├── YYYY-MM-DD-topic-citrini7.md        # Tone rewrites (if applicable)\n    ├── YYYY-MM-DD-topic-audit-manifest.md  # Audit claims list\n    ├── YYYY-MM-DD-topic-audit-report.md    # Audit verification results\n    ├── YYYY-MM-DD-topic-audit-final.md     # Final audit with corrections\n    ├── chart1_*.png                        # Generated charts\n    ├── chart2_*.png\n    └── source_*.png                        # Source images for traceability\n\nExample: 2026-02-05-guc-valuation-debate-assets/"
      },
      {
        "title": "Draft Content Format",
        "body": "Save main content as: YYYY-MM-DD-topic-assets/YYYY-MM-DD-topic.md\n\n# [Topic] [Type] Draft\n\n**Date:** YYYY-MM-DD\n**Source:** [Report name, date]\n**Type:** Thread | Quick Take | Reaction\n**Status:** PENDING APPROVAL\n**Process:** RLM extraction → KSVC check → Humanizer pass\n\n---\n\n## Content\n\n[Content here]\n\n---\n\n## Source Citations\n- [List sources]\n\n## Notes\n- [KSVC holdings: $TICKER at $PRICE entry]\n- [Technical details verified via RLM]\n- [Any caveats or uncertainties]"
      },
      {
        "title": "README.md Template",
        "body": "Create README.md in the assets folder to document the work:\n\n# [Topic] Assets\n\n**Date:** YYYY-MM-DD\n**Type:** [Thread/Quick Take/etc]\n**Topic:** [Brief description]\n\n---\n\n## Content Files\n\n| File | Description | Status |\n|------|-------------|--------|\n| YYYY-MM-DD-topic.md | Original draft | ✅ APPROVED |\n| YYYY-MM-DD-topic-citrini7.md | Citrini7 rewrite | ✅ APPROVED |\n\n---\n\n## Charts (Original Work - OK to Publish)\n\n| File | Description | Data Source |\n|------|-------------|-------------|\n| chart1_*.png | [Description] | [Source PDF + page] |\n| chart2_*.png | [Description] | [Source PDF + page] |\n\n**Theme:** ocean_depths\n\n---\n\n## Data Verification Log\n\n### [Claim Category 1]\n\\```\n[Claim]: [Value]\n- Source: [PDF name, page]\n- RLM verified: [grep results or calculation]\n\\```\n\n### [Claim Category 2]\n\\```\n[Claim]: [Value]\n- Source: [PDF name, page]\n- Verified: [evidence]\n\\```\n\n---\n\n## Audit Reports\n\n| File | Purpose |\n|------|---------|\n| YYYY-MM-DD-topic-audit-manifest.md | Claims to verify |\n| YYYY-MM-DD-topic-audit-report.md | Initial audit results |\n| YYYY-MM-DD-topic-audit-final.md | Final audit with corrections |\n\n**Audit result:** X/Y claims verified\n\n---\n\n## KSVC Holdings\n\n\\```bash\n# Verification command\ncurl -s \"https://kicksvc.online/api/[model]\" | jq '...'\n\nResult: [Holdings status]\n\\```\n\n---\n\n## Source Documents\n\n| Source | Path | Used For |\n|--------|------|----------|\n| [Report name] | /Users/Shared/ksvc/pdfs/YYYYMMDD/file.pdf | [What data] |\n\n---\n\n## Corrections Made\n\n1. [Correction 1]\n2. [Correction 2]\n\n---\n\n## Lessons Learned\n\n1. [Lesson 1]\n2. [Lesson 2]"
      },
      {
        "title": "Step 6: Chart Decision & Generation",
        "body": "Timing: After draft is complete. The draft crystallizes the thesis - then you see which claims benefit from visualization."
      },
      {
        "title": "When to Make Charts",
        "body": "Content TypeChart Likely?WhyLong ThreadYesMultiple data points, trendsQuick TakeMaybeOne key number might not need visualBreaking NewsRarelySpeed > polishVictory LapMaybeEntry vs Now comparison"
      },
      {
        "title": "Chart-Tweet Pairing",
        "body": "Principle: Put the most eye-catching visual early (Tweet 1-3) to hook engagement.\n\nChart TypeBest Tweet PositionWhyMarket size / growth barTweet 2 (TLDR)Pairs with market numbers, shows scaleComponent breakdown pieTweet 3-4Pairs with component discussionCompany comparison tableTweet 5-6Pairs with company analysisTimeline / roadmapTweet 7-8Pairs with forward-looking content\n\nPairing logic:\n\nMatch chart to the tweet that contains the same data\nHook tweet (Tweet 1) can go either way:\n\nText-only: Clean, curiosity-driven, lets words land first\nWith chart: Visual stop, data-forward, shows you have receipts\n\n\nVisuals work best on data-heavy tweets, not opinion tweets\nFinal tweet (watchlist/conclusion) usually doesn't need a chart\n\nExample pairing (humanoid robotics thread):\n\nTweet 1: Hook (optional: market_size_bar.png for visual hook)\nTweet 2: TLDR + market_size_bar.png ← $5.3B→$32.4B numbers\nTweet 3: Component counts (optional: component_pie.png)\nTweet 5: Taiwan names + taiwan_companies_table.png ← KHGEARS/HIWIN/AIRTAC"
      },
      {
        "title": "Decision Process",
        "body": "Review draft - identify \"chartable moments\"\n\nTime series data (market growth, price trends)\nComponent breakdowns (pie charts)\nCompany comparisons (tables)\n\n\n\nCheck RLM cache - do we have the data?\n\nText numbers → bar/line charts\nTables → comparison tables\nSource charts → reference or recreate\n\n\n\nDECLARE SOURCE (MANDATORY) - before any chart generation\n\"I am charting [METRIC] from [SOURCE] page [X]\"\n\"Source contains these exact values: [list them]\"\n\n\n\nGenerate with chart-factory\n/chart-factory"
      },
      {
        "title": "Chart Generation Workflow",
        "body": "Draft complete → identify chartable claims\n                        ↓\n              Pull data from RLM cache (NOT from draft text)\n                        ↓\n              ⚠️ DECLARE SOURCE (state metric + page + exact values)\n                        ↓\n              Save source image FIRST (before generating)\n                        ↓\n              Generate with chart-factory (use theme-factory)\n                        ↓\n              Verify with verification agent\n                        ↓\n              Save to assets folder"
      },
      {
        "title": "Source Declaration (LEARNED FROM MISTAKE)",
        "body": "⚠️ Why this exists: We once created a \"component count\" chart but saved a \"cost %\" source image. The metrics didn't match, making the source invalid for verification.\n\nBefore generating ANY chart, you MUST:\n\nStepActionExample1. State\"I am charting [METRIC] from [SOURCE]\"\"I am charting hardware cost % from 永豐 p.20\"2. ShowScreenshot the exact source table/chartSave as source_hardware_cost_p20.png3. Confirm\"Source contains: [exact values]\"\"19%, 16%, 13%, 52%\"4. FlagIf transforming data, justify it\"I am NOT transforming - using values as-is\"\n\nRed flags - STOP if you notice:\n\nSource shows % but you're charting counts (metric mismatch)\nSource has 15 items but chart has 5 (cherry-picking)\nSource image doesn't contain your chart's numbers (wrong source)\nCompany name romanized/guessed from Chinese (fabricated data)\nTicker suffix assumed without checking (TT vs TW)"
      },
      {
        "title": "Company & Ticker Verification",
        "body": "⚠️ LEARNED FROM MISTAKE: We fabricated \"Chuing\" for 祺驊 (4571). Official name is \"KHGEARS\".\n\n# Always verify Taiwan company names via TWSE API\ncurl -s \"https://www.twse.com.tw/en/api/codeQuery?query=4571\"\n# Returns: {\"query\":\"4571\",\"suggestions\":[\"4571\\tKHGEARS\"]}\n\nNever romanize Chinese names (祺驊 ≠ \"Chuing\")\nUse TW suffix for general audience (TT = Bloomberg only)"
      },
      {
        "title": "Using chart-factory",
        "body": "from chart_factory import create_bar_chart, create_pie_chart, create_table_chart\n\n# Market size bar chart\ncreate_bar_chart(\n    data={'2025': 5.3, '2026': 8.3, '2027': 13.0},\n    title=\"Global Humanoid Robot Market\",\n    theme=\"ocean_depths\",\n    annotations={\"type\": \"cagr\", \"value\": \"57%\"}\n)\n\n# Component pie chart\ncreate_pie_chart(\n    data={'Reducers': 62, 'Motors': 30, 'Screws': 48},\n    title=\"Component Breakdown\",\n    theme=\"ocean_depths\",\n    explode_largest=True\n)\n\n# Company comparison table\ncreate_table_chart(\n    columns=['Company', 'Ticker', 'P/E', 'Rating'],\n    data=[['Chuing', '4571', '24x', 'BUY'], ...],\n    title=\"Taiwan Supply Chain\",\n    theme=\"ocean_depths\"\n)"
      },
      {
        "title": "Verification (MANDATORY)",
        "body": "After generating, spawn Explore agent with thoroughness: quick for focused verification:\n\nTask(subagent_type=\"Explore\", prompt=\"\"\"\n**THOROUGHNESS: quick**\n\n**CONTEXT ISOLATION: You have NO external conversation history. Work ONLY from this prompt.**\n\nCHART VERIFICATION TASK\n\nChart: /path/to/chart.png\nType: bar\n\nSource Data (expected):\n{\"2025\": 5.3, \"2026\": 8.3, \"2027\": 13.0}\n\nSource Context:\n永豐 p.3 - \"2025年全球人型機器人規模約53億美元\"\n\nTask:\n1. Read the chart image\n2. Extract numbers from visual\n3. Compare to expected data\n4. Check for unit consistency (B vs M, % formatting)\n\nReturn ONLY JSON:\n{\n  \"verified\": true/false,\n  \"numbers_in_chart\": [...],\n  \"numbers_in_source\": [...],\n  \"discrepancies\": [...],\n  \"notes\": \"...\"\n}\n\"\"\")\n\nVerification checks data → chart integrity. Source accuracy is RLM's responsibility (Step 4a).\n\nThoroughness = quick: Single-pass verification, focused on specific data points. Fast visual-to-data check."
      },
      {
        "title": "Save Charts",
        "body": "Save to: draft/YYYY-MM-DD-topic-assets/\n\nInclude:\n\nGenerated charts (chart1_.png, chart2_.png)\nSource images from PDF (for traceability)\ngenerate_charts.py script (reproducibility)"
      },
      {
        "title": "Step 7: Publish to Final Folder",
        "body": "After approval, publish clean version to /Users/Shared/ksvc/threads/."
      },
      {
        "title": "File Organization Convention",
        "body": "CRITICAL: Flat folder structure, one folder per post.\n\n/Users/Shared/ksvc/threads/\n├── 2026-02-03-humanoid-robotics/\n│   ├── thread.md                         # Clean content (ready to post)\n│   ├── _metadata.md                      # Internal reference (not for posting)\n│   ├── chart1_market_size.png\n│   ├── chart2_component_breakdown.png\n│   └── chart3_taiwan_companies.png\n└── 2026-02-05-guc-valuation-debate/\n    ├── thread.md                         # Clean content (ready to post)\n    ├── _metadata.md                      # Internal reference (not for posting)\n    ├── guc-eps-comparison.png\n    └── guc-pt-comparison.png\n\nRules:\n\n✅ Flat structure: YYYY-MM-DD-topic/ at root level (not nested in 2026-02/)\n✅ Charts directly in folder (not in charts/ subfolder)\n✅ thread.md = clean content only (no metadata header)\n✅ _metadata.md = internal reference (sources, audit, not for posting)"
      },
      {
        "title": "thread.md Format",
        "body": "Clean version with just the tweets - no metadata header:\n\n# [Topic Title]\n\n1/ [First tweet]\n\n2/ [Second tweet]\n- bullet point\n- bullet point\n\n3/ [Third tweet]\n\n..."
      },
      {
        "title": "_metadata.md Format",
        "body": "Internal reference file (prefixed with _ to indicate not for posting):\n\n# Metadata (not for posting)\n\n**Date:** YYYY-MM-DD\n**Type:** Long Thread (10 tweets) | Quick Take | etc.\n**Status:** READY TO POST\n\n## Sources\n- [Source 1 PDF name] ([Date])\n- [Source 2 PDF name] ([Date])\n\n## KSVC Holdings Check\n- ✅ Held in [Model name] (+X.X% since entry) OR\n- ❌ Not held in any of 7 models (checked USA 1-5, TWSE 1-2)\n- Integration strategy: [Personal stakes | Industry framing | Victory lap]\n\n## Audit Log\n- [Key claim verified via RLM grep]\n- [Correction made: old → new]\n- [Methodology improvement discovered]\n\n## Charts\n- chart1_*.png - [Description] ([Data source])\n- chart2_*.png - [Description] ([Data source])\n\n## Notes\n- [Special handling notes]\n- [Lessons learned]\n\nExample: See /Users/Shared/ksvc/threads/2026-02-05-guc-valuation-debate/_metadata.md"
      },
      {
        "title": "Publish Workflow",
        "body": "# 1. Create publish folder (flat structure)\nmkdir -p /Users/Shared/ksvc/threads/YYYY-MM-DD-topic\n\n# 2. Copy clean content as thread.md\ncp draft/YYYY-MM-DD-topic-assets/YYYY-MM-DD-topic-citrini7.md \\\n   /Users/Shared/ksvc/threads/YYYY-MM-DD-topic/thread.md\n\n# 3. Copy charts directly into folder (not subfolder)\ncp draft/YYYY-MM-DD-topic-assets/chart*.png \\\n   /Users/Shared/ksvc/threads/YYYY-MM-DD-topic/\n\n# 4. Create _metadata.md from draft notes\n# (Document sources, audit log, holdings, charts)\n\nResult:\n\n/Users/Shared/ksvc/threads/YYYY-MM-DD-topic/\n├── thread.md               # Ready to post\n├── _metadata.md            # Internal reference\n├── chart1_*.png\n└── chart2_*.png"
      },
      {
        "title": "When to Publish",
        "body": "StatusActionDraft approvedPublish to /Users/Shared/ksvc/threads/Needs revisionStay in content-pipeline/draft/Posted to XMove to /Users/Shared/ksvc/threads/archive/ (optional)"
      },
      {
        "title": "Quality Checklist",
        "body": "Extraction (Step 1a/1b):\n\n⚠️ Checked published threads (/Users/Shared/ksvc/threads/) before topic selection\n Topic does NOT duplicate a recently published thread (same source + same angle = reject)\n Scanned recent PDF folders (at least 3) with Explore agents\n Identified cross-document connections\n Deep extracted key reports with RLM\n Charts/images extracted and reviewed (use --extract-images)\n ⚠️ Extraction validation: Every PDF's chars_extracted checked against expected size\n ⚠️ Read tool fallback used for any PDF with < 1000 chars (or suspiciously low for page count)\n Key numbers verified via RLM grep (not just Explore summary)\n\nCross-Doc Synthesis (Step 1c):\n\nUsed rlm-repl-multi to compare across sources (if multiple PDFs)\n Asked synthesis questions (consensus, comparison, disagreement)\n Documented synthesized insights in cache\n Flagged cross-doc claims for audit in Step 4b\n Identified unique insights that single-source extraction would miss\n\nContent:\n\nAll published numbers have RLM grep confirmation\n Technical specifics included (fabs, yields, WPM)\n Time frames clear (Q1 2026, 2027e)\n Sources cited (multiple reports for cross-doc)\n Cross-doc reasoning: claims triangulated across multiple reports\n Unique insight that connects dots others miss\n\nKSVC:\n\nequitySeries checked (all 5 US + 2 TWSE models)\n filledOrders fallback checked (all 7 models) if equitySeries shows 0%\n Entry prices noted for victory lap potential\n Integration strategy clear (held vs industry framing)\n\nVoice:\n\nAppropriate type (thread vs quick take vs reaction)\n Skepticism included where uncertain\n Energy for high-conviction points\n Not over-polished\n\nHumanizer (Step 4d):\n\nNo AI patterns (em-dashes, \"Full stop\", etc.)\n Has personality/voice\n Shows thinking process, not just conclusions\n\nAudit (Step 4a):\n\nAll factual claims extracted from draft\n Each claim verified via RLM grep against source\n Taiwan company names verified via TWSE API\n No FAIL status claims remain\n UNSOURCED claims either removed, caveated, or sourced\n Audit report generated and attached to draft\n\nCharts (Step 6):\n\nIdentified chartable claims in draft\n Data pulled from RLM cache (not draft text)\n ⚠️ SOURCE DECLARED before generating (metric + page + exact values)\n ⚠️ Source image saved FIRST (before chart generation)\n ⚠️ Source image contains same metric as chart (not transformed)\n If data transformed, transformation documented and justified\n Used chart-factory with theme-factory theme\n Verification agent confirmed data→chart integrity\n generate_charts.py script included (reproducibility)\n\nPublish (Step 7):\n\nDraft approved for posting\n Created folder in /Users/Shared/ksvc/threads/YYYY-MM-DD-topic/\n thread.md contains clean tweets only (no metadata header)\n _metadata.md contains sources, audit log, chart descriptions\n Charts copied to final folder\n Verified all files present before announcing ready to post"
      },
      {
        "title": "PDF Location",
        "body": "Research PDFs: /Users/Shared/ksvc/pdfs/\n\nls -la /Users/Shared/ksvc/pdfs/ | tail -5"
      },
      {
        "title": "References",
        "body": "references/kirk-voice.md - PRIMARY - Unified voice guide with all content types, formulas, and templates\nreferences/serenity-style.md - Deep dive: data-heavy thread patterns\nreferences/citrini7-style.md - Deep dive: punchy quick take patterns\nFull creator studies: ksvc-intern/content-pipeline/creator-studies/"
      }
    ],
    "body": "Kirk Content Pipeline\n\nCreate Twitter content from analyst research PDFs, validated against KSVC holdings.\n\nPipeline Steps (MANDATORY)\n1a.   Scan PDFs (Explore agents for broad screening)\n1b.   Extract insights (RLM for deep extraction - text, tables, AND charts)\n1c.   Cross-doc synthesis (rlm-multi for insights across sources)\n2.    Check KSVC holdings (preliminary - with known tickers)\n3.    Write content (data backbone, Serenity-heavy)\n4a.   AUDIT (verify draft claims against source PDFs with RLM)\n4a.5. GEMINI CROSS-VALIDATION (web-verify FAIL/UNSOURCED inferences)\n4b.   Final Holdings Verification (check ALL 7 models with discovered tickers)\n4c.   Stylize (invoke kirk-mode skill for voice/character)\n4d.   Humanize (remove AI patterns)\n5.    Save draft for approval\n6.    Chart decision & generation (after draft crystallizes thesis)\n7.    PUBLISH to final folder (clean version for posting)\n\n\nNever skip steps 4a-4d. Use 1a for multi-PDF screening, 1b for deep extraction, 1c for cross-doc synthesis, 4a for verification, 4a.5 for web cross-validation, 4b for final holdings check, 4c for character voice, 4d for AI pattern removal.\n\n⚠️ CRITICAL: Step 1b extracts data. Step 1c synthesizes across docs. Step 4a VERIFIES the written content. Step 4a.5 CROSS-VALIDATES inferences.\n\n1b: \"What does each PDF say?\" (per-doc extraction)\n1c: \"What patterns emerge across PDFs?\" (cross-doc synthesis)\n4a: \"Does my draft accurately reflect the sources?\" (source-locked verification)\n4a.5: \"Are the flagged inferences valid per public sources?\" (web cross-validation)\n4c: \"Which Kirk mode fits this situation?\" (character voice)\nSubagent Permissions (CRITICAL)\n\nSubagents CANNOT Read files outside the project directory. PDFs in /Users/Shared/ksvc/pdfs/ are blocked. The fix: symlink PDFs into the project directory before spawning subagents.\n\nThe main agent MUST create a symlink before Step 1a:\n\nln -sf \"/Users/Shared/ksvc/pdfs/YYYYMMDD\" \".claude/pdfs-scan\"\n\n\nThen subagents Read from .claude/pdfs-scan/filename.pdf — this works because the path resolves inside the project.\n\nAccess Method\t/Users/Shared/ path\tSymlinked project path\nSubagent Read tool (PDF)\t❌ Auto-denied\t✅ Works\nSubagent Read tool (images)\t❌ Auto-denied\t✅ Works\nMain agent Read tool\t✅ User approves\t✅ Works\nBash → RLM\t✅ Any path\t✅ Any path\n\nDiscovered 2026-02-07: Subagents fail with \"Permission to use Read has been auto-denied (prompts unavailable)\" on /Users/Shared/ paths. Symlink into project dir = full Read access. Tested: 19 PDFs, medium thoroughness, 125k tokens, zero errors.\n\nContent Types & Voice Blends\n\nFull guide: references/kirk-voice.md — Read this for templates and examples.\n\nKirk voice = Serenity's data + Citrini7's wit + Jukan's skepticism + Zephyr's energy.\n\nType\tWhen\tBlend\tKey Element\nLong Thread\tDeep dive, multi-source\tSerenity + Jukan\tTLDR + skepticism\nQuick Take\tSingle insight, one report\tCitrini7 + Serenity\tPunchy + one number\nBreaking News\tJust dropped\tZephyr + Jukan\tReaction word + number\nShitpost\tMarket absurdity\tCitrini7 + Zephyr\tMeme format\nPersonal Commentary\tOpinion, question\tPure Jukan\tFirst-person + uncertainty\nVictory Lap\tKSVC call worked\tPure Zephyr\tEntry/Now + thesis\nQuick Formulas\n\nLong Thread: Hook → TLDR → Numbers → Skepticism → Position\n\nQuick Take: Headline number → Context → \"If you're looking now...\"\n\nBreaking News: \"Huge.\" / \"Well well well...\" → Key number → Source\n\nVictory Lap: \"$TICKER up X% since KSVC added it\" → Entry/Now → Thesis validated\n\nStep 1a: Scan PDFs with Explore Agents\n\nUse Explore agents for broad screening when you have many PDFs to review. This is faster than RLM for initial discovery.\n\nStep 1a.0: Check Published Threads (MANDATORY - DO FIRST)\n\n⚠️ Before scanning any PDFs, check what Kirk has already posted.\n\n# List all published threads\nls /Users/Shared/ksvc/threads/\n\n# Read recent thread.md files to understand what topics are covered\n\n\nFor each published thread, note:\n\nTopic (what was the thesis?)\nSource PDFs used (check _metadata.md)\nDate (how recent?)\n\nThen when selecting a topic after scanning, REJECT any topic that:\n\nUses the same primary source PDF as a published thread\nCovers the same thesis/angle (even if from different sources)\nWould read as a repeat to Kirk's followers\n\nAcceptable overlap:\n\nA follow-up/update to a previous thread with NEW data (e.g., earnings confirm the thesis)\nA different angle on the same sector (e.g., posted about ABF shortage, now posting about specific company earnings)\nExplicitly framed as \"update: here's what changed since my last post on X\"\n\nWhy this exists (Case Study — ABF Substrate, 2026-02-07):\n\nKirk published a 10-tweet thread on Feb 5 covering Goldman's ABF shortage report (10%→21%→42%, Kinsus/NYPCB/Unimicron). On Feb 7, the pipeline picked the same Goldman report and produced a 3-tweet quick take with the same numbers, same companies, same angle. We didn't check published threads first, so we wasted a pipeline run on duplicate content when 10 other fresh topic angles were available.\n\nWhen to Use\nScreening 10+ PDFs to find relevant ones\nFinding cross-document connections\nBuilding a thesis from multiple sources\nDon't know which PDFs matter yet\nHow to Scan\n1. Check published threads (Step 1a.0 above)\n\n2. List recent PDF folders and count PDFs\n   ls /Users/Shared/ksvc/pdfs/ | tail -5\n   ls /Users/Shared/ksvc/pdfs/YYYYMMDD/ | wc -l\n\n3. Symlink PDFs into project directory (REQUIRED for subagent access)\n   ln -sf \"/Users/Shared/ksvc/pdfs/YYYYMMDD\" \".claude/pdfs-scan\"\n\n4. Split PDFs into groups and spawn parallel Explore agents\n   TARGET: ~5 PDFs per agent. Spawn ALL agents in a single message.\n   - Each agent gets a specific list of filenames to scan\n   - All agents run simultaneously → total time = slowest agent\n   - Haiku is cheap — more agents = faster with no meaningful cost increase\n\nAgent Sizing\nPDFs\tAgents\tPDFs/Agent\tExpected Time\n≤5\t1\tall\t~25s\n6-10\t2\t~5 each\t~25s\n11-15\t3\t~5 each\t~25s\n16-20\t4\t~5 each\t~25s\n21-30\t5-6\t~5 each\t~30s\n\nWhy ~5 PDFs per agent? Sweet spot for speed. Each PDF takes ~4-8s to Read + summarize. 5 PDFs ≈ 25s per agent. Adding more PDFs per agent saves nothing (same total tokens) but makes wall-clock time worse.\n\nCost: Haiku is cheap. 4 agents × 5 PDFs × ~4k tokens = ~80k input tokens total — same as 1 agent doing all 20. Parallelism is free.\n\nCross-doc synthesis trade-off: Each agent only sees its batch, so cross-batch themes are the main agent's job. This is fine — the main agent merges all results anyway.\n\nExample: Spawn Explore Agents\n\nStep 1: Main agent creates symlink and lists PDFs:\n\nln -sf \"/Users/Shared/ksvc/pdfs/20260205\" \".claude/pdfs-scan\"\n/bin/ls \".claude/pdfs-scan/\"\n\n\nStep 2: Split filenames into groups and spawn agents in parallel (single message, multiple Task calls):\n\n# Agent 1 — first batch\nTask(subagent_type=\"Explore\", prompt=\"\"\"\n**THOROUGHNESS: medium**\n\nScan these specific PDFs for content angles:\n- file1.pdf\n- file2.pdf\n- file3.pdf\n- file4.pdf\n- file5.pdf\n- file6.pdf\n- file7.pdf\n\nFor each PDF, Read enough pages to understand the full thesis (use judgment — some need 1-2 pages, others 1-5):\n\nRead(file_path=\"/Users/dydo/Documents/agent/ksvc-intern/.claude/pdfs-scan/FILENAME.pdf\", pages=\"1-5\")\n\nFor each PDF extract:\n- Company/sector, ticker, rating, price target\n- Key thesis and supporting numbers\n- Supply chain connections\n- Potential content angles\n\nAfter scanning your batch, provide:\n1. Per-PDF summary (2-3 sentences each)\n2. Cross-document themes within your batch\n3. Which PDFs are most relevant for deep extraction\n\"\"\")\n\n# Agent 2 — second batch (SPAWN IN SAME MESSAGE as Agent 1)\nTask(subagent_type=\"Explore\", prompt=\"\"\"\n... same prompt with file8.pdf through file14.pdf ...\n\"\"\")\n\n# Agent 3 — third batch (SPAWN IN SAME MESSAGE)\nTask(subagent_type=\"Explore\", prompt=\"\"\"\n... same prompt with file15.pdf through file20.pdf ...\n\"\"\")\n\n\nStep 3: Main agent synthesizes results from all agents: After all agents return, the main agent:\n\nMerges per-PDF summaries\nIdentifies cross-agent themes (patterns Agent 1 found + patterns Agent 2 found)\nPicks top 3 content angles across all PDFs\nSelects 2-5 PDFs for Step 1b deep extraction\nOutput: Identify Which PDFs Matter\n\nAfter scanning, you'll know:\n\nWhich reports have the best data\nCross-document connections (e.g., \"3 reports confirm memory shortage\")\nThesis recommendations (2-3 angles to explore)\nWhich to deep-extract with RLM\n\n⚠️ WARNING: Explore agents can hallucinate specific numbers. Treat all numbers from Explore summaries as \"unverified claims\" until RLM grep confirms them. Component counts, percentages, and market sizing are especially prone to errors.\n\nCapacity (tested 2026-02-07): Single Explore agent (haiku) handled 19 PDFs at medium thoroughness in 83 seconds, using 125k tokens (~4k tokens/PDF for pages 1-5). 3 agents in parallel = ~30-40s for the same batch.\n\nStep 1b: Deep Extract with RLM\n\nUse RLM for deep extraction from specific PDFs you've identified in Step 1a.\n\nMANDATORY for any number you'll publish. Explore agents summarize; RLM verifies.\n\nWhen to Use\nYou know which 2-5 PDFs matter most\nNeed specific numbers, charts, tables\nBuilding cross-document verification tables\nExtracting technical details (fabs, yields, WPM)\nSingle PDF\ncd ~/.claude/skills/rlm-repl/scripts\npython3 rlm_repl.py init \"/Users/Shared/ksvc/pdfs/YYYYMMDD/file.pdf\" --extract-images\npython3 rlm_repl.py exec -c \"print(grep('revenue|growth|target|price', max_matches=20, window=200))\"\n\nMultiple PDFs (synthesis)\ncd ~/.claude/skills/rlm-repl-multi/scripts\npython3 rlm_repl.py init \"/path/to/report1.pdf\" --name report1 --extract-images\npython3 rlm_repl.py init \"/path/to/report2.pdf\" --name report2 --extract-images\npython3 rlm_repl.py exec -c \"results = grep_all('keyword', max_matches_per_context=20)\"\n\nView Extracted Charts/Images\n# List images from a context\npython3 rlm_repl.py exec --name report1 -c \"print(list_images())\"\n\n# Get image path, then use Read tool to view\npython3 rlm_repl.py exec --name report1 -c \"print(get_image(0))\"\n\n\nCharts often contain key data (P/B trends, margin history, capacity timelines) that text extraction misses.\n\nExtraction Validation (MANDATORY)\n\n⚠️ After EVERY rlm_repl.py init, validate the extraction actually worked.\n\nRLM reports chars_extracted after init. A multi-page analyst report should yield thousands of chars. If you get suspiciously few, the PDF is likely image-based and RLM only extracted metadata/headers.\n\nValidation rule:\n\nChars Extracted\tExpected Report Type\tAction\n> 5,000\tMulti-page report\t✅ Proceed with grep\n1,000 - 5,000\tShort note / partial\t⚠️ Check list_images() — if many images, trigger fallback\n< 1,000\tImage-based PDF\t❌ MUST use Read tool fallback\n\nThe threshold is context-dependent. A 20-page Goldman Sachs report yielding 666 chars is obviously broken. A 1-page pricing table yielding 800 chars might be fine. Use judgment, but when in doubt, fallback.\n\nMandatory Fallback when RLM extraction is low:\n\n# Step 1: RLM init (always try first)\npython3 rlm_repl.py init \"/path/to/report.pdf\" --extract-images\n# Output: \"Extracted 666 chars from 15 pages, saved 9 images\"\n\n# Step 2: Check - is 666 chars enough for a 15-page report? NO.\n# → Trigger fallback\n\n# Step 3: Check extracted images first (they may contain the data)\npython3 rlm_repl.py exec -c \"print(list_images())\"\n# View extracted images with Read tool\n# Read(file_path=\"/path/to/extracted/image-0.png\")\n\n# Step 4: Read the PDF directly (use symlinked path for subagents)\n# Read(file_path=\".claude/pdfs-scan/report.pdf\", pages=\"1-10\")\n# Read(file_path=\".claude/pdfs-scan/report.pdf\", pages=\"11-20\")\n\n\n⚠️ Path rule: Subagents must Read PDFs via the symlinked project path (.claude/pdfs-scan/), NOT from /Users/Shared/. See \"Subagent Permissions\" section above.\n\nWhy this exists (Case Study — ABF Substrate Shortage, 2026-02-07):\n\nGoldman Sachs published two reports: a main ABF upcycle report (71K chars, extracted fine) and a Kinsus upgrade report (15 pages, but only 666 chars extracted). We skipped the Kinsus PDF because \"the main report had everything we needed.\" It didn't. The Kinsus report had unique data (company-specific capacity plans, margin guidance, order book details) that would have strengthened the thread. Skipping it was lazy — the Read tool fallback takes 30 seconds and would have recovered the data.\n\nRules:\n\nNever skip a relevant PDF just because RLM extraction was low. Use the fallback.\nCheck extracted images. RLM with --extract-images often saves chart/table images even when text extraction fails. View them with Read tool.\nLog the fallback. In the extraction cache, note \"extraction_method\": \"read_fallback\" so audit knows the data source.\nIf fallback also fails (corrupted PDF, DRM), document it and move on. But you must TRY.\nRLM Cache: Include Visual Data\n\nWhen extracting, capture all data types for potential chart generation later:\n\nSource Type\tWhat to Extract\tCache Format\nText numbers\tExact quotes with page refs\t{\"value\": 5.3, \"unit\": \"B\", \"source\": \"p.3\", \"quote\": \"規模約53億美元\"}\nTables\tFull table as structured JSON\t{\"columns\": [...], \"rows\": [...], \"source\": \"p.20\"}\nCharts\tData points + source image path\t{\"data\": {...}, \"source_image\": \"pdf-3-1.png\", \"page\": 3}\n\nWhy cache visual data? Step 6 (chart generation) needs this. If you only cache text, you'll lose table structures and chart data points that make great visualizations.\n\nCross-Document Reasoning\n\nBuild thesis by triangulating claims across multiple reports:\n\n# Find where multiple reports discuss the same topic\npython3 rlm_repl.py exec -c \"results = grep_all('DRAM.*price|ASP', max_matches_per_context=5)\"\n\n# Compare forecasts across sources\npython3 rlm_repl.py exec -c \"results = grep_all('2026|2027|growth|demand', max_matches_per_context=5)\"\n\n\nUse cross-doc to verify:\n\nDo multiple sources agree on price forecasts?\nAre supply constraint timelines consistent?\nAny contradictions between reports?\nStep 1b.5: Build Extraction Cache (MANDATORY)\n\n⚠️ Why this step exists: RLM creates state.pkl during extraction, but the writing phase (Step 3) doesn't access it. Without a persistent cache, writers rely on memory, leading to errors like wrong product types, missing time periods, or source attribution mistakes.\n\nWhat this does: Extracts from state.pkl (RLM's internal format) into structured JSON with context labels that the writing phase can reference.\n\nWhen to Run\n\nAfter Step 1b (RLM extraction) and before Step 3 (writing).\n\nWorkflow\tWhen to Cache\nSingle PDF (rlm-repl)\tAfter rlm_repl.py init completes\nMultiple PDFs (rlm-repl-multi)\tAfter all init commands complete\nHow to Build Cache\n\nNew in v2: Auto-generates source tags and attribution map from PDF filenames!\n\nSingle PDF (rlm-repl):\n\ncd ~/.claude/skills/kirk-content-pipeline/scripts\n\n# Auto-extracts from default rlm-repl state location\npython3 build_extraction_cache.py \\\n  --output /path/to/draft-assets/rlm-extraction-cache.json\n\n\nMultiple PDFs (rlm-repl-multi):\n\ncd ~/.claude/skills/kirk-content-pipeline/scripts\n\n# Use --multi flag to load from rlm-repl-multi state\npython3 build_extraction_cache.py \\\n  --multi \\\n  --output /path/to/draft-assets/rlm-extraction-cache.json\n\n\nWith Cross-Doc Synthesis (Optional):\n\n# Add manual synthesis descriptions for cross-doc insights\npython3 build_extraction_cache.py \\\n  --multi \\\n  --output /path/to/draft-assets/rlm-extraction-cache.json \\\n  --synthesis /path/to/cross-doc-synthesis.json\n\n\nSynthesis format (optional, for complex multi-source threads):\n\n{\n  \"dual_squeeze_thesis\": {\n    \"description\": \"Memory shortage (1Q26) + ABF substrate shortage (2H26) = compounding AI server bottleneck\",\n    \"components\": [\n      {\"topic\": \"Memory Pricing\", \"source\": \"gfhk_memory\", \"timeframe\": \"1Q26\"},\n      {\"topic\": \"Abf Shortage\", \"source\": \"goldman_abf\", \"timeframe\": \"2H26-2028\"}\n    ]\n  }\n}\n\n\nWhat auto-generates:\n\n✅ Source tags from PDF filenames (\"GFHK - Memory.pdf\" → tag: \"GFHK\")\n✅ Topics with primary_source, key_metrics, source_context\n✅ Extraction entries with full context labels (product_type, time_period, units, scope)\nCache Format\n\nThe cache includes context labels and attribution map to prevent common errors:\n\n{\n  \"cache_version\": \"1.0\",\n  \"generated_at\": \"2026-02-05T14:00:00\",\n  \"sources\": [\n    {\n      \"source_id\": \"gfhk_memory\",\n      \"pdf_path\": \"/Users/Shared/ksvc/pdfs/20260204/GFHK - Memory.pdf\",\n      \"pdf_name\": \"GFHK - Memory price impact.pdf\",\n      \"tag\": \"GFHK\",\n      \"chars_extracted\": 13199\n    },\n    {\n      \"source_id\": \"goldman_abf\",\n      \"pdf_path\": \"/Users/Shared/ksvc/pdfs/20260204/Goldman ABF shortage.pdf\",\n      \"pdf_name\": \"Goldman Sachs ABF shortage report.pdf\",\n      \"tag\": \"Goldman Sachs\",\n      \"chars_extracted\": 25000\n    }\n  ],\n  \"extractions\": [\n    {\n      \"entry_id\": \"mem_001\",\n      \"source_id\": \"gfhk_memory\",\n      \"figure\": \"Figure 2\",\n      \"page\": 3,\n      \"metric\": \"Total BOM\",\n      \"product_type\": \"HGX B300 8-GPU server\",\n      \"time_period\": \"3Q25 → 1Q26E\",\n      \"units\": \"dollars per server\",\n      \"scope\": \"single HGX B300 8-GPU server\",\n      \"values\": {\n        \"before\": \"$369k\",\n        \"after\": \"$408k\",\n        \"change\": \"+$39k\"\n      },\n      \"context\": \"Memory price impact on AI server BOM\",\n      \"source_quote\": \"Figure 2: HGX B300 8-GPU server BOM...\",\n      \"verification\": \"RLM grep + visual inspection\"\n    }\n  ],\n  \"source_attribution_map\": {\n    \"topics\": {\n      \"Memory Pricing\": {\n        \"primary_source\": \"gfhk_memory\",\n        \"tag\": \"GFHK\",\n        \"key_metrics\": [\"HBM3e ASP\", \"DDR5-6400 (128GB)\", \"NVMe SSD (3.84TB)\", \"Total BOM\"],\n        \"source_context\": \"Figures: Figure 2; Time periods: 3Q25 → 1Q26E\",\n        \"notes\": \"4 extractions from this source\"\n      },\n      \"Abf Shortage\": {\n        \"primary_source\": \"goldman_abf\",\n        \"tag\": \"Goldman Sachs\",\n        \"key_metrics\": [\"ABF shortage ratio\", \"Kinsus PT\", \"NYPCB PT\", \"Unimicron PT\"],\n        \"source_context\": \"Time periods: 2H26, 2027, 2028\",\n        \"notes\": \"5 extractions from this source\"\n      }\n    },\n    \"cross_doc_synthesis\": {\n      \"dual_squeeze_thesis\": {\n        \"description\": \"Memory shortage (1Q26) + ABF substrate shortage (2H26) = compounding AI server bottleneck\",\n        \"components\": [\n          {\"topic\": \"Memory Pricing\", \"source\": \"gfhk_memory\", \"timeframe\": \"1Q26\"},\n          {\"topic\": \"Abf Shortage\", \"source\": \"goldman_abf\", \"timeframe\": \"2H26-2028\"}\n        ]\n      }\n    }\n  }\n}\n\n\nKey fields that prevent errors:\n\nproduct_type: Prevents \"GB300 rack\" when source says \"HGX B300 server\"\ntime_period: Prevents missing \"3Q25 → 1Q26E\" context\nsource_id: Prevents \"Goldman's BOM\" when data is from GFHK\ntag: Auto-extracted from PDF filename for quick attribution\nunits: Prevents \"22.5B racks\" when source means \"22.5bn dollars\"\nscope: Prevents \"per rack\" when source means \"per server\"\n\nAttribution map benefits:\n\ntopics: Topic-level mapping showing which source is primary authority\nkey_metrics: Quick lookup of what each source covers\nsource_context: Summary of figures, time periods covered\ncross_doc_synthesis: Manual insights connecting multiple sources\nIntegration with Step 3 (Writing)\n\nMANDATORY: Reference the cache when writing.\n\nStep 3a: Load cache and attribution map:\n\ncache = load_json('rlm-extraction-cache.json')\nattr_map = cache['source_attribution_map']\n\n# Get topic attribution\ntopic = \"Memory Pricing\"\nsource_tag = attr_map['topics'][topic]['tag']  # \"GFHK\"\nkey_metrics = attr_map['topics'][topic]['key_metrics']\n\n\nStep 3b: Write using cache labels and attribution:\n\n## Content\n\n3/ Memory squeeze is already here. GFHK's BOM breakdown (3Q25 → 1Q26E):\n- HBM3e ASP: $3,756 → $4,378 (+17%)\n- DDR5-6400 (128GB): $563 → $1,920 (+241%)\n- HGX B300 8-GPU server BOM: $369k → $408k\n\n\nSource: rlm-extraction-cache.json, entry mem_001, mem_002, mem_003\n\nContext labels from cache:\n\nProduct type: HGX B300 8-GPU server (not GB300 rack)\nTime period: 3Q25 → 1Q26E (quarterly change)\nSource: GFHK Figure 2 (via attribution map tag)\n\nAttribution map usage:\n\nUsed topics[\"Memory Pricing\"][\"tag\"] → \"GFHK\"\nVerified metrics against key_metrics list\nCross-doc synthesis: See dual_squeeze_thesis for memory + ABF connection\nEnforcement\n\nBefore saving draft (Step 5), verify:\n\n Every published number has a cache entry\n Product types match cache labels\n Time periods included from cache\n Source attributions match cache source_id and attribution map tag\n Units match cache (dollars vs racks, per server vs per datacenter)\n Cross-doc claims reference cross_doc_synthesis if applicable\n\nRed flags - stop if you notice:\n\nWriting numbers from memory instead of cache\nProduct type differs from cache (product_type field)\nMissing time period when cache has time_period\nAttributing to wrong source vs cache source_id\nUsing wrong tag (e.g., \"Goldman\" for GFHK data)\nMissing cross-doc synthesis when connecting multiple sources\nManual Cache Building\n\nIf automatic extraction fails, manually create cache entries:\n\n{\n  \"entry_id\": \"manual_001\",\n  \"source_id\": \"report_name\",\n  \"metric\": \"Component count\",\n  \"product_type\": \"Humanoid robot (dexterous hand)\",\n  \"values\": {\"count\": 22},\n  \"units\": \"DOF (degrees of freedom)\",\n  \"context\": \"Dexterous hand articulation\",\n  \"source_quote\": \"22自由度靈巧手\",\n  \"verification\": \"Manual extraction from p.15\",\n  \"notes\": \"Summed from finger joints (20) + wrist (2)\"\n}\n\n\nSee: ~/.claude/skills/kirk-content-pipeline/scripts/README-extraction-cache.md for full documentation.\n\nStep 1c: Cross-Doc Synthesis (RECOMMENDED)\n\nWhy this step exists: Steps 1a and 1b produce per-document facts. Without explicit synthesis, the pipeline gravitates toward single-source claims (\"KHGEARS P/E is 20x\") rather than cross-doc insights (\"Taiwan brokers are more bullish than Western analysts on humanoid robotics\").\n\nWhen to Use\nScenario\tUse 1c?\nMultiple PDFs on same topic\tYes\nComparing broker views\tYes\nFinding consensus/disagreement\tYes\nSingle PDF deep dive\tNo (skip to Step 2)\nBreaking news (speed matters)\tNo (skip to Step 2)\nWhat 1c Produces\nOutput Type\tExample\tAudit Requirement\nConsensus claim\t\"3 of 4 brokers see DRAM ASP rising in 2H26\"\tCross-doc (rlm-multi)\nComparative insight\t\"HIWIN at 38x vs KHGEARS at 20x - market pricing in certainty\"\tCross-doc (rlm-multi)\nDisagreement flag\t\"MS says neutral, local brokers say buy - who's right?\"\tCross-doc (rlm-multi)\nSynthesized thesis\t\"Taiwan supply chain undervalued vs China peers\"\tCross-doc (rlm-multi)\nHow to Run Cross-Doc Synthesis\ncd ~/.claude/skills/rlm-repl-multi/scripts\n\n# Initialize all relevant PDFs\npython3 rlm_repl.py init \"/path/to/broker1.pdf\" --name broker1\npython3 rlm_repl.py init \"/path/to/broker2.pdf\" --name broker2\npython3 rlm_repl.py init \"/path/to/broker3.pdf\" --name broker3\n\n# Ask synthesis questions (not just extraction)\npython3 rlm_repl.py exec -c \"\n# Question 1: Do they agree on market sizing?\nmarket_data = grep_all('market size|TAM|規模|billion|億', max_matches_per_context=10)\nprint('=== MARKET SIZE ACROSS SOURCES ===')\nprint(market_data)\n\"\n\npython3 rlm_repl.py exec -c \"\n# Question 2: Compare recommendations\nratings = grep_all('BUY|SELL|NEUTRAL|買進|賣出|中立|rating|recommendation', max_matches_per_context=10)\nprint('=== RATINGS COMPARISON ===')\nprint(ratings)\n\"\n\npython3 rlm_repl.py exec -c \"\n# Question 3: Find disagreements\npe_data = grep_all('P/E|PE|本益比|target price|目標價', max_matches_per_context=10)\nprint('=== VALUATION COMPARISON ===')\nprint(pe_data)\n\"\n\nSynthesis Questions to Ask\nCategory\tQuestions\nConsensus\tDo sources agree on [market size / timeline / key risk]?\nComparison\tHow does [broker A] view differ from [broker B]?\nValuation\tAre local vs foreign analysts pricing the same?\nTimeline\tDo sources agree on [catalyst / inflection point]?\nRisk\tWhat risks does one source mention that others miss?\nOutput Format: Synthesis Cache\n\nAfter running 1c, document synthesized insights for Step 3 (writing):\n\n## Cross-Doc Synthesis (Step 1c)\n\n**Sources:** broker1 (永豐), broker2 (MS), broker3 (Citi)\n\n### Consensus\n- Market size: All 3 agree on $5-6B (2025) → $30-35B (2029)\n- CAGR: 55-60% range across all sources\n\n### Disagreements\n- HIWIN: MS says NEUTRAL (38x too rich), 永豐 silent, Citi no coverage\n- Timeline: 永豐 more bullish on 2026 ramp, MS cautious until 2027\n\n### Comparative Insights (use in thread)\n- \"Taiwan brokers (永豐) bullish on KHGEARS; Western analysts (MS) more cautious on HIWIN\"\n- \"Local coverage sees 2026 inflection; foreign houses waiting for 2027 proof points\"\n\n### Audit Flag\nThese synthesized claims require cross-doc verification in Step 4b:\n- [ ] \"3 sources agree on market size\" → verify all 3 sources\n- [ ] \"Local vs foreign view divergence\" → verify specific ratings from each\n\nIntegration with Audit (Step 4a)\n\n⚠️ CRITICAL: Synthesized claims from Step 1c MUST be flagged for cross-doc audit in Step 4a.\n\nIn the audit manifest, mark these claims with cross-doc: true:\n\n## Claims to Verify\n\n| # | Claim | Type | Source ID | Cross-Doc? |\n|---|-------|------|-----------|------------|\n| 1 | KHGEARS P/E 20x | P/E | src1 | No |\n| 2 | Market consensus $5.3B | Consensus | src1, src2, src3 | **Yes** |\n| 3 | Local vs foreign view divergence | Synthesis | src1, src2 | **Yes** |\n\n\nCross-doc claims use rlm-repl-multi for verification, not parallel single-doc agents.\n\nExtract with Technical Specificity\n\nGo beyond surface numbers. Extract:\n\nWafer capacity (WPM)\nFab names (M15X, P4L, X2)\nYield percentages\nProcess nodes (1b, 1c)\nComponent counts per unit\nQuestion\tExtract\nWhat\tOne-sentence summary\nWhy\tWhy readers should care\nWho\tCompanies/tickers affected\nWhen\tTimeline (specific quarters)\nWhere\tFab locations, geography\nHow\tMechanism with technical detail\nStep 2: Check KSVC Holdings (Initial)\n\n⚠️ CRITICAL: This is a preliminary check. You MUST run Step 4c (Final Holdings Verification) after writing content to catch any tickers discovered during extraction.\n\nAll Models (7 Total)\nUS Models: usa-model1 ~ usa-model5 (5 models)\nTaiwan (TWSE) Models: twse-model1 ~ twse-model2 (2 models)\nStep 2a: Identify All Possible Tickers\n\nBefore querying the API, identify ALL possible identifiers for the company:\n\n# Example: Global Unichip Corp\n# Identifiers to search:\n# - US ticker: N/A (not US-listed)\n# - Taiwan ticker: 3443\n# - Chinese name: 創意 or 全球晶圓科技\n# - English name: Global Unichip, GUC\n# - Stock code: 3443 TW (TWSE format)\n\n# For Taiwan stocks, verify ticker via TWSE API first:\ncurl -s \"https://www.twse.com.tw/en/api/codeQuery?query=3443\"\n# Returns: {\"query\":\"3443\",\"suggestions\":[\"3443\\tGUC\"]}\n\n\nRules:\n\nUS stocks: Search by ticker only (e.g., \"MU\", \"AMD\", \"NVDA\")\nTaiwan stocks: Search by stock code (e.g., \"3443\") - may appear as \"3443 創意\" in API\nIf unsure: Check both US and TWSE models\nStep 2b: Query All 7 Models\n\nNEVER assume a stock isn't held without checking ALL 7 models.\n\nRECOMMENDED: Use tradebook for accurate entry prices and current status\n\n# FASTEST METHOD: Check tradebook for entry price + status\n# (Works for all models - US and TWSE)\ncurl -s \"https://kicksvc.online/api/twse-model2\" | \\\n  jq '.tradebook[] | select(.ticker == 6285 or .ticker == 3491) |\n  {ticker, enterDate, enterPrice, todayPrice, profitPercent, exitDate}'\n\n# Returns:\n# {\n#   \"ticker\": 6285,\n#   \"enterDate\": \"Wed, 28 Jan 2026 00:00:00 GMT\",\n#   \"enterPrice\": 162.0,\n#   \"todayPrice\": 207.5,  # ⚠️ May be stale! Use Yahoo Finance for current\n#   \"profitPercent\": 28.09,  # ⚠️ Based on stale todayPrice\n#   \"exitDate\": null  # null = still holding\n# }\n\n\n⚠️ CRITICAL: API's todayPrice and profitPercent can be STALE (hours or days old). Always verify current price with Yahoo Finance API (Step 2d).\n\nFALLBACK: Check equitySeries (slower, less data)\n\n# Check ALL 5 US models\nfor i in 1 2 3 4 5; do\n  echo \"=== USA-Model $i ===\"\n  curl -s \"https://kicksvc.online/api/usa-model$i\" | \\\n    jq --arg t \"MU\" '.equitySeries[0].series[] | select(.Ticker == $t) |\n    {ticker: .Ticker, return: .data[-1].value}'\ndone\n\n# Check ALL 2 TWSE models (search by stock code)\nfor i in 1 2; do\n  echo \"=== TWSE-Model $i ===\"\n  curl -s \"https://kicksvc.online/api/twse-model$i\" | \\\n    jq '.equitySeries[0].series[] | select(.Ticker | contains(\"3443\")) |\n    {ticker: .Ticker, return: .data[-1].value}'\ndone\n\n\nWhy still use equitySeries?\n\nHistorical tracking: Shows return % evolution over time (.data[] array)\nVerification: Confirms position is still active\nFallback: If tradebook is unavailable or empty\nEntry date discovery: First data point (return ≈ 0) indicates entry date\n\nExample: Finding entry date from equitySeries\n\n# Get all data points to find entry date\ncurl -s \"https://kicksvc.online/api/twse-model2\" | \\\n  jq '.equitySeries[0].series[] | select(.Ticker | contains(\"6285\")) | .data[0]'\n# Returns: {\"date\": \"2026-01-28 00:00:00\", \"value\": 0}\n# Entry date: Jan 28, 2026\n\nStep 2c: Verification and Fallback Strategy\n\nUse all three data sources for robustness:\n\nData Source\tWhen to Use\tWhat It Shows\tLimitation\ntradebook\tPrimary\tEntry date, entry price, exit status\ttodayPrice may be stale\nequitySeries\tVerification\tReturn % over time, position status\tNo entry price/date\nfilledOrders\tFallback\tActual trade orders, prices\tEmpty if model didn't reset recently\n\nRecommended workflow:\n\n# 1. PRIMARY: Get entry details from tradebook\nTRADEBOOK=$(curl -s \"https://kicksvc.online/api/twse-model2\" | \\\n  jq '.tradebook[] | select(.ticker == 6285)')\n\n# 2. VERIFY: Cross-check with equitySeries\nEQUITY=$(curl -s \"https://kicksvc.online/api/twse-model2\" | \\\n  jq '.equitySeries[0].series[] | select(.Ticker | contains(\"6285\"))')\n\n# 3. FALLBACK: If tradebook empty, check filledOrders\nif [ -z \"$TRADEBOOK\" ]; then\n  curl -s \"https://kicksvc.online/api/twse-model2\" | \\\n    jq '.filledOrders[] | select(.ticker | contains(\"6285\"))'\nfi\n\n\nCross-verification example:\n\n# Check if tradebook and equitySeries agree on position status\nTRADEBOOK_HELD=$(curl -s \"https://kicksvc.online/api/twse-model2\" | \\\n  jq '.tradebook[] | select(.ticker == 6285 and .exitDate == null) | .ticker')\n\nEQUITY_HELD=$(curl -s \"https://kicksvc.online/api/twse-model2\" | \\\n  jq '.equitySeries[0].series[] | select(.Ticker | contains(\"6285\")) | .Ticker')\n\n# If both show position, high confidence\n# If only one shows position, investigate discrepancy\n\n\nFallback: Check filledOrders (if tradebook empty)\n\nIf equitySeries is empty OR tradebook is empty (rare, but possible after model reset):\n\n# Check ALL US models - filledOrders\nfor i in 1 2 3 4 5; do\n  echo \"=== USA-Model $i filledOrders ===\"\n  curl -s \"https://kicksvc.online/api/usa-model$i\" | \\\n    jq '.filledOrders[] | select(.ticker == \"MU\") | {ticker, price, quantity}'\ndone\n\n# Check ALL TWSE models - filledOrders\nfor i in 1 2; do\n  echo \"=== TWSE-Model $i filledOrders ===\"\n  curl -s \"https://kicksvc.online/api/twse-model$i\" | \\\n    jq '.filledOrders[] | select(.ticker | contains(\"3443\")) | {ticker, price, quantity}'\ndone\n\n\nWhen data sources disagree:\n\nScenario\tAction\ntradebook shows position, equitySeries doesn't\tTrust tradebook (equitySeries may lag)\nequitySeries shows position, tradebook doesn't\tInvestigate - check filledOrders\nfilledOrders shows buy but no current position\tPosition was closed - check tradebook.exitDate\nAll three empty\tPosition not held in this model\nStep 2e: Document Holdings with Accurate Returns\n\nCRITICAL: Always calculate actual returns using:\n\nEntry price from tradebook.enterPrice\nCurrent price from Yahoo Finance API (NOT KSVC API's stale todayPrice)\n\nOutput format (with accurate data):\n\n**KSVC Holdings Check:**\n- ✅ WNC (6285.TW) - Held in TWSE Model 2\n  - Entry: Jan 28, 2026 @ NT$162\n  - Current: NT$187 (Yahoo Finance)\n  - Gain: +15.4% (actual, not API's stale 28%)\n- ✅ UMT (3491.TWO) - Held in TWSE Model 2\n  - Entry: Jan 28, 2026 @ NT$1,120\n  - Current: NT$1,280 (Yahoo Finance)\n  - Gain: +14.3% (actual, not API's stale 23%)\n- ❌ Not held in TWSE Model 1 or USA Models 1-5\n\n**Note:** API's equitySeries and tradebook.todayPrice can lag hours/days behind market.\nAlways use Yahoo Finance for current prices.\n\n\nIf NOT held in any model:\n\n**KSVC Holdings Check:**\n- ❌ Not held in any of 7 models (checked USA 1-5, TWSE 1-2)\n- Content angle: Industry analysis / Market observation\n\nIntegration Strategies\nSituation\tApproach\tExample\nHeld (US)\tCall out position\t\"KSVC Model1 holds $MU at $412 entry\"\nHeld (TW)\tCall out position\t\"KSVC台股Model1持有台積電 (2330)\"\nNot held\tIndustry framing\t\"Memory cycle benefits $MU, SK Hynix\"\nWin\tVictory lap\t\"$MU +15% since Model1 added it\"\nStep 2d: Current Price Check (Yahoo Finance API - REQUIRED)\n\n⚠️ CRITICAL: ALWAYS use Yahoo Finance for current prices. KSVC API's todayPrice can be stale.\n\nUS stocks:\n\n# Get current price\nTICKER=\"MU\"\ncurl -s -A \"Mozilla/5.0\" \"https://query1.finance.yahoo.com/v8/finance/chart/$TICKER?interval=1d&range=1d\" | \\\n  jq '.chart.result[0].meta.regularMarketPrice'\n\n# Get full market data\ncurl -s -A \"Mozilla/5.0\" \"https://query1.finance.yahoo.com/v8/finance/chart/$TICKER?interval=1d&range=1d\" | \\\n  jq '.chart.result[0].meta | {symbol, regularMarketPrice, currency, regularMarketTime}'\n\n\nTaiwan stocks (use .TW or .TWO suffix):\n\n# WNC (6285.TW)\ncurl -s -A \"Mozilla/5.0\" \"https://query1.finance.yahoo.com/v8/finance/chart/6285.TW?interval=1d&range=1d\" | \\\n  jq '.chart.result[0].meta | {symbol, regularMarketPrice, currency, regularMarketTime}'\n\n# UMT (3491.TWO - OTC stocks use .TWO)\ncurl -s -A \"Mozilla/5.0\" \"https://query1.finance.yahoo.com/v8/finance/chart/3491.TWO?interval=1d&range=1d\" | \\\n  jq '.chart.result[0].meta | {symbol, regularMarketPrice, currency, regularMarketTime}'\n\n\nTaiwan ticker suffixes:\n\n.TW - Listed on Taiwan Stock Exchange (TWSE)\n.TWO - Listed on Taipei Exchange (TPEx/OTC)\n\nCalculate actual gain (not API's stale profit%):\n\n# Example: WNC\nTICKER=\"6285.TW\"\nENTRY=162  # From tradebook.enterPrice\nCURRENT=$(curl -s -A \"Mozilla/5.0\" \"https://query1.finance.yahoo.com/v8/finance/chart/$TICKER?interval=1d&range=1d\" | jq '.chart.result[0].meta.regularMarketPrice')\necho \"$TICKER: NT\\$$CURRENT | Entry: NT\\$$ENTRY | Gain: $(awk \"BEGIN {printf \\\"%.1f\\\", ($CURRENT - $ENTRY) / $ENTRY * 100}\")%\"\n\n# Output: 6285.TW: NT$187 | Entry: NT$162 | Gain: +15.4%\n\n\nComplete workflow (tradebook + Yahoo Finance):\n\n# 1. Get entry price from tradebook\nENTRY=$(curl -s \"https://kicksvc.online/api/twse-model2\" | jq '.tradebook[] | select(.ticker == 6285) | .enterPrice')\n\n# 2. Get current price from Yahoo Finance\nCURRENT=$(curl -s -A \"Mozilla/5.0\" \"https://query1.finance.yahoo.com/v8/finance/chart/6285.TW?interval=1d&range=1d\" | jq '.chart.result[0].meta.regularMarketPrice')\n\n# 3. Calculate actual gain\necho \"Entry: NT\\$$ENTRY | Current: NT\\$$CURRENT | Gain: $(awk \"BEGIN {printf \\\"%.1f\\\", ($CURRENT - $ENTRY) / $ENTRY * 100}\")%\"\n\nStep 3: Write Content\n\nSee references/kirk-voice.md for full templates and examples.\n\nThread Numbering Convention\nFormat\tWhen to Use\nNo number on Tweet 1\tRecommended - cleaner hook, stands alone if quoted/shared\n2/, 3/, etc.\tStandard thread format - signals \"2 of N\"\n1/ on first tweet\tOptional - explicit \"thread incoming\" signal\n\nWhy skip number on first tweet:\n\nHook tweet often gets shared standalone\n\"1/\" makes it look incomplete out of context\nCleaner visual presentation\n\nFormat preference: Use / not ) - it's the established Twitter thread convention.\n\n✅ Recommended:\nHumanoid robots going from science fair to factory floor. Taiwan supply chain getting interesting.\n\n2/ TLDR:\n- Market: $5.3B (2025) to $32.4B (2029)...\n\n❌ Avoid:\n1/ Humanoid robots going from science fair...\n\nPick Content Type\nWhat kind of content? (Thread / Quick Take / Breaking / Shitpost / Commentary / Victory Lap)\nLook up the formula in kirk-voice.md\nApply the blend\nTechnical Specificity\n\n❌ Vague: \"NAND supply is tight\"\n\n✅ Specific: \"YMTC adding 135k WPM at Wuhan Fab 3. Still won't close the gap - Samsung X2 conversion delayed to Q2.\"\n\n❌ Vague: \"HBM margins are good\"\n\n✅ Specific: \"SK Hynix HBM yields at 80-90%. Samsung stuck at 60% on 1c DRAM.\"\n\nAlways include: specific numbers, time frames, fab names, comparisons.\n\nReferential Clarity (Learned 2026-02-08)\n\nNever use vague pronouns or shorthand when the referent hasn't been introduced.\n\nIn thread format, each tweet may be read semi-independently. If earlier tweets discuss a concept as a category (e.g., \"ASIC revenue\"), don't suddenly refer to it as \"the project\" in a later tweet — the reader has no antecedent for \"the project.\"\n\n❌ Vague: \"MS thinks the project is the 3nm Google TPU\" (What project? The thread never introduced \"a project.\")\n\n✅ Clear: \"MS thinks the main client/program is the 3nm Google TPU\" (Names what MS is identifying — who's buying and what they're building.)\n\nRule: When a shorthand (\"the project\", \"this deal\", \"the play\") saves words but costs clarity, it's not saving anything. Name the thing directly. A few extra words that prevent the reader from pausing to re-read are always worth it.\n\nWhen shifting from category to specific: If the thread discusses an abstract category (ASIC revenue, memory supply) and then pivots to a specific entity (Google TPU, Samsung fab), bridge the transition. Don't assume the reader already knows which specific thing drives the category.\n\nStep 4a: Audit (MANDATORY — MUST USE SUBAGENTS)\n\n⚠️ WHY THIS STEP EXISTS: We learned that RLM extraction (Step 1b) is not the same as verification. Explore agents hallucinate numbers. Writers make inferences. This step catches errors BEFORE publishing.\n\n⚠️ STRUCTURAL GATE: You (the main agent) are the WRITER. You cannot also be the AUDITOR. You MUST delegate audit to fresh-context subagents. See the \"WARM STATE TRAP\" section in the audit-content skill for why.\n\nStep 4a Process (3 actions, in order)\n\nAction 1: Generate audit manifest\n\nWrite audit-manifest.md with all claims, sources, and search hints.\nThis is the handoff document for the audit agents.\n\n\nAction 2: Spawn Explore agents (MANDATORY — do NOT skip this)\n\nSpawn 1 Explore agent per source PDF via Task tool.\nEach agent gets: the manifest + its assigned PDF path + claim list.\nEach agent returns: JSON with PASS/FAIL/UNSOURCED per claim.\n\n\n⚠️ WARM STATE TRAP: If RLM is already loaded from Step 1b, you WILL be tempted to \"just grep it yourself.\" DO NOT. The audit-content skill explains why: you wrote the draft, so you already \"know\" the answers. Self-auditing is confirmation bias, not verification.\n\nSelf-check: If you are about to type rlm_repl.py exec during Step 4a, STOP. You are skipping the gate.\n\nAction 3: Collect results and write audit report\n\nAggregate agent results into audit-report.md.\nMUST include audit_agent_ids from the Task tool responses.\nIf audit_agent_ids is empty, the audit is invalid.\n\nInvoke the audit-content skill for full process details:\n/audit-content\n\nWhat Gets Verified\nClaim Type\tExample\tHow to Verify\nCompany names\t\"KHGEARS\"\tRLM grep + TWSE API\nTicker formats\t\"4571 TW\"\tTWSE API\nNumbers\t\"62 harmonic reducers\"\tRLM grep exact count\nPercentages\t\"19% cost\"\tRLM grep in source\nP/E ratios\t\"20x\"\tRLM grep analyst target\nRatings\t\"BUY\"\tRLM grep recommendation\nTimelines\t\"2H27\"\tRLM grep + verify context\nAttributions\t\"shipping to X\"\tMust be explicit in source, not inferred\nWhen to Proceed\nAll PASS: Save draft (Step 5)\nAny FAIL: Fix the claim, re-audit\nUNSOURCED: Either remove, add caveat (\"reportedly\"), or find source\n\nDo NOT save draft with FAIL status. UNSOURCED claims need explicit decision.\n\nStep 4a.5: Gemini Web Cross-Validation (RECOMMENDED)\n\n⚠️ WHY THIS STEP EXISTS: RLM audit (Step 4a) is source-locked — it only checks claims against the cited PDF. This over-flags reasonable inferences that go beyond one report but are well-documented publicly. Step 4a.5 gives flagged claims a second chance via web-grounded search.\n\nCase Study (Old Memory Squeeze, 2026-02-07):\n\nDraft said \"capacity getting cannibalized for HBM and DDR5\"\nRLM audit FAIL: MS report says \"exiting DDR4\" / \"cannibalization\" but doesn't name HBM/DDR5 as destination\nGemini confirmed: TrendForce, DigiTimes, The Elec all document the DDR4→HBM/DDR5 shift\nResult: Claim restored with dual attribution (MS + public sources)\n\nSame thread: \"Samsung, Kioxia, Micron all reducing MLC NAND\" — MS only confirmed Samsung, said Kioxia/Micron \"could\" reduce. Gemini confirmed all three are actively reducing per TrendForce (41.7% YoY MLC NAND capacity decrease).\n\nWhen to Use\nRLM Audit Result\tUse Gemini?\tWhy\nFAIL — wrong number\tNo\tNumber errors need source correction, not web search\nFAIL — inference beyond source\tYes\tInference may be valid per public sources\nFAIL — misattribution\tMaybe\tCheck if correct attribution exists publicly\nUNSOURCED — claim not in cited PDF\tYes\tClaim may be common industry knowledge\nPASS\tNo\tAlready verified\nHow to Run\n# For each FAIL/UNSOURCED claim that looks like a reasonable inference:\ngemini -p \"Search the web: [specific factual question about the inference].\nI need external industry sources (TrendForce, DigiTimes, The Elec, Reuters,\ncompany earnings calls) from 2025-2026 confirming or denying this.\"\n\n\nKey: Ask Gemini to search the web explicitly. Without \"search the web\", Gemini may read local files instead.\n\nDecision Matrix\nGemini Result\tAction\nConfirmed with public sources\tRestore claim, add dual attribution (source + Gemini web)\nPartially confirmed\tSoften language to match what's confirmed\nNot confirmed / contradicted\tKeep as FAIL, fix or remove claim\nGemini unsure / no sources found\tKeep as FAIL (conservative default)\nAudit Report Format\n\nUpdate claims restored via Gemini:\n\n| # | Claim | Source | Status |\n|---|-------|--------|--------|\n| 5 | DDR4 capacity → HBM/DDR5 | ms_old_memory + Gemini web | PASS (restored) |\n| 8 | All three reducing MLC NAND | ms_old_memory + Gemini web | PASS (restored) |\n\n\nInclude Gemini's sources in the audit resolution log:\n\n### Claim 5 (RESTORED via Gemini)\n- **RLM flagged:** Source doesn't name HBM/DDR5 as destination\n- **Gemini confirmed:** TrendForce, DigiTimes, The Elec document capacity conversion\n- **Resolution:** Restored with corrected framing\n\nGuidelines\nOnly use for inferences and industry knowledge claims, NOT for number verification\nGemini's web search is a second opinion, not the final word — present both perspectives\nIf Gemini contradicts both the source and common sense, flag for human review\nKeep Gemini queries specific and focused — one claim per query\nStep 4b: Final Holdings Verification (MANDATORY)\n\n⚠️ CRITICAL: This is the FINAL holdings check. You MUST run this after writing content because:\n\nTickers/stock codes may be discovered during extraction (Step 1b)\nCompany names may be clarified during audit (Step 4b)\nStep 2 was a preliminary check with limited information\nWhy This Step Exists\n\nProblem: You might learn the correct ticker late in the pipeline.\n\nExample - GUC Case:\n\nStep 1b: Learn company is \"Global Unichip (GUC)\"\nStep 1b: Extract ticker \"3443 TW\" from report\nStep 2: ❌ Assumed \"not held\" without actually checking TWSE models\nStep 3: Wrote \"I don't have a position here\"\nStep 4c: ✅ Discovered GUC IS held in TWSE Model 1 (+2.22%)\nResult: Had to rewrite content to reflect actual position\nStep 4c Process\n\n1. Extract ALL tickers/identifiers from the draft:\n\n# Read the draft and extract tickers\ngrep -E \"[0-9]{4}|\\\\$[A-Z]{2,5}\" draft.md\n\n# Example output:\n# - 3443 TW (Taiwan stock)\n# - $MU (US stock)\n# - AMD, NVDA (US stocks)\n\n\n2. For EACH ticker, check ALL 7 models:\n\n# Taiwan stock example: 3443\nfor i in 1 2; do\n  echo \"=== TWSE-Model $i ===\"\n  curl -s \"https://kicksvc.online/api/twse-model$i\" | \\\n    jq '.equitySeries[0].series[] | select(.Ticker | contains(\"3443\")) |\n    {ticker: .Ticker, return: .data[-1].value}'\ndone\n\n# US stock example: MU\nfor i in 1 2 3 4 5; do\n  echo \"=== USA-Model $i ===\"\n  curl -s \"https://kicksvc.online/api/usa-model$i\" | \\\n    jq --arg t \"MU\" '.equitySeries[0].series[] | select(.Ticker == $t) |\n    {ticker: .Ticker, return: .data[-1].value}'\ndone\n\n\n3. Compare Step 2 vs Step 4c results:\n\n**Holdings Verification:**\n\nStep 2 (Initial): Claimed \"Not held\"\nStep 4c (Final): ✅ Found in TWSE Model 1 (+2.22%)\n\n**Action Required:** Update draft to reflect actual position\n\n\n4. If holdings status changed, update draft:\n\n# Before (Step 2):\n\"I don't have a position here, but watching...\"\n\n# After (Step 4c):\n\"KSVC holds GUC in TWSE Model 1 (+2.22% since entry). Watching...\"\n\nDecision Matrix\nStep 2\tStep 4c\tAction\nNot held\tNot held\t✅ No change needed\nNot held\tHELD\t❌ UPDATE DRAFT - change content angle\nHeld\tHeld\t✅ Verify return % is current\nHeld\tNot held\t❌ UPDATE DRAFT - position was closed\nOutput Format\n**Step 4c: Final Holdings Verification**\n\n✅ Verified ALL 7 models (USA 1-5, TWSE 1-2)\n\n**Tickers checked:**\n- 3443 (GUC): ✅ Found in TWSE Model 1 (+2.22%)\n- $MU: ❌ Not held\n- $AMD: ✅ Found in USA Model 3 (+12.5%)\n\n**Changes required:**\n- Update draft line 32: Add KSVC position note for GUC\n- Update draft line 45: Add KSVC position note for AMD\n\nStep 4c: Stylize (MANDATORY)\n\nWhy this step exists: The data backbone (Step 3) is Serenity-heavy - precise, comprehensive, verified facts. Step 4c transforms it into Kirk's authentic voice with emotional range and character.\n\nInvoke the kirk-mode skill:\n\n/kirk-mode\n\nWhat Kirk Mode Does\n\nTransforms verified data into Kirk's voice by:\n\nMode selection - Matches Kirk's emotional mode to situation (Analytical, Sarcastic, Emo, Shitpost, Degen, GIF Master)\nVoice elements - Adds discovery moments (\"Wait though\"), reactions (\"wayyy bigger\"), first-person thesis\nMeme culture - Integrates fintwit slang (ngmi, wagmi, brother, probably nothing) strategically\nAnti-formula - Rotates structure to prevent templating (varies TLDR → \"ok so\" → question)\nCredibility balance - Online enough to relate, credible enough to trust\nWhen to Use Each Mode\nSituation\tKirk Mode\tExample\nDeep fundamental dive\tAnalytical\t\"ok so\", \"Wait though\", data-heavy with reactions\nMarket absurdity\tSarcastic\t\"brother Elon literally applied for 1M satellites\"\nPositions down\tEmo\t\"honestly getting wrecked\", vulnerable lowercase\nQuick reaction/meme\tShitpost\tme/also me format, nobody: format\nHigh-conviction risky play\tDegen\t\"sir this is a casino\", YOLO energy\nVictory lap\tGIF Master + Analytical\tPerfect GIFs + receipts\n\nMost natural: Mix modes in single post (Analytical + Sarcastic + maybe GIF)\n\nWorkflow\nAssess situation - What's happening? (Deep dive, absurd market, position down, quick reaction)\nSelect mode(s) - Use kirk-mode decision tree or mix modes naturally\nApply voice toolkit - Discovery moments, strategic \"wayyy\", emphasis markers\nCheck meme integration - Would slang/GIF enhance or distract from analysis?\nVerify authenticity - Read aloud: sounds like intern at bar or ChatGPT report?\n\nOutput: Transformed content with Kirk's character voice - ready for humanizer pass.\n\nSee kirk-mode skill for:\n\nComplete mode descriptions with examples\nMeme vocabulary and format templates\nAnti-formula principles\nCredibility boundaries\nStep 4d: Humanize (MANDATORY)\n\nNote: Humanizer runs AFTER stylize to remove any AI patterns that slipped through during transformation.\n\nInvoke the humanizer skill:\n\n/humanizer\n\nPatterns to Remove\nPattern\tFix\n\"Full stop.\"\t\"Simple as.\" or just delete\nEm-dashes (—)\tPeriods, commas\n\"It's not X. It's Y.\"\t\"The play is Y, not X.\"\nPerfect parallelism\tVary structure\nRule of three\tBreak the pattern\nOver-confidence\tAdd skepticism phrase\nAI Words to Remove\n\nAdditionally, crucial, delve, emphasize, testament, enhance, foster, landscape, showcase, tapestry, underscore, vibrant, pivotal, key (adj), interplay\n\nSoul to Add\nSkepticism: \"I might be wrong\" / \"Not sure about this\"\nReactions: \"That number is wild\" / \"Interesting\"\nFirst person: \"I keep thinking about...\"\nMixed feelings: \"Impressive but also kind of unsettling\"\nQuestions: Ask the audience\nStep 5: Save Draft\nFile Organization Convention\n\nCRITICAL: Use assets folder structure for all drafts.\n\ncontent-pipeline/draft/\n└── YYYY-MM-DD-topic-assets/\n    ├── README.md                           # Inventory, traceability, verification log\n    ├── YYYY-MM-DD-topic.md                 # Original draft\n    ├── YYYY-MM-DD-topic-citrini7.md        # Tone rewrites (if applicable)\n    ├── YYYY-MM-DD-topic-audit-manifest.md  # Audit claims list\n    ├── YYYY-MM-DD-topic-audit-report.md    # Audit verification results\n    ├── YYYY-MM-DD-topic-audit-final.md     # Final audit with corrections\n    ├── chart1_*.png                        # Generated charts\n    ├── chart2_*.png\n    └── source_*.png                        # Source images for traceability\n\n\nExample: 2026-02-05-guc-valuation-debate-assets/\n\nDraft Content Format\n\nSave main content as: YYYY-MM-DD-topic-assets/YYYY-MM-DD-topic.md\n\n# [Topic] [Type] Draft\n\n**Date:** YYYY-MM-DD\n**Source:** [Report name, date]\n**Type:** Thread | Quick Take | Reaction\n**Status:** PENDING APPROVAL\n**Process:** RLM extraction → KSVC check → Humanizer pass\n\n---\n\n## Content\n\n[Content here]\n\n---\n\n## Source Citations\n- [List sources]\n\n## Notes\n- [KSVC holdings: $TICKER at $PRICE entry]\n- [Technical details verified via RLM]\n- [Any caveats or uncertainties]\n\nREADME.md Template\n\nCreate README.md in the assets folder to document the work:\n\n# [Topic] Assets\n\n**Date:** YYYY-MM-DD\n**Type:** [Thread/Quick Take/etc]\n**Topic:** [Brief description]\n\n---\n\n## Content Files\n\n| File | Description | Status |\n|------|-------------|--------|\n| YYYY-MM-DD-topic.md | Original draft | ✅ APPROVED |\n| YYYY-MM-DD-topic-citrini7.md | Citrini7 rewrite | ✅ APPROVED |\n\n---\n\n## Charts (Original Work - OK to Publish)\n\n| File | Description | Data Source |\n|------|-------------|-------------|\n| chart1_*.png | [Description] | [Source PDF + page] |\n| chart2_*.png | [Description] | [Source PDF + page] |\n\n**Theme:** ocean_depths\n\n---\n\n## Data Verification Log\n\n### [Claim Category 1]\n\\```\n[Claim]: [Value]\n- Source: [PDF name, page]\n- RLM verified: [grep results or calculation]\n\\```\n\n### [Claim Category 2]\n\\```\n[Claim]: [Value]\n- Source: [PDF name, page]\n- Verified: [evidence]\n\\```\n\n---\n\n## Audit Reports\n\n| File | Purpose |\n|------|---------|\n| YYYY-MM-DD-topic-audit-manifest.md | Claims to verify |\n| YYYY-MM-DD-topic-audit-report.md | Initial audit results |\n| YYYY-MM-DD-topic-audit-final.md | Final audit with corrections |\n\n**Audit result:** X/Y claims verified\n\n---\n\n## KSVC Holdings\n\n\\```bash\n# Verification command\ncurl -s \"https://kicksvc.online/api/[model]\" | jq '...'\n\nResult: [Holdings status]\n\\```\n\n---\n\n## Source Documents\n\n| Source | Path | Used For |\n|--------|------|----------|\n| [Report name] | /Users/Shared/ksvc/pdfs/YYYYMMDD/file.pdf | [What data] |\n\n---\n\n## Corrections Made\n\n1. [Correction 1]\n2. [Correction 2]\n\n---\n\n## Lessons Learned\n\n1. [Lesson 1]\n2. [Lesson 2]\n\nStep 6: Chart Decision & Generation\n\nTiming: After draft is complete. The draft crystallizes the thesis - then you see which claims benefit from visualization.\n\nWhen to Make Charts\nContent Type\tChart Likely?\tWhy\nLong Thread\tYes\tMultiple data points, trends\nQuick Take\tMaybe\tOne key number might not need visual\nBreaking News\tRarely\tSpeed > polish\nVictory Lap\tMaybe\tEntry vs Now comparison\nChart-Tweet Pairing\n\nPrinciple: Put the most eye-catching visual early (Tweet 1-3) to hook engagement.\n\nChart Type\tBest Tweet Position\tWhy\nMarket size / growth bar\tTweet 2 (TLDR)\tPairs with market numbers, shows scale\nComponent breakdown pie\tTweet 3-4\tPairs with component discussion\nCompany comparison table\tTweet 5-6\tPairs with company analysis\nTimeline / roadmap\tTweet 7-8\tPairs with forward-looking content\n\nPairing logic:\n\nMatch chart to the tweet that contains the same data\nHook tweet (Tweet 1) can go either way:\nText-only: Clean, curiosity-driven, lets words land first\nWith chart: Visual stop, data-forward, shows you have receipts\nVisuals work best on data-heavy tweets, not opinion tweets\nFinal tweet (watchlist/conclusion) usually doesn't need a chart\n\nExample pairing (humanoid robotics thread):\n\nTweet 1: Hook (optional: market_size_bar.png for visual hook)\nTweet 2: TLDR + market_size_bar.png ← $5.3B→$32.4B numbers\nTweet 3: Component counts (optional: component_pie.png)\nTweet 5: Taiwan names + taiwan_companies_table.png ← KHGEARS/HIWIN/AIRTAC\n\nDecision Process\n\nReview draft - identify \"chartable moments\"\n\nTime series data (market growth, price trends)\nComponent breakdowns (pie charts)\nCompany comparisons (tables)\n\nCheck RLM cache - do we have the data?\n\nText numbers → bar/line charts\nTables → comparison tables\nSource charts → reference or recreate\n\nDECLARE SOURCE (MANDATORY) - before any chart generation\n\n\"I am charting [METRIC] from [SOURCE] page [X]\"\n\"Source contains these exact values: [list them]\"\n\n\nGenerate with chart-factory\n\n/chart-factory\n\nChart Generation Workflow\nDraft complete → identify chartable claims\n                        ↓\n              Pull data from RLM cache (NOT from draft text)\n                        ↓\n              ⚠️ DECLARE SOURCE (state metric + page + exact values)\n                        ↓\n              Save source image FIRST (before generating)\n                        ↓\n              Generate with chart-factory (use theme-factory)\n                        ↓\n              Verify with verification agent\n                        ↓\n              Save to assets folder\n\nSource Declaration (LEARNED FROM MISTAKE)\n\n⚠️ Why this exists: We once created a \"component count\" chart but saved a \"cost %\" source image. The metrics didn't match, making the source invalid for verification.\n\nBefore generating ANY chart, you MUST:\n\nStep\tAction\tExample\n1. State\t\"I am charting [METRIC] from [SOURCE]\"\t\"I am charting hardware cost % from 永豐 p.20\"\n2. Show\tScreenshot the exact source table/chart\tSave as source_hardware_cost_p20.png\n3. Confirm\t\"Source contains: [exact values]\"\t\"19%, 16%, 13%, 52%\"\n4. Flag\tIf transforming data, justify it\t\"I am NOT transforming - using values as-is\"\n\nRed flags - STOP if you notice:\n\nSource shows % but you're charting counts (metric mismatch)\nSource has 15 items but chart has 5 (cherry-picking)\nSource image doesn't contain your chart's numbers (wrong source)\nCompany name romanized/guessed from Chinese (fabricated data)\nTicker suffix assumed without checking (TT vs TW)\nCompany & Ticker Verification\n\n⚠️ LEARNED FROM MISTAKE: We fabricated \"Chuing\" for 祺驊 (4571). Official name is \"KHGEARS\".\n\n# Always verify Taiwan company names via TWSE API\ncurl -s \"https://www.twse.com.tw/en/api/codeQuery?query=4571\"\n# Returns: {\"query\":\"4571\",\"suggestions\":[\"4571\\tKHGEARS\"]}\n\nNever romanize Chinese names (祺驊 ≠ \"Chuing\")\nUse TW suffix for general audience (TT = Bloomberg only)\nUsing chart-factory\nfrom chart_factory import create_bar_chart, create_pie_chart, create_table_chart\n\n# Market size bar chart\ncreate_bar_chart(\n    data={'2025': 5.3, '2026': 8.3, '2027': 13.0},\n    title=\"Global Humanoid Robot Market\",\n    theme=\"ocean_depths\",\n    annotations={\"type\": \"cagr\", \"value\": \"57%\"}\n)\n\n# Component pie chart\ncreate_pie_chart(\n    data={'Reducers': 62, 'Motors': 30, 'Screws': 48},\n    title=\"Component Breakdown\",\n    theme=\"ocean_depths\",\n    explode_largest=True\n)\n\n# Company comparison table\ncreate_table_chart(\n    columns=['Company', 'Ticker', 'P/E', 'Rating'],\n    data=[['Chuing', '4571', '24x', 'BUY'], ...],\n    title=\"Taiwan Supply Chain\",\n    theme=\"ocean_depths\"\n)\n\nVerification (MANDATORY)\n\nAfter generating, spawn Explore agent with thoroughness: quick for focused verification:\n\nTask(subagent_type=\"Explore\", prompt=\"\"\"\n**THOROUGHNESS: quick**\n\n**CONTEXT ISOLATION: You have NO external conversation history. Work ONLY from this prompt.**\n\nCHART VERIFICATION TASK\n\nChart: /path/to/chart.png\nType: bar\n\nSource Data (expected):\n{\"2025\": 5.3, \"2026\": 8.3, \"2027\": 13.0}\n\nSource Context:\n永豐 p.3 - \"2025年全球人型機器人規模約53億美元\"\n\nTask:\n1. Read the chart image\n2. Extract numbers from visual\n3. Compare to expected data\n4. Check for unit consistency (B vs M, % formatting)\n\nReturn ONLY JSON:\n{\n  \"verified\": true/false,\n  \"numbers_in_chart\": [...],\n  \"numbers_in_source\": [...],\n  \"discrepancies\": [...],\n  \"notes\": \"...\"\n}\n\"\"\")\n\n\nVerification checks data → chart integrity. Source accuracy is RLM's responsibility (Step 4a).\n\nThoroughness = quick: Single-pass verification, focused on specific data points. Fast visual-to-data check.\n\nSave Charts\n\nSave to: draft/YYYY-MM-DD-topic-assets/\n\nInclude:\n\nGenerated charts (chart1_.png, chart2_.png)\nSource images from PDF (for traceability)\ngenerate_charts.py script (reproducibility)\nStep 7: Publish to Final Folder\n\nAfter approval, publish clean version to /Users/Shared/ksvc/threads/.\n\nFile Organization Convention\n\nCRITICAL: Flat folder structure, one folder per post.\n\n/Users/Shared/ksvc/threads/\n├── 2026-02-03-humanoid-robotics/\n│   ├── thread.md                         # Clean content (ready to post)\n│   ├── _metadata.md                      # Internal reference (not for posting)\n│   ├── chart1_market_size.png\n│   ├── chart2_component_breakdown.png\n│   └── chart3_taiwan_companies.png\n└── 2026-02-05-guc-valuation-debate/\n    ├── thread.md                         # Clean content (ready to post)\n    ├── _metadata.md                      # Internal reference (not for posting)\n    ├── guc-eps-comparison.png\n    └── guc-pt-comparison.png\n\n\nRules:\n\n✅ Flat structure: YYYY-MM-DD-topic/ at root level (not nested in 2026-02/)\n✅ Charts directly in folder (not in charts/ subfolder)\n✅ thread.md = clean content only (no metadata header)\n✅ _metadata.md = internal reference (sources, audit, not for posting)\nthread.md Format\n\nClean version with just the tweets - no metadata header:\n\n# [Topic Title]\n\n1/ [First tweet]\n\n2/ [Second tweet]\n- bullet point\n- bullet point\n\n3/ [Third tweet]\n\n...\n\n_metadata.md Format\n\nInternal reference file (prefixed with _ to indicate not for posting):\n\n# Metadata (not for posting)\n\n**Date:** YYYY-MM-DD\n**Type:** Long Thread (10 tweets) | Quick Take | etc.\n**Status:** READY TO POST\n\n## Sources\n- [Source 1 PDF name] ([Date])\n- [Source 2 PDF name] ([Date])\n\n## KSVC Holdings Check\n- ✅ Held in [Model name] (+X.X% since entry) OR\n- ❌ Not held in any of 7 models (checked USA 1-5, TWSE 1-2)\n- Integration strategy: [Personal stakes | Industry framing | Victory lap]\n\n## Audit Log\n- [Key claim verified via RLM grep]\n- [Correction made: old → new]\n- [Methodology improvement discovered]\n\n## Charts\n- chart1_*.png - [Description] ([Data source])\n- chart2_*.png - [Description] ([Data source])\n\n## Notes\n- [Special handling notes]\n- [Lessons learned]\n\n\nExample: See /Users/Shared/ksvc/threads/2026-02-05-guc-valuation-debate/_metadata.md\n\nPublish Workflow\n# 1. Create publish folder (flat structure)\nmkdir -p /Users/Shared/ksvc/threads/YYYY-MM-DD-topic\n\n# 2. Copy clean content as thread.md\ncp draft/YYYY-MM-DD-topic-assets/YYYY-MM-DD-topic-citrini7.md \\\n   /Users/Shared/ksvc/threads/YYYY-MM-DD-topic/thread.md\n\n# 3. Copy charts directly into folder (not subfolder)\ncp draft/YYYY-MM-DD-topic-assets/chart*.png \\\n   /Users/Shared/ksvc/threads/YYYY-MM-DD-topic/\n\n# 4. Create _metadata.md from draft notes\n# (Document sources, audit log, holdings, charts)\n\n\nResult:\n\n/Users/Shared/ksvc/threads/YYYY-MM-DD-topic/\n├── thread.md               # Ready to post\n├── _metadata.md            # Internal reference\n├── chart1_*.png\n└── chart2_*.png\n\nWhen to Publish\nStatus\tAction\nDraft approved\tPublish to /Users/Shared/ksvc/threads/\nNeeds revision\tStay in content-pipeline/draft/\nPosted to X\tMove to /Users/Shared/ksvc/threads/archive/ (optional)\nQuality Checklist\n\nExtraction (Step 1a/1b):\n\n ⚠️ Checked published threads (/Users/Shared/ksvc/threads/) before topic selection\n Topic does NOT duplicate a recently published thread (same source + same angle = reject)\n Scanned recent PDF folders (at least 3) with Explore agents\n Identified cross-document connections\n Deep extracted key reports with RLM\n Charts/images extracted and reviewed (use --extract-images)\n ⚠️ Extraction validation: Every PDF's chars_extracted checked against expected size\n ⚠️ Read tool fallback used for any PDF with < 1000 chars (or suspiciously low for page count)\n Key numbers verified via RLM grep (not just Explore summary)\n\nCross-Doc Synthesis (Step 1c):\n\n Used rlm-repl-multi to compare across sources (if multiple PDFs)\n Asked synthesis questions (consensus, comparison, disagreement)\n Documented synthesized insights in cache\n Flagged cross-doc claims for audit in Step 4b\n Identified unique insights that single-source extraction would miss\n\nContent:\n\n All published numbers have RLM grep confirmation\n Technical specifics included (fabs, yields, WPM)\n Time frames clear (Q1 2026, 2027e)\n Sources cited (multiple reports for cross-doc)\n Cross-doc reasoning: claims triangulated across multiple reports\n Unique insight that connects dots others miss\n\nKSVC:\n\n equitySeries checked (all 5 US + 2 TWSE models)\n filledOrders fallback checked (all 7 models) if equitySeries shows 0%\n Entry prices noted for victory lap potential\n Integration strategy clear (held vs industry framing)\n\nVoice:\n\n Appropriate type (thread vs quick take vs reaction)\n Skepticism included where uncertain\n Energy for high-conviction points\n Not over-polished\n\nHumanizer (Step 4d):\n\n No AI patterns (em-dashes, \"Full stop\", etc.)\n Has personality/voice\n Shows thinking process, not just conclusions\n\nAudit (Step 4a):\n\n All factual claims extracted from draft\n Each claim verified via RLM grep against source\n Taiwan company names verified via TWSE API\n No FAIL status claims remain\n UNSOURCED claims either removed, caveated, or sourced\n Audit report generated and attached to draft\n\nCharts (Step 6):\n\n Identified chartable claims in draft\n Data pulled from RLM cache (not draft text)\n ⚠️ SOURCE DECLARED before generating (metric + page + exact values)\n ⚠️ Source image saved FIRST (before chart generation)\n ⚠️ Source image contains same metric as chart (not transformed)\n If data transformed, transformation documented and justified\n Used chart-factory with theme-factory theme\n Verification agent confirmed data→chart integrity\n generate_charts.py script included (reproducibility)\n\nPublish (Step 7):\n\n Draft approved for posting\n Created folder in /Users/Shared/ksvc/threads/YYYY-MM-DD-topic/\n thread.md contains clean tweets only (no metadata header)\n _metadata.md contains sources, audit log, chart descriptions\n Charts copied to final folder\n Verified all files present before announcing ready to post\nPDF Location\n\nResearch PDFs: /Users/Shared/ksvc/pdfs/\n\nls -la /Users/Shared/ksvc/pdfs/ | tail -5\n\nReferences\nreferences/kirk-voice.md - PRIMARY - Unified voice guide with all content types, formulas, and templates\nreferences/serenity-style.md - Deep dive: data-heavy thread patterns\nreferences/citrini7-style.md - Deep dive: punchy quick take patterns\nFull creator studies: ksvc-intern/content-pipeline/creator-studies/"
  },
  "trust": {
    "sourceLabel": "tencent",
    "provenanceUrl": "https://clawhub.ai/lukerspace/kirk-content-pipeline",
    "publisherUrl": "https://clawhub.ai/lukerspace/kirk-content-pipeline",
    "owner": "lukerspace",
    "version": "1.0.0",
    "license": null,
    "verificationStatus": "Indexed source record"
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/kirk-content-pipeline",
    "downloadUrl": "https://openagent3.xyz/downloads/kirk-content-pipeline",
    "agentUrl": "https://openagent3.xyz/skills/kirk-content-pipeline/agent",
    "manifestUrl": "https://openagent3.xyz/skills/kirk-content-pipeline/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/kirk-content-pipeline/agent.md"
  }
}