{
  "schemaVersion": "1.0",
  "item": {
    "slug": "windows-control",
    "name": "Windows Control",
    "source": "tencent",
    "type": "skill",
    "category": "开发工具",
    "sourceUrl": "https://clawhub.ai/Spliff7777/windows-control",
    "canonicalUrl": "https://clawhub.ai/Spliff7777/windows-control",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadMode": "redirect",
    "downloadUrl": "/downloads/windows-control",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=windows-control",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "installMethod": "Manual import",
    "extraction": "Extract archive",
    "prerequisites": [
      "OpenClaw"
    ],
    "packageFormat": "ZIP package",
    "includedAssets": [
      "package.json",
      "SKILL.md",
      "scripts/click.py",
      "scripts/click_element.py",
      "scripts/click_text.py",
      "scripts/close_window.py"
    ],
    "primaryDoc": "SKILL.md",
    "quickSetup": [
      "Download the package from Yavira.",
      "Extract the archive and review SKILL.md first.",
      "Import or place the package into your OpenClaw setup."
    ],
    "agentAssist": {
      "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
      "steps": [
        "Download the package from Yavira.",
        "Extract it into a folder your agent can access.",
        "Paste one of the prompts below and point your agent at the extracted folder."
      ],
      "prompts": [
        {
          "label": "New install",
          "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
        },
        {
          "label": "Upgrade existing",
          "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
        }
      ]
    },
    "sourceHealth": {
      "source": "tencent",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-05-07T17:22:31.273Z",
      "expiresAt": "2026-05-14T17:22:31.273Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=afrexai-annual-report",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=afrexai-annual-report",
        "contentDisposition": "attachment; filename=\"afrexai-annual-report-1.0.0.zip\"",
        "redirectLocation": null,
        "bodySnippet": null
      },
      "scope": "source",
      "summary": "Source download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this source.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/windows-control"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    },
    "downloadPageUrl": "https://openagent3.xyz/downloads/windows-control",
    "agentPageUrl": "https://openagent3.xyz/skills/windows-control/agent",
    "manifestUrl": "https://openagent3.xyz/skills/windows-control/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/windows-control/agent.md"
  },
  "agentAssist": {
    "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
    "steps": [
      "Download the package from Yavira.",
      "Extract it into a folder your agent can access.",
      "Paste one of the prompts below and point your agent at the extracted folder."
    ],
    "prompts": [
      {
        "label": "New install",
        "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
      },
      {
        "label": "Upgrade existing",
        "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
      }
    ]
  },
  "documentation": {
    "source": "clawhub",
    "primaryDoc": "SKILL.md",
    "sections": [
      {
        "title": "Windows Control Skill",
        "body": "Full desktop automation for Windows. Control mouse, keyboard, and screen like a human user."
      },
      {
        "title": "Quick Start",
        "body": "All scripts are in skills/windows-control/scripts/"
      },
      {
        "title": "Screenshot",
        "body": "py screenshot.py > output.b64\n\nReturns base64 PNG of entire screen."
      },
      {
        "title": "Click",
        "body": "py click.py 500 300              # Left click at (500, 300)\npy click.py 500 300 right        # Right click\npy click.py 500 300 left 2       # Double click"
      },
      {
        "title": "Type Text",
        "body": "py type_text.py \"Hello World\"\n\nTypes text at current cursor position (10ms between keys)."
      },
      {
        "title": "Press Keys",
        "body": "py key_press.py \"enter\"\npy key_press.py \"ctrl+s\"\npy key_press.py \"alt+tab\"\npy key_press.py \"ctrl+shift+esc\""
      },
      {
        "title": "Move Mouse",
        "body": "py mouse_move.py 500 300\n\nMoves mouse to coordinates (smooth 0.2s animation)."
      },
      {
        "title": "Scroll",
        "body": "py scroll.py up 5      # Scroll up 5 notches\npy scroll.py down 10   # Scroll down 10 notches"
      },
      {
        "title": "Window Management (NEW!)",
        "body": "py focus_window.py \"Chrome\"           # Bring window to front\npy minimize_window.py \"Notepad\"       # Minimize window\npy maximize_window.py \"VS Code\"       # Maximize window\npy close_window.py \"Calculator\"       # Close window\npy get_active_window.py               # Get title of active window"
      },
      {
        "title": "Advanced Actions (NEW!)",
        "body": "# Click by text (No coordinates needed!)\npy click_text.py \"Save\"               # Click \"Save\" button anywhere\npy click_text.py \"Submit\" \"Chrome\"    # Click \"Submit\" in Chrome only\n\n# Drag and Drop\npy drag.py 100 100 500 300            # Drag from (100,100) to (500,300)\n\n# Robust Automation (Wait/Find)\npy wait_for_text.py \"Ready\" \"App\" 30  # Wait up to 30s for text\npy wait_for_window.py \"Notepad\" 10    # Wait for window to appear\npy find_text.py \"Login\" \"Chrome\"      # Get coordinates of text\npy list_windows.py                    # List all open windows"
      },
      {
        "title": "Read Window Text",
        "body": "py read_window.py \"Notepad\"           # Read all text from Notepad\npy read_window.py \"Visual Studio\"     # Read text from VS Code\npy read_window.py \"Chrome\"            # Read text from browser\n\nUses Windows UI Automation to extract actual text (not OCR). Much faster and more accurate than screenshots!"
      },
      {
        "title": "Read UI Elements (NEW!)",
        "body": "py read_ui_elements.py \"Chrome\"               # All interactive elements\npy read_ui_elements.py \"Chrome\" --buttons-only  # Just buttons\npy read_ui_elements.py \"Chrome\" --links-only    # Just links\npy read_ui_elements.py \"Chrome\" --json          # JSON output\n\nReturns buttons, links, tabs, checkboxes, dropdowns with coordinates for clicking."
      },
      {
        "title": "Read Webpage Content (NEW!)",
        "body": "py read_webpage.py                     # Read active browser\npy read_webpage.py \"Chrome\"            # Target Chrome specifically\npy read_webpage.py \"Chrome\" --buttons  # Include buttons\npy read_webpage.py \"Chrome\" --links    # Include links with coords\npy read_webpage.py \"Chrome\" --full     # All elements (inputs, images)\npy read_webpage.py \"Chrome\" --json     # JSON output\n\nEnhanced browser content extraction with headings, text, buttons, and links."
      },
      {
        "title": "Handle Dialogs (NEW!)",
        "body": "# List all open dialogs\npy handle_dialog.py list\n\n# Read current dialog content\npy handle_dialog.py read\npy handle_dialog.py read --json\n\n# Click button in dialog\npy handle_dialog.py click \"OK\"\npy handle_dialog.py click \"Save\"\npy handle_dialog.py click \"Yes\"\n\n# Type into dialog text field\npy handle_dialog.py type \"myfile.txt\"\npy handle_dialog.py type \"C:\\path\\to\\file\" --field 0\n\n# Dismiss dialog (auto-finds OK/Close/Cancel)\npy handle_dialog.py dismiss\n\n# Wait for dialog to appear\npy handle_dialog.py wait --timeout 10\npy handle_dialog.py wait \"Save As\" --timeout 5\n\nHandles Save/Open dialogs, message boxes, alerts, confirmations, etc."
      },
      {
        "title": "Click Element by Name (NEW!)",
        "body": "py click_element.py \"Save\"                    # Click \"Save\" anywhere\npy click_element.py \"OK\" --window \"Notepad\"   # In specific window\npy click_element.py \"Submit\" --type Button    # Only buttons\npy click_element.py \"File\" --type MenuItem    # Menu items\npy click_element.py --list                    # List clickable elements\npy click_element.py --list --window \"Chrome\"  # List in specific window\n\nClick buttons, links, menu items by name without needing coordinates."
      },
      {
        "title": "Read Screen Region (OCR - Optional)",
        "body": "py read_region.py 100 100 500 300     # Read text from coordinates\n\nNote: Requires Tesseract OCR installation. Use read_window.py instead for better results."
      },
      {
        "title": "Workflow Pattern",
        "body": "Read window - Extract text from specific window (fast, accurate)\nRead UI elements - Get buttons, links with coordinates\nScreenshot (if needed) - See visual layout\nAct - Click element by name or coordinates\nHandle dialogs - Interact with popups/save dialogs\nRead window - Verify changes"
      },
      {
        "title": "Screen Coordinates",
        "body": "Origin (0, 0) is top-left corner\nYour screen: 2560x1440 (check with screenshot)\nUse coordinates from screenshot analysis"
      },
      {
        "title": "Open Notepad and type",
        "body": "# Press Windows key\npy key_press.py \"win\"\n\n# Type \"notepad\"\npy type_text.py \"notepad\"\n\n# Press Enter\npy key_press.py \"enter\"\n\n# Wait a moment, then type\npy type_text.py \"Hello from AI!\"\n\n# Save\npy key_press.py \"ctrl+s\""
      },
      {
        "title": "Click in VS Code",
        "body": "# Read current VS Code content\npy read_window.py \"Visual Studio Code\"\n\n# Click at specific location (e.g., file explorer)\npy click.py 50 100\n\n# Type filename\npy type_text.py \"test.js\"\n\n# Press Enter\npy key_press.py \"enter\"\n\n# Verify new file opened\npy read_window.py \"Visual Studio Code\""
      },
      {
        "title": "Monitor Notepad changes",
        "body": "# Read current content\npy read_window.py \"Notepad\"\n\n# User types something...\n\n# Read updated content (no screenshot needed!)\npy read_window.py \"Notepad\""
      },
      {
        "title": "Text Reading Methods",
        "body": "Method 1: Windows UI Automation (BEST)\n\nUse read_window.py for any window\nUse read_ui_elements.py for buttons/links with coordinates\nUse read_webpage.py for browser content with structure\nGets actual text data (not image-based)\n\nMethod 2: Click by Name (NEW)\n\nUse click_element.py to click buttons/links by name\nNo coordinates needed - finds elements automatically\nWorks across all windows or target specific window\n\nMethod 3: Dialog Handling (NEW)\n\nUse handle_dialog.py for popups, save dialogs, alerts\nRead dialog content, click buttons, type text\nAuto-dismiss with common buttons (OK, Cancel, etc.)\n\nMethod 4: Screenshot + Vision (Fallback)\n\nTake full screenshot\nAI reads text visually\nSlower but works for any content\n\nMethod 5: OCR (Optional)\n\nUse read_region.py with Tesseract\nRequires additional installation\nGood for images/PDFs with text"
      },
      {
        "title": "Safety Features",
        "body": "pyautogui.FAILSAFE = True (move mouse to top-left to abort)\nSmall delays between actions\nSmooth mouse movements (not instant jumps)"
      },
      {
        "title": "Requirements",
        "body": "Python 3.11+\npyautogui (installed ✅)\npillow (installed ✅)"
      },
      {
        "title": "Tips",
        "body": "Always screenshot first to see current state\nCoordinates are absolute (not relative to windows)\nWait briefly after clicks for UI to update\nUse ctrl+z friendly actions when possible\n\nStatus: ✅ READY FOR USE (v2.0 - Dialog & UI Elements)\nCreated: 2026-02-01\nUpdated: 2026-02-02"
      }
    ],
    "body": "Windows Control Skill\n\nFull desktop automation for Windows. Control mouse, keyboard, and screen like a human user.\n\nQuick Start\n\nAll scripts are in skills/windows-control/scripts/\n\nScreenshot\npy screenshot.py > output.b64\n\n\nReturns base64 PNG of entire screen.\n\nClick\npy click.py 500 300              # Left click at (500, 300)\npy click.py 500 300 right        # Right click\npy click.py 500 300 left 2       # Double click\n\nType Text\npy type_text.py \"Hello World\"\n\n\nTypes text at current cursor position (10ms between keys).\n\nPress Keys\npy key_press.py \"enter\"\npy key_press.py \"ctrl+s\"\npy key_press.py \"alt+tab\"\npy key_press.py \"ctrl+shift+esc\"\n\nMove Mouse\npy mouse_move.py 500 300\n\n\nMoves mouse to coordinates (smooth 0.2s animation).\n\nScroll\npy scroll.py up 5      # Scroll up 5 notches\npy scroll.py down 10   # Scroll down 10 notches\n\nWindow Management (NEW!)\npy focus_window.py \"Chrome\"           # Bring window to front\npy minimize_window.py \"Notepad\"       # Minimize window\npy maximize_window.py \"VS Code\"       # Maximize window\npy close_window.py \"Calculator\"       # Close window\npy get_active_window.py               # Get title of active window\n\nAdvanced Actions (NEW!)\n# Click by text (No coordinates needed!)\npy click_text.py \"Save\"               # Click \"Save\" button anywhere\npy click_text.py \"Submit\" \"Chrome\"    # Click \"Submit\" in Chrome only\n\n# Drag and Drop\npy drag.py 100 100 500 300            # Drag from (100,100) to (500,300)\n\n# Robust Automation (Wait/Find)\npy wait_for_text.py \"Ready\" \"App\" 30  # Wait up to 30s for text\npy wait_for_window.py \"Notepad\" 10    # Wait for window to appear\npy find_text.py \"Login\" \"Chrome\"      # Get coordinates of text\npy list_windows.py                    # List all open windows\n\nRead Window Text\npy read_window.py \"Notepad\"           # Read all text from Notepad\npy read_window.py \"Visual Studio\"     # Read text from VS Code\npy read_window.py \"Chrome\"            # Read text from browser\n\n\nUses Windows UI Automation to extract actual text (not OCR). Much faster and more accurate than screenshots!\n\nRead UI Elements (NEW!)\npy read_ui_elements.py \"Chrome\"               # All interactive elements\npy read_ui_elements.py \"Chrome\" --buttons-only  # Just buttons\npy read_ui_elements.py \"Chrome\" --links-only    # Just links\npy read_ui_elements.py \"Chrome\" --json          # JSON output\n\n\nReturns buttons, links, tabs, checkboxes, dropdowns with coordinates for clicking.\n\nRead Webpage Content (NEW!)\npy read_webpage.py                     # Read active browser\npy read_webpage.py \"Chrome\"            # Target Chrome specifically\npy read_webpage.py \"Chrome\" --buttons  # Include buttons\npy read_webpage.py \"Chrome\" --links    # Include links with coords\npy read_webpage.py \"Chrome\" --full     # All elements (inputs, images)\npy read_webpage.py \"Chrome\" --json     # JSON output\n\n\nEnhanced browser content extraction with headings, text, buttons, and links.\n\nHandle Dialogs (NEW!)\n# List all open dialogs\npy handle_dialog.py list\n\n# Read current dialog content\npy handle_dialog.py read\npy handle_dialog.py read --json\n\n# Click button in dialog\npy handle_dialog.py click \"OK\"\npy handle_dialog.py click \"Save\"\npy handle_dialog.py click \"Yes\"\n\n# Type into dialog text field\npy handle_dialog.py type \"myfile.txt\"\npy handle_dialog.py type \"C:\\path\\to\\file\" --field 0\n\n# Dismiss dialog (auto-finds OK/Close/Cancel)\npy handle_dialog.py dismiss\n\n# Wait for dialog to appear\npy handle_dialog.py wait --timeout 10\npy handle_dialog.py wait \"Save As\" --timeout 5\n\n\nHandles Save/Open dialogs, message boxes, alerts, confirmations, etc.\n\nClick Element by Name (NEW!)\npy click_element.py \"Save\"                    # Click \"Save\" anywhere\npy click_element.py \"OK\" --window \"Notepad\"   # In specific window\npy click_element.py \"Submit\" --type Button    # Only buttons\npy click_element.py \"File\" --type MenuItem    # Menu items\npy click_element.py --list                    # List clickable elements\npy click_element.py --list --window \"Chrome\"  # List in specific window\n\n\nClick buttons, links, menu items by name without needing coordinates.\n\nRead Screen Region (OCR - Optional)\npy read_region.py 100 100 500 300     # Read text from coordinates\n\n\nNote: Requires Tesseract OCR installation. Use read_window.py instead for better results.\n\nWorkflow Pattern\nRead window - Extract text from specific window (fast, accurate)\nRead UI elements - Get buttons, links with coordinates\nScreenshot (if needed) - See visual layout\nAct - Click element by name or coordinates\nHandle dialogs - Interact with popups/save dialogs\nRead window - Verify changes\nScreen Coordinates\nOrigin (0, 0) is top-left corner\nYour screen: 2560x1440 (check with screenshot)\nUse coordinates from screenshot analysis\nExamples\nOpen Notepad and type\n# Press Windows key\npy key_press.py \"win\"\n\n# Type \"notepad\"\npy type_text.py \"notepad\"\n\n# Press Enter\npy key_press.py \"enter\"\n\n# Wait a moment, then type\npy type_text.py \"Hello from AI!\"\n\n# Save\npy key_press.py \"ctrl+s\"\n\nClick in VS Code\n# Read current VS Code content\npy read_window.py \"Visual Studio Code\"\n\n# Click at specific location (e.g., file explorer)\npy click.py 50 100\n\n# Type filename\npy type_text.py \"test.js\"\n\n# Press Enter\npy key_press.py \"enter\"\n\n# Verify new file opened\npy read_window.py \"Visual Studio Code\"\n\nMonitor Notepad changes\n# Read current content\npy read_window.py \"Notepad\"\n\n# User types something...\n\n# Read updated content (no screenshot needed!)\npy read_window.py \"Notepad\"\n\nText Reading Methods\n\nMethod 1: Windows UI Automation (BEST)\n\nUse read_window.py for any window\nUse read_ui_elements.py for buttons/links with coordinates\nUse read_webpage.py for browser content with structure\nGets actual text data (not image-based)\n\nMethod 2: Click by Name (NEW)\n\nUse click_element.py to click buttons/links by name\nNo coordinates needed - finds elements automatically\nWorks across all windows or target specific window\n\nMethod 3: Dialog Handling (NEW)\n\nUse handle_dialog.py for popups, save dialogs, alerts\nRead dialog content, click buttons, type text\nAuto-dismiss with common buttons (OK, Cancel, etc.)\n\nMethod 4: Screenshot + Vision (Fallback)\n\nTake full screenshot\nAI reads text visually\nSlower but works for any content\n\nMethod 5: OCR (Optional)\n\nUse read_region.py with Tesseract\nRequires additional installation\nGood for images/PDFs with text\nSafety Features\npyautogui.FAILSAFE = True (move mouse to top-left to abort)\nSmall delays between actions\nSmooth mouse movements (not instant jumps)\nRequirements\nPython 3.11+\npyautogui (installed ✅)\npillow (installed ✅)\nTips\nAlways screenshot first to see current state\nCoordinates are absolute (not relative to windows)\nWait briefly after clicks for UI to update\nUse ctrl+z friendly actions when possible\n\nStatus: ✅ READY FOR USE (v2.0 - Dialog & UI Elements) Created: 2026-02-01 Updated: 2026-02-02"
  },
  "trust": {
    "sourceLabel": "tencent",
    "provenanceUrl": "https://clawhub.ai/Spliff7777/windows-control",
    "publisherUrl": "https://clawhub.ai/Spliff7777/windows-control",
    "owner": "Spliff7777",
    "version": "1.0.0",
    "license": null,
    "verificationStatus": "Indexed source record"
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/windows-control",
    "downloadUrl": "https://openagent3.xyz/downloads/windows-control",
    "agentUrl": "https://openagent3.xyz/skills/windows-control/agent",
    "manifestUrl": "https://openagent3.xyz/skills/windows-control/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/windows-control/agent.md"
  }
}