{
  "schemaVersion": "1.0",
  "item": {
    "slug": "llmfit",
    "name": "llmfit",
    "source": "tencent",
    "type": "skill",
    "category": "AI 智能",
    "sourceUrl": "https://clawhub.ai/AlexsJones/llmfit",
    "canonicalUrl": "https://clawhub.ai/AlexsJones/llmfit",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadMode": "redirect",
    "downloadUrl": "/downloads/llmfit",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=llmfit",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "installMethod": "Manual import",
    "extraction": "Extract archive",
    "prerequisites": [
      "OpenClaw"
    ],
    "packageFormat": "ZIP package",
    "includedAssets": [
      "SKILL.md"
    ],
    "primaryDoc": "SKILL.md",
    "quickSetup": [
      "Download the package from Yavira.",
      "Extract the archive and review SKILL.md first.",
      "Import or place the package into your OpenClaw setup."
    ],
    "agentAssist": {
      "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
      "steps": [
        "Download the package from Yavira.",
        "Extract it into a folder your agent can access.",
        "Paste one of the prompts below and point your agent at the extracted folder."
      ],
      "prompts": [
        {
          "label": "New install",
          "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
        },
        {
          "label": "Upgrade existing",
          "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
        }
      ]
    },
    "sourceHealth": {
      "source": "tencent",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-04-30T16:55:25.780Z",
      "expiresAt": "2026-05-07T16:55:25.780Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
        "contentDisposition": "attachment; filename=\"network-1.0.0.zip\"",
        "redirectLocation": null,
        "bodySnippet": null
      },
      "scope": "source",
      "summary": "Source download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this source.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/llmfit"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    },
    "downloadPageUrl": "https://openagent3.xyz/downloads/llmfit",
    "agentPageUrl": "https://openagent3.xyz/skills/llmfit/agent",
    "manifestUrl": "https://openagent3.xyz/skills/llmfit/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/llmfit/agent.md"
  },
  "agentAssist": {
    "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
    "steps": [
      "Download the package from Yavira.",
      "Extract it into a folder your agent can access.",
      "Paste one of the prompts below and point your agent at the extracted folder."
    ],
    "prompts": [
      {
        "label": "New install",
        "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
      },
      {
        "label": "Upgrade existing",
        "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
      }
    ]
  },
  "documentation": {
    "source": "clawhub",
    "primaryDoc": "SKILL.md",
    "sections": [
      {
        "title": "llmfit-advisor",
        "body": "Hardware-aware local LLM advisor. Detects your system specs (RAM, CPU, GPU/VRAM) and recommends models that actually fit, with optimal quantization and speed estimates."
      },
      {
        "title": "When to use (trigger phrases)",
        "body": "Use this skill immediately when the user asks any of:\n\n\"what local models can I run?\"\n\"which LLMs fit my hardware?\"\n\"recommend a local model\"\n\"what's the best model for my GPU?\"\n\"can I run Llama 70B locally?\"\n\"configure local models\"\n\"set up Ollama models\"\n\"what models fit my VRAM?\"\n\"help me pick a local model for coding\"\n\nAlso use this skill when:\n\nThe user wants to configure models.providers.ollama or models.providers.lmstudio\nThe user mentions running models locally and you need to know what fits\nA model recommendation is needed and the user has local inference capability (Ollama, vLLM, LM Studio)"
      },
      {
        "title": "Detect hardware",
        "body": "llmfit --json system\n\nReturns JSON with CPU, RAM, GPU name, VRAM, multi-GPU info, and whether memory is unified (Apple Silicon)."
      },
      {
        "title": "Get top recommendations",
        "body": "llmfit recommend --json --limit 5\n\nReturns the top 5 models ranked by a composite score (quality, speed, fit, context) with optimal quantization for the detected hardware."
      },
      {
        "title": "Filter by use case",
        "body": "llmfit recommend --json --use-case coding --limit 3\nllmfit recommend --json --use-case reasoning --limit 3\nllmfit recommend --json --use-case chat --limit 3\n\nValid use cases: general, coding, reasoning, chat, multimodal, embedding."
      },
      {
        "title": "Filter by minimum fit level",
        "body": "llmfit recommend --json --min-fit good --limit 10\n\nValid fit levels (best to worst): perfect, good, marginal."
      },
      {
        "title": "System JSON",
        "body": "{\n  \"system\": {\n    \"cpu_name\": \"Apple M2 Max\",\n    \"cpu_cores\": 12,\n    \"total_ram_gb\": 32.0,\n    \"available_ram_gb\": 24.5,\n    \"has_gpu\": true,\n    \"gpu_name\": \"Apple M2 Max\",\n    \"gpu_vram_gb\": 32.0,\n    \"gpu_count\": 1,\n    \"backend\": \"Metal\",\n    \"unified_memory\": true\n  }\n}"
      },
      {
        "title": "Recommendation JSON",
        "body": "Each model in the models array includes:\n\nFieldMeaningnameHuggingFace model ID (e.g. meta-llama/Llama-3.1-8B-Instruct)providerModel provider (Meta, Alibaba, Google, etc.)params_bParameter count in billionsscoreComposite score 0–100 (higher is better)score_componentsBreakdown: quality, speed, fit, context (each 0–100)fit_levelPerfect, Good, Marginal, or TooTightrun_modeGPU, CPU+GPU Offload, or CPU Onlybest_quantOptimal quantization for the hardware (e.g. Q5_K_M, Q4_K_M)estimated_tpsEstimated tokens per secondmemory_required_gbVRAM/RAM needed at this quantizationmemory_available_gbAvailable VRAM/RAM detectedutilization_pctHow much of available memory the model usesuse_caseWhat the model is designed forcontext_lengthMaximum context window"
      },
      {
        "title": "Fit levels explained",
        "body": "Perfect: Model fits comfortably with room to spare. Ideal choice.\nGood: Model fits but uses most available memory. Will work well.\nMarginal: Model barely fits. May work but expect slower performance or reduced context.\nTooTight: Model does not fit. Do not recommend."
      },
      {
        "title": "Run modes explained",
        "body": "GPU: Full GPU inference. Fastest. Model weights loaded entirely into VRAM.\nCPU+GPU Offload: Some layers on GPU, rest in system RAM. Slower than pure GPU.\nCPU Only: All inference on CPU using system RAM. Slowest but works without GPU."
      },
      {
        "title": "Configuring OpenClaw with results",
        "body": "After getting recommendations, configure the user's local model provider."
      },
      {
        "title": "For Ollama",
        "body": "Map the HuggingFace model name to its Ollama tag. Common mappings:\n\nllmfit nameOllama tagmeta-llama/Llama-3.1-8B-Instructllama3.1:8bmeta-llama/Llama-3.3-70B-Instructllama3.3:70bQwen/Qwen2.5-Coder-7B-Instructqwen2.5-coder:7bQwen/Qwen2.5-72B-Instructqwen2.5:72bdeepseek-ai/DeepSeek-Coder-V2-Lite-Instructdeepseek-coder-v2:16bdeepseek-ai/DeepSeek-R1-Distill-Qwen-32Bdeepseek-r1:32bgoogle/gemma-2-9b-itgemma2:9bmistralai/Mistral-7B-Instruct-v0.3mistral:7bmicrosoft/Phi-3-mini-4k-instructphi3:minimicrosoft/Phi-4-mini-instructphi4-mini\n\nThen update openclaw.json:\n\n{\n  \"models\": {\n    \"providers\": {\n      \"ollama\": {\n        \"models\": [\"ollama/<ollama-tag>\"]\n      }\n    }\n  }\n}\n\nAnd optionally set as default:\n\n{\n  \"agents\": {\n    \"defaults\": {\n      \"model\": {\n        \"primary\": \"ollama/<ollama-tag>\"\n      }\n    }\n  }\n}"
      },
      {
        "title": "For vLLM / LM Studio",
        "body": "Use the HuggingFace model name directly as the model identifier with the appropriate provider prefix (vllm/ or lmstudio/)."
      },
      {
        "title": "Workflow example",
        "body": "When a user asks \"what local models can I run?\":\n\nRun llmfit --json system to show hardware summary\nRun llmfit recommend --json --limit 5 to get top picks\nPresent the recommendations with scores and fit levels\nIf the user wants to configure one, map it to the appropriate Ollama/vLLM/LM Studio tag\nOffer to update openclaw.json with the chosen model\n\nWhen a user asks for a specific use case like \"recommend a coding model\":\n\nRun llmfit recommend --json --use-case coding --limit 3\nPresent the coding-specific recommendations\nOffer to pull via Ollama and configure"
      },
      {
        "title": "Notes",
        "body": "llmfit detects NVIDIA GPUs (via nvidia-smi), AMD GPUs (via rocm-smi), and Apple Silicon (unified memory).\nMulti-GPU setups aggregate VRAM across cards automatically.\nThe best_quant field tells you the optimal quantization — higher quant (Q6_K, Q8_0) means better quality if VRAM allows.\nSpeed estimates (estimated_tps) are approximate and vary by hardware and quantization.\nModels with fit_level: \"TooTight\" should never be recommended to users."
      }
    ],
    "body": "llmfit-advisor\n\nHardware-aware local LLM advisor. Detects your system specs (RAM, CPU, GPU/VRAM) and recommends models that actually fit, with optimal quantization and speed estimates.\n\nWhen to use (trigger phrases)\n\nUse this skill immediately when the user asks any of:\n\n\"what local models can I run?\"\n\"which LLMs fit my hardware?\"\n\"recommend a local model\"\n\"what's the best model for my GPU?\"\n\"can I run Llama 70B locally?\"\n\"configure local models\"\n\"set up Ollama models\"\n\"what models fit my VRAM?\"\n\"help me pick a local model for coding\"\n\nAlso use this skill when:\n\nThe user wants to configure models.providers.ollama or models.providers.lmstudio\nThe user mentions running models locally and you need to know what fits\nA model recommendation is needed and the user has local inference capability (Ollama, vLLM, LM Studio)\nQuick start\nDetect hardware\nllmfit --json system\n\n\nReturns JSON with CPU, RAM, GPU name, VRAM, multi-GPU info, and whether memory is unified (Apple Silicon).\n\nGet top recommendations\nllmfit recommend --json --limit 5\n\n\nReturns the top 5 models ranked by a composite score (quality, speed, fit, context) with optimal quantization for the detected hardware.\n\nFilter by use case\nllmfit recommend --json --use-case coding --limit 3\nllmfit recommend --json --use-case reasoning --limit 3\nllmfit recommend --json --use-case chat --limit 3\n\n\nValid use cases: general, coding, reasoning, chat, multimodal, embedding.\n\nFilter by minimum fit level\nllmfit recommend --json --min-fit good --limit 10\n\n\nValid fit levels (best to worst): perfect, good, marginal.\n\nUnderstanding the output\nSystem JSON\n{\n  \"system\": {\n    \"cpu_name\": \"Apple M2 Max\",\n    \"cpu_cores\": 12,\n    \"total_ram_gb\": 32.0,\n    \"available_ram_gb\": 24.5,\n    \"has_gpu\": true,\n    \"gpu_name\": \"Apple M2 Max\",\n    \"gpu_vram_gb\": 32.0,\n    \"gpu_count\": 1,\n    \"backend\": \"Metal\",\n    \"unified_memory\": true\n  }\n}\n\nRecommendation JSON\n\nEach model in the models array includes:\n\nField\tMeaning\nname\tHuggingFace model ID (e.g. meta-llama/Llama-3.1-8B-Instruct)\nprovider\tModel provider (Meta, Alibaba, Google, etc.)\nparams_b\tParameter count in billions\nscore\tComposite score 0–100 (higher is better)\nscore_components\tBreakdown: quality, speed, fit, context (each 0–100)\nfit_level\tPerfect, Good, Marginal, or TooTight\nrun_mode\tGPU, CPU+GPU Offload, or CPU Only\nbest_quant\tOptimal quantization for the hardware (e.g. Q5_K_M, Q4_K_M)\nestimated_tps\tEstimated tokens per second\nmemory_required_gb\tVRAM/RAM needed at this quantization\nmemory_available_gb\tAvailable VRAM/RAM detected\nutilization_pct\tHow much of available memory the model uses\nuse_case\tWhat the model is designed for\ncontext_length\tMaximum context window\nFit levels explained\nPerfect: Model fits comfortably with room to spare. Ideal choice.\nGood: Model fits but uses most available memory. Will work well.\nMarginal: Model barely fits. May work but expect slower performance or reduced context.\nTooTight: Model does not fit. Do not recommend.\nRun modes explained\nGPU: Full GPU inference. Fastest. Model weights loaded entirely into VRAM.\nCPU+GPU Offload: Some layers on GPU, rest in system RAM. Slower than pure GPU.\nCPU Only: All inference on CPU using system RAM. Slowest but works without GPU.\nConfiguring OpenClaw with results\n\nAfter getting recommendations, configure the user's local model provider.\n\nFor Ollama\n\nMap the HuggingFace model name to its Ollama tag. Common mappings:\n\nllmfit name\tOllama tag\nmeta-llama/Llama-3.1-8B-Instruct\tllama3.1:8b\nmeta-llama/Llama-3.3-70B-Instruct\tllama3.3:70b\nQwen/Qwen2.5-Coder-7B-Instruct\tqwen2.5-coder:7b\nQwen/Qwen2.5-72B-Instruct\tqwen2.5:72b\ndeepseek-ai/DeepSeek-Coder-V2-Lite-Instruct\tdeepseek-coder-v2:16b\ndeepseek-ai/DeepSeek-R1-Distill-Qwen-32B\tdeepseek-r1:32b\ngoogle/gemma-2-9b-it\tgemma2:9b\nmistralai/Mistral-7B-Instruct-v0.3\tmistral:7b\nmicrosoft/Phi-3-mini-4k-instruct\tphi3:mini\nmicrosoft/Phi-4-mini-instruct\tphi4-mini\n\nThen update openclaw.json:\n\n{\n  \"models\": {\n    \"providers\": {\n      \"ollama\": {\n        \"models\": [\"ollama/<ollama-tag>\"]\n      }\n    }\n  }\n}\n\n\nAnd optionally set as default:\n\n{\n  \"agents\": {\n    \"defaults\": {\n      \"model\": {\n        \"primary\": \"ollama/<ollama-tag>\"\n      }\n    }\n  }\n}\n\nFor vLLM / LM Studio\n\nUse the HuggingFace model name directly as the model identifier with the appropriate provider prefix (vllm/ or lmstudio/).\n\nWorkflow example\n\nWhen a user asks \"what local models can I run?\":\n\nRun llmfit --json system to show hardware summary\nRun llmfit recommend --json --limit 5 to get top picks\nPresent the recommendations with scores and fit levels\nIf the user wants to configure one, map it to the appropriate Ollama/vLLM/LM Studio tag\nOffer to update openclaw.json with the chosen model\n\nWhen a user asks for a specific use case like \"recommend a coding model\":\n\nRun llmfit recommend --json --use-case coding --limit 3\nPresent the coding-specific recommendations\nOffer to pull via Ollama and configure\nNotes\nllmfit detects NVIDIA GPUs (via nvidia-smi), AMD GPUs (via rocm-smi), and Apple Silicon (unified memory).\nMulti-GPU setups aggregate VRAM across cards automatically.\nThe best_quant field tells you the optimal quantization — higher quant (Q6_K, Q8_0) means better quality if VRAM allows.\nSpeed estimates (estimated_tps) are approximate and vary by hardware and quantization.\nModels with fit_level: \"TooTight\" should never be recommended to users."
  },
  "trust": {
    "sourceLabel": "tencent",
    "provenanceUrl": "https://clawhub.ai/AlexsJones/llmfit",
    "publisherUrl": "https://clawhub.ai/AlexsJones/llmfit",
    "owner": "AlexsJones",
    "version": "0.2.2",
    "license": null,
    "verificationStatus": "Indexed source record"
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/llmfit",
    "downloadUrl": "https://openagent3.xyz/downloads/llmfit",
    "agentUrl": "https://openagent3.xyz/skills/llmfit/agent",
    "manifestUrl": "https://openagent3.xyz/skills/llmfit/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/llmfit/agent.md"
  }
}