{
  "schemaVersion": "1.0",
  "item": {
    "slug": "senior-ml-engineer",
    "name": "Senior Ml Engineer",
    "source": "tencent",
    "type": "skill",
    "category": "AI 智能",
    "sourceUrl": "https://clawhub.ai/alirezarezvani/senior-ml-engineer",
    "canonicalUrl": "https://clawhub.ai/alirezarezvani/senior-ml-engineer",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadMode": "redirect",
    "downloadUrl": "/downloads/senior-ml-engineer",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=senior-ml-engineer",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "installMethod": "Manual import",
    "extraction": "Extract archive",
    "prerequisites": [
      "OpenClaw"
    ],
    "packageFormat": "ZIP package",
    "includedAssets": [
      "SKILL.md",
      "references/llm_integration_guide.md",
      "references/mlops_production_patterns.md",
      "references/rag_system_architecture.md",
      "scripts/ml_monitoring_suite.py",
      "scripts/model_deployment_pipeline.py"
    ],
    "primaryDoc": "SKILL.md",
    "quickSetup": [
      "Download the package from Yavira.",
      "Extract the archive and review SKILL.md first.",
      "Import or place the package into your OpenClaw setup."
    ],
    "agentAssist": {
      "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
      "steps": [
        "Download the package from Yavira.",
        "Extract it into a folder your agent can access.",
        "Paste one of the prompts below and point your agent at the extracted folder."
      ],
      "prompts": [
        {
          "label": "New install",
          "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
        },
        {
          "label": "Upgrade existing",
          "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
        }
      ]
    },
    "sourceHealth": {
      "source": "tencent",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-04-30T16:55:25.780Z",
      "expiresAt": "2026-05-07T16:55:25.780Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
        "contentDisposition": "attachment; filename=\"network-1.0.0.zip\"",
        "redirectLocation": null,
        "bodySnippet": null
      },
      "scope": "source",
      "summary": "Source download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this source.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/senior-ml-engineer"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    },
    "downloadPageUrl": "https://openagent3.xyz/downloads/senior-ml-engineer",
    "agentPageUrl": "https://openagent3.xyz/skills/senior-ml-engineer/agent",
    "manifestUrl": "https://openagent3.xyz/skills/senior-ml-engineer/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/senior-ml-engineer/agent.md"
  },
  "agentAssist": {
    "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
    "steps": [
      "Download the package from Yavira.",
      "Extract it into a folder your agent can access.",
      "Paste one of the prompts below and point your agent at the extracted folder."
    ],
    "prompts": [
      {
        "label": "New install",
        "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
      },
      {
        "label": "Upgrade existing",
        "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
      }
    ]
  },
  "documentation": {
    "source": "clawhub",
    "primaryDoc": "SKILL.md",
    "sections": [
      {
        "title": "Senior ML Engineer",
        "body": "Production ML engineering patterns for model deployment, MLOps infrastructure, and LLM integration."
      },
      {
        "title": "Table of Contents",
        "body": "Model Deployment Workflow\nMLOps Pipeline Setup\nLLM Integration Workflow\nRAG System Implementation\nModel Monitoring\nReference Documentation\nTools"
      },
      {
        "title": "Model Deployment Workflow",
        "body": "Deploy a trained model to production with monitoring:\n\nExport model to standardized format (ONNX, TorchScript, SavedModel)\nPackage model with dependencies in Docker container\nDeploy to staging environment\nRun integration tests against staging\nDeploy canary (5% traffic) to production\nMonitor latency and error rates for 1 hour\nPromote to full production if metrics pass\nValidation: p95 latency < 100ms, error rate < 0.1%"
      },
      {
        "title": "Container Template",
        "body": "FROM python:3.11-slim\n\nCOPY requirements.txt .\nRUN pip install --no-cache-dir -r requirements.txt\n\nCOPY model/ /app/model/\nCOPY src/ /app/src/\n\nHEALTHCHECK CMD curl -f http://localhost:8080/health || exit 1\n\nEXPOSE 8080\nCMD [\"uvicorn\", \"src.server:app\", \"--host\", \"0.0.0.0\", \"--port\", \"8080\"]"
      },
      {
        "title": "Serving Options",
        "body": "OptionLatencyThroughputUse CaseFastAPI + UvicornLowMediumREST APIs, small modelsTriton Inference ServerVery LowVery HighGPU inference, batchingTensorFlow ServingLowHighTensorFlow modelsTorchServeLowHighPyTorch modelsRay ServeMediumHighComplex pipelines, multi-model"
      },
      {
        "title": "MLOps Pipeline Setup",
        "body": "Establish automated training and deployment:\n\nConfigure feature store (Feast, Tecton) for training data\nSet up experiment tracking (MLflow, Weights & Biases)\nCreate training pipeline with hyperparameter logging\nRegister model in model registry with version metadata\nConfigure staging deployment triggered by registry events\nSet up A/B testing infrastructure for model comparison\nEnable drift monitoring with alerting\nValidation: New models automatically evaluated against baseline"
      },
      {
        "title": "Feature Store Pattern",
        "body": "from feast import Entity, Feature, FeatureView, FileSource\n\nuser = Entity(name=\"user_id\", value_type=ValueType.INT64)\n\nuser_features = FeatureView(\n    name=\"user_features\",\n    entities=[\"user_id\"],\n    ttl=timedelta(days=1),\n    features=[\n        Feature(name=\"purchase_count_30d\", dtype=ValueType.INT64),\n        Feature(name=\"avg_order_value\", dtype=ValueType.FLOAT),\n    ],\n    online=True,\n    source=FileSource(path=\"data/user_features.parquet\"),\n)"
      },
      {
        "title": "Retraining Triggers",
        "body": "TriggerDetectionActionScheduledCron (weekly/monthly)Full retrainPerformance dropAccuracy < thresholdImmediate retrainData driftPSI > 0.2Evaluate, then retrainNew data volumeX new samplesIncremental update"
      },
      {
        "title": "LLM Integration Workflow",
        "body": "Integrate LLM APIs into production applications:\n\nCreate provider abstraction layer for vendor flexibility\nImplement retry logic with exponential backoff\nConfigure fallback to secondary provider\nSet up token counting and context truncation\nAdd response caching for repeated queries\nImplement cost tracking per request\nAdd structured output validation with Pydantic\nValidation: Response parses correctly, cost within budget"
      },
      {
        "title": "Provider Abstraction",
        "body": "from abc import ABC, abstractmethod\nfrom tenacity import retry, stop_after_attempt, wait_exponential\n\nclass LLMProvider(ABC):\n    @abstractmethod\n    def complete(self, prompt: str, **kwargs) -> str:\n        pass\n\n@retry(stop=stop_after_attempt(3), wait=wait_exponential(min=1, max=10))\ndef call_llm_with_retry(provider: LLMProvider, prompt: str) -> str:\n    return provider.complete(prompt)"
      },
      {
        "title": "Cost Management",
        "body": "ProviderInput CostOutput CostGPT-4$0.03/1K$0.06/1KGPT-3.5$0.0005/1K$0.0015/1KClaude 3 Opus$0.015/1K$0.075/1KClaude 3 Haiku$0.00025/1K$0.00125/1K"
      },
      {
        "title": "RAG System Implementation",
        "body": "Build retrieval-augmented generation pipeline:\n\nChoose vector database (Pinecone, Qdrant, Weaviate)\nSelect embedding model based on quality/cost tradeoff\nImplement document chunking strategy\nCreate ingestion pipeline with metadata extraction\nBuild retrieval with query embedding\nAdd reranking for relevance improvement\nFormat context and send to LLM\nValidation: Response references retrieved context, no hallucinations"
      },
      {
        "title": "Vector Database Selection",
        "body": "DatabaseHostingScaleLatencyBest ForPineconeManagedHighLowProduction, managedQdrantBothHighVery LowPerformance-criticalWeaviateBothHighLowHybrid searchChromaSelf-hostedMediumLowPrototypingpgvectorSelf-hostedMediumMediumExisting Postgres"
      },
      {
        "title": "Chunking Strategies",
        "body": "StrategyChunk SizeOverlapBest ForFixed500-1000 tokens50-100General textSentence3-5 sentences1 sentenceStructured textSemanticVariableBased on meaningResearch papersRecursiveHierarchicalParent-childLong documents"
      },
      {
        "title": "Model Monitoring",
        "body": "Monitor production models for drift and degradation:\n\nSet up latency tracking (p50, p95, p99)\nConfigure error rate alerting\nImplement input data drift detection\nTrack prediction distribution shifts\nLog ground truth when available\nCompare model versions with A/B metrics\nSet up automated retraining triggers\nValidation: Alerts fire before user-visible degradation"
      },
      {
        "title": "Drift Detection",
        "body": "from scipy.stats import ks_2samp\n\ndef detect_drift(reference, current, threshold=0.05):\n    statistic, p_value = ks_2samp(reference, current)\n    return {\n        \"drift_detected\": p_value < threshold,\n        \"ks_statistic\": statistic,\n        \"p_value\": p_value\n    }"
      },
      {
        "title": "Alert Thresholds",
        "body": "MetricWarningCriticalp95 latency> 100ms> 200msError rate> 0.1%> 1%PSI (drift)> 0.1> 0.2Accuracy drop> 2%> 5%"
      },
      {
        "title": "MLOps Production Patterns",
        "body": "references/mlops_production_patterns.md contains:\n\nModel deployment pipeline with Kubernetes manifests\nFeature store architecture with Feast examples\nModel monitoring with drift detection code\nA/B testing infrastructure with traffic splitting\nAutomated retraining pipeline with MLflow"
      },
      {
        "title": "LLM Integration Guide",
        "body": "references/llm_integration_guide.md contains:\n\nProvider abstraction layer pattern\nRetry and fallback strategies with tenacity\nPrompt engineering templates (few-shot, CoT)\nToken optimization with tiktoken\nCost calculation and tracking"
      },
      {
        "title": "RAG System Architecture",
        "body": "references/rag_system_architecture.md contains:\n\nRAG pipeline implementation with code\nVector database comparison and integration\nChunking strategies (fixed, semantic, recursive)\nEmbedding model selection guide\nHybrid search and reranking patterns"
      },
      {
        "title": "Model Deployment Pipeline",
        "body": "python scripts/model_deployment_pipeline.py --model model.pkl --target staging\n\nGenerates deployment artifacts: Dockerfile, Kubernetes manifests, health checks."
      },
      {
        "title": "RAG System Builder",
        "body": "python scripts/rag_system_builder.py --config rag_config.yaml --analyze\n\nScaffolds RAG pipeline with vector store integration and retrieval logic."
      },
      {
        "title": "ML Monitoring Suite",
        "body": "python scripts/ml_monitoring_suite.py --config monitoring.yaml --deploy\n\nSets up drift detection, alerting, and performance dashboards."
      },
      {
        "title": "Tech Stack",
        "body": "CategoryToolsML FrameworksPyTorch, TensorFlow, Scikit-learn, XGBoostLLM FrameworksLangChain, LlamaIndex, DSPyMLOpsMLflow, Weights & Biases, KubeflowDataSpark, Airflow, dbt, KafkaDeploymentDocker, Kubernetes, TritonDatabasesPostgreSQL, BigQuery, Pinecone, Redis"
      }
    ],
    "body": "Senior ML Engineer\n\nProduction ML engineering patterns for model deployment, MLOps infrastructure, and LLM integration.\n\nTable of Contents\nModel Deployment Workflow\nMLOps Pipeline Setup\nLLM Integration Workflow\nRAG System Implementation\nModel Monitoring\nReference Documentation\nTools\nModel Deployment Workflow\n\nDeploy a trained model to production with monitoring:\n\nExport model to standardized format (ONNX, TorchScript, SavedModel)\nPackage model with dependencies in Docker container\nDeploy to staging environment\nRun integration tests against staging\nDeploy canary (5% traffic) to production\nMonitor latency and error rates for 1 hour\nPromote to full production if metrics pass\nValidation: p95 latency < 100ms, error rate < 0.1%\nContainer Template\nFROM python:3.11-slim\n\nCOPY requirements.txt .\nRUN pip install --no-cache-dir -r requirements.txt\n\nCOPY model/ /app/model/\nCOPY src/ /app/src/\n\nHEALTHCHECK CMD curl -f http://localhost:8080/health || exit 1\n\nEXPOSE 8080\nCMD [\"uvicorn\", \"src.server:app\", \"--host\", \"0.0.0.0\", \"--port\", \"8080\"]\n\nServing Options\nOption\tLatency\tThroughput\tUse Case\nFastAPI + Uvicorn\tLow\tMedium\tREST APIs, small models\nTriton Inference Server\tVery Low\tVery High\tGPU inference, batching\nTensorFlow Serving\tLow\tHigh\tTensorFlow models\nTorchServe\tLow\tHigh\tPyTorch models\nRay Serve\tMedium\tHigh\tComplex pipelines, multi-model\nMLOps Pipeline Setup\n\nEstablish automated training and deployment:\n\nConfigure feature store (Feast, Tecton) for training data\nSet up experiment tracking (MLflow, Weights & Biases)\nCreate training pipeline with hyperparameter logging\nRegister model in model registry with version metadata\nConfigure staging deployment triggered by registry events\nSet up A/B testing infrastructure for model comparison\nEnable drift monitoring with alerting\nValidation: New models automatically evaluated against baseline\nFeature Store Pattern\nfrom feast import Entity, Feature, FeatureView, FileSource\n\nuser = Entity(name=\"user_id\", value_type=ValueType.INT64)\n\nuser_features = FeatureView(\n    name=\"user_features\",\n    entities=[\"user_id\"],\n    ttl=timedelta(days=1),\n    features=[\n        Feature(name=\"purchase_count_30d\", dtype=ValueType.INT64),\n        Feature(name=\"avg_order_value\", dtype=ValueType.FLOAT),\n    ],\n    online=True,\n    source=FileSource(path=\"data/user_features.parquet\"),\n)\n\nRetraining Triggers\nTrigger\tDetection\tAction\nScheduled\tCron (weekly/monthly)\tFull retrain\nPerformance drop\tAccuracy < threshold\tImmediate retrain\nData drift\tPSI > 0.2\tEvaluate, then retrain\nNew data volume\tX new samples\tIncremental update\nLLM Integration Workflow\n\nIntegrate LLM APIs into production applications:\n\nCreate provider abstraction layer for vendor flexibility\nImplement retry logic with exponential backoff\nConfigure fallback to secondary provider\nSet up token counting and context truncation\nAdd response caching for repeated queries\nImplement cost tracking per request\nAdd structured output validation with Pydantic\nValidation: Response parses correctly, cost within budget\nProvider Abstraction\nfrom abc import ABC, abstractmethod\nfrom tenacity import retry, stop_after_attempt, wait_exponential\n\nclass LLMProvider(ABC):\n    @abstractmethod\n    def complete(self, prompt: str, **kwargs) -> str:\n        pass\n\n@retry(stop=stop_after_attempt(3), wait=wait_exponential(min=1, max=10))\ndef call_llm_with_retry(provider: LLMProvider, prompt: str) -> str:\n    return provider.complete(prompt)\n\nCost Management\nProvider\tInput Cost\tOutput Cost\nGPT-4\t$0.03/1K\t$0.06/1K\nGPT-3.5\t$0.0005/1K\t$0.0015/1K\nClaude 3 Opus\t$0.015/1K\t$0.075/1K\nClaude 3 Haiku\t$0.00025/1K\t$0.00125/1K\nRAG System Implementation\n\nBuild retrieval-augmented generation pipeline:\n\nChoose vector database (Pinecone, Qdrant, Weaviate)\nSelect embedding model based on quality/cost tradeoff\nImplement document chunking strategy\nCreate ingestion pipeline with metadata extraction\nBuild retrieval with query embedding\nAdd reranking for relevance improvement\nFormat context and send to LLM\nValidation: Response references retrieved context, no hallucinations\nVector Database Selection\nDatabase\tHosting\tScale\tLatency\tBest For\nPinecone\tManaged\tHigh\tLow\tProduction, managed\nQdrant\tBoth\tHigh\tVery Low\tPerformance-critical\nWeaviate\tBoth\tHigh\tLow\tHybrid search\nChroma\tSelf-hosted\tMedium\tLow\tPrototyping\npgvector\tSelf-hosted\tMedium\tMedium\tExisting Postgres\nChunking Strategies\nStrategy\tChunk Size\tOverlap\tBest For\nFixed\t500-1000 tokens\t50-100\tGeneral text\nSentence\t3-5 sentences\t1 sentence\tStructured text\nSemantic\tVariable\tBased on meaning\tResearch papers\nRecursive\tHierarchical\tParent-child\tLong documents\nModel Monitoring\n\nMonitor production models for drift and degradation:\n\nSet up latency tracking (p50, p95, p99)\nConfigure error rate alerting\nImplement input data drift detection\nTrack prediction distribution shifts\nLog ground truth when available\nCompare model versions with A/B metrics\nSet up automated retraining triggers\nValidation: Alerts fire before user-visible degradation\nDrift Detection\nfrom scipy.stats import ks_2samp\n\ndef detect_drift(reference, current, threshold=0.05):\n    statistic, p_value = ks_2samp(reference, current)\n    return {\n        \"drift_detected\": p_value < threshold,\n        \"ks_statistic\": statistic,\n        \"p_value\": p_value\n    }\n\nAlert Thresholds\nMetric\tWarning\tCritical\np95 latency\t> 100ms\t> 200ms\nError rate\t> 0.1%\t> 1%\nPSI (drift)\t> 0.1\t> 0.2\nAccuracy drop\t> 2%\t> 5%\nReference Documentation\nMLOps Production Patterns\n\nreferences/mlops_production_patterns.md contains:\n\nModel deployment pipeline with Kubernetes manifests\nFeature store architecture with Feast examples\nModel monitoring with drift detection code\nA/B testing infrastructure with traffic splitting\nAutomated retraining pipeline with MLflow\nLLM Integration Guide\n\nreferences/llm_integration_guide.md contains:\n\nProvider abstraction layer pattern\nRetry and fallback strategies with tenacity\nPrompt engineering templates (few-shot, CoT)\nToken optimization with tiktoken\nCost calculation and tracking\nRAG System Architecture\n\nreferences/rag_system_architecture.md contains:\n\nRAG pipeline implementation with code\nVector database comparison and integration\nChunking strategies (fixed, semantic, recursive)\nEmbedding model selection guide\nHybrid search and reranking patterns\nTools\nModel Deployment Pipeline\npython scripts/model_deployment_pipeline.py --model model.pkl --target staging\n\n\nGenerates deployment artifacts: Dockerfile, Kubernetes manifests, health checks.\n\nRAG System Builder\npython scripts/rag_system_builder.py --config rag_config.yaml --analyze\n\n\nScaffolds RAG pipeline with vector store integration and retrieval logic.\n\nML Monitoring Suite\npython scripts/ml_monitoring_suite.py --config monitoring.yaml --deploy\n\n\nSets up drift detection, alerting, and performance dashboards.\n\nTech Stack\nCategory\tTools\nML Frameworks\tPyTorch, TensorFlow, Scikit-learn, XGBoost\nLLM Frameworks\tLangChain, LlamaIndex, DSPy\nMLOps\tMLflow, Weights & Biases, Kubeflow\nData\tSpark, Airflow, dbt, Kafka\nDeployment\tDocker, Kubernetes, Triton\nDatabases\tPostgreSQL, BigQuery, Pinecone, Redis"
  },
  "trust": {
    "sourceLabel": "tencent",
    "provenanceUrl": "https://clawhub.ai/alirezarezvani/senior-ml-engineer",
    "publisherUrl": "https://clawhub.ai/alirezarezvani/senior-ml-engineer",
    "owner": "alirezarezvani",
    "version": "2.1.1",
    "license": null,
    "verificationStatus": "Indexed source record"
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/senior-ml-engineer",
    "downloadUrl": "https://openagent3.xyz/downloads/senior-ml-engineer",
    "agentUrl": "https://openagent3.xyz/skills/senior-ml-engineer/agent",
    "manifestUrl": "https://openagent3.xyz/skills/senior-ml-engineer/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/senior-ml-engineer/agent.md"
  }
}