{
  "schemaVersion": "1.0",
  "item": {
    "slug": "time-series-analysis",
    "name": "time-sereis-analysis",
    "source": "tencent",
    "type": "skill",
    "category": "数据分析",
    "sourceUrl": "https://clawhub.ai/dubnium0/time-series-analysis",
    "canonicalUrl": "https://clawhub.ai/dubnium0/time-series-analysis",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadMode": "redirect",
    "downloadUrl": "/downloads/time-series-analysis",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=time-series-analysis",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "installMethod": "Manual import",
    "extraction": "Extract archive",
    "prerequisites": [
      "OpenClaw"
    ],
    "packageFormat": "ZIP package",
    "includedAssets": [
      "SKILLS.md"
    ],
    "primaryDoc": "SKILL.md",
    "quickSetup": [
      "Download the package from Yavira.",
      "Extract the archive and review SKILL.md first.",
      "Import or place the package into your OpenClaw setup."
    ],
    "agentAssist": {
      "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
      "steps": [
        "Download the package from Yavira.",
        "Extract it into a folder your agent can access.",
        "Paste one of the prompts below and point your agent at the extracted folder."
      ],
      "prompts": [
        {
          "label": "New install",
          "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
        },
        {
          "label": "Upgrade existing",
          "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
        }
      ]
    },
    "sourceHealth": {
      "source": "tencent",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-04-30T16:55:25.780Z",
      "expiresAt": "2026-05-07T16:55:25.780Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
        "contentDisposition": "attachment; filename=\"network-1.0.0.zip\"",
        "redirectLocation": null,
        "bodySnippet": null
      },
      "scope": "source",
      "summary": "Source download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this source.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/time-series-analysis"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    },
    "downloadPageUrl": "https://openagent3.xyz/downloads/time-series-analysis",
    "agentPageUrl": "https://openagent3.xyz/skills/time-series-analysis/agent",
    "manifestUrl": "https://openagent3.xyz/skills/time-series-analysis/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/time-series-analysis/agent.md"
  },
  "agentAssist": {
    "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
    "steps": [
      "Download the package from Yavira.",
      "Extract it into a folder your agent can access.",
      "Paste one of the prompts below and point your agent at the extracted folder."
    ],
    "prompts": [
      {
        "label": "New install",
        "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
      },
      {
        "label": "Upgrade existing",
        "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
      }
    ]
  },
  "documentation": {
    "source": "clawhub",
    "primaryDoc": "SKILL.md",
    "sections": [
      {
        "title": "Time Series Data Science - Complete Guide",
        "body": "Expert time series data scientist specializing in forecasting, sequential prediction, and competition-winning strategies. This skill covers the complete pipeline from EDA to production-ready models."
      },
      {
        "title": "Key Lessons from Winning Solutions",
        "body": "Feature Engineering > Model Complexity\n\nFocus on 5-10 most predictive features, not all available\nLag, rolling, and EWM features are often more valuable than the raw data\nInteraction features between top predictors can be game-changers\n\n\n\nTime-Based Validation is Critical\n\nNEVER use random splits for time series\nTrain on past, validate on future (e.g., ts_index <= threshold)\nLeakage from future data will destroy real-world performance\n\n\n\nWeights Matter in Scoring\n\nIf weights are provided, use them directly in training\nHigh-weight samples disproportionately affect score\nSample weighting in model.fit() is better than custom loss\n\n\n\nMulti-Seed Ensemble for Robustness\n\nTrain same model with different random seeds\nAverage predictions reduces variance\nCommon seeds: 42, 2024, or any fixed set"
      },
      {
        "title": "1. Lag Features",
        "body": "GROUP_COLS = ['entity_id', 'category', 'horizon']\n\nfor lag in [1, 3, 5, 10]:\n    df[f'{col}_lag{lag}'] = df.groupby(GROUP_COLS)[col].shift(lag)"
      },
      {
        "title": "2. Rolling Statistics",
        "body": "for window in [5, 10, 20]:\n    df[f'{col}_roll_mean{window}'] = df.groupby(GROUP_COLS)[col].transform(\n        lambda x: x.rolling(window, min_periods=1).mean()\n    )\n    df[f'{col}_roll_std{window}'] = df.groupby(GROUP_COLS)[col].transform(\n        lambda x: x.rolling(window, min_periods=1).std()\n    )"
      },
      {
        "title": "3. Exponential Weighted Mean (EWM)",
        "body": "for span in [5, 10]:\n    df[f'{col}_ewm{span}'] = df.groupby(GROUP_COLS)[col].transform(\n        lambda x: x.ewm(span=span, adjust=False).mean()\n    )"
      },
      {
        "title": "4. Difference Features",
        "body": "df[f'{col}_diff1'] = df.groupby(GROUP_COLS)[col].diff(1)\ndf[f'{col}_diff_pct'] = df.groupby(GROUP_COLS)[col].pct_change(1)"
      },
      {
        "title": "5. Interaction Features",
        "body": "# Difference between related features\ndf['feat_diff'] = df['feature_a'] - df['feature_b']\n\n# Ratio between features\ndf['feat_ratio'] = df['feature_a'] / (df['feature_b'] + 1e-7)\n\n# Product interactions\ndf['feat_product'] = df['feature_a'] * df['feature_b']"
      },
      {
        "title": "6. Target Encoding (for categories)",
        "body": "# Compute on training data only (ts_index <= threshold)\ntrain_only = df[df.ts_index <= VAL_THRESHOLD]\n\nenc_stats = {\n    'category': train_only.groupby('category')['target'].mean().to_dict(),\n    'global_mean': train_only['target'].mean()\n}\n\n# Apply to all data\ndf['category_enc'] = df['category'].map(enc_stats['category']).fillna(enc_stats['global_mean'])"
      },
      {
        "title": "7. Temporal Signals",
        "body": "# Cyclical encoding for periodicity\ndf['t_cycle'] = np.sin(2 * np.pi * df['ts_index'] / period)\ndf['t_cycle_cos'] = np.cos(2 * np.pi * df['ts_index'] / period)\n\n# Normalized time position\ndf['ts_normalized'] = df['ts_index'] / df['ts_index'].max()\n\n# Time bins\ndf['ts_bin'] = pd.cut(df['ts_index'], bins=10, labels=False)"
      },
      {
        "title": "LightGBM Configuration (Competition-Tested)",
        "body": "lgb_cfg = {\n    'objective': 'regression',\n    'metric': 'rmse',\n    'learning_rate': 0.015,\n    'n_estimators': 4000,\n    'num_leaves': 80,\n    'min_child_samples': 200,\n    'feature_fraction': 0.6,\n    'bagging_fraction': 0.7,\n    'bagging_freq': 5,\n    'lambda_l1': 0.1,\n    'lambda_l2': 10.0,\n    'verbosity': -1\n}"
      },
      {
        "title": "Multi-Seed Ensemble Training",
        "body": "val_pred = np.zeros(len(y_val))\ntest_pred = np.zeros(len(X_test))\n\nfor seed in [42, 2024]:\n    model = lgb.LGBMRegressor(**lgb_cfg, random_state=seed)\n    \n    model.fit(\n        X_train, y_train,\n        sample_weight=w_train,  # Use weights directly\n        eval_set=[(X_val, y_val)],\n        eval_sample_weight=[w_val],\n        callbacks=[lgb.early_stopping(200, verbose=False)]\n    )\n    \n    val_pred += model.predict(X_val) / 2\n    test_pred += model.predict(X_test) / 2"
      },
      {
        "title": "Horizon-Specific Models",
        "body": "# Train separate model per forecast horizon\nfor horizon in [1, 3, 10, 25]:\n    train_h = df[df.horizon == horizon]\n    test_h = test_df[test_df.horizon == horizon]\n    \n    # Build features, train model\n    model = train_model(train_h, test_h)\n    predictions[horizon] = model.predict(test_h)"
      },
      {
        "title": "Time-Based Split",
        "body": "VAL_THRESHOLD = int(df['ts_index'].max() * 0.85)\n\ntrain_mask = df['ts_index'] <= VAL_THRESHOLD\nval_mask = df['ts_index'] > VAL_THRESHOLD\n\nX_train = df.loc[train_mask, feature_cols]\nX_val = df.loc[val_mask, feature_cols]"
      },
      {
        "title": "Expanding Window Cross-Validation",
        "body": "from sklearn.model_selection import TimeSeriesSplit\n\ntscv = TimeSeriesSplit(n_splits=5)\nfor train_idx, val_idx in tscv.split(df):\n    # Train on expanding window\n    pass"
      },
      {
        "title": "Custom Metrics",
        "body": "def weighted_rmse_score(y_true, y_pred, weights):\n    \"\"\"Weighted RMSE skill score (higher is better)\"\"\"\n    denom = np.sum(weights * y_true**2)\n    if denom <= 0:\n        return 0.0\n    numer = np.sum(weights * (y_true - y_pred)**2)\n    ratio = numer / denom\n    return float(np.sqrt(1.0 - np.clip(ratio, 0.0, 1.0)))"
      },
      {
        "title": "EDA Checklist",
        "body": "Target Analysis\n\nDistribution by time period\nDistribution by category/horizon\nTrend and seasonality detection\n\n\n\nMissing Values\n\nPattern analysis (random vs systematic)\nGroup-based imputation strategy\n\n\n\nWeight Distribution\n\nConcentration analysis\nImpact on scoring metric\n\n\n\nFeature Correlations\n\nCorrelation with target\nMulticollinearity between features\n\n\n\nTemporal Patterns\n\nStationarity tests\nRolling statistics visualization"
      },
      {
        "title": "Common Pitfalls to Avoid",
        "body": "PitfallSolutionRandom train/test splitUse time-based splitUsing future data for encodingCompute stats on train onlyIgnoring sample weightsUse sample_weight in fit()Too many featuresFocus on top 5-10 predictorsSingle modelMulti-seed ensembleOverfitting validationLarge early stopping patience"
      },
      {
        "title": "Competition Workflow",
        "body": "graph TD\n    A[Load Data] --> B[Compute Encoding Stats on Train]\n    B --> C[Build Features]\n    C --> D[Time-Based Split]\n    D --> E{For Each Horizon}\n    E --> F[Train Multi-Seed Ensemble]\n    F --> G[Validate & Score]\n    G --> H[Generate Predictions]\n    H --> I[Aggregate & Submit]"
      },
      {
        "title": "Quick Reference Commands",
        "body": "# Run complete pipeline\npython train_winning.py\n\n# Generate submission\npython generate_submission.py\n\n# Validate submission format\npython -c \"\nimport pandas as pd\nsub = pd.read_csv('submission.csv')\nprint(f'Rows: {len(sub)}, Cols: {list(sub.columns)}')\nprint(sub.head())\n\""
      },
      {
        "title": "Integration with Other Workflows",
        "body": "Use with /data-analyst for comprehensive EDA\nUse with /data-scientist for advanced feature engineering\nUse with /fintech-engineer for financial risk analysis\nCombine predictions with /quant-analyst for portfolio strategies"
      }
    ],
    "body": "Time Series Data Science - Complete Guide\n\nExpert time series data scientist specializing in forecasting, sequential prediction, and competition-winning strategies. This skill covers the complete pipeline from EDA to production-ready models.\n\nCore Principles\nKey Lessons from Winning Solutions\n\nFeature Engineering > Model Complexity\n\nFocus on 5-10 most predictive features, not all available\nLag, rolling, and EWM features are often more valuable than the raw data\nInteraction features between top predictors can be game-changers\n\nTime-Based Validation is Critical\n\nNEVER use random splits for time series\nTrain on past, validate on future (e.g., ts_index <= threshold)\nLeakage from future data will destroy real-world performance\n\nWeights Matter in Scoring\n\nIf weights are provided, use them directly in training\nHigh-weight samples disproportionately affect score\nSample weighting in model.fit() is better than custom loss\n\nMulti-Seed Ensemble for Robustness\n\nTrain same model with different random seeds\nAverage predictions reduces variance\nCommon seeds: 42, 2024, or any fixed set\nFeature Engineering Toolkit\n1. Lag Features\nGROUP_COLS = ['entity_id', 'category', 'horizon']\n\nfor lag in [1, 3, 5, 10]:\n    df[f'{col}_lag{lag}'] = df.groupby(GROUP_COLS)[col].shift(lag)\n\n2. Rolling Statistics\nfor window in [5, 10, 20]:\n    df[f'{col}_roll_mean{window}'] = df.groupby(GROUP_COLS)[col].transform(\n        lambda x: x.rolling(window, min_periods=1).mean()\n    )\n    df[f'{col}_roll_std{window}'] = df.groupby(GROUP_COLS)[col].transform(\n        lambda x: x.rolling(window, min_periods=1).std()\n    )\n\n3. Exponential Weighted Mean (EWM)\nfor span in [5, 10]:\n    df[f'{col}_ewm{span}'] = df.groupby(GROUP_COLS)[col].transform(\n        lambda x: x.ewm(span=span, adjust=False).mean()\n    )\n\n4. Difference Features\ndf[f'{col}_diff1'] = df.groupby(GROUP_COLS)[col].diff(1)\ndf[f'{col}_diff_pct'] = df.groupby(GROUP_COLS)[col].pct_change(1)\n\n5. Interaction Features\n# Difference between related features\ndf['feat_diff'] = df['feature_a'] - df['feature_b']\n\n# Ratio between features\ndf['feat_ratio'] = df['feature_a'] / (df['feature_b'] + 1e-7)\n\n# Product interactions\ndf['feat_product'] = df['feature_a'] * df['feature_b']\n\n6. Target Encoding (for categories)\n# Compute on training data only (ts_index <= threshold)\ntrain_only = df[df.ts_index <= VAL_THRESHOLD]\n\nenc_stats = {\n    'category': train_only.groupby('category')['target'].mean().to_dict(),\n    'global_mean': train_only['target'].mean()\n}\n\n# Apply to all data\ndf['category_enc'] = df['category'].map(enc_stats['category']).fillna(enc_stats['global_mean'])\n\n7. Temporal Signals\n# Cyclical encoding for periodicity\ndf['t_cycle'] = np.sin(2 * np.pi * df['ts_index'] / period)\ndf['t_cycle_cos'] = np.cos(2 * np.pi * df['ts_index'] / period)\n\n# Normalized time position\ndf['ts_normalized'] = df['ts_index'] / df['ts_index'].max()\n\n# Time bins\ndf['ts_bin'] = pd.cut(df['ts_index'], bins=10, labels=False)\n\nModel Training Patterns\nLightGBM Configuration (Competition-Tested)\nlgb_cfg = {\n    'objective': 'regression',\n    'metric': 'rmse',\n    'learning_rate': 0.015,\n    'n_estimators': 4000,\n    'num_leaves': 80,\n    'min_child_samples': 200,\n    'feature_fraction': 0.6,\n    'bagging_fraction': 0.7,\n    'bagging_freq': 5,\n    'lambda_l1': 0.1,\n    'lambda_l2': 10.0,\n    'verbosity': -1\n}\n\nMulti-Seed Ensemble Training\nval_pred = np.zeros(len(y_val))\ntest_pred = np.zeros(len(X_test))\n\nfor seed in [42, 2024]:\n    model = lgb.LGBMRegressor(**lgb_cfg, random_state=seed)\n    \n    model.fit(\n        X_train, y_train,\n        sample_weight=w_train,  # Use weights directly\n        eval_set=[(X_val, y_val)],\n        eval_sample_weight=[w_val],\n        callbacks=[lgb.early_stopping(200, verbose=False)]\n    )\n    \n    val_pred += model.predict(X_val) / 2\n    test_pred += model.predict(X_test) / 2\n\nHorizon-Specific Models\n# Train separate model per forecast horizon\nfor horizon in [1, 3, 10, 25]:\n    train_h = df[df.horizon == horizon]\n    test_h = test_df[test_df.horizon == horizon]\n    \n    # Build features, train model\n    model = train_model(train_h, test_h)\n    predictions[horizon] = model.predict(test_h)\n\nValidation Strategies\nTime-Based Split\nVAL_THRESHOLD = int(df['ts_index'].max() * 0.85)\n\ntrain_mask = df['ts_index'] <= VAL_THRESHOLD\nval_mask = df['ts_index'] > VAL_THRESHOLD\n\nX_train = df.loc[train_mask, feature_cols]\nX_val = df.loc[val_mask, feature_cols]\n\nExpanding Window Cross-Validation\nfrom sklearn.model_selection import TimeSeriesSplit\n\ntscv = TimeSeriesSplit(n_splits=5)\nfor train_idx, val_idx in tscv.split(df):\n    # Train on expanding window\n    pass\n\nCustom Metrics\ndef weighted_rmse_score(y_true, y_pred, weights):\n    \"\"\"Weighted RMSE skill score (higher is better)\"\"\"\n    denom = np.sum(weights * y_true**2)\n    if denom <= 0:\n        return 0.0\n    numer = np.sum(weights * (y_true - y_pred)**2)\n    ratio = numer / denom\n    return float(np.sqrt(1.0 - np.clip(ratio, 0.0, 1.0)))\n\nEDA Checklist\n\nTarget Analysis\n\nDistribution by time period\nDistribution by category/horizon\nTrend and seasonality detection\n\nMissing Values\n\nPattern analysis (random vs systematic)\nGroup-based imputation strategy\n\nWeight Distribution\n\nConcentration analysis\nImpact on scoring metric\n\nFeature Correlations\n\nCorrelation with target\nMulticollinearity between features\n\nTemporal Patterns\n\nStationarity tests\nRolling statistics visualization\nCommon Pitfalls to Avoid\nPitfall\tSolution\nRandom train/test split\tUse time-based split\nUsing future data for encoding\tCompute stats on train only\nIgnoring sample weights\tUse sample_weight in fit()\nToo many features\tFocus on top 5-10 predictors\nSingle model\tMulti-seed ensemble\nOverfitting validation\tLarge early stopping patience\nCompetition Workflow\ngraph TD\n    A[Load Data] --> B[Compute Encoding Stats on Train]\n    B --> C[Build Features]\n    C --> D[Time-Based Split]\n    D --> E{For Each Horizon}\n    E --> F[Train Multi-Seed Ensemble]\n    F --> G[Validate & Score]\n    G --> H[Generate Predictions]\n    H --> I[Aggregate & Submit]\n\nQuick Reference Commands\n# Run complete pipeline\npython train_winning.py\n\n# Generate submission\npython generate_submission.py\n\n# Validate submission format\npython -c \"\nimport pandas as pd\nsub = pd.read_csv('submission.csv')\nprint(f'Rows: {len(sub)}, Cols: {list(sub.columns)}')\nprint(sub.head())\n\"\n\nIntegration with Other Workflows\nUse with /data-analyst for comprehensive EDA\nUse with /data-scientist for advanced feature engineering\nUse with /fintech-engineer for financial risk analysis\nCombine predictions with /quant-analyst for portfolio strategies"
  },
  "trust": {
    "sourceLabel": "tencent",
    "provenanceUrl": "https://clawhub.ai/dubnium0/time-series-analysis",
    "publisherUrl": "https://clawhub.ai/dubnium0/time-series-analysis",
    "owner": "dubnium0",
    "version": "1.0.0",
    "license": null,
    "verificationStatus": "Indexed source record"
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/time-series-analysis",
    "downloadUrl": "https://openagent3.xyz/downloads/time-series-analysis",
    "agentUrl": "https://openagent3.xyz/skills/time-series-analysis/agent",
    "manifestUrl": "https://openagent3.xyz/skills/time-series-analysis/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/time-series-analysis/agent.md"
  }
}