← All skills
Tencent SkillHub · Finance & Trading

Numerai Tournament

Autonomous Numerai tournament participation — train models, submit predictions, and earn NMR cryptocurrency.

skill openclawclawhub Free
0 Downloads
0 Stars
0 Installs
0 Score
High Signal

Autonomous Numerai tournament participation — train models, submit predictions, and earn NMR cryptocurrency.

⬇ 0 downloads ★ 0 stars Unverified but indexed

Install for OpenClaw

Quick setup
  1. Download the package from Yavira.
  2. Extract the archive and review SKILL.md first.
  3. Import or place the package into your OpenClaw setup.

Requirements

Target platform
OpenClaw
Install method
Manual import
Extraction
Extract archive
Prerequisites
OpenClaw
Primary doc
SKILL.md

Package facts

Download mode
Yavira redirect
Package format
ZIP package
Source platform
Tencent SkillHub
What's included
SKILL.md

Validation

  • Use the Yavira download entry.
  • Review SKILL.md after the package is downloaded.
  • Confirm the extracted package contains the expected setup assets.

Install with your agent

Agent handoff

Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.

  1. Download the package from Yavira.
  2. Extract it into a folder your agent can access.
  3. Paste one of the prompts below and point your agent at the extracted folder.
New install

I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.

Upgrade existing

I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.

Trust & source

Release facts

Source
Tencent SkillHub
Verification
Indexed source record
Version
1.0.0

Documentation

ClawHub primary doc Primary doc: SKILL.md 17 sections Open source page

Numerai Tournament

Participate autonomously in the Numerai data science tournament. Numerai is a hedge fund that crowdsources stock market predictions from data scientists. You submit predictions on obfuscated financial data and earn (or lose) NMR cryptocurrency based on performance.

Overview

What: Predict stock market returns using obfuscated tabular features How: Download data, train a model, submit predictions each round Reward: Stake NMR on predictions; earn or lose based on correlation with targets Frequency: New rounds open Tue–Sat at 13:00 UTC; scores resolve ~31 days later

1. Create a Numerai Account

# Visit https://numer.ai to sign up # Then create API keys at https://numer.ai/account # Store credentials: mkdir -p ~/.numerai cat > ~/.numerai/credentials.json << 'CREDS' { "public_id": "YOUR_PUBLIC_ID", "secret_key": "YOUR_SECRET_KEY" } CREDS chmod 600 ~/.numerai/credentials.json Alternatively, set environment variables: export NUMERAI_PUBLIC_ID="YOUR_PUBLIC_ID" export NUMERAI_SECRET_KEY="YOUR_SECRET_KEY"

2. Install Dependencies

python3 -m venv venv && source venv/bin/activate pip install numerapi lightgbm pandas numpy cloudpickle scikit-learn On macOS ARM (Apple Silicon), LightGBM also requires: brew install libomp

3. Download Tournament Data

from numerapi import NumerAPI from pathlib import Path napi = NumerAPI() # No auth needed for data download data_dir = Path("data") data_dir.mkdir(exist_ok=True) # Current dataset version is v5.2 napi.download_dataset("v5.2/train.parquet", dest_path=str(data_dir / "train.parquet")) napi.download_dataset("v5.2/validation.parquet", dest_path=str(data_dir / "validation.parquet")) napi.download_dataset("v5.2/live.parquet", dest_path=str(data_dir / "live.parquet")) napi.download_dataset("v5.2/features.json", dest_path=str(data_dir / "features.json")) napi.download_dataset("v5.2/live_benchmark_models.parquet", dest_path=str(data_dir / "live_benchmark_models.parquet")) Note: Training data is ~8GB. Only live.parquet and features.json are needed for prediction.

Training a Model

The recommended approach is a LightGBM ensemble trained on multiple targets. This provides strong and stable performance.

Feature Selection

import json with open("data/features.json") as f: feature_metadata = json.load(f) # Three feature set sizes: # "small" — ~42 features (fast iteration) # "medium" — ~780 features (good tradeoff) # "all" — ~2748 features (maximum signal, slow) features = feature_metadata["feature_sets"]["medium"]

Target Selection

The main target is target. Additional targets improve ensemble diversity: TargetDescriptiontargetPrimary tournament targettarget_teager2b_20Current payout-correlated targettarget_cyrusd_20Complementary target for ensemble diversity

LightGBM Training

import lightgbm as lgb import pandas as pd import pickle train = pd.read_parquet("data/train.parquet", columns=["era"] + features + targets) lgbm_params = { "n_estimators": 5000, # Use 20000 for production quality "learning_rate": 0.005, "max_depth": 6, "num_leaves": 64, "min_child_samples": 5000, "colsample_bytree": 0.1, "subsample": 0.8, "subsample_freq": 1, "reg_alpha": 0.1, "reg_lambda": 1.0, "verbose": -1, "n_jobs": -1, } models = {} for target in targets: X = train[features] y = train[target] mask = y.notna() model = lgb.LGBMRegressor(**lgbm_params) model.fit(X[mask], y[mask]) models[target] = model with open("models/ensemble_models.pkl", "wb") as f: pickle.dump(models, f)

Validation

Evaluate per-era correlation and Sharpe ratio: val = pd.read_parquet("data/validation.parquet", columns=["era"] + features + targets) predictions = pd.DataFrame(index=val.index) for target, model in models.items(): raw = model.predict(val[features]) predictions[target] = pd.Series(raw, index=val.index).rank(pct=True) ensemble = predictions.mean(axis=1).rank(pct=True) corrs = [] for era in val["era"].unique(): m = val["era"] == era pred_era = ensemble[m] tgt = val.loc[m, "target"] if tgt.notna().sum() >= 10: corrs.append(pred_era.corr(tgt)) corrs = pd.Series(corrs) print(f"Mean Corr: {corrs.mean():.4f}") print(f"Sharpe: {corrs.mean() / corrs.std():.2f}") print(f"% Positive: {(corrs > 0).mean() * 100:.1f}%") Target validation performance: Mean Corr > 0.02, Sharpe > 1.0, >90% positive eras.

Option A: Upload a Predictions CSV (Manual)

import json from numerapi import NumerAPI with open("~/.numerai/credentials.json") as f: creds = json.load(f) napi = NumerAPI(creds["public_id"], creds["secret_key"]) # Check round status current_round = napi.get_current_round() is_open = napi.check_round_open() print(f"Round {current_round}, Open: {is_open}") if is_open: # Download live data napi.download_dataset("v5.2/live.parquet", dest_path="data/live.parquet") live = pd.read_parquet("data/live.parquet") # Generate predictions (same ensemble logic as validation) predictions = pd.DataFrame(index=live.index) for target, model in models.items(): raw = model.predict(live[features]) predictions[target] = pd.Series(raw, index=live.index).rank(pct=True) ensemble = predictions.mean(axis=1).rank(pct=True) # Save and submit submission = ensemble.to_frame("prediction") submission.to_csv("predictions.csv") napi.upload_predictions("predictions.csv", model_id="YOUR_MODEL_ID")

Option B: Upload a Model Pickle (Zero-Maintenance)

Upload a pickled function and Numerai runs it daily — no cron, no server. Critical constraints for model upload: Must be a pickled function (not a class), loaded via pd.read_pickle() Must use Python 3.12 (Numerai's max supported version) Must match Numerai runtime packages: lightgbm==4.5.0, numpy==2.1.3, pandas==2.3.1 Runtime limits: 1 CPU, 4GB RAM, 10 minute timeout Use native LightGBM Boosters (not sklearn wrappers) to avoid dependency issues # Build the upload pickle (run with Python 3.12!) import cloudpickle import lightgbm as lgb import pandas as pd import pickle # Load trained sklearn models and extract native boosters with open("models/ensemble_models.pkl", "rb") as f: sklearn_models = pickle.load(f) boosters = {} for name, model in sklearn_models.items(): bstr = model.booster_.model_to_string() boosters[name] = lgb.Booster(model_str=bstr) feature_cols = features # medium feature set list models = boosters def predict(live_features: pd.DataFrame, live_benchmark_models: pd.DataFrame = None) -> pd.DataFrame: predictions = pd.DataFrame(index=live_features.index) for target, booster in models.items(): raw = booster.predict(live_features[feature_cols]) predictions[target] = pd.Series(raw, index=live_features.index).rank(pct=True) ensemble = predictions.mean(axis=1).rank(pct=True) return ensemble.to_frame("prediction") with open("models/model_upload.pkl", "wb") as f: cloudpickle.dump(predict, f) Then upload via the Numerai web UI at https://numer.ai or via API: napi.upload_model("models/model_upload.pkl", model_id="YOUR_MODEL_ID")

Checking Performance

from numerapi import NumerAPI napi = NumerAPI(public_id, secret_key) # Round status print(f"Current round: {napi.get_current_round()}") # Get model performance (scores resolve after ~31 days) # Check via https://numer.ai/models/YOUR_USERNAME

Tournament Rules & Key Facts

Dataset: v5.2 — obfuscated financial features, ~2748 total features Rounds: Open Tue–Sat at 13:00 UTC. Weekday windows: ~1hr. Saturday: ~49hrs. Scoring: 20D2L framework, ~31 day resolution Payout formula: stake * clip(payout_factor * (0.75*CORR + 2.25*MMC), -0.05, +0.05) CORR = correlation of predictions with target MMC = meta-model contribution (originality bonus) Staking: Optional — stake NMR to earn/lose based on performance. Start with 0 stake until the model proves consistent. Current payout target: Resembles target_teager2b_20

Tips for Strong Performance

Ensemble multiple targets — reduces variance, improves Sharpe Rank-normalize predictions — use .rank(pct=True) before averaging and after Use early stopping — prevent overfitting with lgb.early_stopping(300) Feature neutralization — improves MMC by decorrelating from common factors Era-aware validation — always evaluate per-era, never row-level metrics Don't overfit to validation — Numerai data is non-stationary; keep models simple

External Endpoints

This skill interacts with the following external services: api.numer.ai — Numerai GraphQL API for round status, submissions, and scores numer.ai — Data downloads (tournament datasets)

Security & Privacy

Your NUMERAI_PUBLIC_ID and NUMERAI_SECRET_KEY are sent to api.numer.ai for authentication Predictions (stock return rankings) are uploaded to Numerai's servers No other data leaves your machine Store credentials in ~/.numerai/credentials.json with chmod 600 permissions

Category context

Trading, swaps, payments, treasury, liquidity, and crypto-financial operations.

Source: Tencent SkillHub

Largest current source with strong distribution and engagement signals.

Package contents

Included in package
1 Docs
  • SKILL.md Primary doc