{
  "schemaVersion": "1.0",
  "item": {
    "slug": "memory-bench-pioneer",
    "name": "Memory Bench Pioneer",
    "source": "tencent",
    "type": "skill",
    "category": "AI 智能",
    "sourceUrl": "https://clawhub.ai/globalcaos/memory-bench-pioneer",
    "canonicalUrl": "https://clawhub.ai/globalcaos/memory-bench-pioneer",
    "targetPlatform": "OpenClaw"
  },
  "install": {
    "downloadMode": "redirect",
    "downloadUrl": "/downloads/memory-bench-pioneer",
    "sourceDownloadUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=memory-bench-pioneer",
    "sourcePlatform": "tencent",
    "targetPlatform": "OpenClaw",
    "installMethod": "Manual import",
    "extraction": "Extract archive",
    "prerequisites": [
      "OpenClaw"
    ],
    "packageFormat": "ZIP package",
    "includedAssets": [
      "SKILL.md",
      "scripts/collect.py",
      "scripts/rate.py",
      "scripts/submit.sh",
      "scripts/test_metrics.py",
      "scripts/testset.json"
    ],
    "primaryDoc": "SKILL.md",
    "quickSetup": [
      "Download the package from Yavira.",
      "Extract the archive and review SKILL.md first.",
      "Import or place the package into your OpenClaw setup."
    ],
    "agentAssist": {
      "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
      "steps": [
        "Download the package from Yavira.",
        "Extract it into a folder your agent can access.",
        "Paste one of the prompts below and point your agent at the extracted folder."
      ],
      "prompts": [
        {
          "label": "New install",
          "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
        },
        {
          "label": "Upgrade existing",
          "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
        }
      ]
    },
    "sourceHealth": {
      "source": "tencent",
      "status": "healthy",
      "reason": "direct_download_ok",
      "recommendedAction": "download",
      "checkedAt": "2026-04-30T16:55:25.780Z",
      "expiresAt": "2026-05-07T16:55:25.780Z",
      "httpStatus": 200,
      "finalUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
      "contentType": "application/zip",
      "probeMethod": "head",
      "details": {
        "probeUrl": "https://wry-manatee-359.convex.site/api/v1/download?slug=network",
        "contentDisposition": "attachment; filename=\"network-1.0.0.zip\"",
        "redirectLocation": null,
        "bodySnippet": null
      },
      "scope": "source",
      "summary": "Source download looks usable.",
      "detail": "Yavira can redirect you to the upstream package for this source.",
      "primaryActionLabel": "Download for OpenClaw",
      "primaryActionHref": "/downloads/memory-bench-pioneer"
    },
    "validation": {
      "installChecklist": [
        "Use the Yavira download entry.",
        "Review SKILL.md after the package is downloaded.",
        "Confirm the extracted package contains the expected setup assets."
      ],
      "postInstallChecks": [
        "Confirm the extracted package includes the expected docs or setup files.",
        "Validate the skill or prompts are available in your target agent workspace.",
        "Capture any manual follow-up steps the agent could not complete."
      ]
    },
    "downloadPageUrl": "https://openagent3.xyz/downloads/memory-bench-pioneer",
    "agentPageUrl": "https://openagent3.xyz/skills/memory-bench-pioneer/agent",
    "manifestUrl": "https://openagent3.xyz/skills/memory-bench-pioneer/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/memory-bench-pioneer/agent.md"
  },
  "agentAssist": {
    "summary": "Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.",
    "steps": [
      "Download the package from Yavira.",
      "Extract it into a folder your agent can access.",
      "Paste one of the prompts below and point your agent at the extracted folder."
    ],
    "prompts": [
      {
        "label": "New install",
        "body": "I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete."
      },
      {
        "label": "Upgrade existing",
        "body": "I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run."
      }
    ]
  },
  "documentation": {
    "source": "clawhub",
    "primaryDoc": "SKILL.md",
    "sections": [
      {
        "title": "Memory Bench",
        "body": "Collect, assess, and submit anonymized memory system statistics for the ENGRAM and CORTEX research papers."
      },
      {
        "title": "1. Assess Retrieval Quality",
        "body": "Run the standard test set (30 queries across 4 types × 3 difficulty levels) with LLM-as-judge:\n\n# Full assessment with GPT-4o-mini judge + ablation (recommended)\npython3 scripts/rate.py --queries 30 --judge openai --ablation\n\n# Without OpenAI key: local embedding judge (weaker, marked in output)\npython3 scripts/rate.py --queries 30 --judge local --ablation\n\n# Custom test set\npython3 scripts/rate.py --testset path/to/queries.json --judge openai\n\nWhat it measures:\n\nRAR (Recall Accuracy Ratio), MRR (Mean Reciprocal Rank)\nnDCG@5, MAP@5, Precision@5, Hit Rate\nAll metrics include 95% bootstrap confidence intervals\nAblation: runs with AND without spreading activation to isolate its contribution\n\nJudge methods:\n\nopenai — GPT-4o-mini rates each (query, result) pair 1-5. Independent from retrieval system. ~$0.01 per run.\nlocal — Embedding cosine similarity. Weaker, marked as such in output. Zero cost.\n\nStandard test set (scripts/testset.json): 30 queries stratified across semantic/episodic/procedural/strategic types and easy/medium/hard difficulty. No lexical overlap with stored memories. All deployments run the same queries for cross-site comparability."
      },
      {
        "title": "2. Collect Statistics",
        "body": "python3 scripts/collect.py --contributor GITHUB_USER --days 14 --output /tmp/memory-bench-report.json\n\nCollected (anonymized): Memory counts/types/ages, strength/importance histograms, association graph size, hierarchy levels, consolidation history, retrieval metrics (RAR/MRR/nDCG/MAP with CIs), ablation results, judge method, algorithm version, embedding coverage. Instance ID is a random UUID (not reversible).\n\nNever collected: Memory content, queries, file paths, usernames, hostnames."
      },
      {
        "title": "3. Submit as PR",
        "body": "scripts/submit.sh /tmp/memory-bench-report.json GITHUB_USERNAME\n\nForks, branches, places report, updates INDEX.json, opens PR. Requires gh CLI."
      },
      {
        "title": "Validation Protocol",
        "body": "For peer-review-ready data, contributors should:\n\nRun rate.py --ablation --judge openai (minimum N=30 queries)\nCollect at least 2 reports from the same instance, ≥7 days apart (longitudinal)\nReport the algorithm version (auto-captured from git)"
      },
      {
        "title": "Test Set Format",
        "body": "Custom test sets are JSON arrays:\n\n[\n  {\n    \"id\": \"T01\",\n    \"query\": \"...\",\n    \"category\": \"semantic|episodic|procedural|strategic\",\n    \"difficulty\": \"easy|medium|hard\"\n  }\n]"
      },
      {
        "title": "Agent Workflow",
        "body": "When asked to submit benchmarks: run rate.py --ablation --judge openai, then collect.py, review summary, then submit.sh. Share the PR link."
      }
    ],
    "body": "Memory Bench\n\nCollect, assess, and submit anonymized memory system statistics for the ENGRAM and CORTEX research papers.\n\nThree-Step Pipeline\n1. Assess Retrieval Quality\n\nRun the standard test set (30 queries across 4 types × 3 difficulty levels) with LLM-as-judge:\n\n# Full assessment with GPT-4o-mini judge + ablation (recommended)\npython3 scripts/rate.py --queries 30 --judge openai --ablation\n\n# Without OpenAI key: local embedding judge (weaker, marked in output)\npython3 scripts/rate.py --queries 30 --judge local --ablation\n\n# Custom test set\npython3 scripts/rate.py --testset path/to/queries.json --judge openai\n\n\nWhat it measures:\n\nRAR (Recall Accuracy Ratio), MRR (Mean Reciprocal Rank)\nnDCG@5, MAP@5, Precision@5, Hit Rate\nAll metrics include 95% bootstrap confidence intervals\nAblation: runs with AND without spreading activation to isolate its contribution\n\nJudge methods:\n\nopenai — GPT-4o-mini rates each (query, result) pair 1-5. Independent from retrieval system. ~$0.01 per run.\nlocal — Embedding cosine similarity. Weaker, marked as such in output. Zero cost.\n\nStandard test set (scripts/testset.json): 30 queries stratified across semantic/episodic/procedural/strategic types and easy/medium/hard difficulty. No lexical overlap with stored memories. All deployments run the same queries for cross-site comparability.\n\n2. Collect Statistics\npython3 scripts/collect.py --contributor GITHUB_USER --days 14 --output /tmp/memory-bench-report.json\n\n\nCollected (anonymized): Memory counts/types/ages, strength/importance histograms, association graph size, hierarchy levels, consolidation history, retrieval metrics (RAR/MRR/nDCG/MAP with CIs), ablation results, judge method, algorithm version, embedding coverage. Instance ID is a random UUID (not reversible).\n\nNever collected: Memory content, queries, file paths, usernames, hostnames.\n\n3. Submit as PR\nscripts/submit.sh /tmp/memory-bench-report.json GITHUB_USERNAME\n\n\nForks, branches, places report, updates INDEX.json, opens PR. Requires gh CLI.\n\nValidation Protocol\n\nFor peer-review-ready data, contributors should:\n\nRun rate.py --ablation --judge openai (minimum N=30 queries)\nCollect at least 2 reports from the same instance, ≥7 days apart (longitudinal)\nReport the algorithm version (auto-captured from git)\nTest Set Format\n\nCustom test sets are JSON arrays:\n\n[\n  {\n    \"id\": \"T01\",\n    \"query\": \"...\",\n    \"category\": \"semantic|episodic|procedural|strategic\",\n    \"difficulty\": \"easy|medium|hard\"\n  }\n]\n\nAgent Workflow\n\nWhen asked to submit benchmarks: run rate.py --ablation --judge openai, then collect.py, review summary, then submit.sh. Share the PR link."
  },
  "trust": {
    "sourceLabel": "tencent",
    "provenanceUrl": "https://clawhub.ai/globalcaos/memory-bench-pioneer",
    "publisherUrl": "https://clawhub.ai/globalcaos/memory-bench-pioneer",
    "owner": "globalcaos",
    "version": "2.0.0",
    "license": null,
    "verificationStatus": "Indexed source record"
  },
  "links": {
    "detailUrl": "https://openagent3.xyz/skills/memory-bench-pioneer",
    "downloadUrl": "https://openagent3.xyz/downloads/memory-bench-pioneer",
    "agentUrl": "https://openagent3.xyz/skills/memory-bench-pioneer/agent",
    "manifestUrl": "https://openagent3.xyz/skills/memory-bench-pioneer/agent.json",
    "briefUrl": "https://openagent3.xyz/skills/memory-bench-pioneer/agent.md"
  }
}