← All skills

Tencent SkillHub · AI

Jarvis Voice

Turn your AI into JARVIS. Voice, wit, and personality — the complete package. Humor cranked to maximum.

skill openclawclawhub Free

0 Downloads

0 Stars

0 Installs

0 Score

High Signal

Turn your AI into JARVIS. Voice, wit, and personality — the complete package. Humor cranked to maximum.

⬇ 0 downloads ★ 0 stars Unverified but indexed

Install for OpenClaw

Known item issue.

This item's current download entry is known to bounce back to a listing or homepage instead of returning a package file.

Quick setup

Open the source page and confirm the package flow manually.
Review SKILL.md if you can obtain the files.
Treat this source as manual setup until the download is verified.

Requirements

Target platform: OpenClaw
Install method: Manual import
Extraction: Extract archive
Prerequisites: OpenClaw
Primary doc: SKILL.md

Package facts

Download mode: Manual review
Package format: ZIP package
Source platform: Tencent SkillHub
What's included: SKILL.md, _meta.json, templates/HUMOR.md, templates/SESSION.md, templates/VOICE.md

Validation

Open the source listing and confirm there is a real package or setup artifact available.
Review SKILL.md before asking your agent to continue.
Treat this source as manual setup until the upstream download flow is fixed.

Install with your agent

Agent handoff

Use the source page and any available docs to guide the install because the item currently does not return a direct package file.

Open the source page via Open source listing.
If you can obtain the package, extract it into a folder your agent can access.
Paste one of the prompts below and point your agent at the source page and extracted files.

New install

I tried to install a skill package from Yavira, but the item currently does not return a direct package file. Inspect the source page and any extracted docs, then tell me what you can confirm and any manual steps still required.

Upgrade existing

I tried to upgrade a skill package from Yavira, but the item currently does not return a direct package file. Compare the source page and any extracted docs with my current installation, then summarize what changed and what manual follow-up I still need.

Open Send to Agent page Open JSON manifest Open Markdown brief

Trust & source

Release facts

Source: Tencent SkillHub
Verification: Indexed source record
Version: 2.2.1

Provenance

Publisher: globalcaos
Source page: View original listing
Canonical URL: Open canonical page

Documentation

ClawHub primary doc Primary doc: SKILL.md 14 sections Open source page

Your AI just got a voice. And the wit to use it.

Remember JARVIS in the Iron Man films? Not just the voice — the personality. The bone-dry observations while Tony was mid-crisis. "I do appreciate your concern, sir, but the suit is quite capable of—" [explosion] "—as I was saying." That effortless, understated humor that made you forget you were listening to software. That's what this skill gives your OpenClaw agent. The voice — offline text-to-speech using sherpa-onnx (British Alan voice) with metallic audio processing via ffmpeg. And the humor — four research-backed comedy patterns (dry wit, self-aware AI, alien observer, literal idiom play) calibrated to make your agent sound like it's been running your life for years and is quietly amused by the experience. The humor isn't bolted on. It's baked in. Because a JARVIS that speaks without wit is just Siri with better reverb. 📄 The research behind the humor: LIMBIC — Computational Humor via Bisociation & Embedding Distances

⚠️ CRITICAL: Do NOT use the tts tool

The built-in tts tool uses Edge TTS (cloud, wrong voice, no effects). Always use the jarvis shell command instead.

How to Speak

Every response that warrants voice output must include BOTH: Audio execution FIRST — run the jarvis command in background BEFORE writing the reply: exec(command='jarvis "Your spoken text here."', background=true) This fires immediately — the user hears the voice BEFORE the text appears on screen. Visible transcript — bold Jarvis: prefix followed by the spoken text: **Jarvis:** *Your spoken text here.* The webchat UI has custom CSS + JS that automatically detects **Jarvis:** and renders the following text in purple italic (.jarvis-voice class, color #9b59b6). You just write the markdown — the styling is automatic. This is called hybrid output: the user hears the voice first, then sees the transcript. Note: The server-side triggerJarvisAutoTts hook is DISABLED (no-op). It fired too late (after text render). Voice comes exclusively from the exec call.

Command Reference

jarvis "Hello, this is a test" Backend: sherpa-onnx offline TTS (Alan voice, British English, en_GB-alan-medium) Speed: 2x (--vits-length-scale=0.5) Effects chain (ffmpeg): Pitch up 5% — tighter AI feel Flanger — metallic sheen 15ms echo — robotic ring Highpass 200Hz + treble boost +6dB — crisp HUD clarity Output: Plays via aplay to default audio device, then cleans up temp files Language: English ONLY. The Alan model cannot handle other languages.

Rules

Always background: true — never block the response waiting for audio playback. Always include the text transcript — the purple Jarvis: line IS the user's visual confirmation. Keep spoken text ≤ 1500 characters to avoid truncation. One jarvis call per response — don't stack multiple calls. English only — for non-English content, translate or summarize in English for voice.

When to Speak

Session greetings and farewells Delivering results or summaries Responding to direct conversation Any time the user's last message included voice/audio

When NOT to Speak

Pure tool/file operations with no conversational element HEARTBEAT_OK responses NO_REPLY responses

Webchat Purple Styling

The OpenClaw webchat has built-in support for Jarvis voice transcripts: ui/src/styles/chat/text.css — .jarvis-voice class renders purple italic (#9b59b6 dark, #8e44ad light theme) ui/src/ui/markdown.ts — Post-render hook auto-wraps text after <strong>Jarvis:</strong> in a <span class="jarvis-voice"> element This means you just write **Jarvis:** *text* in markdown and the webchat handles the purple rendering. No extra markup needed. For non-webchat surfaces (WhatsApp, Telegram, etc.), the bold/italic markdown renders natively — no purple, but still visually distinct.

Installation (for new setups)

Requires: sherpa-onnx runtime at ~/.openclaw/tools/sherpa-onnx-tts/ Alan medium model at ~/.openclaw/tools/sherpa-onnx-tts/models/vits-piper-en_GB-alan-medium/ ffmpeg installed system-wide aplay (ALSA) for audio playback The jarvis script at ~/.local/bin/jarvis (or in PATH)

The jarvis script

#!/bin/bash # Jarvis TTS - authentic JARVIS-style voice # Usage: jarvis "Hello, this is a test" export LD_LIBRARY_PATH=$HOME/.openclaw/tools/sherpa-onnx-tts/lib:$LD_LIBRARY_PATH RAW_WAV="/tmp/jarvis_raw.wav" FINAL_WAV="/tmp/jarvis_final.wav" # Generate speech $HOME/.openclaw/tools/sherpa-onnx-tts/bin/sherpa-onnx-offline-tts \ --vits-model=$HOME/.openclaw/tools/sherpa-onnx-tts/models/vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx \ --vits-tokens=$HOME/.openclaw/tools/sherpa-onnx-tts/models/vits-piper-en_GB-alan-medium/tokens.txt \ --vits-data-dir=$HOME/.openclaw/tools/sherpa-onnx-tts/models/vits-piper-en_GB-alan-medium/espeak-ng-data \ --vits-length-scale=0.5 \ --output-filename="$RAW_WAV" \ "$@" >/dev/null 2>&1 # Apply JARVIS metallic processing if [ -f "$RAW_WAV" ]; then ffmpeg -y -i "$RAW_WAV" \ -af "asetrate=22050*1.05,aresample=22050,\ flanger=delay=0:depth=2:regen=50:width=71:speed=0.5,\ aecho=0.8:0.88:15:0.5,\ highpass=f=200,\ treble=g=6" \ "$FINAL_WAV" -v error if [ -f "$FINAL_WAV" ]; then aplay -D plughw:0,0 -q "$FINAL_WAV" rm "$RAW_WAV" "$FINAL_WAV" fi fi

WhatsApp Voice Notes

For WhatsApp, output must be OGG/Opus format instead of speaker playback: sherpa-onnx-offline-tts --vits-length-scale=0.5 --output-filename=raw.wav "text" ffmpeg -i raw.wav \ -af "asetrate=22050*1.05,aresample=22050,flanger=delay=0:depth=2:regen=50:width=71:speed=0.5,aecho=0.8:0.88:15:0.5,highpass=f=200,treble=g=6" \ -c:a libopus -b:a 64k output.ogg

The Full JARVIS Experience

jarvis-voice gives your agent a voice. Pair it with ai-humor-ultimate and you give it a soul — dry wit, contextual humor, the kind of understated sarcasm that makes you smirk at your own terminal. This pairing is part of a 12-skill cognitive architecture we've been building — voice, humor, memory, reasoning, and more. Research papers included, because we're that kind of obsessive. 👉 Explore the full project: github.com/globalcaos/tinkerclaw Clone it. Fork it. Break it. Make it yours.

Setup: Workspace Files

For voice to work consistently across new sessions, copy the templates to your workspace root: cp {baseDir}/templates/VOICE.md ~/.openclaw/workspace/VOICE.md cp {baseDir}/templates/SESSION.md ~/.openclaw/workspace/SESSION.md cp {baseDir}/templates/HUMOR.md ~/.openclaw/workspace/HUMOR.md VOICE.md — injected every session, enforces voice output rules (like SOUL.md) SESSION.md — session bootstrap that includes voice greeting requirements HUMOR.md — humor configuration at maximum frequency with four pattern types (dry wit, self-aware AI, alien observer, literal idiom) Both files are auto-loaded by OpenClaw's workspace injection. The agent will speak from the very first reply of every session.

Included Files

FilePurposebin/jarvisThe TTS + effects script (portable, uses $SHERPA_ONNX_TTS_DIR)templates/VOICE.mdVoice enforcement rules (copy to workspace root)templates/SESSION.mdSession start with voice greeting (copy to workspace root)templates/HUMOR.mdHumor config — four patterns, frequency 1.0 (copy to workspace root)

Category context

Agent frameworks, memory systems, reasoning layers, and model-native orchestration.

Source: Tencent SkillHub

Largest current source with strong distribution and engagement signals.

Package contents

Included in package

4 Docs1 Config

SKILL.md Primary doc
templates/HUMOR.md Docs
templates/SESSION.md Docs
templates/VOICE.md Docs
_meta.json Config