Tencent SkillHub · Communication & Collaboration

whatsappVoiceOpenSkill

Real-time WhatsApp voice message processing. Transcribe voice notes to text via Whisper, detect intent, execute handlers, and send responses. Use when building conversational voice interfaces for WhatsApp. Supports English and Hindi, customizable intents (weather, status, commands), automatic language detection, and streaming responses via TTS.

skill openclawclawhub Free

0 Downloads

0 Stars

0 Installs

0 Score

High Signal

⬇ 0 downloads ★ 0 stars Unverified but indexed

Install for OpenClaw

Quick setup

Download the package from Yavira.
Extract the archive and review SKILL.md first.
Import or place the package into your OpenClaw setup.

Requirements

Target platform: OpenClaw
Install method: Manual import
Extraction: Extract archive
Prerequisites: OpenClaw
Primary doc: SKILL.md

Package facts

Download mode: Yavira redirect
Package format: ZIP package
Source platform: Tencent SkillHub
What's included: COMMUNITY-NOTES.md, example-custom-intents.js, package.json, requirements.txt, SKILL.md, scripts/transcribe.py

Validation

Use the Yavira download entry.
Review SKILL.md after the package is downloaded.
Confirm the extracted package contains the expected setup assets.

Install with your agent

Agent handoff

Hand the extracted package to your coding agent with a concrete install brief instead of figuring it out manually.

Download the package from Yavira.
Extract it into a folder your agent can access.
Paste one of the prompts below and point your agent at the extracted folder.

New install

I downloaded a skill package from Yavira. Read SKILL.md from the extracted folder and install it by following the included instructions. Tell me what you changed and call out any manual steps you could not complete.

Upgrade existing

I downloaded an updated skill package from Yavira. Read SKILL.md from the extracted folder, compare it with my current installation, and upgrade it while preserving any custom configuration unless the package docs explicitly say otherwise. Summarize what changed and any follow-up checks I should run.

Open Send to Agent page Open JSON manifest Open Markdown brief

Trust & source

Release facts

Source: Tencent SkillHub
Verification: Indexed source record
Version: 1.0.0

Provenance

Publisher: syedateebulislam
Source page: View original listing
Canonical URL: Open canonical page

Documentation

ClawHub primary doc Primary doc: SKILL.md 20 sections Open source page

WhatsApp Voice Talk

Turn WhatsApp voice messages into real-time conversations. This skill provides a complete pipeline: voice → transcription → intent detection → response generation → text-to-speech. Perfect for: Voice assistants on WhatsApp Hands-free command interfaces Multi-lingual chatbots IoT voice control (drones, smart home, etc.)

1. Install Dependencies

pip install openai-whisper soundfile numpy

2. Process a Voice Message

const { processVoiceNote } = require('./scripts/voice-processor'); const fs = require('fs'); // Read a voice message (OGG, WAV, MP3, etc.) const buffer = fs.readFileSync('voice-message.ogg'); // Process it const result = await processVoiceNote(buffer); console.log(result); // { // status: 'success', // response: "Current weather in Delhi is 19°C, haze. Humidity is 56%.", // transcript: "What's the weather today?", // intent: 'weather', // language: 'en', // timestamp: 1769860205186 // }

3. Run Auto-Listener

For automatic processing of incoming WhatsApp voice messages: node scripts/voice-listener-daemon.js This watches ~/.clawdbot/media/inbound/ every 5 seconds and processes new voice files.

How It Works

Incoming Voice Message ↓ Transcribe (Whisper API) ↓ "What's the weather?" ↓ Detect Language & Intent ↓ Match against INTENTS ↓ Execute Handler ↓ Generate Response ↓ Convert to TTS ↓ Send back via WhatsApp

Key Features

✅ Zero Setup Complexity - No FFmpeg, no complex dependencies. Uses soundfile + Whisper. ✅ Multi-Language - Automatic English/Hindi detection. Extend easily. ✅ Intent-Driven - Define custom intents with keywords and handlers. ✅ Real-Time Processing - 5-10 seconds per message (after first model load). ✅ Customizable - Add weather, status, commands, or anything else. ✅ Production Ready - Built from real usage in Clawdbot.

Weather Bot

// User says: "What's the weather in Bangalore?" // Response: "Current weather in Delhi is 19°C..." // (Built-in intent, just enable it)

Smart Home Control

// User says: "Turn on the lights" // Handler: Sends signal to smart home API // Response: "Lights turned on"

Task Manager

// User says: "Add milk to shopping list" // Handler: Adds to database // Response: "Added milk to your list"

Status Checker

// User says: "Is the system running?" // Handler: Checks system status // Response: "All systems online"

Add a Custom Intent

Edit voice-processor.js: Add to INTENTS map: const INTENTS = { 'shopping': { keywords: ['shopping', 'list', 'buy', 'खरीद'], handler: 'handleShopping' } }; Add handler: const handlers = { async handleShopping(language = 'en') { return { status: 'success', response: language === 'en' ? "What would you like to add to your shopping list?" : "आप अपनी शॉपिंग लिस्ट में क्या जोड़ना चाहते हैं?" }; } };

Support More Languages

Update detectLanguage() for your language's Unicode: const urduChars = /[\u0600-\u06FF]/g; // Add this Add language code to returns: return language === 'ur' ? 'Urdu response' : 'English response'; Set language in transcribe.py: result = model.transcribe(data, language="ur")

Change Transcription Model

In transcribe.py: model = whisper.load_model("tiny") # Fastest, 39MB model = whisper.load_model("base") # Default, 140MB model = whisper.load_model("small") # Better, 466MB model = whisper.load_model("medium") # Good, 1.5GB

Architecture

Scripts: transcribe.py - Whisper transcription (Python) voice-processor.js - Core logic (intent parsing, handlers) voice-listener-daemon.js - Auto-listener watching for new messages References: SETUP.md - Installation and configuration API.md - Detailed function documentation

Integration with Clawdbot

If running as a Clawdbot skill, hook into message events: // In your Clawdbot handler const { processVoiceNote } = require('skills/whatsapp-voice-talk/scripts/voice-processor'); message.on('voice', async (audioBuffer) => { const result = await processVoiceNote(audioBuffer, message.from); // Send response back await message.reply(result.response); // Or send as voice (requires TTS) await sendVoiceMessage(result.response); });

Performance

First run: ~30 seconds (downloads Whisper model, ~140MB) Typical: 5-10 seconds per message Memory: ~1.5GB (base model) Languages: English, Hindi (easily extended)

Supported Audio Formats

OGG (Opus), WAV, FLAC, MP3, CAF, AIFF, and more via libsndfile. WhatsApp uses Opus-coded OGG by default — works out of the box.

Troubleshooting

"No module named 'whisper'" pip install openai-whisper "No module named 'soundfile'" pip install soundfile Voice messages not processing? Check: clawdbot status (is it running?) Check: ~/.clawdbot/media/inbound/ (files arriving?) Run daemon manually: node scripts/voice-listener-daemon.js (see logs) Slow transcription? Use smaller model: whisper.load_model("base") or "tiny"

License

MIT - Use freely, customize, contribute back! Built for real-world use in Clawdbot. Battle-tested with multiple languages and use cases.

Category context

Messaging, meetings, inboxes, CRM, and teammate communication surfaces.

Source: Tencent SkillHub

Largest current source with strong distribution and engagement signals.

Package contents

Included in package

2 Docs2 Scripts1 Config1 Files

SKILL.md Primary doc
COMMUNITY-NOTES.md Docs
example-custom-intents.js Scripts
scripts/transcribe.py Scripts
package.json Config
requirements.txt Files