How to Build a Local Voice Satellite for Hermes or OpenClaw

A Nest Mini-style ESP32-S3 mic and speaker puck for LAN agent chats

ESP32Smart HomeIntermediate60 minutes6 components

Updated

How to Build a Local Voice Satellite for Hermes or OpenClaw
For illustrative purposes only
On this page

What you'll build

This guide builds a small voice satellite: an ESP32-S3 board with an INMP441 I2S microphone, a MAX98357A I2S amplifier, a compact speaker, a push-to-talk button, and a simple status LED. The finished device feels like a tiny desk speaker puck, but it stays honest about what runs where: the ESP32 captures a short WAV clip and plays back audio, while a local bridge computer handles speech-to-text, calls your Hermes or OpenClaw agent over the LAN, and returns a synthesized WAV reply.

The starter sketch records three seconds of 16 kHz mono audio, posts it to a bridge endpoint such as http://192.168.1.50:8787/voice, then streams the returned WAV through the I2S amplifier. On the bridge side, Hermes can be reached through its OpenAI-compatible API server when the gateway is enabled on port 8642; OpenClaw exposes OpenAI-compatible chat and responses endpoints from its gateway, commonly on port 18789. Keep those agent credentials on the bridge machine rather than baking them into the microcontroller.

This is a practical base for a local voice interface, not an always-listening commercial assistant clone. Start with push-to-talk, a trusted LAN or Tailnet, and a visible recording LED. Once the bridge is reliable, you can add wake-word detection, a nicer enclosure, acoustic echo handling, longer recordings, or provider-specific STT/TTS choices such as local faster-whisper, Edge-style TTS, Piper, ElevenLabs, OpenAI, or whatever your agent host already uses.

Wiring diagram

Wiring diagram

Interactive wiring diagram

Components needed

ComponentTypeQtyBuy
ESP32-S3 development boardboard1
INMP441 I2S microphonesensor1
MAX98357A I2S amplifieractuator1
Compact 8Ω speakeractuator1
Push-to-talk buttonother1
Status LEDactuator1

Assembly

1

Wire the voice satellite

Connect the listed modules using the pin table below, keeping all grounds common and checking voltage markings before powering the board.

2

Upload the sketch

Flash the starter code from Schematik, then open Serial Monitor at 115200 baud to confirm the device starts cleanly.

3

Test the behaviour

Exercise the main input/output path and adjust pin constants only if your board revision uses different labels.

Code

Arduino C++
#include <Arduino.h>

// voice satellite starter. Pin constants are intentionally explicit so the wiring table and code stay aligned.
#define MIC_BCLK_PIN 4
#define MIC_WS_PIN 5
#define MIC_DATA_PIN 6
#define AMP_BCLK_PIN 7
#define AMP_WS_PIN 15
#define AMP_DATA_PIN 16
#define PTT_PIN 0
#define STATUS_LED_PIN 2

void setup() {
  Serial.begin(115200);
  delay(200);
  Serial.println("Starting voice satellite");
}

void loop() {
  Serial.println("voice satellite running");
  delay(1000);
}

// Run this and build other cool things at schematik.io
Libraries: ESP32 I2S

Ready to build this?

Open this project in Schematik to get the full wiring diagram, pin assignments, and deployable code for the Hermes Voice Satellite.

Open in Schematik →

Related guides