How to Build an ESP32 Bluetooth Speaker
Stream phone audio to real speakers with an I2S DAC and XH-A232 amp
Updated

What you'll build
Build a proper Bluetooth speaker around an ESP32 DevKit v1. The ESP32 acts as a Bluetooth Classic A2DP audio sink, sends digital audio over I2S to a PCM5102A DAC, and the DAC feeds line-level left/right audio into an XH-A232 TPA3110 stereo amplifier. A 12 V adapter powers the amplifier directly, while an MP1584 buck converter steps the same supply down to 5 V for the ESP32. An SSD1306 OLED shows a small beat-reactive smiley and bar visualizer while music is playing.
This is not the tiny “play a tone from a buzzer” version of a speaker. The PCM5102A keeps the ESP32 out of the analog-audio path, and the XH-A232 gives enough output for real 4–8 Ω passive speakers. The important bit is the power split: 12 V goes only to the amp and buck input; the ESP32 gets 5 V on VIN from the buck; all grounds are shared so the audio reference is stable.
The firmware has four jobs: make the ESP32 discoverable as SmartSpeaker, stream Bluetooth A2DP audio out through I2S on GPIO 26/25/22, keep the PCM5102A unmuted through XSMT on GPIO 27, and update the OLED visualizer from a read-only audio callback. If your PCM5102A breakout labels the data pin as DATA instead of DIN, wire it exactly where this guide says DIN — it is the same I2S data input.
Upload and calibrate
Open the starter in Schematik and deploy it to an ESP32 DevKit v1. Open Serial Monitor at 115200 baud. A healthy boot prints a Bluetooth-ready message and the board appears as SmartSpeaker in your phone or laptop Bluetooth list within a few seconds.
Pair with SmartSpeaker, start music at low volume, then raise the volume from the phone or laptop. The XH-A232 does the speaker driving; the ESP32 and PCM5102A only handle the digital and line-level stages. The OLED should show the smiley and bars reacting to audio. If the OLED is blank but audio works, re-check the display address and the SCL move to GPIO 23.
Set the phone volume low before the first power-on. The XH-A232 can get loud quickly on efficient speakers, and a wiring mistake on the analog input is easier to catch before the amplifier is working hard.
Troubleshooting
SmartSpeakerdoes not appear in Bluetooth scans. Reset the ESP32 and watch Serial Monitor. The firmware starts Bluetooth Classic A2DP after setup; if it reboots repeatedly, check for a power dip from the buck converter or a wrong library install.- Compile says
BluetoothA2DPSink.his missing. Use the GitHub install URL for the ESP32-A2DP library, not the short package name. The starter declares the library fromhttps://github.com/pschatzmann/ESP32-A2DPfor that reason. - PCM5102A has
DATA, notDIN. WireDATAto ESP32 GPIO 22. On these DAC modulesDATAandDINare two labels for the same I2S data input. - Audio plays but there is high-pitched noise. Shorten the I2S and analog wires, keep the 12 V amplifier wiring away from the DAC outputs, and confirm AGND and GND join at the shared ground.
- OLED stays blank. Confirm VCC is on 3V3, SDA is GPIO 21, SCL is GPIO 23, and the module address is 0x3C. Some boards use 0x3D.
- ESP32 resets when music gets loud. The amplifier is pulling the 12 V supply down or injecting noise into the ground. Use a supply with more current headroom, twist the power leads, and keep the buck output wiring short.
Going further
Once the basic speaker is solid, move it from breadboard to perfboard or a small enclosure. Add a physical power switch on the 12 V input, strain relief for speaker wires, and a front panel for the OLED. If you want cleaner audio at higher volume, use shielded cable between the PCM5102A and XH-A232 and add local decoupling near the amplifier power input.
Wiring diagram
Components needed
| Component | Type | Qty | Buy |
|---|---|---|---|
| MP1584EN 3A Adjustable Buck Converter Module | power | 1 | Buy |
| SSD1306 OLED | display | 1 | Buy |
| Adafruit PCM5102 I2S DAC with Line Level Output | other | 1 | $4.95 |
| Power Supply, Power Adapter, 12V/2A, DC Jack Output, ORD-PSU-12V2A-5.5-2.1 | power | 1 | |
| Left Speaker | actuator | 1 | |
| Right Speaker | actuator | 1 | |
| XH-A232 TPA3110 Stereo Amplifier Board | other | 1 | Buy |
Supplier links, prices, and availability are shown as a guide and may change. Schematik may earn a commission from purchases made through affiliate links.
Assembly
Gather all parts
Collect the ESP32 DevKit v1, PCM5102A I2S DAC module, XH-A232 TPA3110 amplifier board, MP1584 buck converter module (pre-set to 5V output), 12V/2A barrel-jack adapter, two 4-8 Ω passive speakers (≥15W each), and a breadboard or perfboard for connections.
- Verify your buck converter is set to 5.0V output BEFORE connecting the ESP32 — use a multimeter with only the 12V supply attached.
Set up the 12V power rail
Wire the 12V barrel-jack adapter output to two places: (a) the XH-A232 VCC (+) and GND (−) screw terminals, and (b) the MP1584 buck converter VIN (+) and GND (−) input pads.
- Check polarity carefully — reverse voltage permanently damages both the TPA3110 and the MP1584.
- Do NOT plug the 12V adapter in yet.
Adjust and verify the buck converter to 5V
Connect ONLY the 12V adapter (no ESP32 yet). Power on, measure the MP1584 VOUT with a multimeter, and turn its trim potentiometer until VOUT reads 5.0V. Power off before proceeding.
- Turn the pot clockwise to raise voltage, counter-clockwise to lower it.
Power the ESP32 from the buck converter
Connect the MP1584 VOUT (+5V) to the ESP32 DevKit VIN pin, and MP1584 GND to ESP32 GND.
- Use the VIN pin on the ESP32, not the 3V3 pin — VIN accepts up to 12V and feeds the on-board regulator.
Connect ESP32 to PCM5102A (I2S)
Wire the I2S bus between the ESP32 and the PCM5102A module: • GPIO26 → PCM5102A BCK • GPIO25 → PCM5102A LCK (LRCLK) • GPIO22 → PCM5102A DIN Also connect: • ESP32 3V3 → PCM5102A VCC • ESP32 GND → PCM5102A GND and AGND
- Keep I2S wires as short as possible to avoid noise. 10 cm or less is ideal.
Connect PCM5102A analog output to XH-A232 input
Connect the PCM5102A line-level outputs to the amplifier input terminals: • PCM5102A AOUTL → XH-A232 L_IN (Left input) • PCM5102A AOUTR → XH-A232 R_IN (Right input) • PCM5102A AGND → XH-A232 AGND (signal ground) Use short shielded wire or twisted pairs to minimise hum.
- Share a common GND between the PCM5102A and XH-A232 to eliminate ground loops.
- PCM5102A output is ~1 Vrms — well within the XH-A232 ~0.775V input sensitivity.
Connect the speakers
Connect each speaker to the XH-A232 screw terminals: • Left speaker + → XH-A232 L+ • Left speaker − → XH-A232 L− • Right speaker + → XH-A232 R+ • Right speaker − → XH-A232 R−
- Never short the speaker output terminals together — it will damage the TPA3110.
- Ensure speakers are rated 4–8 Ω impedance and at least 15W.
Final check and power-on
Double-check all connections against the wiring diagram. Plug in the 12V adapter. The ESP32 should boot and the Serial monitor will print 'Waiting for connection…'. Pair your phone or laptop with the Bluetooth device named 'SmartSpeaker' and start playing audio.
- If you hear hum, try lifting the audio ground slightly or adding a 100 Ω resistor in series on each audio input line.
- Volume is controlled entirely by your source device.
Pin assignments
| Pin | Connection | Type |
|---|---|---|
| 5V | buck_5v VOUT | POWER |
| GND | buck_5v GND | GROUND |
| 3V3 | oled VCC | POWER |
| GND | oled GND | GROUND |
| GPIO 21 | oled SDA | I2C |
| GPIO 23 | oled SCL | I2C |
| 3V3 | pcm5102 VCC | POWER |
| GND | pcm5102 GND | GROUND |
| GND | pcm5102 AGND | GROUND |
| GPIO 26 | pcm5102 BCK | DATA |
| GPIO 25 | pcm5102 LCK | DATA |
| GPIO 22 | pcm5102 DIN | DATA |
| GPIO 27 | pcm5102 XSMT | DIGITAL |
| EXT | pcm5102 AOUTL → XH-A232 TPA3110 Stereo Amplifier Board L_IN | ANALOG |
| EXT | pcm5102 AOUTR → XH-A232 TPA3110 Stereo Amplifier Board R_IN | ANALOG |
| EXT | pcm5102 AGND → XH-A232 TPA3110 Stereo Amplifier Board AGND | GROUND |
| EXT | psu_12v +12V → MP1584 Buck Converter VIN | POWER |
| EXT | psu_12v GND → MP1584 Buck Converter GND | GROUND |
| EXT | psu_12v +12V → XH-A232 TPA3110 Stereo Amplifier Board VCC | POWER |
| EXT | psu_12v GND → XH-A232 TPA3110 Stereo Amplifier Board GND | GROUND |
| EXT | xha232 L+ → Left Speaker SP+ | ANALOG |
| EXT | xha232 L- → Left Speaker SP- | ANALOG |
| EXT | xha232 R+ → Right Speaker SP+ | ANALOG |
| EXT | xha232 R- → Right Speaker SP- | ANALOG |
Code
#include <Arduino.h>
#include "BluetoothA2DPSink.h"
#include <Wire.h>
#include <Adafruit_GFX.h>
#include <Adafruit_SSD1306.h>
#include <math.h>
// ── Pins ──────────────────────────────────────────────────────────────────────
#define I2S_BCK_PIN 26
#define I2S_LCK_PIN 25
#define I2S_DIN_PIN 22
#define XSMT_PIN 27
#define OLED_SDA_PIN 21
#define OLED_SCL_PIN 23
// ── OLED ──────────────────────────────────────────────────────────────────────
#define SCREEN_WIDTH 128
#define SCREEN_HEIGHT 64
#define OLED_ADDR 0x3C
// ── Visualizer ────────────────────────────────────────────────────────────────
#define NUM_BARS 16
#define BAR_MAX_H 16
#define BUF_SAMPLES 512
// ── Globals ───────────────────────────────────────────────────────────────────
void audio_data_callback(const uint8_t *data, uint32_t len);
void connection_state_changed(esp_a2d_connection_state_t state, void *ptr);
void audio_state_changed(esp_a2d_audio_state_t state, void *ptr);
void drawCuteSmiley(int cx, int cy, int r, bool excited);
void drawArms(int cx, int cy, int r, float e, uint32_t t, int move);
void updateBars();
void drawVisualizer();
void displayTask(void *param);
static int16_t audioBuf[BUF_SAMPLES];
static volatile uint16_t audioBufLen = 0;
static SemaphoreHandle_t bufMutex;
static float bars[NUM_BARS];
static volatile float energy = 0.0f;
static volatile float energyFast = 0.0f;
static volatile bool isStreaming = false;
// Beat detection for triggering dance moves
static volatile float beatAvg = 0.0f;
static volatile bool beatHit = false;
Adafruit_SSD1306 display(SCREEN_WIDTH, SCREEN_HEIGHT, &Wire, -1);
BluetoothA2DPSink a2dp_sink;
// ── Audio tap — read-only post-processing hook.
// with_post_call=true means the library STILL writes the data to I2S itself,
// we only get a copy. This does NOT replace/disable the I2S output path. ──────
void audio_data_callback(const uint8_t *data, uint32_t len) {
const int16_t *src = (const int16_t *)data;
uint32_t stereo = len / 4;
uint32_t n = stereo < BUF_SAMPLES ? stereo : BUF_SAMPLES;
float sum = 0;
for (uint32_t i = 0; i < n; i++) {
float s = (((int32_t)src[i*2] + src[i*2+1]) / 2.0f) / 32768.0f;
sum += s * s;
}
float rms = (n > 0) ? sqrtf(sum / n) : 0.0f;
// Perceptual (sqrt) curve + higher gain: quiet passages get boosted much
// more than loud ones, so the animation reacts well at low volume without
// the loud parts clipping flat at 1.0.
float e = fminf(sqrtf(rms * 22.0f), 1.0f);
// Instant attack, quick decay — strong beat response
energyFast = (e > energyFast) ? e : energyFast * 0.82f;
energy = energy * 0.7f + e * 0.3f;
// Simple beat detector: flag when energy spikes above a slow running average.
// Slower average (0.98) makes transient kicks stand out; lower threshold and
// floor so even quiet music reliably triggers move changes.
beatAvg = beatAvg * 0.98f + e * 0.02f;
if (e > beatAvg * 1.22f && e > 0.06f) beatHit = true;
if (xSemaphoreTake(bufMutex, 0) == pdTRUE) {
for (uint32_t i = 0; i < n; i++)
audioBuf[i] = (int16_t)(((int32_t)src[i*2] + src[i*2+1]) >> 1);
audioBufLen = (uint16_t)n;
xSemaphoreGive(bufMutex);
}
}
// ── BT callbacks ──────────────────────────────────────────────────────────────
void connection_state_changed(esp_a2d_connection_state_t state, void *ptr) {
Serial.println(state == ESP_A2D_CONNECTION_STATE_CONNECTED
? "[BT] Connected" : "[BT] Disconnected");
}
void audio_state_changed(esp_a2d_audio_state_t state, void *ptr) {
if (state == ESP_A2D_AUDIO_STATE_STARTED) {
isStreaming = true;
digitalWrite(XSMT_PIN, HIGH);
Serial.println("[BT] Streaming started");
} else {
isStreaming = false;
digitalWrite(XSMT_PIN, LOW);
Serial.println("[BT] Streaming stopped");
for (int i = 0; i < NUM_BARS; i++) bars[i] = 0;
energy = 0; energyFast = 0;
}
}
// ── Cute happy smiley ────────────────────────────────────────────────────────
// Big round head, large shiny eyes, rosy cheeks, real upward grin.
void drawCuteSmiley(int cx, int cy, int r, bool excited) {
// Head — double-stroke for a bold look
display.drawCircle(cx, cy, r, SSD1306_WHITE);
display.drawCircle(cx, cy, r - 1, SSD1306_WHITE);
int eo = r / 3 + 2; // eye spacing
int ey = cy - r / 5; // eyes in upper half
int er = max(r / 4, 3); // eye radius
if (excited) {
// Happy "^ ^" closed eyes on a strong beat — curve opens DOWN (cute squint)
for (int dx = -er; dx <= er; dx++) {
int dy = (int)(sqrtf((float)(er*er - dx*dx)) * 0.7f);
display.drawPixel(cx - eo + dx, ey + dy, SSD1306_WHITE);
display.drawPixel(cx - eo + dx, ey + dy - 1, SSD1306_WHITE);
display.drawPixel(cx + eo + dx, ey + dy, SSD1306_WHITE);
display.drawPixel(cx + eo + dx, ey + dy - 1, SSD1306_WHITE);
}
} else {
// Big shiny round eyes
display.fillCircle(cx - eo, ey, er, SSD1306_WHITE);
display.fillCircle(cx + eo, ey, er, SSD1306_WHITE);
display.fillCircle(cx - eo + 1, ey - 1, 1, SSD1306_BLACK); // shine
display.fillCircle(cx + eo + 1, ey - 1, 1, SSD1306_BLACK);
}
// Rosy cheeks
int ck = r * 3 / 4;
int cky = cy + r / 5;
display.drawCircle(cx - ck, cky, 2, SSD1306_WHITE);
display.drawCircle(cx + ck, cky, 2, SSD1306_WHITE);
// ── Upward grin ── on screen +Y is DOWN, so a smile dips in the MIDDLE
int mw = r * 3 / 5; // mouth half-width
int my = cy + r / 5; // mouth corners baseline
int depth = excited ? (r * 3 / 5) : (r * 2 / 5);
for (int dx = -mw; dx <= mw; dx++) {
float t = (float)dx / mw; // -1..1
int dy = (int)((1.0f - t * t) * depth); // center dips DOWN = smile
display.drawPixel(cx + dx, my + dy, SSD1306_WHITE);
display.drawPixel(cx + dx, my + dy + 1, SSD1306_WHITE);
}
}
// ── Dancing arms — move = current dance move (0 wave, 1 raise-the-roof, 2 sway) ─
void drawArms(int cx, int cy, int r, float e, uint32_t t, int move) {
int sx = r;
int sy = cy + r / 3;
float osc = sinf((float)t / 220.0f);
float beat = fminf(e * 26.0f, 22.0f);
int lx2, ly2, rx2, ry2;
if (move == 1) {
// "Raise the roof" — both hands punch straight up on the beat
int lift = (int)(8 + beat);
lx2 = cx - sx - 4; ly2 = sy - lift;
rx2 = cx + sx + 4; ry2 = sy - lift;
} else if (move == 2) {
// Side sway — both arms swing the same way, flipping with the oscillator
int swing = (int)(osc * (10 + beat * 0.5f));
lx2 = cx - sx - 12 + swing; ly2 = sy + 2;
rx2 = cx + sx + 12 + swing; ry2 = sy + 2;
} else {
// Wave — arms alternate up/down (classic dance)
lx2 = cx - sx - 12; ly2 = (int)(sy - beat + osc * 5.0f);
rx2 = cx + sx + 12; ry2 = (int)(sy - beat - osc * 5.0f);
}
display.drawLine(cx - sx, sy, lx2, ly2, SSD1306_WHITE);
display.fillCircle(lx2, ly2, 2, SSD1306_WHITE);
display.drawLine(cx + sx, sy, rx2, ry2, SSD1306_WHITE);
display.fillCircle(rx2, ry2, 2, SSD1306_WHITE);
}
// ── Bars ──────────────────────────────────────────────────────────────────────
void updateBars() {
static int16_t snap[BUF_SAMPLES];
uint16_t snapLen = 0;
if (xSemaphoreTake(bufMutex, pdMS_TO_TICKS(4)) == pdTRUE) {
snapLen = audioBufLen;
memcpy(snap, audioBuf, snapLen * sizeof(int16_t));
xSemaphoreGive(bufMutex);
}
if (snapLen == 0) return;
int spb = max(1, (int)snapLen / NUM_BARS);
for (int b = 0; b < NUM_BARS; b++) {
int start = b * spb;
int end = min(start + spb, (int)snapLen);
float sum = 0; int cnt = 0;
for (int i = start; i < end; i++) {
float s = snap[i] / 32768.0f;
sum += s * s; cnt++;
}
float rms = (cnt > 0) ? sqrtf(sum / cnt) : 0.0f;
// Same perceptual sqrt curve as the energy meter so quiet bars still
// rise to a visible height instead of sitting flat at the bottom.
float target = fminf(sqrtf(rms * 22.0f), 1.0f);
bars[b] = (target > bars[b]) ? target : bars[b] * 0.78f;
}
}
// ── Main draw (core 0) ───────────────────────────────────────────────────────
void drawVisualizer() {
static int danceMove = 0;
static uint32_t lastSwitch = 0;
display.clearDisplay();
uint32_t now = millis();
if (!isStreaming) {
drawCuteSmiley(64, 26, 16, false);
display.setTextSize(1);
display.setTextColor(SSD1306_WHITE);
display.setCursor(16, 52);
display.print("Waiting for BT...");
display.display();
return;
}
updateBars();
// Cycle through the dance moves:
// - advance on a detected beat (min 700ms apart so a move is visible), OR
// - advance on a 2.5s timeout so it ALWAYS keeps alternating even when the
// beat detector is quiet. Each move gets at least one full bar to show.
bool beat = beatHit;
beatHit = false;
if ((beat && now - lastSwitch > 700) || (now - lastSwitch > 2500)) {
danceMove = (danceMove + 1) % 3;
lastSwitch = now;
}
float ef = energyFast;
bool exc = ef > 0.35f; // big excited face on strong beats
int r = 16 + (int)(ef * 8.0f); // 16..24 — clearly grows
int bcy = 24 - (int)(ef * 6.0f); // bounces up on beat
drawArms(64, bcy, r, ef, now, danceMove);
drawCuteSmiley(64, bcy, r, exc);
// Full-width spectrum bars across the bottom
int barW = SCREEN_WIDTH / NUM_BARS;
for (int b = 0; b < NUM_BARS; b++) {
int x = b * barW;
int h = (int)(bars[b] * BAR_MAX_H);
if (h > 0) display.fillRect(x, SCREEN_HEIGHT - h, barW - 1, h, SSD1306_WHITE);
}
display.display();
}
// ── Display task pinned to core 0 (keeps I2C off the BT/I2S core) ────────────
void displayTask(void *param) {
for (;;) {
drawVisualizer();
vTaskDelay(pdMS_TO_TICKS(33));
}
}
// ── Setup ─────────────────────────────────────────────────────────────────────
void setup() {
Serial.begin(115200);
Serial.println("SCHEMATIK SPEAKER starting...");
pinMode(XSMT_PIN, OUTPUT);
digitalWrite(XSMT_PIN, LOW);
// I2C bus recovery before Wire.begin (clears a stuck SSD1306)
pinMode(OLED_SCL_PIN, OUTPUT);
pinMode(OLED_SDA_PIN, OUTPUT);
digitalWrite(OLED_SDA_PIN, HIGH);
for (int i = 0; i < 9; i++) {
digitalWrite(OLED_SCL_PIN, HIGH); delayMicroseconds(5);
digitalWrite(OLED_SCL_PIN, LOW); delayMicroseconds(5);
}
digitalWrite(OLED_SDA_PIN, LOW);
digitalWrite(OLED_SCL_PIN, HIGH); delayMicroseconds(5);
digitalWrite(OLED_SDA_PIN, HIGH); delayMicroseconds(5);
delay(50);
Wire.begin(OLED_SDA_PIN, OLED_SCL_PIN);
Wire.setClock(100000);
if (!display.begin(SSD1306_SWITCHCAPVCC, OLED_ADDR)) {
Serial.println("[OLED] Init failed — check wiring/address");
} else {
display.clearDisplay();
drawCuteSmiley(64, 26, 16, false);
display.display();
Serial.println("[OLED] Ready");
}
bufMutex = xSemaphoreCreateMutex();
// Tell the library which I2S pins to drive (legacy I2S API).
i2s_pin_config_t pin_config = {
.bck_io_num = I2S_BCK_PIN,
.ws_io_num = I2S_LCK_PIN,
.data_out_num = I2S_DIN_PIN,
.data_in_num = I2S_PIN_NO_CHANGE
};
a2dp_sink.set_pin_config(pin_config);
a2dp_sink.set_on_connection_state_changed(connection_state_changed);
a2dp_sink.set_on_audio_state_changed(audio_state_changed);
// IMPORTANT: register the tap with post_write_callback semantics — the
// library still installs+writes I2S itself; we only receive a copy.
a2dp_sink.set_stream_reader(audio_data_callback, true);
a2dp_sink.start("SCHEMATIK SPEAKER");
Serial.println("Bluetooth ready — pair with \"SCHEMATIK SPEAKER\"");
xTaskCreatePinnedToCore(displayTask, "display", 4096, NULL, 1, NULL, 0);
}
void loop() {
vTaskDelay(pdMS_TO_TICKS(1000));
}
// Run this and build other cool things at schematik.ioReady to build this?
Open this project in Schematik to get the full wiring diagram, pin assignments, and deployable code for the ESP32 Bluetooth Speaker.
Open in Schematik →