Main Content

Talkie part 1: the ESP32 Speech Synthesiser

Speech synthesises has come a long way. If you have a Google Home or Alexa you know how well it sounds when the device talks to you.

Back in the old days speech synthesises was Magic. I had a glimpse on it on an Apple II computer. I never used it myself. I had a Commodore 64 and the speech software was not widespread in the Netherlands. There was a speech add-on board for the Commodore 64 but it was as expensive as the computer itself: meaning around 500 euro.

Back in those days a computer with speech synthesises sounded like a computer that talked. Maybe you remember some video’s in which professor Stephen Hawking gave a talk about the universe. He had a speech synthesizer that spoke as a computer voice.

Someone has ported such a system to the ESP32 and it is called Talkie. Talkie is a software implementation of the Texas Instruments speech synthesis architecture (Linear Predictive Coding) from the late 1970s / early 1980s, as used on several popular applications. It was used by Texas Instruments, Acorn computers, Atari arcade games, Apple ][ and the IBM PS/2 Speech adapter (information from its website).

Talkie has a rather limited vocabulary. It is however enough to make some fun projects with it. And that is exactly what I wanted to do. So lets have a look at it.

Hardware.

To get speech synthesises working on the ESP32 we have to add some additional hardware. The ESP32 can not supply enough power from its IO pins to attach a speaker directly. So we have 2 possibillities.

First we can attach a speaker with a build in amplifier. These are the same speakers you would attach to your computer or Raspberry. I have a few sets of them some powered by a dedicated power supply and some can be powered by a USB power supply or even a power bank.

The second method is to connect an amplifier to the ESP32 and attach a speaker to that amplifier. I have bought some small amplifiers from my chinese suppliers which are very cheap. They are called GF1002. It is a stereo amplifier that supplies about 3 watt per channel and works from a 5 volt power supply. So we can feed it from an USB power supply. It has a volume knob with an on/of function and is breadboard friendly. Just solder some headers and you are ready to go. Ideal for this kind of work and enough volume to hear the speech across a room.

So an ESP32 an amplifier and a speaker is basically all we need. Add a button with a resistor and it is all we need for the first project I am presenting here: a talking clock.”

Link to article