Voice-controlled assistants that respond to human speech and commands are no longer new. An estimated one in four adults in America owns a smart speaker, like Google Home or Amazon's Echo. These devices provide a point of communication between you and your connected devices, providing convenience and support by automating certain functions.
In this tutorial, we'll cover how you can control home or office devices using voice recognition and the Message Queuing Telemetry Transport (MQTT) protocol. MQTT is an OASIS standard messaging protocol for the Internet of Things (IoT).
For this project, we used Google's Speech API in the Python programming language to ensure that what is spoken is understood and that voice commands match specific conditions. We will also use MQTT to send control signals to the device.
Required components
Tools
- Arduino IDE
- Python 2.7
Libraries
Circuit Diagram
The board is connected to a light switch using a relay circuit. It is also possible to use the Arduino UNO board instead of a custom one.
Technical considerations
We use Python 2.7 to write the code and our computer's microphone to record what is said for this project.
Ideally, speech should be easily converted to text in Python using the Speech_recoginition library. The text is converted to the computer's “voice” using the ttysx library. And Python's paho-mqtt library supports communication for MQTT.
Block diagram
Note: The computer system receives speech through your microphone and converts it so that it can properly recognize any command. The computer sends signals to MQTT with the support of a router, which is connected to an online broker. The controlling device is also connected to this same MQTT broker.
How it works
- The control device, which is inside the board, must first be connected to the Wi-Fi router in your home or office. It will wait for any command signal on the “ts/light” topic.
- When we start the recognizer script, the device starts “listening”. Anything spoken will be recorded and converted to text.
- This speech is saved within a string and this string is compared with a set of specific commands. If the commands match, control signals are sent to MQTT. For example, if a speaker says “lights on”, the speech will be converted to a string. This string is then analyzed for the words “lights on”. When words are found and matched, the device sends an “ON” signal to the MQTT broker.
- Another device – the one inside the switchboard socket – receives the command and turns on the light.
The code
The code can be divided into two parts.
1. Voice recognition
When turned on, the system creates a connection with the MQTT broker: “broker.hivemq.com”.
client.connect(mqtt_broker, mqtt_port, 60)
A function starts recording any words picked up by the microphone and converts them to text.
while 1:
data = recordAudio
data = r.recognize_google(audio)
A sequence of data (which includes the spoken words) is passed into the “assti(data)” function, where the speech is analyzed for any specific commands.
if (“lights on” in data):
send(“ON”)
speak(“the light is on!”);
print “The light is on!”
if (“lights off” in data):
send(“OFF”)
If the condition is matched (i.e. the words match the commands), a signal is sent using the “send(msg)” function to MQTT.
def send(message):
publish.single(mqtt_publish_topic, msg, hostname=mqtt_broker)
2. Network communication
A common subscription is created and published to ESP.
const char* topicSubscribe = “ts/light”;
const char* topicPublish = “ts/report”;
To access the network, we use ESP8266 WiFi chip and ATmega328 is a single-chip microcontroller. Anything sent from the ATmega328 via serial is published directly as a control signal to the “ts/light” topic.
if (Serial.available) {
String recivedData = Serial.readString ;
temp_str = data received;
char temp(temp_str.length + 2);
temp_str.toCharArray(temp, temp_str.length + 1);
client.publish(topicPublish, temp);
}
Observe the ESP8266 code snippet. Anything received over MQTT is sent to the ESP's serial port and the ATmega328p.
void Received_data(char* topic, byte* payload, unsigned internal length) {
data_r_in_string = “”;
for (int i = 0; i < length; i++) {
data_r_in_string = String(data_r_in_string + (char)payload(i));
//Serial.print((char)payload(i));
}
Serial.print(data_r_in_string);}
This is a small Python script that converts speech to text. You can include additional commands or functionality, such as AI, in your scripts. This would mean they would work as personal virtual assistants.