Speech Synthesis: Text To Speech in SwiftUI

DevTechie Inc

Jun 17, 2022

Speech synthesis is the process of producing human speech artificially. A system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products.

Apple provides speech synthesis library inside AVFoundation framework.

Speech synthesis requires use of two classes, AVSpeechUtterance and AVSpeechSynthesizer.

Let’s start with a simple string, which will work as the input value for our utterance. We will create this string as a State variable so we can later on use it with a TextField.

@State private var inputString = "Hello world! My name is Dev Techie"

Next, we will create UI with SwiftUI.

struct TextToSpeechExample: View {
    
    @State private var inputString = "Hello world! My name is Dev Techie" 
    
    var body: some View {
        VStack {
            TextField("Enter text", text: $inputString)
                .textFieldStyle(.roundedBorder)
            
            Button("Text to speech") {
                // add utterance here
            }
        }.padding()
        
    }
}

Our view will be very simple. Something like this:

Once we have the UI, we will add action to the button. This action will turn our typed text in TextField into speech.

First we will create AVSpeechUtterance instance and initialize it with our inputString, as shown below:

let utterance = AVSpeechUtterance(string: inputString)

We will also set the voice on utterance object, voice is a AVSpeechSynthesisVoice type and takes language as a parameter so for this case, I will use en-US for US English.

utterance.voice = AVSpeechSynthesisVoice(language: "en-US")

We can also set pitch and rate of speech with following parameters on utterance.

utterance.pitchMultiplier = 2.0
utterance.rate = 0.3

Next, we will create AVSpeechSynthesizer and call its speak function, which takes utterance as a parameter:

let synthesizer = AVSpeechSynthesizer()
synthesizer.speak(utterance)

And your text to speech view is ready 😃

Here is the full code. I would recommend changing values for language, pitch and rate to have some fun with this.

struct TextToSpeechExample: View {
    
    @State private var inputString = "Hello world! My name is Dev Techie" 
    
    var body: some View {
        VStack {
            TextField("Enter text", text: $inputString)
                .textFieldStyle(.roundedBorder)
            
            Button("Text to speech") {
                let utterance = AVSpeechUtterance(string: inputString)
                utterance.voice = AVSpeechSynthesisVoice(language: "en-US")
                utterance.pitchMultiplier = 2.0
                utterance.rate = 0.3
                let synthesizer = AVSpeechSynthesizer()
                synthesizer.speak(utterance)
            }
        }.padding()
        
    }
}

With that we have reached the end of this article. Thank you once again for reading. Don’t forget to subscribe our weekly newsletter at https://www.devtechie.com