Synthesizing Speech with Natural Language Processing in SwiftUI
NaturalLanguage
framework in iOS provides a convenient way to add intelligence to our text based apps.
We can leverage power of Natural Language Processing using built in NaturalLanguage
framework in iOS.
One such feature of the framework is to be able to identify dominant language
in a given string. Once we learn about the language text is written in, we can combine this information and use it with other frameworks, such as AVKit’s
Speech Synthesizer
.
AVSpeechSynthesizer
gives apps ability to produce synthesized speech from text utterances
. It also enables monitoring or controlling of ongoing speech. Combined with Natural Language
, we can make our apps speak content in the written different languages.
Today, we will build an example to combine power of NaturalLanguage
and AVSpeechSynthesizer
to speak content written in different languages.
Note: you will need a device to run this example, as simulator failed on me numerous times while putting this example together.
Let’s start with setup. We will have a TextEditor
view for user to type or paste text string. We will have a button
which upon tapping, will start reading the content written inside the TextEditor
.
struct DevTechieNLSpeechSynthesis: View {
@State private var text = ""
var body: some View {
VStack {
TextEditor(text: $text)
.padding()
.overlay(RoundedRectangle(cornerRadius: 10).stroke(Color.gray.opacity(0.5), lineWidth: 2))
.padding()
Button("Speak") {
}
}
}
}
We will start with the import statements so let’s import AVKit
, where AVSpeechSynthesizer
lives, also NaturalLanguage
where NLP
related stuff is located.
import AVKit
import NaturalLanguage
We will create instances of NLLanguageRecognizer
and AVSpeechSynthesizer
next.
@State private var text = ""
let recognizer = NLLanguageRecognizer()
let speechSynthesizer = AVSpeechSynthesizer()
Inside the Button’s
action, we first wanna process input text in order to recognize the dominant language
recognizer.processString(text)
let lang = recognizer.dominantLanguage!.rawValue
Once we know the dominant
language
, we will start working on our text to speech part.
We will create an instance of AVSpeechUtterance
, which is an object that encapsulates the text for speech synthesis
and parameters that affect the speech.
let utterance = AVSpeechUtterance(string: text)
AVSpeechUtterance
has a voice
property which is the voice the speech synthesizer uses when speaking the utterance and we can set language for the voice by assigning AVSpeechSynthesisVoice
instance. AVSpeechSynthesisVoice
creates a distinct voice to use with speech synthesis.
For the language parameter in AVSpeechSynthesisVoice
, we will pass our detected dominant language
from the input string.
utterance.voice = AVSpeechSynthesisVoice(language: lang)
Before we ask speech synthesizer to start speaking, we will set the AVAudioSession’s
sharedInstance
with the category as playback
do {
try AVAudioSession.sharedInstance().setCategory(AVAudioSession.Category.playback)
try AVAudioSession.sharedInstance().setActive(true)
} catch {
print(error.localizedDescription)
}
Last but not the least, we will ask AVSpeechSynthesizer
to speak the utterance.
speechSynthesizer.speak(utterance)
Our complete code will look like this:
import AVKit
import NaturalLanguage
struct DevTechieNLSpeechSynthesis: View {
@State private var text = ""
let recognizer = NLLanguageRecognizer()
let speechSynthesizer = AVSpeechSynthesizer()
var body: some View {
VStack {
TextEditor(text: $text)
.padding()
.overlay(RoundedRectangle(cornerRadius: 10).stroke(Color.gray.opacity(0.5), lineWidth: 2))
.padding()
Button("Speak") {
recognizer.processString(text)
let lang = recognizer.dominantLanguage!.rawValue
let utterance = AVSpeechUtterance(string: text)
utterance.voice = AVSpeechSynthesisVoice(language: lang)
do {
try AVAudioSession.sharedInstance().setCategory(AVAudioSession.Category.playback)
try AVAudioSession.sharedInstance().setActive(true)
} catch {
print(error.localizedDescription)
}
speechSynthesizer.speak(utterance)
}
}
}
}
Build and run on device: