Dominant Language Recognizer in SwiftUI using Natural Language Processing

DevTechie Inc
Jul 4, 2023

Dominant Language Recognizer in SwiftUI using Natural Language Processing

Apple includes machine learning frameworks for all their platforms to make it easy for app developers to leverage power of AI and ML, while building their apps.

One such framework is the Natural Language framework, which provides a variety of Natural Language Processing(NLP) features. NLP provided by Apple’s Natural Language framework supports many different languages and scripts.

In this article, we will explore way to detect language in which a piece of text is written. Language identification automatically detects the language and script for a piece of text. We can combine Natural Language framework with Create ML to train and deploy custom Natural Language models.

Let’s get started with an example. We will start by creating a simple UI for us to paste piece of text and a button, which upon pressing, will recognize the dominant language for the entered text.

struct DevTechieNLExplorer: View {
    @State private var text = ""
    @State private var result = ""
    var body: some View {
        NavigationStack {
            VStack {
                TextEditor(text: $text)
                    .overlay(
                        RoundedRectangle(cornerRadius: 5)
                            .stroke(Color.gray.opacity(0.3), lineWidth: 1)
                    )
                Button("Process") {
                    // process text here
                }
                .buttonStyle(.bordered)
                Spacer()
                Text("Dominant language: \(result)")
                    .font(.title3)
            }
            .padding()
            .navigationTitle("DevTechie")
        }
    }
}

For NLP, we will start by creating instance of NLLanguageRecognizer.

An NLLanguageRecognizer object automatically detects the language of a piece of text. It performs language identification by:

Identifying the dominant script of a piece of text. Some languages have a unique script (like Greek), but others share the same script (like English, French, and German, which all share the Latin script).

Identifying the language itself.

Note: Don’t use an instance of NLLanguageRecognizer from more than one thread simultaneously.

import NaturalLanguage
struct DevTechieNLExplorer: View {
    @State private var text = ""
    @State private var result = ""
    let recognizer = NLLanguageRecognizer()
    
    var body: some View {
        NavigationStack {
            VStack {
                TextEditor(text: $text)
                    .overlay(
                        RoundedRectangle(cornerRadius: 5)
                            .stroke(Color.gray.opacity(0.3), lineWidth: 1)
                    )
                Button("Process") {
                    // process text here
                }
                .buttonStyle(.bordered)
                Spacer()
                Text("Dominant language: \(result)")
                    .font(.title3)
            }
            .padding()
            .navigationTitle("DevTechie")
        }
    }
}

We will use NLLanguageRecognizer instance to process the text using processString(string:) method. After processing the string, we are ready to fetch dominant language information by calling NLLanguageRecognizer ‘s dominantLanguage property. This property returns an optional value so we will safe wrap it using if let statement.

struct DevTechieNLExplorer: View {
    @State private var text = ""
    @State private var result = ""
    let recognizer = NLLanguageRecognizer()
    
    var body: some View {
        NavigationStack {
            VStack {
                TextEditor(text: $text)
                    .overlay(
                        RoundedRectangle(cornerRadius: 5)
                            .stroke(Color.gray.opacity(0.3), lineWidth: 1)
                    )
                Button("Process") {
                    guard !text.isEmpty else { return }
                    recognizer.processString(text)
                    if let dominantLanguage = recognizer.dominantLanguage {
                        result = dominantLanguage.rawValue
                    }
                }
                .buttonStyle(.bordered)
                Spacer()
                Text("Dominant language: \(result)")
                    .font(.title3)
            }
            .padding()
            .navigationTitle("DevTechie")
        }
    }
}

Build and run. We will try and paste two different language proverbs. First one in Hindi

Second one in Spanish


Getting the list of possible languages

NLLanguageRecognizer recognizes language with good accuracy but sometimes it can get confusing for NLLanguageRecognizer to determine the language with close script proximity. We can ask recognizer to generate possible language predictions using languageHypotheses(with Maximum:) method. This method takes integer value for number of maximum hypotheses to generate.

struct DevTechieNLExplorer: View {
    @State private var text = ""
    @State private var result = ""
    let recognizer = NLLanguageRecognizer()
    
    var body: some View {
        NavigationStack {
            VStack {
                TextEditor(text: $text)
                    .overlay(
                        RoundedRectangle(cornerRadius: 5)
                            .stroke(Color.gray.opacity(0.3), lineWidth: 1)
                    )
                Button("Process") {
                    guard !text.isEmpty else { return }
                    recognizer.processString(text)
                    result = ""
                    for lang in recognizer.languageHypotheses(withMaximum: 2) {
                        result += "\(lang.key.rawValue)" + (String(format: "%0.0f%%", lang.value * 100)) + ", "
                    }
                }
                .buttonStyle(.bordered)
                Spacer()
                Text("Language prediction: \(result)")
                    .font(.title3)
            }
            .padding()
            .navigationTitle("DevTechie")
        }
    }
}

We will use Latin proverb as an example to test our code.


Constraining the language identification

We can helpNLLanguageRecognizer by providing the information about the text we want to identify, if we already know about it. For example, if we are building an app that must targeted to a particular region and targeted to support only a few languages, we can specify this information via languageConstraints and via languageHints

For our example, we will constrain our recognizer to recognize only Japanese language so upon pasting English text, it will not make any predictions.

struct DevTechieNLExplorer: View {
    @State private var text = ""
    @State private var result = ""
    let recognizer = NLLanguageRecognizer()
    
    var body: some View {
        NavigationStack {
            VStack {
                TextEditor(text: $text)
                    .overlay(
                        RoundedRectangle(cornerRadius: 5)
                            .stroke(Color.gray.opacity(0.3), lineWidth: 1)
                    )
                Button("Process") {
                    guard !text.isEmpty else { return }
                    
                    recognizer.languageConstraints = [.japanese]
                    recognizer.languageHints = [.japanese: 1]
                    
                    recognizer.processString(text)
                    result = ""
                    for lang in recognizer.languageHypotheses(withMaximum: 2) {
                        result += "\(lang.key.rawValue)" + (String(format: "%0.0f%%", lang.value * 100)) + ", "
                    }
                }
                .buttonStyle(.bordered)
                Spacer()
                Text("Language prediction: \(result)")
                    .font(.title3)
            }
            .padding()
            .navigationTitle("DevTechie")
        }
    }
}