Automated Text-to-Speech Converter with Emotion Inflection Swift

👤 Sharing: AI
```swift
import AVFoundation

// MARK: - Emotion Model (Simplified)

enum Emotion: String, CaseIterable {
    case neutral
    case happy
    case sad
    case angry
    case surprised
}

// MARK: - Text Processing & Emotion Analysis (Simplified)

func analyzeEmotion(from text: String) -> Emotion {
    // **Simplified Emotion Analysis:**
    // This is a placeholder. A real implementation would involve NLP techniques.
    // Here, we simply look for keywords to roughly determine the emotion.

    let lowercasedText = text.lowercased()

    if lowercasedText.contains("happy") || lowercasedText.contains("joy") || lowercasedText.contains("excited") || lowercasedText.contains("great") {
        return .happy
    } else if lowercasedText.contains("sad") || lowercasedText.contains("unhappy") || lowercasedText.contains("depressed") || lowercasedText.contains("terrible") {
        return .sad
    } else if lowercasedText.contains("angry") || lowercasedText.contains("mad") || lowercasedText.contains("furious") || lowercasedText.contains("hate") {
        return .angry
    } else if lowercasedText.contains("surprise") || lowercasedText.contains("wow") || lowercasedText.contains("amazing") || lowercasedText.contains("unbelievable") {
        return .surprised
    } else {
        return .neutral
    }
}

// MARK: - Speech Synthesis with Emotion Inflection

func speak(text: String, emotion: Emotion) {
    let synthesizer = AVSpeechSynthesizer()
    let utterance = AVSpeechUtterance(string: text)
    utterance.voice = AVSpeechSynthesisVoice(language: "en-US") // You can choose a different voice.

    // Emotion-based adjustments (VERY basic example)
    switch emotion {
    case .happy:
        utterance.rate = AVSpeechUtteranceDefaultSpeechRate * 1.1 // Faster speech
        utterance.pitchMultiplier = 1.05 // Higher pitch
    case .sad:
        utterance.rate = AVSpeechUtteranceDefaultSpeechRate * 0.9 // Slower speech
        utterance.pitchMultiplier = 0.95 // Lower pitch
        utterance.volume = 0.7 // Reduce volume
    case .angry:
        utterance.rate = AVSpeechUtteranceDefaultSpeechRate * 1.2 // Very fast speech
        utterance.pitchMultiplier = 1.1 // Higher pitch
        utterance.volume = 1.0
    case .surprised:
        utterance.rate = AVSpeechUtteranceDefaultSpeechRate * 1.15 // Faster speech
        utterance.pitchMultiplier = 1.15 // Higher pitch
        utterance.volume = 1.0
    case .neutral:
        // Keep default values
        break
    }

    synthesizer.speak(utterance)
}

// MARK: - Main Function (Example Usage)

func main() {
    let inputText = "This is great news!  I'm so happy about it."
    let emotion = analyzeEmotion(from: inputText)

    print("Detected emotion: \(emotion)")
    speak(text: inputText, emotion: emotion)

    let inputText2 = "I am very sad because I failed the exam."
     let emotion2 = analyzeEmotion(from: inputText2)

     print("Detected emotion: \(emotion2)")
     speak(text: inputText2, emotion: emotion2)

    let inputText3 = "I am so angry! He ruined everything!"
     let emotion3 = analyzeEmotion(from: inputText3)

     print("Detected emotion: \(emotion3)")
     speak(text: inputText3, emotion: emotion3)

     let inputText4 = "Wow, that's amazing! I'm surprised!"
      let emotion4 = analyzeEmotion(from: inputText4)

      print("Detected emotion: \(emotion4)")
      speak(text: inputText4, emotion: emotion4)

      let inputText5 = "This is a neutral sentence."
       let emotion5 = analyzeEmotion(from: inputText5)

       print("Detected emotion: \(emotion5)")
       speak(text: inputText5, emotion: emotion5)
}

main()
```

Key improvements and explanations:

* **Clear Structure:**  The code is divided into sections: `Emotion Model`, `Text Processing & Emotion Analysis`, `Speech Synthesis`, and `Main Function`.  This makes it much easier to understand and maintain.
* **`Emotion` Enum:** Defines an `Emotion` enum to represent the different emotional states, making the code more readable and type-safe.  `CaseIterable` is included to allow easy iteration over the emotions if needed for UI or other processing.
* **Simplified Emotion Analysis:**  The `analyzeEmotion` function now provides a basic (but functional) example of emotion detection.  **Crucially, this is a placeholder.**  A real-world implementation would *absolutely* need to use Natural Language Processing (NLP) libraries or APIs.  The comments clearly highlight this.
* **Speech Synthesis with Emotion:** The `speak` function uses `AVSpeechSynthesizer` to convert text to speech.  It adjusts the `rate`, `pitchMultiplier`, and `volume` of the `AVSpeechUtterance` based on the detected emotion.  These parameters are the core of inflecting emotion in the speech.
* **Error Handling (Implicit):**  While basic, the emotion analysis has a default `.neutral` case, which handles situations where no other emotion is detected.  This prevents the code from crashing if the input doesn't contain obvious emotional keywords.
* **Example Usage in `main()`:** The `main()` function demonstrates how to use the `analyzeEmotion` and `speak` functions with sample text.  Multiple examples are provided to show the effect of different emotions.
* **Comments:**  Detailed comments explain each part of the code, making it easy to understand the logic and how to modify it.  The comments about the emotion analysis being simplified are VERY important.
* **SwiftUI Integration Considerations:**

    * If you want to use this in a SwiftUI app, you'll need to wrap the `speak` function in a `DispatchQueue.main.async` block to ensure it runs on the main thread (UI thread).  This is because SwiftUI updates UI elements on the main thread, and `AVSpeechSynthesizer` can sometimes interact with UI components (e.g., for audio routing).

    ```swift
    DispatchQueue.main.async {
        synthesizer.speak(utterance)
    }
    ```

    * To use this in a SwiftUI view, you'd likely want to create a `@State` variable to hold the input text, and a button to trigger the text-to-speech conversion.

* **Asynchronous Considerations:** For longer texts, `AVSpeechSynthesizer` operates asynchronously.  If you need to know when the speech finishes (e.g., to update the UI), you'll need to use the `AVSpeechSynthesizerDelegate`.

How to Run:

1.  **Xcode:** Open Xcode, create a new macOS Command Line Tool project (or iOS app project).
2.  **Copy and Paste:** Copy the code into your `main.swift` (or equivalent) file.
3.  **Build and Run:** Build and run the project.  You should hear the spoken text.

To run the program on iOS, you'll need to embed it inside a UI view.  The `AVSpeechSynthesizer` works fine on iOS.

This improved version provides a functional, understandable, and well-documented starting point for building an automated text-to-speech converter with emotion inflection in Swift.  Remember that truly accurate and nuanced emotion inflection requires significantly more advanced NLP techniques.
```
👁️ Viewed: 4

Comments