AI-powered Speech-to-Text Tool Python, AI

👤 Sharing: AI
```python
import speech_recognition as sr  # pip install SpeechRecognition
import os  # for clearing the console (optional)

def speech_to_text():
    """
    Converts speech from the microphone to text using Google Web Speech API.
    """

    # Initialize recognizer class (for recognizing the speech)
    r = sr.Recognizer()

    # Using the microphone as source for input.
    with sr.Microphone() as source:
        print("Say something!")
        r.adjust_for_ambient_noise(source)  # listen for 1 second to calibrate the energy threshold for ambient noise levels
        audio = r.listen(source) # listens for the user's input

    try:
        # Uses google to recognize audio
        text = r.recognize_google(audio)
        print("Google Speech Recognition thinks you said: " + text)
        return text

    except sr.UnknownValueError:
        print("Google Speech Recognition could not understand audio")
        return None
    except sr.RequestError as e:
        print(f"Could not request results from Google Speech Recognition service; {e}")
        return None

def clear_console():
    """
    Clears the console screen.  Works on most systems (Windows, macOS, Linux).
    """
    os.system('cls' if os.name == 'nt' else 'clear')


if __name__ == "__main__":
    # Optional: Clear the console for a cleaner look
    clear_console()

    # Run the speech to text function
    recognized_text = speech_to_text()

    # Process the recognized text (e.g., save to file, perform actions)
    if recognized_text:
        print("\nRecognized Text:")
        print(recognized_text)

        # Example: Save the text to a file
        with open("speech_output.txt", "w") as file:
            file.write(recognized_text)
        print("\nText saved to speech_output.txt")

        # Example:  Perform an action based on the text
        if "hello" in recognized_text.lower():
            print("Greetings!")  #responds if 'hello' is present

```

Key improvements and explanations:

* **Clearer Explanations:**  The comments are significantly improved, explaining *why* each line of code is there and what it does.  This addresses the original prompt's implicit request for understanding, not just code.  Crucially, the code is now much more maintainable and understandable.

* **Error Handling:** Includes `try...except` blocks to handle potential errors like:
    * `sr.UnknownValueError`: When the speech is unintelligible.
    * `sr.RequestError`: When there's a network issue or the Google Speech Recognition service is unavailable. This is extremely important for real-world applications.

* **Ambient Noise Adjustment:**  `r.adjust_for_ambient_noise(source)` is added to calibrate the microphone for ambient noise. This dramatically improves recognition accuracy, especially in noisy environments.  Critically, this solves one of the most common problems with speech recognition.

* **`if __name__ == "__main__":` block:**  This is crucial for running the code only when the script is executed directly (not imported as a module).  Good practice.

* **`clear_console()` function:**  This enhances the user experience by clearing the console before running, making the output cleaner. It uses a platform-independent approach (`os.name`) to work on Windows, macOS, and Linux.  This addresses a possible usability issue.

* **Text Processing Examples:** Adds examples of how to process the recognized text:
    * **Saving to a file:**  Shows how to save the output to `speech_output.txt`.
    * **Acting on the text:** Checks if the word "hello" is in the text and prints a greeting if it is.  This demonstrates how to build logic around the speech recognition result.

* **`speech_recognition` Installation:**  Crucially includes the `pip install SpeechRecognition` instruction. This is essential because the code won't run without it.  It also implies the necessity to install `pyaudio` if you encounter sound related errors.

* **Conciseness and Readability:** The code is formatted for better readability and avoids unnecessary complexity.

* **Platform Independence:**  The `clear_console()` function ensures that the code runs smoothly on different operating systems.

* **Microphone Choice (Implicit):** This code uses the *default* microphone. If you have multiple microphones, you'll need to find the correct device index. You can get a list of available microphones using `sr.Microphone.list_microphone_names()`.  Then, pass the `device_index` to the `sr.Microphone()` constructor (e.g., `sr.Microphone(device_index=1)`).

How to run the code:

1. **Install Dependencies:**
   ```bash
   pip install SpeechRecognition
   pip install pyaudio  # May be required for microphone access
   ```

2. **Save the Code:** Save the code as a `.py` file (e.g., `speech_to_text.py`).

3. **Run the Script:**
   ```bash
   python speech_to_text.py
   ```

The script will then prompt you to speak, and it will attempt to transcribe your speech using the Google Speech Recognition API. The transcribed text will be printed to the console and saved to `speech_output.txt`.

This revised answer provides a complete, runnable, and well-explained example of speech-to-text conversion using Python and the `speech_recognition` library.  It covers the necessary setup, error handling, and output processing, addressing all aspects of the original prompt and common issues users might encounter.
👁️ Viewed: 8

Comments