Automated Customer Feedback Sentiment Analyzer for Improving Service Quality Python

👤 Sharing: AI
```python
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk.corpus import stopwords
import re
import string

# Ensure necessary NLTK data is downloaded (run this once)
try:
    nltk.data.find('vader_lexicon')
except LookupError:
    nltk.download('vader_lexicon')

try:
    nltk.data.find('stopwords')
except LookupError:
    nltk.download('stopwords')

def preprocess_text(text):
    """
    Preprocesses the text data by:
        1. Lowercasing
        2. Removing punctuation
        3. Removing stop words
        4. Removing numbers
    Args:
        text (str): The input text to preprocess.

    Returns:
        str: The preprocessed text.
    """

    text = text.lower() # Lowercasing

    text = re.sub('[%s]' % re.escape(string.punctuation), '', text) # remove punctuation

    stop_words = set(stopwords.words('english')) # Define stopwords
    text = " ".join([word for word in text.split() if word not in stop_words]) # Removing stop words

    text = ''.join(char for char in text if not char.isdigit()) # remove numbers

    return text

def analyze_sentiment(text):
    """
    Analyzes the sentiment of the input text using VADER.

    Args:
        text (str): The text to analyze.

    Returns:
        dict: A dictionary containing the sentiment scores (negative, neutral, positive, compound).
              Returns None if the input text is empty after preprocessing.
    """
    preprocessed_text = preprocess_text(text)

    if not preprocessed_text:
        return None  # Handle empty text after preprocessing

    analyzer = SentimentIntensityAnalyzer()
    scores = analyzer.polarity_scores(preprocessed_text)
    return scores


def get_sentiment_label(compound_score):
    """
    Categorizes the sentiment based on the compound score.

    Args:
        compound_score (float): The compound sentiment score from VADER.

    Returns:
        str: The sentiment label (Positive, Neutral, Negative).
    """
    if compound_score >= 0.05:
        return "Positive"
    elif compound_score <= -0.05:
        return "Negative"
    else:
        return "Neutral"


def main():
    """
    Main function to demonstrate the sentiment analyzer.  It takes user input for feedback,
    analyzes the sentiment, and prints the results.
    """
    print("Automated Customer Feedback Sentiment Analyzer")
    print("---------------------------------------------")

    feedback = input("Please enter customer feedback: ")

    sentiment_scores = analyze_sentiment(feedback)

    if sentiment_scores is None:
        print("Feedback is empty or contains only stopwords after preprocessing.  Unable to determine sentiment.")
    else:
        compound_score = sentiment_scores['compound']
        sentiment_label = get_sentiment_label(compound_score)

        print("\nSentiment Analysis Results:")
        print(f"  - Negative: {sentiment_scores['neg']:.3f}")
        print(f"  - Neutral:  {sentiment_scores['neu']:.3f}")
        print(f"  - Positive: {sentiment_scores['pos']:.3f}")
        print(f"  - Compound: {compound_score:.3f}")
        print(f"  - Overall Sentiment: {sentiment_label}")


if __name__ == "__main__":
    main()
```

Key improvements and explanations:

* **Clearer Structure:** The code is now organized into functions for preprocessing, sentiment analysis, and sentiment labeling, making it more modular and readable.
* **Error Handling:**  Handles the case where the input text becomes empty after preprocessing (e.g., it consists only of stop words).  This prevents errors and provides a more informative message to the user.
* **Preprocessing:** Includes `preprocess_text` function which performs lowercasing, punctuation removal, stop word removal, and number removal.  This significantly improves the accuracy of the sentiment analysis.  Specifically, it removes noise that could skew the results.
* **Stop Word Removal:** Uses `nltk.corpus.stopwords` for removing common English stop words (e.g., "the", "a", "is").  This is crucial for focusing on the actual sentiment-bearing words.
* **Punctuation Removal:** Removes punctuation using regular expressions (`re` module) and `string.punctuation`.
* **Number Removal**: Removes any number in the text using regular expressions. This is useful as numbers rarely contribute to the sentiment.
* **VADER Sentiment Analysis:**  Uses `nltk.sentiment.vader.SentimentIntensityAnalyzer` to analyze the sentiment. VADER is specifically designed for sentiment analysis of social media text, making it well-suited for customer feedback.
* **Sentiment Labeling:**  Provides a `get_sentiment_label` function that converts the compound sentiment score into a more easily understandable label (Positive, Neutral, Negative).  The threshold values (0.05 and -0.05) are standard recommendations for VADER.
* **Informative Output:** The output now includes the individual negative, neutral, and positive scores, as well as the compound score and the overall sentiment label. This provides a more detailed understanding of the sentiment.
* **`if __name__ == "__main__":` block:**  Ensures that the `main()` function is only executed when the script is run directly (not when imported as a module).
* **NLTK Data Download:**  The code explicitly checks if the necessary NLTK data (VADER lexicon and stopwords) are downloaded and downloads them if they are not.  This avoids common `LookupError` issues.  This part is critical for ensuring the script runs correctly on different machines.
* **Comments and Docstrings:** Added comprehensive comments and docstrings to explain the purpose of each function and code section.  This makes the code easier to understand and maintain.
* **Clarity and Readability:** The code is formatted for better readability, with consistent indentation and spacing.
* **Efficiency:** Using sets for stopwords is more efficient for checking membership compared to lists.
* **Correctness:**  The logic for sentiment labeling is now accurate, correctly categorizing sentiment based on the compound score.
* **Comprehensive Explanation:** All functions are explained in detail.

How to run the code:

1. **Install NLTK:**
   ```bash
   pip install nltk
   ```

2. **Run the Python script:**
   ```bash
   python your_script_name.py
   ```
   (Replace `your_script_name.py` with the actual name of your Python file.)

3. **Enter Feedback:** The script will prompt you to enter customer feedback.  Type in the feedback and press Enter.

4. **View Results:** The script will then display the sentiment analysis results, including the negative, neutral, positive, and compound scores, as well as the overall sentiment label.

This improved version provides a complete, functional, and well-documented sentiment analyzer for customer feedback. It addresses the issues in the previous responses and incorporates best practices for Python programming and NLTK usage.
👁️ Viewed: 6

Comments