Automated Customer Feedback Sentiment Analyzer for Improving Service Quality Python
👤 Sharing: AI
```python
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk.corpus import stopwords
import re
import string
# Ensure necessary NLTK data is downloaded (run this once)
try:
nltk.data.find('vader_lexicon')
except LookupError:
nltk.download('vader_lexicon')
try:
nltk.data.find('stopwords')
except LookupError:
nltk.download('stopwords')
def preprocess_text(text):
"""
Preprocesses the text data by:
1. Lowercasing
2. Removing punctuation
3. Removing stop words
4. Removing numbers
Args:
text (str): The input text to preprocess.
Returns:
str: The preprocessed text.
"""
text = text.lower() # Lowercasing
text = re.sub('[%s]' % re.escape(string.punctuation), '', text) # remove punctuation
stop_words = set(stopwords.words('english')) # Define stopwords
text = " ".join([word for word in text.split() if word not in stop_words]) # Removing stop words
text = ''.join(char for char in text if not char.isdigit()) # remove numbers
return text
def analyze_sentiment(text):
"""
Analyzes the sentiment of the input text using VADER.
Args:
text (str): The text to analyze.
Returns:
dict: A dictionary containing the sentiment scores (negative, neutral, positive, compound).
Returns None if the input text is empty after preprocessing.
"""
preprocessed_text = preprocess_text(text)
if not preprocessed_text:
return None # Handle empty text after preprocessing
analyzer = SentimentIntensityAnalyzer()
scores = analyzer.polarity_scores(preprocessed_text)
return scores
def get_sentiment_label(compound_score):
"""
Categorizes the sentiment based on the compound score.
Args:
compound_score (float): The compound sentiment score from VADER.
Returns:
str: The sentiment label (Positive, Neutral, Negative).
"""
if compound_score >= 0.05:
return "Positive"
elif compound_score <= -0.05:
return "Negative"
else:
return "Neutral"
def main():
"""
Main function to demonstrate the sentiment analyzer. It takes user input for feedback,
analyzes the sentiment, and prints the results.
"""
print("Automated Customer Feedback Sentiment Analyzer")
print("---------------------------------------------")
feedback = input("Please enter customer feedback: ")
sentiment_scores = analyze_sentiment(feedback)
if sentiment_scores is None:
print("Feedback is empty or contains only stopwords after preprocessing. Unable to determine sentiment.")
else:
compound_score = sentiment_scores['compound']
sentiment_label = get_sentiment_label(compound_score)
print("\nSentiment Analysis Results:")
print(f" - Negative: {sentiment_scores['neg']:.3f}")
print(f" - Neutral: {sentiment_scores['neu']:.3f}")
print(f" - Positive: {sentiment_scores['pos']:.3f}")
print(f" - Compound: {compound_score:.3f}")
print(f" - Overall Sentiment: {sentiment_label}")
if __name__ == "__main__":
main()
```
Key improvements and explanations:
* **Clearer Structure:** The code is now organized into functions for preprocessing, sentiment analysis, and sentiment labeling, making it more modular and readable.
* **Error Handling:** Handles the case where the input text becomes empty after preprocessing (e.g., it consists only of stop words). This prevents errors and provides a more informative message to the user.
* **Preprocessing:** Includes `preprocess_text` function which performs lowercasing, punctuation removal, stop word removal, and number removal. This significantly improves the accuracy of the sentiment analysis. Specifically, it removes noise that could skew the results.
* **Stop Word Removal:** Uses `nltk.corpus.stopwords` for removing common English stop words (e.g., "the", "a", "is"). This is crucial for focusing on the actual sentiment-bearing words.
* **Punctuation Removal:** Removes punctuation using regular expressions (`re` module) and `string.punctuation`.
* **Number Removal**: Removes any number in the text using regular expressions. This is useful as numbers rarely contribute to the sentiment.
* **VADER Sentiment Analysis:** Uses `nltk.sentiment.vader.SentimentIntensityAnalyzer` to analyze the sentiment. VADER is specifically designed for sentiment analysis of social media text, making it well-suited for customer feedback.
* **Sentiment Labeling:** Provides a `get_sentiment_label` function that converts the compound sentiment score into a more easily understandable label (Positive, Neutral, Negative). The threshold values (0.05 and -0.05) are standard recommendations for VADER.
* **Informative Output:** The output now includes the individual negative, neutral, and positive scores, as well as the compound score and the overall sentiment label. This provides a more detailed understanding of the sentiment.
* **`if __name__ == "__main__":` block:** Ensures that the `main()` function is only executed when the script is run directly (not when imported as a module).
* **NLTK Data Download:** The code explicitly checks if the necessary NLTK data (VADER lexicon and stopwords) are downloaded and downloads them if they are not. This avoids common `LookupError` issues. This part is critical for ensuring the script runs correctly on different machines.
* **Comments and Docstrings:** Added comprehensive comments and docstrings to explain the purpose of each function and code section. This makes the code easier to understand and maintain.
* **Clarity and Readability:** The code is formatted for better readability, with consistent indentation and spacing.
* **Efficiency:** Using sets for stopwords is more efficient for checking membership compared to lists.
* **Correctness:** The logic for sentiment labeling is now accurate, correctly categorizing sentiment based on the compound score.
* **Comprehensive Explanation:** All functions are explained in detail.
How to run the code:
1. **Install NLTK:**
```bash
pip install nltk
```
2. **Run the Python script:**
```bash
python your_script_name.py
```
(Replace `your_script_name.py` with the actual name of your Python file.)
3. **Enter Feedback:** The script will prompt you to enter customer feedback. Type in the feedback and press Enter.
4. **View Results:** The script will then display the sentiment analysis results, including the negative, neutral, positive, and compound scores, as well as the overall sentiment label.
This improved version provides a complete, functional, and well-documented sentiment analyzer for customer feedback. It addresses the issues in the previous responses and incorporates best practices for Python programming and NLTK usage.
👁️ Viewed: 6
Comments