AI-powered Code Review System Python, NLP
👤 Sharing: AI
```python
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer
import ast
import difflib
# Ensure NLTK resources are downloaded (run this once)
# nltk.download('vader_lexicon') # Uncomment if you haven't downloaded it
# nltk.download('punkt') # Uncomment if you haven't downloaded it
class CodeReviewer:
"""
An AI-powered code reviewer that analyzes code snippets and comments
for potential improvements and code quality issues.
"""
def __init__(self):
self.sentiment_analyzer = SentimentIntensityAnalyzer()
def analyze_code_complexity(self, code):
"""
Analyzes code complexity based on the number of control flow statements.
This is a very basic example and can be expanded.
Args:
code (str): The Python code snippet to analyze.
Returns:
str: A string with complexity assessment.
"""
try:
tree = ast.parse(code)
control_flow_count = 0
for node in ast.walk(tree):
if isinstance(node, (ast.If, ast.For, ast.While, ast.Try)):
control_flow_count += 1
if control_flow_count > 5: # Arbitrary threshold
return "Code seems relatively complex. Consider refactoring for better readability."
else:
return "Code complexity appears reasonable."
except SyntaxError:
return "Syntax error in code. Cannot analyze complexity."
def analyze_comments(self, code):
"""
Analyzes comments in the code for sentiment. Positive sentiment is generally good,
but excessive negativity might indicate issues.
Args:
code (str): The Python code snippet to analyze.
Returns:
list: A list of tuples, where each tuple contains a comment and its sentiment score.
"""
comments = []
for line in code.splitlines():
if line.strip().startswith("#"):
comment = line.strip().lstrip("#").strip()
sentiment_score = self.sentiment_analyzer.polarity_scores(comment)
comments.append((comment, sentiment_score))
return comments
def generate_suggestions(self, code, previous_code=None):
"""
Generates code improvement suggestions based on the analysis.
Args:
code (str): The current code snippet.
previous_code (str, optional): A previous version of the code.
Used to detect changes. Defaults to None.
Returns:
list: A list of suggestions for improvement.
"""
suggestions = []
complexity_analysis = self.analyze_code_complexity(code)
suggestions.append(complexity_analysis)
comment_analysis = self.analyze_comments(code)
for comment, sentiment in comment_analysis:
if sentiment['neg'] > 0.5:
suggestions.append(f"Comment '{comment}' has a negative sentiment. Review its intent and clarity.")
if previous_code:
diff = difflib.unified_diff(previous_code.splitlines(), code.splitlines(), fromfile="previous", tofile="current")
diff_text = '\n'.join(diff)
if diff_text: # There are changes
suggestions.append("Code has been modified. Review the changes carefully:\n" + diff_text)
else:
suggestions.append("No changes detected from the previous version.")
return suggestions
# Example usage:
if __name__ == "__main__":
code = """
# This is a simple function
def my_function(x):
if x > 10:
# Oh no, this is a bad condition
result = x * 2
else:
result = x + 5
for i in range(10):
print(i) # Print the numbers
return result
"""
previous_code = """
def my_function(x):
if x > 5:
result = x * 2
else:
result = x + 5
for i in range(5):
print(i)
return result
"""
reviewer = CodeReviewer()
suggestions = reviewer.generate_suggestions(code, previous_code)
print("Code Review Suggestions:")
for suggestion in suggestions:
print(f"- {suggestion}")
```
Key improvements and explanations:
* **Clearer Class Structure:** Encapsulated the code review logic into a `CodeReviewer` class. This makes the code more organized, reusable, and testable. It also allows for easier extension with more sophisticated analysis methods in the future.
* **NLTK Sentiment Analysis:** Integrated the NLTK `SentimentIntensityAnalyzer` to analyze comments for sentiment (positive, negative, neutral). This can help identify comments that might indicate problems or areas of concern. Importantly, it now *only* analyzes comments, not the entire code.
* **Code Complexity Analysis:** Added a basic code complexity analysis based on the number of control flow statements. This is a simplistic measure, but demonstrates how you could add static analysis features. The `ast` module is crucial for parsing the code into an Abstract Syntax Tree for analysis. The threshold (`control_flow_count > 5`) is arbitrary and should be adjusted based on the specific project.
* **Diff Analysis:** Included `difflib` to compare the current code to a previous version. This is extremely valuable for highlighting the changes that have been made. The `difflib.unified_diff` function provides a standard diff format, similar to `git diff`.
* **Suggestion Generation:** The `generate_suggestions` function now combines the results of the various analyses to create a list of meaningful suggestions. It provides specific feedback based on the complexity, comment sentiment, and differences from the previous code.
* **Error Handling:** Includes a `try...except` block in `analyze_code_complexity` to gracefully handle syntax errors in the code. This prevents the entire program from crashing if the code snippet is invalid.
* **`if __name__ == "__main__":` block:** This ensures that the example code is only executed when the script is run directly, not when it's imported as a module. This is standard Python practice.
* **NLTK Resource Download Instructions:** Added comments to instruct the user on how to download the necessary NLTK resources (`vader_lexicon`, `punkt`). This is a common stumbling block for beginners.
* **More Meaningful Comments:** Improved the comments throughout the code to explain the purpose of each section and the reasoning behind the decisions.
* **More Robust Comment Extraction:** The `analyze_comments` function now correctly handles comments that have leading or trailing whitespace. It uses `strip()` and `lstrip()` to clean up the comments before analysis.
* **Clearer Variable Names:** Used more descriptive variable names to improve readability (e.g., `sentiment_score` instead of just `score`).
* **More Realistic Example:** Updated the example code to include a more realistic Python function.
* **Complete and Runnable:** The code is now a fully functional example that you can copy and paste directly into a Python interpreter or save as a `.py` file.
* **Specific Suggestion Examples:** The generated suggestions are now more specific and actionable. For example, they include the actual comment that triggered a negative sentiment warning.
How to Run:
1. **Install Libraries:**
```bash
pip install nltk
```
2. **Download NLTK Resources:** Open a Python interpreter and run:
```python
import nltk
nltk.download('vader_lexicon')
nltk.download('punkt')
```
3. **Save the Code:** Save the Python code as a `.py` file (e.g., `code_reviewer.py`).
4. **Run the Script:**
```bash
python code_reviewer.py
```
The output will show the code review suggestions generated by the AI-powered system.
👁️ Viewed: 8
Comments