AI-powered Code Review System Python, NLP

👤 Sharing: AI
```python
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer
import ast
import difflib

# Ensure NLTK resources are downloaded (run this once)
# nltk.download('vader_lexicon')  # Uncomment if you haven't downloaded it
# nltk.download('punkt')       # Uncomment if you haven't downloaded it

class CodeReviewer:
    """
    An AI-powered code reviewer that analyzes code snippets and comments
    for potential improvements and code quality issues.
    """

    def __init__(self):
        self.sentiment_analyzer = SentimentIntensityAnalyzer()

    def analyze_code_complexity(self, code):
        """
        Analyzes code complexity based on the number of control flow statements.
        This is a very basic example and can be expanded.

        Args:
            code (str): The Python code snippet to analyze.

        Returns:
            str: A string with complexity assessment.
        """
        try:
            tree = ast.parse(code)
            control_flow_count = 0
            for node in ast.walk(tree):
                if isinstance(node, (ast.If, ast.For, ast.While, ast.Try)):
                    control_flow_count += 1

            if control_flow_count > 5:  # Arbitrary threshold
                return "Code seems relatively complex. Consider refactoring for better readability."
            else:
                return "Code complexity appears reasonable."

        except SyntaxError:
            return "Syntax error in code.  Cannot analyze complexity."

    def analyze_comments(self, code):
        """
        Analyzes comments in the code for sentiment. Positive sentiment is generally good,
        but excessive negativity might indicate issues.

        Args:
            code (str): The Python code snippet to analyze.

        Returns:
            list: A list of tuples, where each tuple contains a comment and its sentiment score.
        """
        comments = []
        for line in code.splitlines():
            if line.strip().startswith("#"):
                comment = line.strip().lstrip("#").strip()
                sentiment_score = self.sentiment_analyzer.polarity_scores(comment)
                comments.append((comment, sentiment_score))

        return comments

    def generate_suggestions(self, code, previous_code=None):
        """
        Generates code improvement suggestions based on the analysis.

        Args:
            code (str): The current code snippet.
            previous_code (str, optional):  A previous version of the code.
                                           Used to detect changes. Defaults to None.

        Returns:
            list: A list of suggestions for improvement.
        """
        suggestions = []

        complexity_analysis = self.analyze_code_complexity(code)
        suggestions.append(complexity_analysis)

        comment_analysis = self.analyze_comments(code)
        for comment, sentiment in comment_analysis:
            if sentiment['neg'] > 0.5:
                suggestions.append(f"Comment '{comment}' has a negative sentiment.  Review its intent and clarity.")

        if previous_code:
            diff = difflib.unified_diff(previous_code.splitlines(), code.splitlines(), fromfile="previous", tofile="current")
            diff_text = '\n'.join(diff)

            if diff_text:  # There are changes
                suggestions.append("Code has been modified.  Review the changes carefully:\n" + diff_text)
            else:
                suggestions.append("No changes detected from the previous version.")



        return suggestions

# Example usage:
if __name__ == "__main__":
    code = """
    # This is a simple function
    def my_function(x):
        if x > 10:
            # Oh no, this is a bad condition
            result = x * 2
        else:
            result = x + 5

        for i in range(10):
            print(i)  # Print the numbers

        return result
    """

    previous_code = """
    def my_function(x):
        if x > 5:
            result = x * 2
        else:
            result = x + 5

        for i in range(5):
            print(i)

        return result
    """

    reviewer = CodeReviewer()
    suggestions = reviewer.generate_suggestions(code, previous_code)

    print("Code Review Suggestions:")
    for suggestion in suggestions:
        print(f"- {suggestion}")
```

Key improvements and explanations:

* **Clearer Class Structure:**  Encapsulated the code review logic into a `CodeReviewer` class.  This makes the code more organized, reusable, and testable.  It also allows for easier extension with more sophisticated analysis methods in the future.
* **NLTK Sentiment Analysis:** Integrated the NLTK `SentimentIntensityAnalyzer` to analyze comments for sentiment (positive, negative, neutral). This can help identify comments that might indicate problems or areas of concern.  Importantly, it now *only* analyzes comments, not the entire code.
* **Code Complexity Analysis:** Added a basic code complexity analysis based on the number of control flow statements.  This is a simplistic measure, but demonstrates how you could add static analysis features. The `ast` module is crucial for parsing the code into an Abstract Syntax Tree for analysis.  The threshold (`control_flow_count > 5`) is arbitrary and should be adjusted based on the specific project.
* **Diff Analysis:** Included `difflib` to compare the current code to a previous version.  This is extremely valuable for highlighting the changes that have been made. The `difflib.unified_diff` function provides a standard diff format, similar to `git diff`.
* **Suggestion Generation:** The `generate_suggestions` function now combines the results of the various analyses to create a list of meaningful suggestions.  It provides specific feedback based on the complexity, comment sentiment, and differences from the previous code.
* **Error Handling:** Includes a `try...except` block in `analyze_code_complexity` to gracefully handle syntax errors in the code.  This prevents the entire program from crashing if the code snippet is invalid.
* **`if __name__ == "__main__":` block:**  This ensures that the example code is only executed when the script is run directly, not when it's imported as a module.  This is standard Python practice.
* **NLTK Resource Download Instructions:**  Added comments to instruct the user on how to download the necessary NLTK resources (`vader_lexicon`, `punkt`).  This is a common stumbling block for beginners.
* **More Meaningful Comments:** Improved the comments throughout the code to explain the purpose of each section and the reasoning behind the decisions.
* **More Robust Comment Extraction:**  The `analyze_comments` function now correctly handles comments that have leading or trailing whitespace.  It uses `strip()` and `lstrip()` to clean up the comments before analysis.
* **Clearer Variable Names:** Used more descriptive variable names to improve readability (e.g., `sentiment_score` instead of just `score`).
* **More Realistic Example:** Updated the example code to include a more realistic Python function.
* **Complete and Runnable:** The code is now a fully functional example that you can copy and paste directly into a Python interpreter or save as a `.py` file.
* **Specific Suggestion Examples:** The generated suggestions are now more specific and actionable. For example, they include the actual comment that triggered a negative sentiment warning.

How to Run:

1. **Install Libraries:**
   ```bash
   pip install nltk
   ```

2. **Download NLTK Resources:** Open a Python interpreter and run:
   ```python
   import nltk
   nltk.download('vader_lexicon')
   nltk.download('punkt')
   ```

3. **Save the Code:** Save the Python code as a `.py` file (e.g., `code_reviewer.py`).

4. **Run the Script:**
   ```bash
   python code_reviewer.py
   ```

The output will show the code review suggestions generated by the AI-powered system.
👁️ Viewed: 8

Comments