AI-powered AI-based Text Styling Python, NLP, AI
👤 Sharing: AI
```python
import nltk
import spacy
import random
from nltk.corpus import wordnet
from nltk.tokenize import word_tokenize
# Download necessary NLTK data (if you haven't already)
# nltk.download('punkt')
# nltk.download('averaged_perceptron_tagger')
# nltk.download('wordnet')
# nltk.download('stopwords')
# Load spaCy's English language model (you might need to download it first: python -m spacy download en_core_web_sm)
nlp = spacy.load("en_core_web_sm")
def synonym_replacement(text, n=1):
"""
Replaces n words in the input text with synonyms.
Args:
text: The input text (string).
n: The number of words to replace (integer).
Returns:
The modified text (string) with synonyms.
"""
words = word_tokenize(text)
new_words = words.copy() # Create a copy to modify
random_word_list = list(set(random.sample(range(len(words)), min(n, len(words))))) # Ensure we don't ask for more replacements than words exist
for i in random_word_list:
word = words[i]
synonyms = []
for syn in wordnet.synsets(word):
for lemma in syn.lemmas():
synonyms.append(lemma.name())
if synonyms:
synonym = random.choice(synonyms)
new_words[i] = synonym
return " ".join(new_words)
def change_tense(text):
"""
Changes the tense of the input text to the past tense. Uses spaCy for POS tagging.
Args:
text: The input text (string).
Returns:
The modified text (string) in the past tense.
"""
doc = nlp(text)
new_text = []
for token in doc:
if token.pos_ == "VERB":
try:
new_text.append(token.lemma_ + "ed") # Simple past tense conversion. This is imperfect.
except:
new_text.append(token.text) # If lemmatization or adding "ed" fails, keep the original
else:
new_text.append(token.text)
return " ".join(new_text)
def add_intensity(text):
"""
Adds words to intensify the sentiment of the text. This is a very basic example
and could be improved with sentiment analysis.
Args:
text: The input text (string).
Returns:
The modified text (string) with added intensity.
"""
words = word_tokenize(text)
new_text = []
intensity_words = ["very", "extremely", "incredibly", "utterly"]
for word in words:
new_text.append(word)
if random.random() < 0.2: # Add intensity word with a 20% probability
new_text.append(random.choice(intensity_words))
return " ".join(new_text)
def main():
"""
Main function to demonstrate the AI-powered text styling.
"""
text = "The cat sat on the mat. The dog ran quickly. I am happy."
print("Original Text:", text)
# Synonym replacement
synonym_text = synonym_replacement(text, n=3)
print("\nSynonym Replacement:", synonym_text)
# Tense change
past_tense_text = change_tense(text)
print("\nTense Change (Past):", past_tense_text)
# Add Intensity
intensity_text = add_intensity(text)
print("\nAdd Intensity:", intensity_text)
if __name__ == "__main__":
main()
```
Key improvements and explanations:
* **Clearer Function Definitions:** Each function now has a docstring explaining its purpose, arguments, and return value. This is essential for maintainability and readability.
* **NLTK Data Download:** The code includes commented-out lines to download the necessary NLTK data. This is important because the program will fail if these datasets are not available. I've commented them out to avoid automatically downloading them every time the code runs, as users may already have them. It's crucial to inform the user to uncomment these lines if they don't have the datasets.
* **spaCy Loading:** Similar to NLTK, spaCy requires a language model. The code now explicitly loads the `en_core_web_sm` model. It also includes a comment instructing the user to download it if necessary. This is critical, as spaCy won't work without it.
* **Error Handling (Tense Change):** The `change_tense` function now includes a `try...except` block. This is important because lemmatization can fail for some words, or the simple "ed" addition might not be correct. The `except` block ensures that the program doesn't crash if an error occurs during tense conversion. This handles more edge cases.
* **Random Sampling Improvement:** The `synonym_replacement` function now calculates how many words can actually be replaced. If the user asks to replace *n* words, and the text has fewer than *n* words, it will only replace the maximum number of words that are available. This prevents `IndexError` exceptions. It also uses `set` to ensure that words aren't replaced twice with different synonyms.
* **Intensity Control:** The `add_intensity` function now adds intensity words with a probability. This makes the output less predictable and more natural.
* **Word Tokenization:** The code now uses `word_tokenize` from `nltk.tokenize` in both `synonym_replacement` and `add_intensity` functions. This is important because it correctly splits the text into words, handling punctuation and contractions more effectively than simply splitting on spaces.
* **Synonym Selection:** The `synonym_replacement` function now correctly iterates through the lemmas of each synset to get the actual synonym strings, instead of the synset objects themselves. This retrieves correct word representations.
* **Code Structure:** The code is organized into functions, making it more modular and reusable. The `main` function demonstrates how to use the different functions.
* **Clearer Output:** The output now includes labels for each type of text modification.
* **Corrected Tense Change:** The `change_tense` function now correctly uses `token.lemma_` to get the base form of the verb before adding "ed." This is crucial for proper past tense conversion.
* **`if __name__ == "__main__":` block:** The code is now enclosed in an `if __name__ == "__main__":` block. This ensures that the `main` function is only called when the script is executed directly, not when it's imported as a module.
* **`copy()` to Avoid Side Effects:** In `synonym_replacement`, `new_words = words.copy()` creates a copy of the `words` list. This is crucial because modifying the original `words` list directly could lead to unexpected side effects.
* **Limit replacements:** The code checks that `n` is not larger than the number of words in the text. This prevents errors.
How to run the code:
1. **Install Libraries:**
```bash
pip install nltk spacy
python -m spacy download en_core_web_sm
```
2. **Run the Python script:**
```bash
python your_script_name.py
```
This improved version addresses many potential issues and provides a more robust and understandable example of AI-powered text styling using Python, NLTK, and spaCy. It's more resilient to errors and produces more meaningful results. Remember that this is a *basic* example, and more sophisticated techniques (like using machine learning models for sentiment analysis and better tense conversion) would be needed for real-world applications.
👁️ Viewed: 11
Comments