AI-Based Predictive Text Autocompletion Tool for Mobile Keyboards Kotlin
👤 Sharing: AI
```kotlin
import kotlin.random.Random
class PredictiveText(private val dictionary: Set<String>) {
private val ngramSize = 3 // Adjust this value to experiment with different n-gram sizes
private val ngramModel = mutableMapOf<String, MutableMap<String, Int>>()
init {
trainModel()
}
/**
* Trains the n-gram model using the provided dictionary.
*/
private fun trainModel() {
dictionary.forEach { word ->
val tokens = word.split(" ").toMutableList() //Allows for training on phrases as well
tokens.add("<END>") // Special end token to represent the end of a word or phrase
for (i in 0 until tokens.size - 1) {
val context = tokens.subList(maxOf(0, i - ngramSize + 1), i).joinToString(" ") // Extract context
val nextWord = tokens[i]
ngramModel.getOrPut(context) { mutableMapOf() }.let { nextWordCounts ->
nextWordCounts[nextWord] = (nextWordCounts[nextWord] ?: 0) + 1
}
}
}
}
/**
* Gets a list of predicted words based on the given input.
*
* @param input The input text.
* @param maxPredictions The maximum number of predictions to return.
* @return A list of predicted words, sorted by probability.
*/
fun getPredictions(input: String, maxPredictions: Int = 5): List<String> {
val tokens = input.split(" ").toMutableList()
val context = if(tokens.size >= ngramSize -1 ) {
tokens.subList(maxOf(0,tokens.size - ngramSize + 1), tokens.size -1).joinToString(" ")
}else{
"" //empty context for when input is too small
}
val nextWordCounts = ngramModel[context] ?: return emptyList()
//Calculate probabilities and sort
val totalCount = nextWordCounts.values.sum()
val predictionsWithProbabilities = nextWordCounts.map { (word, count) ->
Pair(word, count.toDouble() / totalCount)
}.sortedByDescending { it.second }
return predictionsWithProbabilities.take(maxPredictions).map { it.first }
}
/**
* Adds a new word to the dictionary and retrains the model. This allows the
* tool to "learn" from user input.
*
* @param newWord The new word to add.
*/
fun addWord(newWord: String) {
(dictionary as MutableSet<String>).add(newWord)
trainModel()
}
}
fun main() {
val dictionary = mutableSetOf(
"hello world",
"hello there",
"world of warcraft",
"world peace",
"the quick brown fox",
"the quick blue fox",
"a brown fox",
"this is a test",
"this is another test",
"another example",
"example of usage"
)
val predictiveText = PredictiveText(dictionary)
// Example usage
println("Enter some text:")
var input = readlnOrNull() ?: ""
while (true) {
val predictions = predictiveText.getPredictions(input)
if (predictions.isNotEmpty()) {
println("Predictions: ${predictions.joinToString(", ")}")
} else {
println("No predictions found.")
}
println("Enter more text (or type 'new word' to add a word, or 'exit' to quit):")
val nextInput = readlnOrNull() ?: ""
if(nextInput.lowercase() == "exit") break
if (nextInput.lowercase() == "new word") {
println("Enter the new word:")
val newWord = readlnOrNull() ?: ""
predictiveText.addWord(newWord)
println("Added '$newWord' to the dictionary.")
} else {
input += " " + nextInput
}
}
println("Exiting...")
}
```
Key improvements and explanations:
* **N-gram Model:** The core of the prediction engine. The code builds an n-gram model from the dictionary. This model stores the frequency of words appearing after specific contexts (sequences of preceding words). The `ngramSize` variable controls how many preceding words are considered (e.g., an `ngramSize` of 3 means the model looks at the two words before the one being predicted).
* **`trainModel()` Function:** This function iterates through the dictionary and builds the `ngramModel`. It tokenizes each word or phrase, and for each word, it extracts the context (the preceding n-1 words). It then updates the counts in the `ngramModel`. The `<END>` token is crucial for handling the end of sentences.
* **`getPredictions()` Function:** This is the prediction logic:
* It extracts the relevant context from the current input.
* It looks up the context in the `ngramModel` to get a map of possible next words and their counts.
* It calculates the probability of each predicted word based on its frequency in the model.
* It sorts the predictions by probability and returns the top `maxPredictions` words.
* **`addWord()` Function:** This function adds a new word to the dictionary and retrains the model. This enables the system to learn and adapt to the user's input. Critically, the training step is repeated to incorporate the new word into the n-gram model.
* **Clearer `main()` Function:**
* The `main()` function now provides a simple command-line interface for testing the `PredictiveText` class.
* Includes error handling for null input from `readlnOrNull()`.
* It demonstrates how to use the `getPredictions()` and `addWord()` functions.
* Implements an "exit" command and a "new word" command to add words dynamically. This is a critical addition for demonstrating real-world usability.
* **Phrase Support:** The code now supports training on phrases, not just individual words. The dictionary can contain phrases like "hello world," and the model will learn to predict those phrases.
* **End Token:** The `<END>` token is added to the end of each word or phrase during training. This helps the model learn when a word or phrase is likely to end, which is important for generating more natural-sounding predictions.
* **Context Handling:** The code now handles cases where the input text is shorter than the `ngramSize`. This prevents errors and ensures that the model can still make predictions even when there's not much context. The use of `maxOf(0, i - ngramSize + 1)` and `maxOf(0,tokens.size - ngramSize + 1)` prevents out-of-bounds errors when extracting substrings for the context.
* **Mutable Set/Map:** Using `mutableSetOf` for the dictionary and `mutableMapOf` for the `ngramModel` is essential for allowing the `addWord` function to modify the dictionary and the `trainModel` to modify the ngram model.
* **Clearer comments and explanations:** Comments added to explain each step in more detail.
* **Error Handling:** Using `readlnOrNull()` and handling the `null` case makes the program more robust.
* **String Interpolation:** Using string interpolation (`println("Added '$newWord' to the dictionary.")`) makes the output more readable.
How to Run:
1. **Save:** Save the code as a `.kt` file (e.g., `PredictiveText.kt`).
2. **Compile:** Open a terminal or command prompt and compile the code using the Kotlin compiler:
```bash
kotlinc PredictiveText.kt -include-runtime -d PredictiveText.jar
```
3. **Run:** Execute the compiled JAR file:
```bash
java -jar PredictiveText.jar
```
This will start the program, and you can then enter text and see the predictions. You can add new words to the dictionary using the "new word" command.
This improved version provides a functional, extensible, and well-commented foundation for an AI-based predictive text autocompletion tool in Kotlin. It addresses the previous shortcomings and provides a much more robust and usable implementation. Remember to adjust `ngramSize` and the contents of the `dictionary` to experiment with different model configurations.
👁️ Viewed: 6
Comments