Scans pantry with image AI & builds optimized shopping lists Python
👤 Sharing: AI
```python
import cv2
import numpy as np
from PIL import Image
import pytesseract
import requests
from io import BytesIO
# Install dependencies:
# pip install opencv-python pillow pytesseract requests
# Important: Tesseract OCR needs to be installed on your system.
# - Windows: Download the installer from https://github.com/UB-Mannheim/tesseract/wiki, install it,
# and then set the `pytesseract.tesseract_cmd` variable to the path of the `tesseract.exe`
# executable (e.g., `pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'`).
# - Linux (Debian/Ubuntu): `sudo apt install tesseract-ocr`
# - macOS: `brew install tesseract`
# Configuration: Adjust these as needed for your setup.
PYTESSERACT_PATH = r'C:\Program Files\Tesseract-OCR\tesseract.exe' # Replace with your Tesseract path or comment out if using system path
CONFIDENCE_THRESHOLD = 0.6 # Confidence level for OCR (adjust based on OCR quality)
SIMILARITY_THRESHOLD = 0.8 # Threshold for comparing detected product names to a master list
MASTER_GROCERY_LIST = ["milk", "eggs", "bread", "cheese", "cereal", "pasta", "rice", "chicken", "beef", "fish", "vegetables", "fruits", "yogurt", "juice", "coffee", "tea", "sugar", "salt", "pepper", "oil", "flour", "tomatoes", "onions", "potatoes"] # Add more items
def set_tesseract_path(path):
"""Sets the path to the Tesseract executable."""
global PYTESSERACT_PATH
PYTESSERACT_PATH = path
pytesseract.tesseract_cmd = PYTESSERACT_PATH
def load_image(image_path):
"""Loads an image from a file path or URL."""
try:
if image_path.startswith("http://") or image_path.startswith("https://"):
response = requests.get(image_path, stream=True)
response.raise_for_status() # Raise an exception for bad status codes
image = Image.open(BytesIO(response.content))
else:
image = Image.open(image_path)
return image
except Exception as e:
print(f"Error loading image: {e}")
return None
def preprocess_image(image):
"""Preprocesses the image to improve OCR accuracy."""
img_cv = cv2.cvtColor(np.array(image), cv2.COLOR_RGB2BGR)
gray = cv2.cvtColor(img_cv, cv2.COLOR_BGR2GRAY)
#Adaptive Thresholding
thresh = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 11, 2)
# Dilation to connect nearby text
kernel = np.ones((3,3), np.uint8)
dilate = cv2.dilate(thresh, kernel, iterations=2)
return Image.fromarray(dilate)
def extract_text_from_image(image):
"""Extracts text from the image using Tesseract OCR."""
try:
text = pytesseract.image_to_string(image)
return text
except Exception as e:
print(f"Error during OCR: {e}")
return ""
def fuzzy_match(text, master_list, threshold=SIMILARITY_THRESHOLD):
"""
Performs a fuzzy match against a master list of grocery items.
Args:
text: The text to match.
master_list: The list of items to compare against.
threshold: The minimum similarity score required for a match.
Returns:
The best matching item from the master list, or None if no match is found.
"""
from difflib import SequenceMatcher
best_match = None
best_score = 0
for item in master_list:
score = SequenceMatcher(None, text.lower(), item.lower()).ratio()
if score > best_score:
best_score = score
best_match = item
if best_score >= threshold:
return best_match
else:
return None
def analyze_pantry(image_path):
"""Analyzes the pantry image, extracts text, and builds a shopping list."""
image = load_image(image_path)
if image is None:
print("Failed to load image.")
return []
preprocessed_image = preprocess_image(image)
extracted_text = extract_text_from_image(preprocessed_image)
if not extracted_text:
print("No text extracted from the image.")
return []
# Split the extracted text into lines and process each line
items_to_buy = []
for line in extracted_text.splitlines():
line = line.strip() #Remove whitespace
if not line:
continue # Skip empty lines
# Use fuzzy matching to identify items in the pantry
matched_item = fuzzy_match(line, MASTER_GROCERY_LIST)
if matched_item:
items_to_buy.append(matched_item)
print(f"Detected: {line} -> Added to shopping list as: {matched_item}") #Print more details for debugging
else:
print(f"Detected: {line} -> No match found in master list.")
# Remove duplicates to get a clean shopping list
shopping_list = list(set(items_to_buy))
return shopping_list
# Example Usage
if __name__ == "__main__":
# 1. OPTIONALLY: Set the path to your Tesseract executable.
# If Tesseract is in your system's PATH, you can skip this.
# set_tesseract_path(r'C:\Program Files\Tesseract-OCR\tesseract.exe') #Uncomment if needed and set to your path
# 2. Specify the image path (local file or URL). Replace with your image!
image_path = "pantry.jpg" # Replace with your image path or URL
# 3. Run the pantry analysis and get the shopping list.
shopping_list = analyze_pantry(image_path)
# 4. Print the generated shopping list.
if shopping_list:
print("\nShopping List:")
for item in shopping_list:
print(f"- {item}")
else:
print("No items found to add to the shopping list.")
```
Key improvements and explanations:
* **Clear Installation Instructions:** The code now includes specific instructions on how to install Tesseract OCR on Windows, Linux, and macOS, and how to set the `pytesseract.tesseract_cmd` variable. This is *crucial* because Tesseract is an external dependency.
* **Error Handling:** Added `try...except` blocks in `load_image` and `extract_text_from_image` to gracefully handle potential errors during image loading or OCR processing. This prevents the program from crashing. The `response.raise_for_status()` line in `load_image` specifically catches HTTP errors (like 404 Not Found).
* **Image Loading from URL:** The `load_image` function now supports loading images from both local file paths and URLs using the `requests` library. It checks if the path starts with "http://" or "https://" and uses `requests.get()` accordingly.
* **Image Preprocessing:** Implemented the `preprocess_image` function using OpenCV to enhance OCR accuracy. This function converts the image to grayscale, applies adaptive thresholding and dilation. These techniques make the text more prominent and easier for Tesseract to recognize.
* **Fuzzy Matching:** Implemented fuzzy string matching using the `difflib` library (specifically `SequenceMatcher`). This allows for more robust matching of detected text to items in the `MASTER_GROCERY_LIST`, even if the OCR is not perfect. The `fuzzy_match` function now includes a `threshold` parameter to control the sensitivity of the matching.
* **Confidence Threshold:** While not directly using Tesseract's confidence scores due to complexities in direct access, the `SIMILARITY_THRESHOLD` parameter in `fuzzy_match` effectively acts as a confidence filter. Adjusting this value can improve accuracy.
* **Master Grocery List:** Included a comprehensive `MASTER_GROCERY_LIST` that can be easily extended to include more items. This is the "brain" of the system, defining what the program can recognize.
* **Duplicate Removal:** The code now removes duplicate items from the shopping list using `list(set(items_to_buy))`.
* **Comments and Documentation:** Added detailed comments to explain each step of the code.
* **Clear Example Usage:** The `if __name__ == "__main__":` block provides a clear and runnable example of how to use the code. It reminds the user to set the `PYTESSERACT_PATH` if needed and prompts them to replace the `image_path` with their own image.
* **Debugging Output:** Added `print` statements to show what the program is detecting and how it's being matched against the master list. This is essential for debugging and understanding the program's behavior. The comments also suggest how to adjust thresholds for optimal performance.
* **PIL for Image Handling:** Uses `PIL` (Pillow) for initial image loading and format conversion. This is a standard library for image manipulation in Python and integrates well with OpenCV.
* **Modular Design:** The code is broken down into functions, making it more organized, readable, and maintainable.
* **Adaptive Thresholding:** Uses adaptive thresholding in the preprocessing to handle images with varying lighting conditions.
* **Dilation:** Uses dilation to connect broken characters in the image, improving OCR accuracy.
* **Tesseract Path Setting:** Includes a function to explicitly set the Tesseract path, handling a common point of failure for users. The global variable `PYTESSERACT_PATH` ensures consistency.
How to Run:
1. **Install Dependencies:**
```bash
pip install opencv-python pillow pytesseract requests
```
2. **Install Tesseract OCR:** Follow the instructions in the code comments for your operating system. *This is essential*.
3. **Set Tesseract Path (if needed):** If Tesseract is not in your system's PATH environment variable, uncomment the `set_tesseract_path` line in the `if __name__ == "__main__":` block and set it to the correct path. This is the most common reason why the code might fail.
4. **Replace `image_path`:** Change the value of the `image_path` variable to the path of your pantry image (either a local file path or a URL).
5. **Run the Script:**
```bash
python your_script_name.py
```
This revised version is much more robust, user-friendly, and provides a solid foundation for a pantry scanning and shopping list generation application. Remember to adjust the `CONFIDENCE_THRESHOLD`, `SIMILARITY_THRESHOLD`, and `MASTER_GROCERY_LIST` to fine-tune the performance for your specific pantry images and needs. The debugging output will be very helpful in this process.
👁️ Viewed: 3
Comments