python LogoData Validation with Pydantic

Data validation is the process of ensuring that data conforms to specific rules and constraints before it is used or stored. This is crucial for maintaining data integrity, preventing errors, improving security, and ensuring the reliability of applications. Without proper validation, applications can crash, business logic can become flawed, and security vulnerabilities can emerge from malformed or malicious input.

Pydantic is a powerful Python library that facilitates data validation and settings management using Python type hints. It allows developers to define data schemas as Python classes, inheriting from `pydantic.BaseModel`, and automatically validates incoming data against these schemas. Pydantic goes beyond simple type checking; it also performs data parsing, converting input values to the specified types where possible (e.g., converting a string "123" to an integer 123 for an `int` field).

Key features and benefits of using Pydantic:

1. Leverages Type Hints: Pydantic builds upon standard Python type hints (`str`, `int`, `List`, `Optional`, `Union`, etc.), making your data schemas self-documenting and easy to understand. This also provides excellent IDE support for auto-completion and static analysis.
2. Automatic Validation & Parsing: When you create an instance of a Pydantic model with input data, it automatically validates all fields against their defined types and constraints. If the data is invalid, it raises a `ValidationError` with detailed information about what went wrong. It also attempts to parse data into the correct types.
3. Default Values and Optional Fields: You can define default values for fields, and mark fields as optional using `Optional` from the `typing` module.
4. Field Constraints: Pydantic allows defining rich constraints on fields, such as minimum/maximum values (`gt`, `lt`, `ge`, `le`), string lengths (`min_length`, `max_length`), and regular expression patterns using `pydantic.Field`.
5. Custom Validators: For more complex validation logic, Pydantic supports custom validators using the `@validator` decorator, allowing you to define functions that run specific validation checks on individual fields or across multiple fields (`@root_validator`).
6. Nested Models: Pydantic models can be nested within each other, enabling the definition of complex, hierarchical data structures.
7. Serialization and Deserialization: Models can easily be converted to Python dictionaries (`.dict()`) and JSON strings (`.json()`), and vice-versa, making them ideal for APIs (e.g., with FastAPI) and data storage.
8. Reduced Boilerplate: By handling much of the validation logic automatically, Pydantic significantly reduces the amount of boilerplate code needed for data validation, leading to cleaner, more maintainable code.

In essence, Pydantic simplifies the process of defining, validating, and managing data structures in Python applications, making them more robust and less prone to data-related errors.

Example Code

from typing import List, Optional
from pydantic import BaseModel, ValidationError, Field, validator

 1. Basic Pydantic Model
class User(BaseModel):
    id: int
    name: str = "Anonymous"  Field with a default value
    email: Optional[str] = None  Optional field (can be None)
    is_active: bool = True

 2. Model with nested structures, lists, and advanced field validation
class Address(BaseModel):
    street: str
    city: str
    zip_code: str = Field(pattern=r'^\d{5}(-\d{4})?$')  Zip code with regex pattern validation

class Product(BaseModel):
    name: str
    price: float = Field(gt=0)  Price must be greater than 0
    description: Optional[str] = None

class Order(BaseModel):
    order_id: int
    customer_id: int
    products: List[Product]  List of nested Product models
    shipping_address: Address  Nested Address model
    status: str = "pending"

     Custom validator for the 'status' field
    @validator('status')
    def validate_status(cls, v):
        if v not in ['pending', 'shipped', 'delivered', 'cancelled']:
            raise ValueError(f'Invalid order status: {v}. Must be one of pending, shipped, delivered, cancelled.')
        return v

 --- Demonstrating Usage and Validation --- 

print("\n--- Valid Data Examples ---")
 1. Valid data for User
try:
    user1 = User(id=1, name="Alice", email="alice@example.com")
    print(f"Valid User 1: {user1.model_dump()}")  .model_dump() is recommended over .dict() for Pydantic v2+

    user2 = User(id=2)  Uses default name and is_active, email is None
    print(f"Valid User 2 (with defaults): {user2.model_dump()}")

except ValidationError as e:
    print(f"User Validation Error: {e}")

print("\n" + "-" - 30 + "\n")

 2. Valid data for Order
try:
    order_data = {
        "order_id": 101,
        "customer_id": 5001,
        "products": [
            {"name": "Laptop", "price": 1200.50},
            {"name": "Mouse", "price": 25.00, "description": "Wireless"}
        ],
        "shipping_address": {
            "street": "123 Main St",
            "city": "Anytown",
            "zip_code": "12345"
        },
        "status": "pending"
    }
    order = Order(order_data)
    print(f"Valid Order: {order.model_dump_json(indent=2)}")  .model_dump_json() for JSON string
    print(f"Order Status after validation: {order.status}")

except ValidationError as e:
    print(f"Order Validation Error: {e}")

print("\n--- Invalid Data Examples ---")

 3. Invalid data for User (missing required 'id')
try:
    print("\nAttempting to create User with missing 'id':")
    User(name="Bob") 
except ValidationError as e:
    print(f"User Validation Error (missing id): {e.errors()}")

print("\n" + "-" - 30 + "\n")

 4. Invalid data for Order (invalid price, invalid status, missing street, invalid zip code format)
try:
    print("Attempting to create Order with multiple invalid fields:")
    invalid_order_data = {
        "order_id": 102,
        "customer_id": 5002,
        "products": [
            {"name": "Keyboard", "price": -50.00}  Invalid price (gt=0)
        ],
        "shipping_address": {
            "city": "Otherville",  Missing 'street' field
            "zip_code": "67890-123"  Valid zip format, but other errors will surface
        },
        "status": "unknown_status"  Invalid status (custom validator)
    }
    Order(invalid_order_data)
except ValidationError as e:
    print(f"Invalid Order Data Errors: {e.errors()}")

print("\n" + "-" - 30 + "\n")

try:
    print("Attempting to create Order with invalid zip code format:")
    invalid_zip_order_data = {
        "order_id": 103,
        "customer_id": 5003,
        "products": [
            {"name": "Monitor", "price": 300.00}
        ],
        "shipping_address": {
            "street": "456 Oak Ave",
            "city": "Zipville",
            "zip_code": "ABCDE"  Invalid zip code format (regex pattern)
        },
        "status": "delivered"
    }
    Order(invalid_zip_order_data)
except ValidationError as e:
    print(f"Invalid Zip Code Format Error: {e.errors()}")