GDPR Compliance Checker Python

👤 Sharing: AI
```python
import re
import datetime

def check_email_format(email):
    """
    Checks if an email address has a valid format.

    Args:
        email (str): The email address to check.

    Returns:
        bool: True if the email is valid, False otherwise.
    """
    pattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
    return bool(re.match(pattern, email))

def check_age(date_of_birth_str):
    """
    Checks if a person is over 16 years old based on their date of birth.
    GDPR typically requires parental consent for data processing of individuals under 16.

    Args:
        date_of_birth_str (str): The date of birth as a string in YYYY-MM-DD format.

    Returns:
        bool: True if the person is 16 or older, False otherwise.  Returns None if the date format is invalid.
    """
    try:
        date_of_birth = datetime.datetime.strptime(date_of_birth_str, "%Y-%m-%d").date()
        today = datetime.date.today()
        age = today.year - date_of_birth.year - ((today.month, today.day) < (date_of_birth.month, date_of_birth.day))
        return age >= 16
    except ValueError:
        print("Invalid date format. Please use YYYY-MM-DD.")
        return None


def check_consent_format(consent_text):
    """
    Performs a basic check to see if the consent text appears to meet the requirements of GDPR.
    This is a simplified example, and a real-world implementation would be much more sophisticated.
    It checks for keywords and basic structure.

    Args:
        consent_text (str): The text of the consent.

    Returns:
        bool: True if the consent text likely meets GDPR standards, False otherwise.
    """
    consent_text = consent_text.lower()

    # Basic keywords to look for
    keywords = ["consent", "purpose", "data", "privacy", "rights", "withdrawal", "policy", "control"]

    # Check if the consent text mentions the purpose of data collection
    has_purpose = any(keyword in consent_text for keyword in ["purpose", "why", "collecting"])

    # Check if the consent text mentions the types of data collected
    has_data_types = any(keyword in consent_text for keyword in ["data", "information", "personal", "details"])

    # Check if the consent text mentions the rights of the data subject
    has_rights = any(keyword in consent_text for keyword in ["rights", "access", "delete", "modify", "portability"])

    # Check if consent withdrawal is mentioned.
    has_withdrawal = any(keyword in consent_text for keyword in ["withdrawal", "revoke", "cancel", "withdraw"])


    # Check if the consent text mentions the data controller/organization
    has_controller = any(keyword in consent_text for keyword in ["controller", "company", "organization"])


    # Check if at least a majority of key keywords are included
    num_keywords_present = sum(1 for keyword in keywords if keyword in consent_text)
    keyword_threshold = len(keywords) // 2  # At least half the keywords should be present.

    # Return True only if all required elements are present *and* sufficient keywords are mentioned
    return has_purpose and has_data_types and has_rights and has_withdrawal and has_controller and (num_keywords_present >= keyword_threshold)


def check_data_retention_policy(retention_period_days, data_type):
    """
    Simulates checking a data retention policy against GDPR principles of data minimization and storage limitation.
    This is a simplified example. Actual retention policies are complex and vary greatly by data type and jurisdiction.

    Args:
        retention_period_days (int): The number of days the data is retained.
        data_type (str): The type of data being retained (e.g., "email address", "purchase history").

    Returns:
        str: A message indicating whether the retention period is likely compliant or non-compliant.
    """

    # Very simple, example-based retention rules.  In reality, this would pull from a database/config.
    reasonable_retention = {
        "email address": 365, # 1 year
        "purchase history": 1825,  # 5 years
        "website cookies": 90 # 3 months
    }


    if data_type in reasonable_retention:
        if retention_period_days <= reasonable_retention[data_type]:
            return f"Data retention for {data_type} within reasonable limits ({retention_period_days} days)."
        else:
            return f"WARNING: Data retention for {data_type} exceeds reasonable limits ({retention_period_days} days). Review your policy."
    else:
        return f"Data type '{data_type}' not recognized.  Please define a retention policy for this data type."



# --- Main Program ---
if __name__ == "__main__":
    print("GDPR Compliance Checker")

    # Example 1: Email format check
    email = input("Enter an email address: ")
    if check_email_format(email):
        print("Email address format is valid.")
    else:
        print("Email address format is invalid.")

    # Example 2: Age check
    date_of_birth = input("Enter your date of birth (YYYY-MM-DD): ")
    age_check_result = check_age(date_of_birth)
    if age_check_result is True:
        print("You are 16 years or older.")
    elif age_check_result is False:
        print("You are under 16. Parental consent may be required.")
    else:
        #The function already prints the error message.
        pass


    # Example 3: Consent check
    print("\nEnter the consent text:")
    consent_text = input()
    if check_consent_format(consent_text):
        print("Consent text appears to meet basic GDPR requirements (keyword check).")
    else:
        print("WARNING: Consent text may not meet GDPR requirements. Review carefully.")


    # Example 4: Data retention policy check
    data_type = input("\nEnter the data type (e.g., email address, purchase history): ")
    try:
        retention_period_days = int(input("Enter the data retention period in days: "))
        retention_message = check_data_retention_policy(retention_period_days, data_type)
        print(retention_message)
    except ValueError:
        print("Invalid retention period. Please enter a number.")
```

Key improvements and explanations:

* **Clear Function Definitions:** Each check (email, age, consent, retention) is encapsulated in its own function for better organization and reusability.
* **Email Format Validation:**  Uses a regular expression to perform a more robust email format check.
* **Age Verification:**  `check_age` now uses `datetime` to reliably calculate age based on date of birth and returns `None` on invalid input (and prints an error message).  Handles cases where the birthday hasn't happened yet this year.  Includes error handling for invalid date formats.
* **Consent Format Inspection:** `check_consent_format` does a very basic check to see if certain keywords are present in the consent text.  This is *not* a replacement for legal review, but it provides a basic example of how to programmatically analyze consent.
    * Includes checks for "purpose", "data", "rights", "withdrawal", and "controller" to cover essential elements.
    * Calculates the number of keywords present and requires a minimum threshold to be met.
* **Data Retention Policy Simulation:** `check_data_retention_policy` simulates a data retention policy check, comparing the provided retention period against a reasonable limit for the data type.
    * Uses a dictionary `reasonable_retention` to store example retention periods.  This would normally be loaded from a config file or database.
    * Handles the case where the data type is not recognized.
* **Error Handling:** Includes `try...except` blocks to handle potential errors such as invalid date formats or invalid retention period inputs.
* **GDPR Caveats:**  Includes comments emphasizing that these checks are simplified examples and should not be used as a substitute for legal advice or a comprehensive GDPR compliance program.  The code emphasizes that the consent checking is superficial.
* **Modularity:**  The code is structured in a modular way, making it easier to add or modify the checks as needed.
* **Comprehensive Comments:**  Added comments to explain the purpose of each section of the code and the logic behind the checks.
* **User-Friendly Input:**  Uses `input()` to get data from the user, making the program interactive.
* **Clear Output:** The output provides informative messages about the results of each check.
* **`if __name__ == "__main__":` block:**  The main program logic is enclosed in this block, which ensures that the code is only executed when the script is run directly (not when it is imported as a module).
* **`bool()` casting:**  The email regex matching uses `bool()` to explicitly cast the result to a boolean.  This is good practice.
* **Uses `datetime.date` objects:**  Working with dates requires `date` objects for proper comparisons, rather than just strings.

How to run the code:

1.  **Save:** Save the code as a `.py` file (e.g., `gdpr_checker.py`).
2.  **Run:** Open a terminal or command prompt and navigate to the directory where you saved the file. Then, run the script using `python gdpr_checker.py`.
3.  **Follow Prompts:** The program will prompt you to enter an email address, date of birth, consent text, data type, and retention period.  Provide the requested information.
4.  **Review Results:** The program will display the results of each GDPR check.

Remember that this is a simplified example and does not cover all aspects of GDPR compliance. Always consult with legal professionals and data privacy experts for comprehensive guidance.
👁️ Viewed: 5

Comments