Builds multi stop itineraries by scraping deals & user preferences Python

👤 Sharing: AI
```python
import requests
from bs4 import BeautifulSoup
import re  # For regular expressions (e.g., cleaning up prices)

def scrape_flight_deals(origin, destination, date):
    """
    Scrapes flight deals from a hypothetical airline website.  (Replace with actual URL)

    Args:
        origin (str):  Departure airport code (e.g., "JFK").
        destination (str): Arrival airport code (e.g., "LAX").
        date (str): Departure date (e.g., "2024-01-15").

    Returns:
        list: A list of dictionaries, where each dictionary represents a flight deal.
              Returns an empty list if no deals are found or if an error occurs.
    """
    url = f"https://example.com/flights?origin={origin}&destination={destination}&date={date}"  # Replace with actual website
    try:
        response = requests.get(url)
        response.raise_for_status()  # Raise HTTPError for bad responses (4xx or 5xx)

        soup = BeautifulSoup(response.content, 'html.parser')

        #  *** REPLACE THESE SELECTORS WITH THE CORRECT SELECTORS FOR THE WEBSITE YOU ARE SCRAPING ***
        #  This is the most crucial part - you'll need to inspect the website's HTML
        #  to find the correct CSS selectors or XPath expressions to extract the data.
        flight_elements = soup.find_all('div', class_='flight-card')  # Example selector - adjust as needed

        flight_deals = []
        for flight in flight_elements:
            try:
                #  Extract data from each flight card. Again, customize these selectors.
                airline = flight.find('span', class_='airline-name').text.strip()  # Example
                price_string = flight.find('span', class_='flight-price').text.strip() # Example
                # Clean up the price string: Remove currency symbols, commas, etc.
                price = float(re.sub(r'[^\d\.]', '', price_string))  # Keep only digits and decimal point

                departure_time = flight.find('span', class_='departure-time').text.strip() # Example
                arrival_time = flight.find('span', class_='arrival-time').text.strip() # Example

                flight_deals.append({
                    'airline': airline,
                    'price': price,
                    'departure_time': departure_time,
                    'arrival_time': arrival_time,
                    'origin': origin,
                    'destination': destination,
                    'date': date
                })
            except AttributeError as e:
                print(f"Error extracting data from a flight card: {e}")  # Debugging

        return flight_deals

    except requests.exceptions.RequestException as e:
        print(f"Error during request: {e}")
        return []
    except Exception as e:
        print(f"An unexpected error occurred: {e}")
        return []


def get_user_preferences():
    """
    Gets user preferences for travel.

    Returns:
        dict: A dictionary containing user preferences.
    """
    preferences = {}
    preferences['budget'] = float(input("Enter your budget: "))
    preferences['preferred_airlines'] = input("Enter preferred airlines (comma-separated): ").split(',')
    preferences['max_stops'] = int(input("Enter maximum number of stops: "))
    preferences['interests'] = input("Enter interests (comma-separated): ").split(',') # e.g., "beach,hiking,museums"
    preferences['start_date'] = input("Enter start date (YYYY-MM-DD): ")
    preferences['end_date'] = input("Enter end date (YYYY-MM-DD): ")
    return preferences


def find_potential_destinations(interests):
    """
    A very basic example of suggesting destinations based on interests.
    In a real-world application, this would use a database or API.

    Args:
        interests (list): A list of interests.

    Returns:
        list: A list of potential destinations.
    """
    destinations = []
    if 'beach' in interests:
        destinations.append("HNL")  # Honolulu
        destinations.append("MIA")  # Miami
    if 'hiking' in interests:
        destinations.append("DEN")  # Denver
        destinations.append("SEA")  # Seattle
    if 'museums' in interests:
        destinations.append("NYC")  # New York City
        destinations.append("PAR")  # Paris (example of adding international)

    return destinations

def build_itinerary(origin, potential_destinations, start_date, end_date, budget, max_stops, preferred_airlines):
    """
    Builds a multi-stop itinerary based on flight deals and user preferences.

    Args:
        origin (str): Origin airport code.
        potential_destinations (list): A list of potential destination airport codes.
        start_date (str): Start date for the trip (YYYY-MM-DD).
        end_date (str): End date for the trip (YYYY-MM-DD).
        budget (float): The budget for the entire trip.
        max_stops (int): The maximum number of stops allowed.
        preferred_airlines (list): A list of preferred airlines.

    Returns:
        list: A list of itinerary segments (dictionaries), or an empty list if no suitable itinerary is found.
    """

    itinerary = []
    total_cost = 0
    current_location = origin
    current_date = start_date

    for i, destination in enumerate(potential_destinations):
        # Example:  Travel to each destination sequentially.  A more sophisticated
        #  algorithm might explore different permutations of destinations.

        deals = scrape_flight_deals(current_location, destination, current_date)

        # Filter deals based on preferences:
        affordable_deals = [deal for deal in deals if deal['price'] + total_cost <= budget]
        preferred_airline_deals = [deal for deal in affordable_deals if deal['airline'] in preferred_airlines]
        if preferred_airline_deals:
            best_deal = min(preferred_airline_deals, key=lambda x: x['price']) # Cheapest of the preferred
        elif affordable_deals:
            best_deal = min(affordable_deals, key=lambda x: x['price'])  # Cheapest overall
        else:
            print(f"No affordable flights from {current_location} to {destination} found. Skipping.")
            continue  # Skip to the next destination

        itinerary.append(best_deal)
        total_cost += best_deal['price']
        current_location = destination

        # Set the departure date for the next flight to a few days after the arrival at the current destination
        # (This is very basic - you'd likely want more sophisticated logic to determine stay duration).
        from datetime import datetime, timedelta
        arrival_date = datetime.strptime(best_deal['date'], '%Y-%m-%d').date()
        next_departure_date = arrival_date + timedelta(days=3) # Stay 3 days
        current_date = next_departure_date.strftime('%Y-%m-%d')


    # Check if we can return to the origin within the budget and end date
    return_deal = scrape_flight_deals(current_location, origin, current_date)  # Try to return on the last date

    if return_deal:
       affordable_return_deals = [deal for deal in return_deal if deal['price'] + total_cost <= budget]
       if affordable_return_deals:
           best_return_deal = min(affordable_return_deals, key=lambda x: x['price'])
           itinerary.append(best_return_deal)
           total_cost += best_return_deal['price']
       else:
           print("Couldn't find an affordable return flight.")


    if itinerary:
        print("Itinerary Found:")
        print(f"Total Cost: ${total_cost:.2f}")
        return itinerary
    else:
        print("No suitable itinerary found.")
        return []


# Main execution block
if __name__ == "__main__":
    # Get user preferences
    user_prefs = get_user_preferences()
    origin_airport = input("Enter your origin airport code (e.g., JFK): ")  # Get origin

    # Find potential destinations based on interests
    potential_destinations = find_potential_destinations(user_prefs['interests'])

    # Build the itinerary
    itinerary = build_itinerary(
        origin_airport,
        potential_destinations,
        user_prefs['start_date'],
        user_prefs['end_date'],
        user_prefs['budget'],
        user_prefs['max_stops'],
        user_prefs['preferred_airlines']
    )

    if itinerary:
        for leg in itinerary:
            print(f"  {leg['origin']} -> {leg['destination']} on {leg['date']} ({leg['airline']}): ${leg['price']:.2f}")
```

Key improvements and explanations:

* **Error Handling:**  Includes `try...except` blocks to gracefully handle potential errors during the web scraping process, like network issues (`requests.exceptions.RequestException`), incorrect HTML structure on the website (catching `AttributeError` when a selector doesn't find an element), and unexpected exceptions.  This prevents the program from crashing.  The code now prints informative error messages.  Crucially, the `scrape_flight_deals` function *returns* an empty list in case of error, allowing the rest of the code to continue.
* **Realistic Scraping:**
    * **Placeholders:** I've used placeholder values (`"https://example.com/flights..."`,  `flight-card`, `airline-name`, etc.).  **YOU MUST REPLACE THESE WITH THE ACTUAL VALUES FROM THE WEBSITE YOU ARE SCRAPING.** Inspect the website's HTML structure to find the correct CSS selectors or XPath expressions.
    * **`response.raise_for_status()`:** Checks the HTTP status code of the response.  If it's an error (4xx or 5xx), it raises an exception, preventing the program from trying to parse invalid HTML.
    * **Price Cleaning:** The `re.sub()` part in `scrape_flight_deals` is crucial.  It uses a regular expression to remove any characters from the price string that aren't digits or a decimal point, making it safe to convert to a float.
* **User Preferences:**  The `get_user_preferences()` function now gets more information from the user (budget, preferred airlines, max stops, interests, start/end dates).  The code now incorporates these preferences in the `build_itinerary` function.
* **Destination Suggestions:**  The `find_potential_destinations()` function provides a rudimentary way to suggest destinations based on user interests.  This would be replaced by a more sophisticated system in a real application (e.g., using a database of destinations and their attractions).
* **Itinerary Building Logic:**
    * **Filtering by Preferences:** The `build_itinerary()` function filters the flight deals based on the user's budget, preferred airlines.
    * **Cheapest Deal:**  Selects the cheapest flight deal that meets the criteria.
    * **Basic Multi-Stop Logic:** The code now iterates through the potential destinations and tries to find flights to each one.
    * **Date Handling:** Added very basic date handling using `datetime` and `timedelta` to schedule connecting flights a few days apart.  This needs to be significantly improved in a production system.
* **Return Flight:** The program now attempts to find a return flight from the last destination back to the origin.
* **Clearer Output:** Prints the itinerary in a more readable format, including the origin, destination, date, airline, and price for each leg.  Also prints the total cost of the itinerary.
* **Modularity:**  The code is divided into functions, making it more organized and easier to maintain.
* **Comments:**  Includes detailed comments explaining each part of the code.
* **Main Execution Block:** The `if __name__ == "__main__":` block ensures that the main code only runs when the script is executed directly (not when it's imported as a module).
* **Type Hinting (Optional):**  For even better code clarity, you could add type hints:

   ```python
   from typing import List, Dict

   def scrape_flight_deals(origin: str, destination: str, date: str) -> List[Dict]:
       ...
   def get_user_preferences() -> Dict:
       ...
   def find_potential_destinations(interests: List[str]) -> List[str]:
       ...
   def build_itinerary(origin: str, potential_destinations: List[str], start_date: str, end_date: str, budget: float, max_stops: int, preferred_airlines: List[str]) -> List[Dict]:
       ...
   ```

**How to Use:**

1. **Install Libraries:**
   ```bash
   pip install requests beautifulsoup4
   ```

2. **Replace Placeholders:**  **THIS IS THE MOST IMPORTANT STEP.** Open the code and carefully replace the placeholder URLs, CSS selectors, and data extraction logic in the `scrape_flight_deals` function with the correct values for the actual airline website you want to scrape.  Use your browser's developer tools (inspect element) to examine the HTML structure.

3. **Run the Script:**
   ```bash
   python your_script_name.py
   ```

4. **Enter Preferences:** The script will prompt you to enter your travel preferences (budget, airlines, dates, etc.).

**Important Considerations and Next Steps:**

* **Website Structure:**  Websites change their HTML structure frequently.  Your scraper will break if the website's layout is updated.  You'll need to monitor the website and update your selectors accordingly.
* **Legal and Ethical Concerns:**
    * **Terms of Service:**  Read the website's terms of service to ensure that scraping is allowed.  Many websites prohibit scraping.
    * **robots.txt:**  Check the website's `robots.txt` file to see which parts of the site are disallowed for bots.
    * **Respectful Scraping:**  Don't overload the website with requests.  Implement delays between requests to avoid being blocked.  Use `time.sleep()` to add delays.
    * **User-Agent:** Set a descriptive User-Agent header in your requests so the website can identify your scraper (and potentially contact you if there's an issue).
* **More Robust Destination Suggestions:** Use a database or API to suggest destinations based on user interests, budget, and travel dates.
* **More Sophisticated Itinerary Planning:**
    * **Optimization:** Use optimization algorithms (e.g., genetic algorithms, simulated annealing) to find the best itinerary based on multiple criteria (price, travel time, number of stops, preferred airlines).
    * **API Integration:** Integrate with airline APIs (if available) to get real-time flight data.  This is generally more reliable than scraping.
* **Database Storage:**  Store flight data and user preferences in a database (e.g., SQLite, PostgreSQL) for persistence and efficient querying.
* **User Interface:** Create a user interface (e.g., using Flask or Django) to make the program more user-friendly.
* **Handling Stops:**  The current code doesn't explicitly handle the number of stops.  You'd need to add logic to limit the number of connecting flights in the itinerary.
* **Accommodation and Activities:** Extend the program to include booking accommodation and activities at each destination.  This would involve scraping hotel booking websites or using travel APIs.
* **Caching:** Implement caching to store scraped data and reduce the number of requests to the website.  Use libraries like `requests-cache`.

This comprehensive example provides a solid foundation for building a multi-stop itinerary planner. Remember to adapt the scraping logic to the specific website you are targeting and to respect the website's terms of service.
👁️ Viewed: 3

Comments