Workflow: Splitout Converttofile Send

Workflow Details

Download Workflow
{
    "id": "DswhuYzoemjA6iNN",
    "meta": {
        "instanceId": "a1ae5c8dc6c65e674f9c3947d083abcc749ef2546dff9f4ff01de4d6a36ebfe6",
        "templateCredsSetupCompleted": true
    },
    "name": "Scrape Books from URL with Dumpling AI, Clean HTML, Save to Sheets, Email as CSV",
    "tags": [
        {
            "id": "TlcNkmb96fUfZ2eA",
            "name": "Tutorials",
            "createdAt": "2025-04-15T17:02:00.249Z",
            "updatedAt": "2025-04-15T17:02:00.249Z"
        }
    ],
    "nodes": [
        {
            "id": "2e4f64a5-353c-4dd3-9822-62df795d4940",
            "name": "Convert to CSV File",
            "type": "n8n-nodes-base.convertToFile",
            "position": [
                1640,
                340
            ],
            "parameters": {
                "options": []
            },
            "typeVersion": 1.100000000000000088817841970012523233890533447265625
        },
        {
            "id": "472442d3-a691-4310-93f8-019579d0c473",
            "name": "Extract all books from the page",
            "type": "n8n-nodes-base.html",
            "position": [
                760,
                340
            ],
            "parameters": {
                "options": [],
                "operation": "extractHtmlContent",
                "dataPropertyName": "content",
                "extractionValues": {
                    "values": [
                        {
                            "key": "books",
                            "cssSelector": ".row > li",
                            "returnArray": true,
                            "returnValue": "html"
                        }
                    ]
                }
            },
            "typeVersion": 1.1999999999999999555910790149937383830547332763671875
        },
        {
            "id": "92765257-d64d-47c9-bd57-50914342138b",
            "name": "Sort by price",
            "type": "n8n-nodes-base.sort",
            "position": [
                1420,
                340
            ],
            "parameters": {
                "options": [],
                "sortFieldsUi": {
                    "sortField": [
                        {
                            "order": "descending",
                            "fieldName": "price"
                        }
                    ]
                }
            },
            "typeVersion": 1
        },
        {
            "id": "efc2f33f-1bef-4906-b3b7-b02868080a54",
            "name": "Extract individual book price",
            "type": "n8n-nodes-base.html",
            "position": [
                1200,
                340
            ],
            "parameters": {
                "options": [],
                "operation": "extractHtmlContent",
                "dataPropertyName": "books",
                "extractionValues": {
                    "values": [
                        {
                            "key": "title",
                            "attribute": "title",
                            "cssSelector": "h3 > a",
                            "returnValue": "attribute"
                        },
                        {
                            "key": "price",
                            "cssSelector": ".price_color"
                        }
                    ]
                }
            },
            "typeVersion": 1.1999999999999999555910790149937383830547332763671875
        },
        {
            "id": "74c7c3af-d63c-4b6c-95a0-15f45b19134b",
            "name": "Send CSV via e-mail",
            "type": "n8n-nodes-base.gmail",
            "position": [
                1860,
                340
            ],
            "webhookId": "40f2d609-52ed-40bf-b190-1f1cebbe3fb7",
            "parameters": {
                "sendTo": "",
                "message": "Hey, here's the scraped data from the online bookstore!",
                "options": {
                    "attachmentsUi": {
                        "attachmentsBinary": [
                            []
                        ]
                    }
                },
                "subject": "bookstore csv",
                "emailType": "text"
            },
            "credentials": {
                "gmailOAuth2": {
                    "id": "j70r3RTMED1pgN3R",
                    "name": "Gmail account 2"
                }
            },
            "typeVersion": 2.100000000000000088817841970012523233890533447265625
        },
        {
            "id": "95c7998b-ece0-4dea-b99e-97ac22fb8a59",
            "name": "Sticky Note3",
            "type": "n8n-nodes-base.stickyNote",
            "position": [
                140,
                -260
            ],
            "parameters": {
                "width": 619,
                "height": 297,
                "content": "### Scrape Books from URL with Dumpling AI, Clean HTML, Save to Sheets, Email as CSV\n\n\ud83d\udccc This workflow scrapes book data from a website, turns it into a CSV, saves it, and sends it by email.\n\n\ud83d\udd27 It starts from a Google Sheets trigger, fetches the page using DumplingAI, extracts books, sorts by price, and emails the CSV.\n\n\u2705 Make sure APIs for Gmail, Sheets & Drive are enabled in Google Cloud. Update the URL in the \"Fetch website content\" node.\n"
            },
            "typeVersion": 1
        },
        {
            "id": "f599028a-49a9-4b85-b484-5abf1229e373",
            "name": "Sticky Note",
            "type": "n8n-nodes-base.stickyNote",
            "position": [
                140,
                60
            ],
            "parameters": {
                "color": 4,
                "width": 900,
                "height": 300,
                "content": "### \ud83d\udd01 Trigger to Raw Book HTML\n\n1. **Google Sheets Trigger**  \n   Watches a sheet for new row entries. Once a new URL is added, the workflow starts.\n\n2. **Fetch Website Content (Dumpling AI)**  \n   Makes an HTTP POST request to Dumpling AI to scrape and return the full HTML of the target URL.\n\n3. **Extract All Books**  \n   Uses CSS selectors to isolate the list items (`li.row > li`) containing book entries.\n\n4. **Split Out Node**  \n   Breaks the array of book HTML blocks into individual items, so each book can be processed separately in the next steps.\n"
            },
            "typeVersion": 1
        },
        {
            "id": "bc6ab72c-de03-4e79-9da0-ca12ddf31811",
            "name": "Sticky Note1",
            "type": "n8n-nodes-base.stickyNote",
            "position": [
                1140,
                60
            ],
            "parameters": {
                "color": 6,
                "width": 840,
                "height": 300,
                "content": "### \ud83d\udce6 Parse, Sort, Export & Email\n\n5. **Extract Individual Book Data**  \n   From each book, extract the title (`<h3>a` title attribute) and price (`.price_color` content).\n\n6. **Sort by Price**  \n   Organizes the extracted data in descending order using the price field.\n\n7. **Convert to CSV File**  \n   Transforms the sorted JSON data into a downloadable CSV file format.\n\n8. **Send CSV via Gmail**  \n   Automatically sends an email with the CSV file attached to the predefined address.\n"
            },
            "typeVersion": 1
        },
        {
            "id": "a1246b4e-212f-4bd3-970b-b0ff8db2f834",
            "name": "Trigger- Watches For new URL in Spreadsheet",
            "type": "n8n-nodes-base.googleSheetsTrigger",
            "position": [
                320,
                340
            ],
            "parameters": {
                "event": "rowAdded",
                "options": [],
                "pollTimes": {
                    "item": [
                        {
                            "mode": "everyMinute"
                        }
                    ]
                },
                "sheetName": {
                    "__rl": true,
                    "mode": "list",
                    "value": "",
                    "cachedResultUrl": "https:\/\/docs.google.com\/spreadsheets\/d\/1pb4WLqv2EruLM1z9-utehcINolSj0vlUqZionyLoRUs\/edit#gid=0",
                    "cachedResultName": "Sheet1"
                },
                "documentId": {
                    "__rl": true,
                    "mode": "list",
                    "value": "",
                    "cachedResultUrl": "https:\/\/docs.google.com\/spreadsheets\/d\/1pb4WLqv2EruLM1z9-utehcINolSj0vlUqZionyLoRUs\/edit?usp=drivesdk",
                    "cachedResultName": "URLs"
                }
            },
            "credentials": {
                "googleSheetsTriggerOAuth2Api": {
                    "id": "qDzHSzTkclwDHpSR",
                    "name": "Google Sheets Trigger account"
                }
            },
            "typeVersion": 1
        },
        {
            "id": "b19aa287-3be4-4e16-908d-b0cb484519e3",
            "name": "Scrape Website Content with Dumpling AI",
            "type": "n8n-nodes-base.httpRequest",
            "position": [
                540,
                340
            ],
            "parameters": {
                "url": "https:\/\/app.dumplingai.com\/api\/v1\/scrape",
                "method": "POST",
                "options": {
                    "allowUnauthorizedCerts": true
                },
                "jsonBody": "={\n  \"url\": \"{{ $('Trigger- Watches For new URL in Spreadsheet')}}\", \n  \"format\": \"html\",\n  \"cleaned\": \"True\"\n  }",
                "sendBody": true,
                "sendHeaders": true,
                "specifyBody": "json",
                "authentication": "genericCredentialType",
                "genericAuthType": "httpHeaderAuth",
                "headerParameters": {
                    "parameters": [
                        {
                            "name": "Content-Type",
                            "value": "application\/json"
                        }
                    ]
                }
            },
            "credentials": {
                "httpBasicAuth": {
                    "id": "mznexGH3YDtrUTAk",
                    "name": "Unnamed credential"
                },
                "httpHeaderAuth": {
                    "id": "xamyMqCpAech5BeT",
                    "name": "Header Auth account"
                }
            },
            "typeVersion": 4.0999999999999996447286321199499070644378662109375
        },
        {
            "id": "02cbc6f9-bdcb-45fc-9973-ded42346ffbc",
            "name": "Split HTML Array into Individual Books",
            "type": "n8n-nodes-base.splitOut",
            "position": [
                980,
                340
            ],
            "parameters": {
                "options": [],
                "fieldToSplitOut": "books"
            },
            "typeVersion": 1
        }
    ],
    "active": false,
    "pinData": [],
    "settings": {
        "executionOrder": "v1"
    },
    "versionId": "264412ff-9d74-443c-a2ff-69be1e042a82",
    "connections": {
        "Sort by price": {
            "main": [
                [
                    {
                        "node": "Convert to CSV File",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Convert to CSV File": {
            "main": [
                [
                    {
                        "node": "Send CSV via e-mail",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Extract individual book price": {
            "main": [
                [
                    {
                        "node": "Sort by price",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Extract all books from the page": {
            "main": [
                [
                    {
                        "node": "Split HTML Array into Individual Books",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Split HTML Array into Individual Books": {
            "main": [
                [
                    {
                        "node": "Extract individual book price",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Scrape Website Content with Dumpling AI": {
            "main": [
                [
                    {
                        "node": "Extract all books from the page",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Trigger- Watches For new URL in Spreadsheet": {
            "main": [
                [
                    {
                        "node": "Scrape Website Content with Dumpling AI",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        }
    }
}
Back to Workflows

Related Workflows

Redis Schedule Import Scheduled
View
Manual Cockpit Automate Triggered
View
Stopanderror Stickynote Create Webhook
View
OpenSea NFT Agent Tool
View
Testing Mulitple Local LLM with LM Studio
View
✨🔪 Advanced AI Powered Document Parsing & Text Extraction with Llama Parse
View
[2/2] KNN classifier (lands dataset)
View
Get the current weather data for a city
View
Splitout GoogleCalendar Automate Webhook
View
Manual Stickynote Create Webhook
View