Workflow: Splitout Converttofile Send

Workflow Details

Download Workflow
{
    "id": "DswhuYzoemjA6iNN",
    "meta": {
        "instanceId": "a1ae5c8dc6c65e674f9c3947d083abcc749ef2546dff9f4ff01de4d6a36ebfe6",
        "templateCredsSetupCompleted": true
    },
    "name": "Scrape Books from URL with Dumpling AI, Clean HTML, Save to Sheets, Email as CSV",
    "tags": [
        {
            "id": "TlcNkmb96fUfZ2eA",
            "name": "Tutorials",
            "createdAt": "2025-04-15T17:02:00.249Z",
            "updatedAt": "2025-04-15T17:02:00.249Z"
        }
    ],
    "nodes": [
        {
            "id": "2e4f64a5-353c-4dd3-9822-62df795d4940",
            "name": "Convert to CSV File",
            "type": "n8n-nodes-base.convertToFile",
            "position": [
                1640,
                340
            ],
            "parameters": {
                "options": []
            },
            "typeVersion": 1.100000000000000088817841970012523233890533447265625
        },
        {
            "id": "472442d3-a691-4310-93f8-019579d0c473",
            "name": "Extract all books from the page",
            "type": "n8n-nodes-base.html",
            "position": [
                760,
                340
            ],
            "parameters": {
                "options": [],
                "operation": "extractHtmlContent",
                "dataPropertyName": "content",
                "extractionValues": {
                    "values": [
                        {
                            "key": "books",
                            "cssSelector": ".row > li",
                            "returnArray": true,
                            "returnValue": "html"
                        }
                    ]
                }
            },
            "typeVersion": 1.1999999999999999555910790149937383830547332763671875
        },
        {
            "id": "92765257-d64d-47c9-bd57-50914342138b",
            "name": "Sort by price",
            "type": "n8n-nodes-base.sort",
            "position": [
                1420,
                340
            ],
            "parameters": {
                "options": [],
                "sortFieldsUi": {
                    "sortField": [
                        {
                            "order": "descending",
                            "fieldName": "price"
                        }
                    ]
                }
            },
            "typeVersion": 1
        },
        {
            "id": "efc2f33f-1bef-4906-b3b7-b02868080a54",
            "name": "Extract individual book price",
            "type": "n8n-nodes-base.html",
            "position": [
                1200,
                340
            ],
            "parameters": {
                "options": [],
                "operation": "extractHtmlContent",
                "dataPropertyName": "books",
                "extractionValues": {
                    "values": [
                        {
                            "key": "title",
                            "attribute": "title",
                            "cssSelector": "h3 > a",
                            "returnValue": "attribute"
                        },
                        {
                            "key": "price",
                            "cssSelector": ".price_color"
                        }
                    ]
                }
            },
            "typeVersion": 1.1999999999999999555910790149937383830547332763671875
        },
        {
            "id": "74c7c3af-d63c-4b6c-95a0-15f45b19134b",
            "name": "Send CSV via e-mail",
            "type": "n8n-nodes-base.gmail",
            "position": [
                1860,
                340
            ],
            "webhookId": "40f2d609-52ed-40bf-b190-1f1cebbe3fb7",
            "parameters": {
                "sendTo": "",
                "message": "Hey, here's the scraped data from the online bookstore!",
                "options": {
                    "attachmentsUi": {
                        "attachmentsBinary": [
                            []
                        ]
                    }
                },
                "subject": "bookstore csv",
                "emailType": "text"
            },
            "credentials": {
                "gmailOAuth2": {
                    "id": "j70r3RTMED1pgN3R",
                    "name": "Gmail account 2"
                }
            },
            "typeVersion": 2.100000000000000088817841970012523233890533447265625
        },
        {
            "id": "95c7998b-ece0-4dea-b99e-97ac22fb8a59",
            "name": "Sticky Note3",
            "type": "n8n-nodes-base.stickyNote",
            "position": [
                140,
                -260
            ],
            "parameters": {
                "width": 619,
                "height": 297,
                "content": "### Scrape Books from URL with Dumpling AI, Clean HTML, Save to Sheets, Email as CSV\n\n\ud83d\udccc This workflow scrapes book data from a website, turns it into a CSV, saves it, and sends it by email.\n\n\ud83d\udd27 It starts from a Google Sheets trigger, fetches the page using DumplingAI, extracts books, sorts by price, and emails the CSV.\n\n\u2705 Make sure APIs for Gmail, Sheets & Drive are enabled in Google Cloud. Update the URL in the \"Fetch website content\" node.\n"
            },
            "typeVersion": 1
        },
        {
            "id": "f599028a-49a9-4b85-b484-5abf1229e373",
            "name": "Sticky Note",
            "type": "n8n-nodes-base.stickyNote",
            "position": [
                140,
                60
            ],
            "parameters": {
                "color": 4,
                "width": 900,
                "height": 300,
                "content": "### \ud83d\udd01 Trigger to Raw Book HTML\n\n1. **Google Sheets Trigger**  \n   Watches a sheet for new row entries. Once a new URL is added, the workflow starts.\n\n2. **Fetch Website Content (Dumpling AI)**  \n   Makes an HTTP POST request to Dumpling AI to scrape and return the full HTML of the target URL.\n\n3. **Extract All Books**  \n   Uses CSS selectors to isolate the list items (`li.row > li`) containing book entries.\n\n4. **Split Out Node**  \n   Breaks the array of book HTML blocks into individual items, so each book can be processed separately in the next steps.\n"
            },
            "typeVersion": 1
        },
        {
            "id": "bc6ab72c-de03-4e79-9da0-ca12ddf31811",
            "name": "Sticky Note1",
            "type": "n8n-nodes-base.stickyNote",
            "position": [
                1140,
                60
            ],
            "parameters": {
                "color": 6,
                "width": 840,
                "height": 300,
                "content": "### \ud83d\udce6 Parse, Sort, Export & Email\n\n5. **Extract Individual Book Data**  \n   From each book, extract the title (`<h3>a` title attribute) and price (`.price_color` content).\n\n6. **Sort by Price**  \n   Organizes the extracted data in descending order using the price field.\n\n7. **Convert to CSV File**  \n   Transforms the sorted JSON data into a downloadable CSV file format.\n\n8. **Send CSV via Gmail**  \n   Automatically sends an email with the CSV file attached to the predefined address.\n"
            },
            "typeVersion": 1
        },
        {
            "id": "a1246b4e-212f-4bd3-970b-b0ff8db2f834",
            "name": "Trigger- Watches For new URL in Spreadsheet",
            "type": "n8n-nodes-base.googleSheetsTrigger",
            "position": [
                320,
                340
            ],
            "parameters": {
                "event": "rowAdded",
                "options": [],
                "pollTimes": {
                    "item": [
                        {
                            "mode": "everyMinute"
                        }
                    ]
                },
                "sheetName": {
                    "__rl": true,
                    "mode": "list",
                    "value": "",
                    "cachedResultUrl": "https:\/\/docs.google.com\/spreadsheets\/d\/1pb4WLqv2EruLM1z9-utehcINolSj0vlUqZionyLoRUs\/edit#gid=0",
                    "cachedResultName": "Sheet1"
                },
                "documentId": {
                    "__rl": true,
                    "mode": "list",
                    "value": "",
                    "cachedResultUrl": "https:\/\/docs.google.com\/spreadsheets\/d\/1pb4WLqv2EruLM1z9-utehcINolSj0vlUqZionyLoRUs\/edit?usp=drivesdk",
                    "cachedResultName": "URLs"
                }
            },
            "credentials": {
                "googleSheetsTriggerOAuth2Api": {
                    "id": "qDzHSzTkclwDHpSR",
                    "name": "Google Sheets Trigger account"
                }
            },
            "typeVersion": 1
        },
        {
            "id": "b19aa287-3be4-4e16-908d-b0cb484519e3",
            "name": "Scrape Website Content with Dumpling AI",
            "type": "n8n-nodes-base.httpRequest",
            "position": [
                540,
                340
            ],
            "parameters": {
                "url": "https:\/\/app.dumplingai.com\/api\/v1\/scrape",
                "method": "POST",
                "options": {
                    "allowUnauthorizedCerts": true
                },
                "jsonBody": "={\n  \"url\": \"{{ $('Trigger- Watches For new URL in Spreadsheet')}}\", \n  \"format\": \"html\",\n  \"cleaned\": \"True\"\n  }",
                "sendBody": true,
                "sendHeaders": true,
                "specifyBody": "json",
                "authentication": "genericCredentialType",
                "genericAuthType": "httpHeaderAuth",
                "headerParameters": {
                    "parameters": [
                        {
                            "name": "Content-Type",
                            "value": "application\/json"
                        }
                    ]
                }
            },
            "credentials": {
                "httpBasicAuth": {
                    "id": "mznexGH3YDtrUTAk",
                    "name": "Unnamed credential"
                },
                "httpHeaderAuth": {
                    "id": "xamyMqCpAech5BeT",
                    "name": "Header Auth account"
                }
            },
            "typeVersion": 4.0999999999999996447286321199499070644378662109375
        },
        {
            "id": "02cbc6f9-bdcb-45fc-9973-ded42346ffbc",
            "name": "Split HTML Array into Individual Books",
            "type": "n8n-nodes-base.splitOut",
            "position": [
                980,
                340
            ],
            "parameters": {
                "options": [],
                "fieldToSplitOut": "books"
            },
            "typeVersion": 1
        }
    ],
    "active": false,
    "pinData": [],
    "settings": {
        "executionOrder": "v1"
    },
    "versionId": "264412ff-9d74-443c-a2ff-69be1e042a82",
    "connections": {
        "Sort by price": {
            "main": [
                [
                    {
                        "node": "Convert to CSV File",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Convert to CSV File": {
            "main": [
                [
                    {
                        "node": "Send CSV via e-mail",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Extract individual book price": {
            "main": [
                [
                    {
                        "node": "Sort by price",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Extract all books from the page": {
            "main": [
                [
                    {
                        "node": "Split HTML Array into Individual Books",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Split HTML Array into Individual Books": {
            "main": [
                [
                    {
                        "node": "Extract individual book price",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Scrape Website Content with Dumpling AI": {
            "main": [
                [
                    {
                        "node": "Extract all books from the page",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Trigger- Watches For new URL in Spreadsheet": {
            "main": [
                [
                    {
                        "node": "Scrape Website Content with Dumpling AI",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        }
    }
}
Back to Workflows

Related Workflows

Schedule Filter Create Scheduled
View
Create, update and get a subscriber using the MailerLite node
View
RAG & GenAI App With WordPress Content
View
Splitout Schedule Send Scheduled
View
Splitout Filter Create Webhook
View
Building RAG Chatbot for Movie Recommendations with Qdrant and Open AI
View
Manual Signl4 Automate Triggered
View
Wait Slack Automate Webhook
View
[2/2] KNN classifier (lands dataset)
View
Manual Salesforce Automate Triggered
View