Workflow: Splitout Converttofile Send

Workflow Details

Download Workflow
{
    "id": "DswhuYzoemjA6iNN",
    "meta": {
        "instanceId": "a1ae5c8dc6c65e674f9c3947d083abcc749ef2546dff9f4ff01de4d6a36ebfe6",
        "templateCredsSetupCompleted": true
    },
    "name": "Scrape Books from URL with Dumpling AI, Clean HTML, Save to Sheets, Email as CSV",
    "tags": [
        {
            "id": "TlcNkmb96fUfZ2eA",
            "name": "Tutorials",
            "createdAt": "2025-04-15T17:02:00.249Z",
            "updatedAt": "2025-04-15T17:02:00.249Z"
        }
    ],
    "nodes": [
        {
            "id": "2e4f64a5-353c-4dd3-9822-62df795d4940",
            "name": "Convert to CSV File",
            "type": "n8n-nodes-base.convertToFile",
            "position": [
                1640,
                340
            ],
            "parameters": {
                "options": []
            },
            "typeVersion": 1.100000000000000088817841970012523233890533447265625
        },
        {
            "id": "472442d3-a691-4310-93f8-019579d0c473",
            "name": "Extract all books from the page",
            "type": "n8n-nodes-base.html",
            "position": [
                760,
                340
            ],
            "parameters": {
                "options": [],
                "operation": "extractHtmlContent",
                "dataPropertyName": "content",
                "extractionValues": {
                    "values": [
                        {
                            "key": "books",
                            "cssSelector": ".row > li",
                            "returnArray": true,
                            "returnValue": "html"
                        }
                    ]
                }
            },
            "typeVersion": 1.1999999999999999555910790149937383830547332763671875
        },
        {
            "id": "92765257-d64d-47c9-bd57-50914342138b",
            "name": "Sort by price",
            "type": "n8n-nodes-base.sort",
            "position": [
                1420,
                340
            ],
            "parameters": {
                "options": [],
                "sortFieldsUi": {
                    "sortField": [
                        {
                            "order": "descending",
                            "fieldName": "price"
                        }
                    ]
                }
            },
            "typeVersion": 1
        },
        {
            "id": "efc2f33f-1bef-4906-b3b7-b02868080a54",
            "name": "Extract individual book price",
            "type": "n8n-nodes-base.html",
            "position": [
                1200,
                340
            ],
            "parameters": {
                "options": [],
                "operation": "extractHtmlContent",
                "dataPropertyName": "books",
                "extractionValues": {
                    "values": [
                        {
                            "key": "title",
                            "attribute": "title",
                            "cssSelector": "h3 > a",
                            "returnValue": "attribute"
                        },
                        {
                            "key": "price",
                            "cssSelector": ".price_color"
                        }
                    ]
                }
            },
            "typeVersion": 1.1999999999999999555910790149937383830547332763671875
        },
        {
            "id": "74c7c3af-d63c-4b6c-95a0-15f45b19134b",
            "name": "Send CSV via e-mail",
            "type": "n8n-nodes-base.gmail",
            "position": [
                1860,
                340
            ],
            "webhookId": "40f2d609-52ed-40bf-b190-1f1cebbe3fb7",
            "parameters": {
                "sendTo": "",
                "message": "Hey, here's the scraped data from the online bookstore!",
                "options": {
                    "attachmentsUi": {
                        "attachmentsBinary": [
                            []
                        ]
                    }
                },
                "subject": "bookstore csv",
                "emailType": "text"
            },
            "credentials": {
                "gmailOAuth2": {
                    "id": "j70r3RTMED1pgN3R",
                    "name": "Gmail account 2"
                }
            },
            "typeVersion": 2.100000000000000088817841970012523233890533447265625
        },
        {
            "id": "95c7998b-ece0-4dea-b99e-97ac22fb8a59",
            "name": "Sticky Note3",
            "type": "n8n-nodes-base.stickyNote",
            "position": [
                140,
                -260
            ],
            "parameters": {
                "width": 619,
                "height": 297,
                "content": "### Scrape Books from URL with Dumpling AI, Clean HTML, Save to Sheets, Email as CSV\n\n\ud83d\udccc This workflow scrapes book data from a website, turns it into a CSV, saves it, and sends it by email.\n\n\ud83d\udd27 It starts from a Google Sheets trigger, fetches the page using DumplingAI, extracts books, sorts by price, and emails the CSV.\n\n\u2705 Make sure APIs for Gmail, Sheets & Drive are enabled in Google Cloud. Update the URL in the \"Fetch website content\" node.\n"
            },
            "typeVersion": 1
        },
        {
            "id": "f599028a-49a9-4b85-b484-5abf1229e373",
            "name": "Sticky Note",
            "type": "n8n-nodes-base.stickyNote",
            "position": [
                140,
                60
            ],
            "parameters": {
                "color": 4,
                "width": 900,
                "height": 300,
                "content": "### \ud83d\udd01 Trigger to Raw Book HTML\n\n1. **Google Sheets Trigger**  \n   Watches a sheet for new row entries. Once a new URL is added, the workflow starts.\n\n2. **Fetch Website Content (Dumpling AI)**  \n   Makes an HTTP POST request to Dumpling AI to scrape and return the full HTML of the target URL.\n\n3. **Extract All Books**  \n   Uses CSS selectors to isolate the list items (`li.row > li`) containing book entries.\n\n4. **Split Out Node**  \n   Breaks the array of book HTML blocks into individual items, so each book can be processed separately in the next steps.\n"
            },
            "typeVersion": 1
        },
        {
            "id": "bc6ab72c-de03-4e79-9da0-ca12ddf31811",
            "name": "Sticky Note1",
            "type": "n8n-nodes-base.stickyNote",
            "position": [
                1140,
                60
            ],
            "parameters": {
                "color": 6,
                "width": 840,
                "height": 300,
                "content": "### \ud83d\udce6 Parse, Sort, Export & Email\n\n5. **Extract Individual Book Data**  \n   From each book, extract the title (`<h3>a` title attribute) and price (`.price_color` content).\n\n6. **Sort by Price**  \n   Organizes the extracted data in descending order using the price field.\n\n7. **Convert to CSV File**  \n   Transforms the sorted JSON data into a downloadable CSV file format.\n\n8. **Send CSV via Gmail**  \n   Automatically sends an email with the CSV file attached to the predefined address.\n"
            },
            "typeVersion": 1
        },
        {
            "id": "a1246b4e-212f-4bd3-970b-b0ff8db2f834",
            "name": "Trigger- Watches For new URL in Spreadsheet",
            "type": "n8n-nodes-base.googleSheetsTrigger",
            "position": [
                320,
                340
            ],
            "parameters": {
                "event": "rowAdded",
                "options": [],
                "pollTimes": {
                    "item": [
                        {
                            "mode": "everyMinute"
                        }
                    ]
                },
                "sheetName": {
                    "__rl": true,
                    "mode": "list",
                    "value": "",
                    "cachedResultUrl": "https:\/\/docs.google.com\/spreadsheets\/d\/1pb4WLqv2EruLM1z9-utehcINolSj0vlUqZionyLoRUs\/edit#gid=0",
                    "cachedResultName": "Sheet1"
                },
                "documentId": {
                    "__rl": true,
                    "mode": "list",
                    "value": "",
                    "cachedResultUrl": "https:\/\/docs.google.com\/spreadsheets\/d\/1pb4WLqv2EruLM1z9-utehcINolSj0vlUqZionyLoRUs\/edit?usp=drivesdk",
                    "cachedResultName": "URLs"
                }
            },
            "credentials": {
                "googleSheetsTriggerOAuth2Api": {
                    "id": "qDzHSzTkclwDHpSR",
                    "name": "Google Sheets Trigger account"
                }
            },
            "typeVersion": 1
        },
        {
            "id": "b19aa287-3be4-4e16-908d-b0cb484519e3",
            "name": "Scrape Website Content with Dumpling AI",
            "type": "n8n-nodes-base.httpRequest",
            "position": [
                540,
                340
            ],
            "parameters": {
                "url": "https:\/\/app.dumplingai.com\/api\/v1\/scrape",
                "method": "POST",
                "options": {
                    "allowUnauthorizedCerts": true
                },
                "jsonBody": "={\n  \"url\": \"{{ $('Trigger- Watches For new URL in Spreadsheet')}}\", \n  \"format\": \"html\",\n  \"cleaned\": \"True\"\n  }",
                "sendBody": true,
                "sendHeaders": true,
                "specifyBody": "json",
                "authentication": "genericCredentialType",
                "genericAuthType": "httpHeaderAuth",
                "headerParameters": {
                    "parameters": [
                        {
                            "name": "Content-Type",
                            "value": "application\/json"
                        }
                    ]
                }
            },
            "credentials": {
                "httpBasicAuth": {
                    "id": "mznexGH3YDtrUTAk",
                    "name": "Unnamed credential"
                },
                "httpHeaderAuth": {
                    "id": "xamyMqCpAech5BeT",
                    "name": "Header Auth account"
                }
            },
            "typeVersion": 4.0999999999999996447286321199499070644378662109375
        },
        {
            "id": "02cbc6f9-bdcb-45fc-9973-ded42346ffbc",
            "name": "Split HTML Array into Individual Books",
            "type": "n8n-nodes-base.splitOut",
            "position": [
                980,
                340
            ],
            "parameters": {
                "options": [],
                "fieldToSplitOut": "books"
            },
            "typeVersion": 1
        }
    ],
    "active": false,
    "pinData": [],
    "settings": {
        "executionOrder": "v1"
    },
    "versionId": "264412ff-9d74-443c-a2ff-69be1e042a82",
    "connections": {
        "Sort by price": {
            "main": [
                [
                    {
                        "node": "Convert to CSV File",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Convert to CSV File": {
            "main": [
                [
                    {
                        "node": "Send CSV via e-mail",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Extract individual book price": {
            "main": [
                [
                    {
                        "node": "Sort by price",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Extract all books from the page": {
            "main": [
                [
                    {
                        "node": "Split HTML Array into Individual Books",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Split HTML Array into Individual Books": {
            "main": [
                [
                    {
                        "node": "Extract individual book price",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Scrape Website Content with Dumpling AI": {
            "main": [
                [
                    {
                        "node": "Extract all books from the page",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        },
        "Trigger- Watches For new URL in Spreadsheet": {
            "main": [
                [
                    {
                        "node": "Scrape Website Content with Dumpling AI",
                        "type": "main",
                        "index": 0
                    }
                ]
            ]
        }
    }
}
Back to Workflows

Related Workflows

YouTube Video Transcriber
View
Standup Bot - Initialize
View
OpenAI-model-examples
View
Noop Slack Send Webhook
View
Schedule Twilio Automation Webhook
View
Send the Astronomy Picture of the day daily to a Telegram channel
View
Schedule Telegram Automation Scheduled
View
Datetime Schedule Create Scheduled
View
Amazon keywords
View
Use any LLM-Model via OpenRouter
View