Puppeteer Web Scraper

Puppeteer is a Node.js library, controls Chrome/Chromium through the DevTools Protocol in headless mode. Keep in mind that when scraping websites, you should always review and comply with the website's terms of service and policies to ensure ethical and legal use of the data.

Scrape One URL

  1. (Optional) Connect Text Splitter.

  2. Input desired URL to be scraped.

Crawl & Scrape Multiple URLs

Visit Web Crawl guide to allow scaping of multiple pages.

Output

Loads URL content as Document

Resources

Last updated