Node.js Web Scraping with Powerful Features for Server-Side Crawling and More

Nov 8, 2017

Summary of my bookmarked Github repositories from Nov 8th, 2017

bda-research/node-crawler
Node.js is a powerful and widely-used package for server-side crawling and scraping in JavaScript. It offers features such as server-side DOM manipulation, automatic jQuery insertion, configurable pool size and retries, rate limit control, and a priority queue for requests. It is compatible with Node.js version 4.x or newer. The package also provides options for working with Cheerio or JSDOM, and supports features like sending requests directly and working with HTTP2. With Node.js, you can easily crawl and scrape web content for various purposes.