75d3756bbcd01a5ce98e5b5d4f69fa8cd4577ca6
Concurrent web scraper
Requirements
This crawler requires at least Python 3.5 in order to utilise the async/await keywords from asyncio.
Install required modules:
pip install -r requirements.txt
Run:
python crawler.py -u https://urltocrawl.com [-c 100]
Flags:
- -u/--url https://url.com
- The base URL is required.
- -c/--concurrency 100
- Specifying concurrency value is optional (defaults to 100).
Results
The resulting sitemap will be output to the root of this directory as sitemap.html
Description
Languages
Python
100%