diff --git a/notes.md b/notes.md index 101af1d..071c64b 100644 --- a/notes.md +++ b/notes.md @@ -18,39 +18,9 @@ * better exception handling * randomise output filename -### Async bits +### talking points -in `__main__`: - -```python -loop = asyncio.get_event_loop() -try: - loop.run_until_complete(main()) -finally: - loop.close() -``` - - * initialises loop and runs it to completion - * needs to handle errors (try/except/finally) - -```python -async def run(args=None): - tasks = [] - - for url in pool: - tasks.append(url) - # for i in range(10): - # tasks.append(asyncio.ensure_future(myCoroutine(i))) - - # gather completed tasks - await asyncio.gather(*tasks) -``` - -Getting the contents of the page needs to be async too - -```python -async def get_source(): - blah - blah - await urlopen(url) -``` \ No newline at end of file + - token bucket algo to enforce n requests per second + - read up on bucket algo types + - re-structuring AsyncCrawler to be more testable + - use exponential backoff algo? \ No newline at end of file