Python Concurrency with asyncio

A week ago, I was working on a project that involved calling a REST API end point 32 million times to retrieve certain type of documents. The input to the API was a presigned URL that had a validity of few days. Hence I did not have the luxury of doing things in sequential manner. A rough calculation for the time taken to perform the task using a simple for loop made me realize that the task is a nice little use case for parallelizing. That’s when I started looking at asyncio. In the first go at my task, I ventured along with a standard approach of using multithreading functions in python. However there was always an itch to see if I could get better performance using ayncio and multithreading. The book titled “Python Concurrency with asyncio” written by “Matthew Fowler” helped me understand the basics of concurrent and parallel computing with asyncio. Subsequently I went back and performed the task of pinging an API 32 million times to retrieve 32 million json documents using asyncio and multithreading. In this post, I will summarize a few chapters that I found it useful to get my work done.