Python provides three concurrency models: threads (I/O-bound, GIL limits CPU parallelism), multiprocessing (true CPU parallelism, separate memory spaces), and asyncio (cooperative single-threaded concurrency via an event loop). Choosing the right model — and knowing why the GIL exists — is fundamental to Python performance.

Key Points

  • GIL (Global Interpreter Lock): CPython mutex that ensures only one thread executes Python bytecode at a time — threads don't achieve CPU parallelism for pure Python code
  • Threads (threading module): useful for I/O-bound work — GIL is released during I/O; use ThreadPoolExecutor for pool management
  • Multiprocessing: spawns separate Python processes each with their own GIL — true CPU parallelism; use ProcessPoolExecutor
  • asyncio: single-threaded event loop, cooperative multitasking — async def / await suspends coroutines on I/O, yields to other coroutines
  • async/await: async def defines a coroutine; await suspends and yields control to the event loop
  • asyncio.gather(): run multiple coroutines concurrently — like Promise.all() in JavaScript
  • asyncio.create_task(): schedule coroutine without waiting — background task
  • aiohttp, httpx, asyncpg, aiofiles: async-native libraries for HTTP, PostgreSQL, files
  • Mixing sync and async: loop.run_in_executor() runs blocking code in a thread pool without blocking the event loop
ModelUse caseGILMemoryComplexity
threadingI/O-bound (HTTP, DB, files)Limited by GILSharedLow
multiprocessingCPU-bound (compute, ML)BypassedSeparateMedium
asyncioHigh-concurrency I/O (1000s of connections)Single-threadedSharedMedium
concurrent.futuresPool abstraction for threads/processesSame as underlyingBothLow

Python concurrency: asyncio gather, run_in_executor for sync code, ThreadPoolExecutor, ProcessPoolExecutor

import asyncio, aiohttp
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor

# asyncio — fetch many URLs concurrently
async def fetch(session, url):
    async with session.get(url) as resp:
        return await resp.text()

async def fetch_all(urls):
    async with aiohttp.ClientSession() as session:
        tasks = [asyncio.create_task(fetch(session, url)) for url in urls]
        return await asyncio.gather(*tasks)   # concurrent, not parallel

results = asyncio.run(fetch_all(urls))       # entry point

# Mixing sync and async — run blocking code without blocking event loop
async def process():
    loop = asyncio.get_event_loop()
    data = await loop.run_in_executor(None, read_large_file, path)
    return data

# ThreadPoolExecutor — I/O-bound parallelism
def download(url): ...

with ThreadPoolExecutor(max_workers=10) as pool:
    futures = [pool.submit(download, url) for url in urls]
    results = [f.result() for f in futures]

# ProcessPoolExecutor — CPU-bound parallelism
import math
def compute_primes(limit): ...

with ProcessPoolExecutor() as pool:
    # splits work across CPU cores
    results = list(pool.map(compute_primes, [10**6, 10**7, 10**8]))

# Async context manager + generator
async def stream_rows(conn, query):
    async with conn.transaction():
        async for row in conn.cursor(query):
            yield row                         # async generator

Real-World Example

A FastAPI server with async def endpoints handles 10,000+ concurrent requests on a single process because each await yields control to the event loop while waiting for the database or HTTP call. The same logic with synchronous code and a thread pool is limited by thread count (typically 200–500). The GIL is being removed in Python 3.13+ (free-threaded mode) — but asyncio remains the idiomatic choice for I/O-bound servers.