Gunicorn Worker Class
Gunicorn has worker_class setting. Some possible values are
sync
gthread
gevent
Definitions from Luis Sena’s nice blog
sync
This is the default worker class. Each process will handle 1 request at a time and you can use the parameter -w to set workers.
The recommendation for the number of workers is 2–4 x $(NUM_CORES), although it will depend on how your application works.
gthread
If you use gthread, Gunicorn will allow each worker to have multiple threads. In this case, the Python application is loaded once per worker, and each of the threads spawned by the same worker shares the same memory space.
Those threads will be at the mercy of the GIL, but it’s still useful for when you have some I/O blocking happening. It will allow you to handle more concurrency without increasing your memory too much.
gevent
Eventlet and gevent make use of “green threads” or “pseudo threads” and are based on greenlet.
In practice, if your application work is mainly I/O bound, it will allow it to scale to potentially thousands of concurrent requests on a single process.
Even with the rise of async frameworks (fastapi, sanic, etc), this is still relevant today since it allows you to optimize for I/O without having the extra code complexity.
The way they manage to do it is by “monkey patching” your code, mainly replacing blocking parts with compatible cooperative counterparts from gevent package.
It uses epoll or kqueue or libevent for highly scalable non-blocking I/O. Coroutines ensure that the developer uses a blocking style of programming that is similar to threading, but provide the benefits of non-blocking I/O.
This is usually the most efficient way to run your django/flask/etc web application, since most of the time the bulk of the latency comes from I/O related work.
Workers Value
While using gevent workers threads count is not set up. Documentation settings – threads says that threads are only relevant for gthread workers. So with gevent we only have workers. And each worker is a separate operating system process as far as I know. And the workers count is 8 to 16 for a 4 core machine.
Where does the performance gain come from
So really there is not any threads executed? If there is not any threads how does the gevent workers gain performance. There should be some pseudo threads that should be executed concurrently. While some pseudo thread is making I/O it should be detected and another pseudo thread should be executed. So where is this another pseudo thread? Is it the other worker processes or gunicorn creates some pseudo threads within a worker process?
Okay, let\’s break down Gunicorn worker classes, focusing on the `gevent` worker and where its performance gains come from.\n\n**Understanding the Core Concepts**\n\n* **Processes vs. Threads:**\n * **Processes:** A process is an independent execution environment with its own memory space. Each Gunicorn worker, regardless of the `worker_class`, is typically a separate process. This provides isolation – if one worker crashes, it ideally doesn\’t take down the entire application.\n * **Threads:** Threads exist *within* a process and share the same memory space. This makes them lighter weight than processes but also introduces the risk of thread-safety issues (race conditions, etc.) if not managed carefully.\n* **Concurrency vs. Parallelism:**\n * **Concurrency:** Dealing with multiple tasks *at the same time* (but not necessarily executing them simultaneously). Think of a single chef juggling multiple orders.\n * **Parallelism:** Executing multiple tasks *simultaneously*. Think of multiple chefs each working on a different order. Parallelism requires multiple CPU cores.\n* **I/O Bound vs. CPU Bound:**\n * **I/O Bound:** The application spends most of its time waiting for I/O operations (database queries, network requests, file system access). Web applications are typically I/O bound.\n * **CPU Bound:** The application spends most of its time performing calculations.\n\n**Gunicorn Worker Classes: A Recap**\n\n* **`sync`:** The simplest. Each worker handles one request at a time. Good for CPU-bound tasks where the GIL isn\’t a bottleneck.\n* **`gthread`:** Each worker process spawns multiple threads. The threads share the same memory space within the process. Useful for I/O bound applications, but still limited by the GIL.\n* **`gevent`:** Uses greenlets (lightweight, \”pseudo-threads\”) within a single process (or rather, within each *worker* process). Heavily optimized for I/O-bound applications.\n\n**The `gevent` Worker: Where the Performance Gains Come From**\n\nYou are correct that the `gevent` worker doesn\’t use operating system-level threads configured via the `–threads` setting. The performance gains come from the following:\n\n1. **Greenlets (Cooperative Multitasking):**\n * `gevent` (and Eventlet) uses greenlets, which are lightweight, user-space threads. These are *not* the same as OS-level threads. Think of them as coroutines or cooperative multitasking.\n * A greenlet runs until it explicitly yields control (e.g., when it\’s waiting for I/O).\n\n2. **Monkey Patching and Non-Blocking I/O:**\n * `gevent` *monkey patches* standard library functions (e.g., socket operations) to make them non-blocking and cooperative. Monkey patching means dynamically modifying existing code at runtime. This can be a bit risky, so be sure you understand the implications and test thoroughly.\n * Instead of blocking while waiting for a network connection, `gevent`\’s patched `socket` library will yield control to another greenlet.\n\n3. **Event Loop:**\n * `gevent` uses an event loop (based on `libev` or similar) to monitor file descriptors (sockets, files, etc.) for I/O readiness.\n * The event loop efficiently manages which greenlet is ready to run next. When a greenlet\’s I/O operation completes, the event loop schedules it to resume.\n\n**Putting It All Together**\n\nImagine a `gevent` worker handling multiple web requests. Here\’s a simplified view:\n\n1. The worker receives a new request.\n2. It creates a new greenlet to handle that request.\n3. The greenlet starts executing, making calls to, say, a database.\n4. The database call (thanks to monkey patching) is non-blocking. The greenlet yields control back to the `gevent` event loop.\n5. The event loop looks for other greenlets that are ready to run (whose I/O operations have completed).\n6. It finds another greenlet (perhaps one waiting for a different database query) and resumes its execution.\n7. This continues until the first greenlet\’s database query completes. The event loop then schedules it to resume processing the original request.\n\n**Key Points:**\n\n* **Concurrency within a Worker:** `gevent` allows *concurrent* execution of multiple requests *within a single worker process*. It achieves this through cooperative multitasking and non-blocking I/O.\n* **No OS Threads (Typically):** The `gevent` worker generally doesn\’t use the OS-level threads that are configured with `–threads`. The concurrency comes from greenlets.\n* **Workers Still Matter:** You still use the `-w` flag to specify the number of Gunicorn *worker processes*. Each worker process then has its own `gevent` event loop and set of greenlets. More workers allow you to better utilize multiple CPU cores, but each worker also consumes memory.\n* **Not True Parallelism:** Because of the GIL, even with `gevent`, you don\’t get true parallelism in Python code. Only one greenlet can execute Python bytecode at a time *within a given worker process*. However, because of the efficient I/O handling, this limitation is often less significant than it would be with traditional threads or a `sync` worker.\n* **The Sweet Spot:** The `gevent` worker excels at handling a large number of concurrent I/O-bound requests efficiently.\n\n**Answers to your specific questions:**\n\n* \”So really there is not any threads executed?\” Correct, the `gevent` worker class typically doesn\’t rely on OS-level threads created via the `–threads` option. It uses greenlets.\n* \”If there is not any threads how does the gevent workers gain performance. There should be some pseudo threads that should be executed concurrently.\” Exactly! The pseudo-threads *are* the greenlets.\n* \”While some pseudo thread is making I/O it should be detected and another pseudo thread should be executed. So where is this another pseudo thread? Is it the other worker processes or gunicorn creates some pseudo threads within a worker process?\” Gunicorn creates pseudo threads (greenlets) within *each* worker process. The `gevent` event loop within each worker manages these greenlets. The other *worker processes* are separate processes and don\’t directly interact in terms of greenlet scheduling. They are, however, handling other incoming requests.\n\n**Recommendation**\n\nFor most I/O-bound web applications, the `gevent` worker is often a good choice. It provides excellent concurrency without the overhead of threads. You can experiment with the number of workers to find the optimal balance between resource utilization and performance.\n\nBefore deploying with gevent make sure to test your application thoroughly, especially if it uses libraries that might not be fully compatible with monkey patching. Using async frameworks is also an option but it has extra code complexity.\n\n