Scaling Python and Jupyter with ZeroMQ
What if you wanted this async ability and many more message exchange patterns like pub/sub. But you wanted to do zero of that server work? Then you should check out ZeroMQ.
ZeroMQ is to queuing what Flask is to web apps. A powerful and simple framework for you to build just what you need. You're almost certain to learn some new networking patterns and capabilities in this episode with our guest Min Ragan-Kelley to discuss using ZeroMQ from Python as well as how ZeroMQ is central to the internals of Jupyter Notebooks.
Episode Deep Dive
Guest introduction and background
Min Reagan-Kelly is a seasoned contributor to the Jupyter and IPython ecosystem with over a decade of experience. He started out in physics, doing computational simulations of plasmas, and eventually discovered Python’s flexibility for scientific work. That path led him deep into IPython (now Jupyter) development where he helped build advanced features such as IPython Parallel. Min’s current focus includes maintaining PyZMQ (the Python bindings for ZeroMQ) and working on tools that power distributed and interactive computing in Python.
What to Know If You're New to Python
If you’re new to Python but curious about the concepts in this episode, here are some ideas and resources to help you get the most out of the discussion:
- Understand Basic Python Scripting: You’ll want to be comfortable writing and running simple Python scripts before jumping into distributed or asynchronous code.
- Familiarity with Package Installation (pip / venv): ZeroMQ and PyZMQ are external libraries, so practice installing packages in a virtual environment.
- High-level Grasp of Concurrency: Even knowing basic async or threading will help you follow the messaging topics in ZeroMQ.
- Be Aware of Data Structures: ZeroMQ uses messages (rather than raw socket streams), so a little knowledge of Python lists, dictionaries, and byte arrays can be valuable.
Key points and takeaways
- ZeroMQ as a Central Player in Scalable, Asynchronous Messaging
ZeroMQ is a C++ library that abstracts away low-level networking details and focuses on messaging patterns rather than raw TCP/UDP connections. By using sockets with defined patterns (e.g., publish-subscribe or request-reply), developers can build robust, concurrent applications without setting up a separate broker service. This removes much of the overhead and complexity found in typical async solutions like Celery paired with Redis or RabbitMQ. Moreover, ZeroMQ is highly portable across many languages, making it well-suited for polyglot or microservice architectures.
- Links and tools:
- How Jupyter Relies on ZeroMQ
Jupyter notebooks use ZeroMQ sockets at their core to communicate between the “client” (the notebook interface) and the “kernel” (where code runs). For instance, it employs publish-subscribe channels for sending real-time output to all connected clients, and request-reply channels for executing code and returning results. This design allows multiple clients to connect to the same kernel without the kernel itself needing to handle each peer individually.
- Links and tools:
- PyZMQ: Pythonic Bindings for ZeroMQ
PyZMQ is the Python library that provides the bridge between ZeroMQ’s C/C++ code and Python. It wraps native functionality—like atomic message sending, non-blocking IO, and multiple socket types—while staying relatively simple to integrate. PyZMQ is foundational to Jupyter but is equally relevant for custom microservices or distributed Python solutions.
- Links and tools:
- Understanding and Choosing Socket Patterns
ZeroMQ offers several core patterns, each corresponding to a specific socket type. The main ones mentioned are:
- Publish-Subscribe (Pub-Sub) for broadcasting messages to many receivers.
- Push-Pull (Ventilator-Worker) for work queues distributing tasks across multiple workers.
- Request-Reply (Dealer-Router) for classic client-server interactions (but more flexible than raw HTTP). Developers can mix and match these patterns without rewriting large swaths of code—only the socket pattern and who “binds” versus “connects” typically change.
- Links and tools:
- Async vs. Threading vs. Multiprocessing vs. Messaging
Many people default to Python’s
threading
ormultiprocessing
libraries for parallel or asynchronous tasks, but message-based frameworks like ZeroMQ can often be more scalable and simpler to reason about. Instead of shared-state concurrency, you have explicit message-passing, which can reduce locking headaches. ZeroMQ’s asynchronous sending/receiving also interacts nicely with Python’sasyncio
or frameworks such as Tornado.- Tools:
- Serialization and Zero-Copy Transfers
Large data transfers, such as NumPy arrays, can be sent efficiently with ZeroMQ’s zero-copy features. This means you avoid extra memory copying in Python—great for real-time or high-throughput applications. The Jupyter protocol itself often exploits multi-part messages, sending text (JSON) plus binary buffers in separate frames.
- Tools:
- IPython Parallel and ZeroMQ
An extension of the IPython project, IPython Parallel uses ZeroMQ to manage distributed or parallel computation across many kernels (called “engines”). Rather than manually handling cluster management, IPython Parallel abstracts the complexity through ZeroMQ’s identity-based message routing. It’s a powerful tool for scientific workloads that need to scale code across multiple machines.
- Tools:
- Use Cases in Microservices
For distributed applications with many small services, ZeroMQ can serve as a high-speed communication backbone. It supports ephemeral scaling: new instances can appear and connect with minimal changes to the overall architecture. ZeroMQ’s approach is brokerless, so any service can directly connect to any other—an appealing model for container-based environments where ephemeral services come and go quickly.
- Tools:
- Docker / Kubernetes (referenced conceptually in the conversation)
- Tools:
- Challenges Building PyZMQ (C Extensions) for Cross-Platform
Min discussed overcoming the complexities of distributing PyZMQ wheels. Before Python “wheels,” Windows users often had to compile from source or rely on non-official installers. Tools like
CI Build Wheel
andDelve Wheel
have simplified cross-platform builds, enabling PyZMQ to ship prebuilt binaries for major OSes and architectures.- Links and tools:
- Real-Time Interaction with Scientific Simulations Min’s background in plasma physics led him to see the benefit of hooking simulations into Python and adjusting parameters in real time. ZeroMQ’s async, message-based design allowed a feedback loop—like turning physical knobs—without rewriting the entire simulation logic. This concept extends well beyond physics, making dynamic, interactive data processing feasible in other domains too.
Interesting quotes and stories
Steering a 5-day simulation in half a day: Min described how, in his physics research, it used to take multiple 5-day runs to tune the parameters for certain simulations. By incorporating Python and ZeroMQ, he was able to steer the simulation mid-run, drastically cutting down total runtime to about half a day.
“Publish-Subscribe is basically sending messages to all who can keep up”: This quote highlights how ZeroMQ is unconcerned about how many subscribers connect or disconnect. It simply broadcasts data without requiring the publisher to store or track individual peers.
Key definitions and terms
- ZeroMQ Context: A container for IO threads that handle the actual network communication. All sockets in a process typically share a context.
- Pub-Sub: Publish-subscribe messaging pattern; one sender, multiple receivers.
- Dealer / Router: Sockets used for request-reply patterns, offering flexible routing of messages.
- Zero-Copy: Transferring data between locations (e.g., memory buffers) without copying it in user space, improving performance.
- IPython Parallel: A Python library for parallel or distributed computing built on ZeroMQ, allowing multiple engines to execute code concurrently.
Learning resources
Here are some recommended resources if you want to dive deeper into the topics from this episode:
- ZeroMQ Guide: In-depth tutorials and best practices for using ZeroMQ.
- PyZMQ GitHub Repo: Source code, issues, and documentation for Python’s ZeroMQ bindings.
- Python for Absolute Beginners: Excellent foundation if you’re brand new to Python.
- Async Techniques and Examples in Python: Learn how to leverage Python’s async features, threading, and multiprocessing—pairs well with messaging approaches like ZeroMQ.
Overall takeaway
ZeroMQ’s strength lies in its combination of simplicity and power. By designing around messaging rather than raw sockets, it greatly simplifies building distributed systems and underpins core projects like Jupyter. Whether you are looking to scale out microservices, add asynchronous interactivity to scientific simulations, or just want an alternative to traditional HTTP for backend communication, ZeroMQ (and PyZMQ in particular) is a valuable tool in Python’s ecosystem.
Links from the show
Simula Lab: simula.no
Talk Python Binder episode: talkpython.fm/256
The ZeroMQ Guide: zguide.zeromq.org
Binder: mybinder.org
IPython for parallel computing: ipyparallel.readthedocs.io
Messaging in Jupyter: jupyter-client.readthedocs.io
DevWheel Package: pypi.org
cibuildwheel: pypi.org
YouTube Live Stream: youtube.com
PyCon Ticket Contest: talkpython.fm/pycon2021
Episode transcripts: talkpython.fm
--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy