Ask HN: Who's building on Python NoGIL?
I am interested in knowing the things the community is building specifically taking into account the NoGIL aspect of python. Like is someone building frameworks around using Threads instead of Async?
I am interested in knowing the things the community is building specifically taking into account the NoGIL aspect of python. Like is someone building frameworks around using Threads instead of Async?
This is going to be bananas for libpython-clj[1]. One of the biggest limiting factors right now is that you can't mix Java/Clojure concurrency with Python concurrency, you need to have a really clear separation of concurrency models. But with this, you will be able to freely mix Clojure and Python concurrency. Just from a compositional standpoint, Clojure atoms and core.async with Python functions will be fantastic. More practically, this will unlock a lot of performance gains with PyTorch and Tensorflow which historically we've had to lock to single threaded mode. Yay!
[1]: https://github.com/clj-python/libpython-clj
I’m here for it :) love the Clojure approach to symbiosis. Parens consume all the things!
PyO3 0.23.0 was a big release I’ve been tinkering with extensively. Support for “free-threaded Python” is a headline feature, and I imagine NoGIL Python will be extremely nice for Rust interoperability, so there is definitely interest in that crate. Also could be huge for queueing data for GPUs, api servers, and bulk data fetching.
For whatever reason (maybe post 2to3 PTSD), Python community seems not extremely eager to jump on latest versions of Python and it often takes a long time for popular libraries to support the latest and greatest, so I’d recommend patience and baby steps
https://github.com/PyO3/pyo3/releases/tag/v0.23.0
> For whatever reason (maybe post 2to3 PTSD), Python community seems not extremely eager to jump on latest versions of Python
Well, you'd think after the 2 to 3 debacle, python might take backwards compatibility more seriously, but they don't.
Follow semver, and stop breaking things on 3.x. If it's deprecated in 3.x, don't remove it until 4.
I don't think there's even a plan for a v4. The fallout from 2 to 3 was that bad. So to keep improvements going takes deprecating something several versions before removing, and research is done to find how popular that particular thing is to determine its candidacy for removal. Thus it's best practice to pin all dependencies, and read the release notes before doing a version update.
Yeah it'd be better if the name was understood as python 3 instead of just python to avoid mixing in semver
What have they broken on 3.x? Genuine question as I haven't followed python's development super closely
Removal of setuptools in 3.12 broke a ton of legacy builds. Basically created a wall of forced package upgrades for a huge amount of packages in PypI where end users have to bump a bunch of stuff if they want to migrate from < 3.12 to 3.12+
setuptools was never part of Python's standard library. I think you're thinking of distutils which was removed from Python in the 3.12 release. You can easily access distutils again by installing a package from PyPI.
Some seldom used standard modules have been deprecated and later removed. Like recently I revisited a project I initially made using v3.6, but it broke on v3.13 due to an indirect dependency no longer present in the stdlib. It was a simple fix though as a quick search identified the issue and pointed to the removed module in a package on PyPI.
Python 3.6 is from 2016 and 3.13 is from 2024. Similar things happen on most platforms on this timescale, eg on the Java side[1], you'd be going from Java 8 to Java 23.
Clojure is pretty good even on that timescale though.
[1] See eg https://stackoverflow.com/a/50445603 up until 2021
Yep it's totally understandable, and OK by me as these changes are documented in the release docs and the fix a pip install away.
Yeah, it's nothing crazy, but it makes upgrades a lot more unpredictable. It's harder to communicate to management why the 3.x update took a day and the 3.y upgrade took a whole quarter.
It's harder to upgrade services in a central way with any amount of leverage, and generally requires more coordination overhead, and moving more carefully.
Compare with, say, golang, where it's pretty much a non-issue. My experience with Ruby was a lot better too, until Ruby 3, but hey, that was a major version bump!
I've always used threads despite the GIL. I haven't tried NoGIL and am waiting to find out how many bugs it surfaces. I do get the impression that multi-threaded Python code is full of hazards that the GIL covers up. There will have to be locks inserted all over the place. CPython should have simply been retired as part of the 2 to 3 transition. It was great in its day, on 1-core machines with the constraints of that era. I have a feeling of tragedy that this didn't happen and now it can never be repaired. I probably wouldn't use Python for web projects these days. I haven't done anything in Elixir yet but it looks like about the best option. (I've used Erlang so I think I have a decent idea of what I'd be getting into with Elixir).
In a strange way, Python being so bad at interpreting bytecodes and limited by GIL, plus being good at interfacing with C cheaply (unlike Go and Java,) induced a programming style that is extremely suited for data-parallel computing which is the way to efficiently scale compute in today's SIMD/GPU world. If you wanted to be efficient, you had to prepare your data ahead of time and hand it off. Any intermediate interaction with that data would ruin your performance. That's mostly how efficient Python libraries and ecosystem are built.
Weakness may have turned into a strength.
Why would NoGIL change that, though? It's not like large data-parallel operations can suddenly be done efficiently in Python if you remove the GIL. The problem with GIL afaik is mostly latency problems in interactive applications.
Oh, I didn't mean to imply something will change. I am simply concurring with the parent while observing that limitation turned into a strength by established a certain ecosystem early on that fits the modern architectural developments well. I don't think that is going to change now, but had nogil been the original, it could have led to a different style of libraries being designed.
Can you elaborate more on the data-parallel computing part?
Pretty much the entire Data Science/Machine Learning landscape from numpy, etc. to tensorflow and alike are thin wrappers over C code and if you want performance, you better batch structure your operation beforehand and minimize back and forth from Python.
He's probably talking about libraries like PySpark or PyFlink which are used a lot
Pyflink seems promising, I love vanilla flink but as soon as you need to debug your pyflink job pyflink becomes a hurdle. That translation layer between Python and Java can be opaque.
So what would the alternative history have been if CPython was retired after Python 3 came out in 2008, what would we be using now? IronPython or GraalPy?
It would have suffered the same fate as Perl 6 and we'd all be on 2.1x.
Its merged into CPython 3.13 but labeled as experimental.
Single threaded cpu bound workloads suffer in benchmarks (vs i/o workloads) till they put back the specializing adaptive interpreter (PEP 659) in 3.14. Docs say a 40% hit now, target is 10% at next release.
C extensions will have to be re-built and ported to support free threaded mode.
Some interesting and impactful bits of open source work for those with a c++ multithreading background.
I have the same question! I love Python and asynchronous stuff, and I do not know too much about threading.
Is threading potentially better for IO bound tasks than async?
Potentially, but probably not. The benefit of a parallel-enabled interpreter would be two CPU cores executing bytecode instructions at the same time in the same interpreter. So, you could have one python thread working on one set of data, and another python thread working on another set of data, and the two threads would not interfere with each other much or at all. Today, with the global interpreter lock, only one of those threads can be executing bytecode at a time.
> Today, with the global interpreter lock, only one of those threads can be executing bytecode at a time.
Yes, but Python now has a version without GIL, which prompted this post in the first place. So my question is: Now, if I use a version of Python 3.13 without GIL, can a threaded Flask app do better than an AIOHTTP server.
I think it will depend on which task we are talking about. For For CPU-bound tasks (i.e heavy computation, data processing), The No-GIL Flask with threading would likely perform better than AIOHTTP since it can truly parallelize computation across cores.Now for I/O-bound tasks (database queries, API calls or file operations) then AIOHTTP would still likely be more efficient due to its lower overhead and memory.
So for your original question
> Is threading potentially better for IO bound tasks than async?
async will be better in general, potentially due to Async co-routines using far less memory than threads and being better under high concurrency. But that's all will depend on the details of implementation of the server.
Started working on a no-gil gRPC Python implementation but super low priority.
Has anyone started deploying nogil at scale in prod?
No, I am not personally aware of anyone using it prod.
Waiting for CFFI or pywin32 free-threaded support since I don't have time to work on CFFI myself