Nuitka Release 1.3
This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler, “download now”.
This release contains a large amount of performance work, that should specifically be useful on Windows, but also generally. A bit of scalability work has been applied, and as usual many bug fixes and small improvements, many of which have been in hotfixes.
Bug Fixes
macOS: Framework build of PySide6 were not properly supporting the use of WebEngine. This requires including frameworks and resources in new ways, and actually some duplication of files, making the bundle big, but this seems to be unavoidable to keep the signature intact.
Standalone: Added workaround for
dotenv
. Do not insist on compiled package directories that may not be there in case of no data files. Fixed in 1.2.1 already.Python3.8+: Fix, the
ctypes.CDLL
node attributes thewinmode
argument to Python2, which is wrong, it was actually added with 3.8. Fixed in 1.2.1 already.Windows: Attempt to detect corrupt object file in MSVC linking. These might be produced by
cl.exe
crashes orclcache
bugs. When these are reported by the linker, it now suggests to use the--clean-cache=ccache
which will remove it, otherwise there would be no way to cure it. Added in 1.2.1 already.Standalone: Added data files for
folium
package. Added in 1.2.1 already.Standalone: Added data files for
branca
package. Added in 1.2.1 already.Fix, some forms
try
that had exitingfinally
branches were tracing values only assigned in thetry
block incorrectly. Fixed in 1.2.2 already.Alpine: Fix, Also include
libstdc++
for Alpine to not use the system one which is required by its other binaries, much like we already do for Anaconda. Fixed in 1.2.2 already.Standalone: Added support for latest
pytorch
. One of our workarounds no longer applies. Fixed in 1.2.2 already.Standalone: Added support for webcam on Windows with
opencv-python
. Fixed in 1.2.3 already.Standalone: Added support for
pytorch_lightning
, it was not finding metadata forrich
package. Fixed in 1.2.4 already.For the release we found that
pytorch_lightning
may not findrich
installed. Need to guardversion()
checks in our package configuration.Standalone: Added data files for
dash
package. Fixed in 1.2.4 already.Windows: Retry replace
clcache
entry after a delay, this works around Virus scanners giving access denied while they are checking the file. Naturally you ought to disable those for your build space, but new users often don’t have this. Fixed in 1.2.4 already.Standalone: Added support for
scipy
1.9.2 changes. Fixed in 1.2.4 already.Catch corrupt object file outputs from
gcc
as well and suggest to clean cache as well. This has been observed to happen at least on Windows and should help resolve theccache
situation there.Windows: In case
clcache
fails to acquire the global lock, simply ignore that. This happens sporadically and barely is a real locking issue, since that would require two compilations at the same time and for that it largely works.Compatibility: Classes should have the
f_locals
set to the actual mapping used in their frame. This makes Nuitka usable with themultidispatch
package which tries to find methods there while the class is building.Anaconda: Fix, newer Anaconda versions have TCL and Tk in new places, breaking the
tk-inter
automatic detection. This was fixed in 1.2.6 already.Windows 7: Fix, onefile was not working anymore, a new API usage was not done in a compatible fashion. Fixed in 1.2.6 already.
Standalone: Added data files for
lark
package. Fixed in 1.2.6 already.Fix,
pkgutil.iter_modules
without arguments was given wrong compiled package names. Fixed in 1.2.6 already.Standalone: Added support for newer
clr
DLLs changes. Fixed in 1.2.7 already.Standalone: Added workarounds for
tensorflow.compat
namespace not being available. Fixed in 1.2.7 already.Standalone: Added support for
tkextrafont
. Fixed in 1.2.7 already.Python3: Fix, locals dict test code testing if a variable was present in a mapping could leak references. Fixed in 1.2.7 already.
Standalone: Added support for
timm
package. Fixed in 1.2.7 already.Plugins: Add
tls
to list of sensible plugins. This enables at leastpyqt6
plugin to do networking with SSL encryption.Standalone: Added implicit dependencies of
sklearn.cluster
.FreeBSD: Fix,
fcopyfile
is no longer available on newest OS version, and include files forsendfile
have changed.MSYS2: Add back support for MSYS Posix variant. Now onefile works there too.
Fix, when picking up data files from command line and plugins, different exclusions were applied. This has been unified to get better coverage for avoiding to include DLLs and the like as data files. DLLs are not data files and must be dealt with differently after all.
New Features
UI: Added new option for cache disabling
--disable-cache
that acceptsall
and cache names likeccache
,bytecode
and on Windows,dll-dependencies
with selective values.Note
The
clcache
is implied inccache
for simplicity.UI: With the same values as
--disable-cache
Nuitka may now also be called with--clean-cache
in a compilation or without a filename argument, and then it will erase those caches current data before making a compilation.macOS: Added
--macos-app-mode
option for application bundles that should run in the background (background
) or are only a UI element (ui-element
).Plugins: In the Nuitka package configuration files, the
when
allows now to check if a plugin is active. This allowed us to limit console warnings to only packages whose plugin was activated.Plugins: Can now mark a plugin as a GUI toolkit responsible with the consequence that other toolkit detector plugins are all disabled, so when using
tk-inter
no longer will you be asked aboutPySide6
plugin, as that is not what you are using apparently.Plugins: Generalized the GUI toolkit detection to include
tk-inter
as well, so it will now point out thatwx
and the Qt bindings should be removed for best results, if they are included in the compilation.Plugins: Added ability to provide data files for macOS
Resources
folder of application bundles.macOS: Fix, Qt WebEngine was not working for framework using Python builds, like the ones from PyPI. This adds support for both PySide2 and PySide6 to distribute those as well.
MSYS2: When asking a CPython installation to compress from the POSIX Python, it crashed on the main filename being not the same.
Scons: Fix, need to preserve environment attached modes when switching to winlibs gcc on Windows. This was observed with MSYS2, but might have effects in other cases too.
Optimization
Python3.10+: When creating dictionaries, lists, and tuples, we use the newly exposed dictionary free list. This can speedup code that repeatedly allocates and releases dictionaries by a lot.
Python3.6+: Added fast path to dictionary copy. Compact dictionaries have their keys and values copied directly. This is inspired by a Python3.10 change, but it is applicable to older Python as well, and so we did.
Python3.9+: Faster compiled object creation, esp. on Python platforms that use a DLLs for libpython, which is a given on Windows. This makes up for core changes that went unnoticed so far and should regain relative speedups to standard Python.
Python3.10+: Faster float operations, we use the newly exposed float free list. This can speed up all kinds of float operations that are not doable in-place by a lot.
Python3.8+: On Windows, faster object tracking is now available, this previously had to go through a DLL call, that is now removed in this way as it was for non-Windows only so far.
Python3.7+: On non-Windows, faster object tracking is now used, this was regressed when adding support for this version, becoming equally bad as all of Windows at the time. However, we now managed to restore it.
Optimization: Faster deep copy of mutable tuples and list constants, these were already faster, but e.g. went up from 137% gain factor to 201% on Python3.10 as a result. We now use guided a deep copy, which then has the information, what types it is going to copy, removing the need to check through a dictionary.
Optimization: Also have own allocator function for fixed size objects. This accelerates allocation of compiled cells, dictionaries, some iterators, and lists objects.
More efficient code for object initialization, avoiding one DLL calls to set up our compiled objects.
Have our own
PyObject_Size
variant, that will be slightly faster and avoids DLL usage forlen
and size hints, e.g. in container creations.Avoid using non-optimal
malloc
related macros and functions of Python, and instead of the fasted form generally. This avoids Python DLL calls that on Windows can be particularly slow.Scalability: Generated child mixins are now used for the generated package metadata hard import nodes calls, and for all instances of single child tuple containers. These are more efficient for creation and traversal of the tree, directly improving the Python compile time.
Scalability: Slightly more efficient compile time constant property detections. For
frozenset
there was not need to check for hashable values, and some branches could be replaced with e.g. defining our ownEllipsisType
for use in short paths.Windows: When using MSVC and LTO, the linking stage was done with only one thread, we now use the proper options to use all cores. This is controlled by
--jobs
much like C compilation already is. For large programs this will give big savings in overall execution time. Added in 1.2.7 already.Anti-Bloat: Remove the use of
pytest
fordash
package compilation.Anti-Bloat: Remove the use of IPython for
dotenv
,pyvista
,python_utils
, andtrimesh
package compilation.Anti-Bloat: Remove IPython usage in
rdkit
improving compile time for standalone by a lot. Fixed in 1.2.7 already.Anti-Bloat: Avoid
keras
testing framework when using that package.
Organisational
Plugins: The
numpy
plugin functionality was moved to Nuitka package configuration, and as a result, the plugin is now deprecated and devoid of functionality. On non-Windows, this removes unused duplications of thenumpy.core
DLLs.User Manual: Added information about macOS entitlements and Windows console. These features are supported very well by Nuitka, but needed documentation.
UI: Remove alternative options from
--help
output. These are there often only for historic reasons, e.g. when an option was renamed. They should not bother users reading them.Plugins: Expose the mnemonics option to plugin warnings function, and use it for
pyside2
andpyqt5
plugins.Quality: Detect trailing/leading spaces in Nuitka package configuration
description
values during their automatic check.UI: Detect the CPython official flavor on Windows by comparing to registry paths and consider real prefixes, when being used in virtualenv more often, e.g. when checking for CPython on Apple.
UI: Enhanced
--version
output to include the C compiler selection. It is doing that respecting your other options, e.g.--clang
, etc. so it will be helpful in debugging setup issues.UI: Some error messages were using
%r
rather than'%s'
to output file paths, but that escaped backslashes on Windows, making them look worse, so we changed away from this.UI: Document more clearly what
--output-dir
actually controls.macOS: Added options hint that the
Foundation
module requires bundle mode to be usable.UI: Allow using both
--follow-imports
and--nofollow-imports
on command line rather than erroring out. Simply use what was given last, this allows overriding what was given in project options tests should the need arise.Reports: Include plugin reasons for pre and post load modules provided. This solves a TODO and makes it easier to debug plugins.
UI: Handle
--include-package-data
before compilation, removing the ability to use pattern. This makes it easier to recognize mistakes without a long compilation and plugins can know them this way too.GitHub: Migration workflows to using newer actions for Python and checkout. Also use newer Ubuntu LTS for Linux test runner.
UI: Catch user error of running Nuitka with the
pythonw
binary on Windows.UI: Make it clear that MSYS2 defaults to
--mingw64
mode. It had been like this, but the--help
output didn’t say so.GitHub: Updated contribution guidelines for better readability.
GitHub: Use organisation URLs everywhere, some were still pointing to the personal rather than the organisation one. While these are redirected, it is not good to have them like this.
Mastodon: Added link to https://fosstodon.org/@kayhayen to the PyPI package and User Manual.
Cleanups
Nodes for hard import calls of package meta data now have their base classes fully automatically created, replacing what was previously manual code. This aims at making them more consistent and easier to add.
When adding the new Scons file for C compiler version output, more values that are needed for both onefile and backend compilation were moved to centralized code, simplifying these somewhat again.
Remove unused
main_module
tag. It cannot happen that a module name matches, and still thinks of itself as__main__
during compilation, so that idea was unnecessary.Generate the dictionary copy variants from template code rather than having manual duplications. For
dict.copy()
, for deep copy (needed e.g. when there are escaping mutable keyword argument constant values in say a function call), and for**kw
value preparation in the called function (checking argument types), we have had diverged copies, that are now unified in a single Jinja2 template for optimization.Plugins: Also allow providing generators for providing extra DLLs much like we already do for data files.
Naming of basic tests now makes sure to use a
Test
suffix, so in Visual Code selector they are more distinct from Nuitka code modules.Rather than populating empty dictionaries, helper code now uses factory functions to create them, passing keys and values, and allowing values to be optional, removing noisy
if
branches at call side.Removed remaining
PyDev
annotations, we don’t need those anymore for a long time already.Cleanup, avoid lists objects for ctypes defined functions and their
arglist
, actually tuples are sufficient and naturally better to use.Spelling cleanups were resumed, as an ongoing action.
Tests
Added construct test that demonstrates the mutable constant argument passing for lists to see the performance gains in this area too.
Made construct runner
--diff
output usable for interactive usage.Repaired Nuitka Speedcenter, but it’s not yet too usable for general consumption. More work will be needed there, esp. to make comparisons more accessible for the general public.
Summary
The major achievement of this release was the removal of the long lived
numpy
plug-in, replacing it with package based configuration, that
is even more optimal and works perfectly across all platforms on both
important package installation flavors.
This release has a lot of consolidation efforts, but also as a result of 3.11 investigations, addresses a lot of issues, that have crept in over time with Python3 releases since 3.7, each time, something had not been noticed. There will more need for investigation of relative performance losses, but this should address the most crucial ones, and also takes advantage of optimization that had become with 3.10 already.
There is also some initial results from cleanups with the composite node tree structure, and how it is managed. Generated “child(ren) having” mixins, allow for faster traversal of the node tree.
Some technical things also improved in Scons. Using multiple cores in
LTO with MSVC with help this a lot, although for big compilations
--lto=no
probably has to be recommended still.
More anti-bloat
work on more packages rounds up the work.
For macOS specifically, the WebEngine support is cruical to some users,
and the new --macos-app-mode
with more GUI friendly default resolves
long standing problems in this area.
And for MSYS2 and FreeBSD, support has been re-activated, so now 4 OSes work extremely well (others too likely), and on those, most Python flavors work well.
The performance and scalability improvements are going to be crucial. It’s a pity that 3.11 is not yet supported, but we will be getting there.