Approximately one-third of the software packages in the Python Package Index (PyPi) are vulnerable to a design feature that allows an attacker to automatically execute code when it is downloaded to a computer.
The findings, uncovered by Checkmarx and published on Friday, underscore how open source software repositories like PyPi are increasingly being targeted and exploited by malicious actors. The company said that “a lot of the malicious packages we find in the wild use this code execution feature during installation to achieve higher infection rates.”
According to Tzachi Zorenshtain, Head of Supply Chain Security at Checkmarx, when developers install a software package from repositories like PyPi, most understand that there is also a risk of installing any accompanying malicious code.
“When we examine the behavior and look for new attack vectors, we found that if you download a malicious package, just download it, it will automatically run on your computer,” he told SC Media in an interview from Israel. “So we try to understand why, because for us the word download doesn’t necessarily mean that the code will run automatically.”
But for PyPi, yes. The commands required for both processes run a script, called pip, runs another file called setup.py, which is designed to provide a data structure for the package manager to understand how to handle the package. That script and process is also made up of self-executing Python code, which means an attacker can insert and execute that malicious code on the device of anyone who downloads it.
In fact, this specific vulnerability was mentioned in 2014 on GitHub, but was not directly addressed because the flaw is more a feature of how frequently software is downloaded and installed from the repository than a bug and cannot be directly patched. .
“It is an unfortunate fact of the Python packaging ecosystem that anything related to packaging always involves the execution of arbitrary code (referring to setup.py),” a GitHub user wrote in July 2014.
In recent years, PyPi has introduced a new wheel file type (.whl) that eliminates the need to run the setup.py command altogether, but for compatibility reasons still allows contributors to choose their preferred format. That means that many packages on PyPi, up to a third according to Checkmarx, still use the vulnerable tar.gz format, and obviously malicious actors would intentionally choose the older format to spread their malicious code.
There are other solutions, such as downloading the package through your browser, that can avoid using the setup.py process altogether. Beyond that, Zorenshtain expects the vulnerability to be exploited in packages using the older file format in the next few years.
“The most alarming thing for us is that this is not a vulnerability that will be fixed easily,” Zorenshtain said, later adding: “If we magically changed all formats and everything was re-sent and archived in the new format, then it would be easy to fix. remove this behavior. We understand that this behavior will probably be with us for a while, so at least [building] awareness is what was important to us.”
A request for comment and questions sent to the Python Software Foundation, which runs PyPi as a free community resource, were not answered at the time of publication.