Devs unknowingly use “malicious” modules snuck into official Python repository


Python Package Index, often abbreviated as PyPI, and were subsequently incorporated into software multiple times from June through this month, Slovakia’s National Security Authority said in an advisory published Thursday. The unidentified people who made available the code packages gave them names that closely resembled those used for packages found in the standard Python library. The packages contained the exact same code as the upstream libraries except for an installation script, which was changed to include a “malicious (but relatively benign) code.”

“Such packages may have been downloaded by unwitting developer[s] or administrator[s] by various means, including the popular ‘pip’ utility (pip install urllib),” Thursday’s advisory stated. “There is evidence that the fake packages have indeed been downloaded and incorporated into software multiple times between June 2017 and September 2017.”

Officials with the Slovak authority said they recently notified PyPI administrators of the activity, and all identified packages were taken down immediately. Removal of the infected libraries, however, does nothing to purge them from servers that installed them. The authority advised developers and administrators to check whether any of their servers are relying on the tainted packages. The advisory provided the specific commands that can be used to perform the check. In the event infected packages are found, administrators should remove them immediately and replace them with the proper package.

Shortly after Thursday’s advisory went live, researcher and activist Benjamin Bach and freelance journalist Hanno Böck reported that they were able to seed PyPI with more than 20 libraries that are part of the Python standard library. They, too, modified the package installation files, in this case, with a script that caused developers to briefly connect to a server that recorded each developer’s IP address. Within minutes, the server reported the libraries were being installed. Results published here showed the packages were downloaded almost 7,000 over a two-day period.

A case of mistaken identity

The problem is that packages in the standard Python library should originate only from their official source, rather than being downloaded from third-party repositories that store packages developed by non-official sources. Thursday’s advisory and the results published on Friday demonstrate this best practice is being ignored by a significant number of developers and, in the process, could jeopardize the security of the resulting software. For instance, if a developer were to accidentally use a rogue pseudo-random number generator instead of Python’s official secret module, an app’s cryptographic functions might be easy for attackers to defeat.

“It’s a very easy way to compromise many systems in a short time,” Böck told Ars on Friday. “Ultimately, this comes down to the problem that everyone can upload to PyPI. Right now, this problem is completely ignored by the Python+PyPI people. We need at least to start a discussion about what the best solution should be.”

By Saturday morning, PyPI administrators had removed the top 20 most-downloaded packages posted by Bach and Böck. It wasn’t clear if PyPI was preventing new packages from using those names. Attempts to reach PyPI administrators weren’t immediately successful.

The incidents closely resemble an attack carried out last year in a research experiment by a college student in Germany. As part of his bachelor thesis, University of Hamburg student Nikolai Philipp Tschacher uploaded packages to PyPI and two other repositories. The packages used names that were similar to widely used packages already submitted by other users. They also contained code that tracked the developers. Over a span of several months, his imposter code was executed more than 45,000 times on more than 17,000 separate domains, and more than half the time his code was given all-powerful administrative rights. Two of the affected domains ended in .mil, an indication that people inside the US military had run his script.

Bach and Böck started their project after discovering that many of the package names used in Tschacher’s experiment had since become available again on PyPI, freeing the way for anyone else to offer malicious packages that used the same names.

“Benjamin [Bach] tried to tell both the Python security team and the PyPI devs about it and got no reaction,” Böck said.

The problem is ultimately the result of developers and administrators who fail to inspect packages thoroughly. Adding to the insecurity, the widely used pip package management system (pictured above), which most Python developers rely on, doesn’t require cryptographic signature before executing code when a package is installed. Böck said the PyPI is currently blocking use of the packages he and Bach used but that a more comprehensive solution still needs to be worked out.

Powered by WPeMatico