PyPI open-source code repository deals with manic malware maelstrom

Public source code repositories, from Sourceforge to GitHub, from the Linux Kernel Archives to ReactOS.org, from PHP Packagist to the Python Package Index, better known as PyPI, are a fantastic source (sorry!) of free operating systems, applications, programming libraries, and developers’ toolkits that have done computer science and software engineering a world of good.

Most software projects need “helper” code that isn’t a fundamental part of the problem that the project itself is trying to solve, such as utility functions for writing to the system log, producing colourful output, uploading status reports to a web service, creating backup archives of old data, and so on.

In cases like that, you can save time (and benefit for free from other people’s expertise) by searching for a package that already exists in one of the many available repositories, and hooking that external package into your own tree of source code.

In the other direction, if you’re working on a project of your own that includes some useful utilities you couldn’t find anywhere else, you might feel inclined to offer something to the community in return by packaging up your code and making it available for free to everyone else.

The cost of free

As you’re no doubt aware, however, community source code repositories bring with them a number of cybersecurity challenges:

  • Popular packages that suddenly vanish. Sometimes, packages that a well-meaning programmer has donated to the community become so popular that they become a critical part of thousands or even hundreds of thousands of bigger projects that take them for granted. But if the original programmer decides to withdraw from the community and to delete their projects (which they have every right to do if they have no formal contractual obligations to anyone who’s chosen to rely on them), the side-effects can be temporarily disastrous, as other people’s projects suddenly “update” to a state in which a necessary part of their code is missing.
  • Projects that get actively hijacked for evil. Cybercriminals who guess, steal or buy passwords to other people’s projects can inject malware into the code, and anyone who already trusts the once-innocent package will unwittingly infect themselves (and perhaps their own customers) with malware if they download the rogue “update” automatically. Crooks can even take over old projects using social engineering trickery, by joining the project and being really helpful for a while, until the original maintainer decides to trust them with upload access.
  • Rogue packages that masquerade as innocent ones. Crooks regularly upload packages that have names that are sufficiently close to well-known projects that other users download and use them by mistake, in an attack jocularly known as typosquatting. (The same trick works for websites, hoping that a user who mistypes a URL even slightly will end up on a bogus look-alike site instead.) The crooks generally clone the genuine package first, so it still performs all the functions of the original, but with some additional malicious behaviour buried deep in the code.
  • Petulant behaviour by so-called “researchers”. We’ve sadly had to write about this sort of probably-legal-but-ethically-dubious behaviour several times. Examples include a US PhD student and their supervisor who deliberately uploaded fake patches to the Linux kernel as part of an unauthorised experiment that the core Linux team were left to sort out, and a self-serving “expert” with the nickname Supply Chain Risks who uploaded a booby-trapped fake project to the PyPI repository as a reminder of the risk of so-called supply chain attacks. SC Risks then followed up their proof-of-concept “research” package with a further 3950 packages, leaving the PyPI team to find and delete them all.

Rogue uploaders

Unfortunately, PyPI seems to have been hammered by a bunch of rogue, automated uploads over the past weekend.

The team has, perhaps understandably, not yet given any details of how the attack was carried out, but the site temporarily blocked anyone new from joining up, and blocked existing users from creating new projects:

New user and new project name registration on PyPI is temporarily suspended. The volume of malicious users and malicious projects being created on the index in the past week has outpaced our ability to respond to it in a timely fashion, especially with multiple PyPI administrators on leave.

While we re-group over the weekend, new user and new project registration is temporarily suspended. [2023-05-20T16:02:00Z]

We’re guessing that the attackers were using automated tools to flood the site with rogue packages, presumably hoping that if they tried hard enough, some of the malicious content would escape notice and get left behind even after the site’s cleanup efforts, thus completing what you might call an Security Bypass Attack

…or perhaps that the site administrators would feel compelled to take the entire site offline to sort it out, thus causing a Denial of Service Attack, or DoS.

The good news is that in just over 24 hours, the team got on top of the problem, and was able to announce, “Suspension has been lifted.”

In other words, even though PyPI was not 100% functional over the weekend, there was no true denial of service against the site or its millions of users.

What to do?

  • Don’t choose a repository package just because the name looks right. Check that you really are downloading the right module from the right publisher. Even legitimate modules sometimes have names that clash, compete or confuse.
  • Don’t blindly download package updates into your own development or build systems. Test and review everything you download before you approve it for use. Remember that packages typically include update-time scripts that run when you do the update, so malware infections could be delivered via the update process itself, not as part of the package source code that gets left behind afterwards.
  • Don’t make it easy for attackers to get into your own packages. Choose proper passwords, use 2FA whenever you can, and don’t blindly trust newcomers to your project as soon as they start angling to get maintainer access, no matter how keen you are to hand the reins to someone else.
  • Don’t be a you-know-what. As this story reminds us all, volunteers in the open source community have enough trouble with genuine cybercriminals without having to deal with “researchers” who conduct proof-of-concept attacks for their own benefit, whether for academic purposes or for bragging rights (or both).

go top