In a new twist on software supply chain attacks, researchers have discovered a Python package hiding malware inside of compiled code, allowing it to evade ordinary detection measures.
On April 17, researchers reported a packaged called “fshec2” to the administrators of the Python open source repository PyPI. Malicious packages aren’t new — or particularly rare — in PyPI, but unlike the lot of them, fshec2 contained all of its malicious functionality inside of its compiled code, making it hard to spot as bad news.
The PyPI admins immediately removed the package. In so doing, they “also recognized this type of attack as interesting and acknowledged that it had not been previously seen,” Karlo Zanki, reverse engineer at ReversingLabs, wrote in a report published June 1.
“This behavior is a bit more sophisticated, and it shows that the attackers are evolving and paying attention to the better detections that are being rolled out,” says Ashlee Benge, director of threat intelligence advocacy at ReversingLabs, adding that “we’re probably going to continue to see this kind of attack increase in the future.”
Unpacking fshec2
The genius of fshec2 is in how it dispenses with basic conventions of good hacker hygiene.
For example, bad guys tend not to distribute overt malware out on the Web — that’d be ham-handed. Instead, they plant tools which, upon hooking into a target computer, connect back to their C2 servers and trigger the download of malicious code — that can be ransomware, an infostealer, you name it.
Furthermore, to hide their true intentions, hackers will often use code obfuscation — taking any clues the good guys might pick up on, and turning it to spaghetti.
By contrast, fshec2 front loaded its malicious functionalities, and didn’t rely on obfuscation tools at all.
The package contained three files — two unexceptional source code files, and a third, more interesting file, “full.pyc.” Within full.pyc was a method called “get_path,” which, the researchers explained, “performs some of the common malicious functions observed in other malicious PyPI packages we have analyzed,” including collecting usernames, hostnames, and directory listings, and downloading commands from a remote server.
How did fshec2 manage to hide the maliciousness of get_path? The crucial bit here is that PYC files contain not source code, but compiled bytecode.
The Problem With Bytecode
Bytecode is a representation of Python, compiled as a set of instructions for the Python Virtual Machine. In a simplified sense, it exists somewhere between source code and being a machine binary.
Raw bytecode isn’t friendly to the human eye — get_path, for example, can’t be found in readable form anywhere inside fshec2. And so it’s also able to skirt by software scanners.
PyPI doesn’t yet account for malware hidden in bytecode, Benge explains, because “over the last decade, these files have gotten increasingly more complicated and huge. It’s really slow, often, to try to scan such a big file. So it creates this dilemma: how much lag do you want your user to experience?”
As for third-party security software, she adds, “another problem is file type. I couldn’t even tell you how many file types there are at this point — there are all sorts of obscure ones. And oftentimes, security solutions don’t actually have the capabilities to look at the kinds less commonly seen.”
PyPI: Fighting Back Against Cyberattackers
“Right now, poor PyPI is really under fire,” Benge remarks. “There’s been a huge increase in this type of attack generally, where we’re seeing malicious Python libraries be leveraged to serve malware.”
In the past year especially, threat actors have been inventing new kinds of malicious Python packages, and even openly advertising their evil goods on the repo. The bad news has come so often for PyPI that, earlier this month, the security community interpreted a routine shutdown as a massive cyberattack.
In response, the Python Software Foundation has been investing in security more than ever, creating new roles for dedicated security experts. And last week, PyPI announced that by year’s end, users who upload and maintain projects and organizations will be required to protect their accounts with two-factor authentication.
Benge is optimistic about these developments. “Now it becomes a little more difficult than just pushing a malicious library out to PyPI and waiting for someone to download it. Now we’re seeing that these guys have to work a little harder.”