Vulnerabilities in System Management Mode (SMM) and more general UEFI applications/drivers (DXE) are receiving increased attention from security researchers. Over the last 12 months, the Binarly efiXplorer team disclosed 107 high-impact vulnerabilities related to SMM and DXE firmware components.
However, newer platforms have significantly increased the runtime mitigations in the UEFI firmware execution environment (including SMM), and the new Intel platform firmware runtime mitigations reshaped the attack surface for SMM/DXE with new Intel Hardware Shield technologies applied below-the-OS.
The complexity of the modern platform security features is growing every year. The general security promises of the platform consist of many different layers defining their own security boundaries. In many cases, these layers introduce inconsistencies in mitigation technologies and create room for breaking general security promises, allowing for successful attacks.
In this presentation, Alex Matrosov explores recent changes in the UEFI firmware security runtime using one of the most recent Intel CPUs as an example. The presentation covers the evolution of firmware mitigations in SMM/DXE on x86-based CPUs and a discussion about the new attacks on Intel Platform Properties Assessment Module (PPAM), which are often used in tandem with Intel SMI Transfer Monitor (STM).
These topics have never been publicly discussed from the offensive security research perspective.
Breaking Firmware Trust From The Other Side: Exploiting Early Boot Phases
Breaking Firmware Trust From The Other Side: Exploiting Early Boot Phases
I think after system internals, we need to dive deeper into firmware internals and we have a very interesting stuff today. We’ll dive even deeper than just as system management. What we will be talking about the pre-EFI firmware and how actually that works and where is the breaking point and how actually it can be broken and what are the attack surfaces and of course, new vulnerabilities. All right.
I will probably just make a chart. So it’s the Binarly research team which has been involved. Google remembers all my research better than me. You can just Google my name. All right.
So this is a short agenda for today. And I think we will start with some introduction and why this topics are important and then dive deeper into the problems features that actually cause these vulnerabilities. Right.
So first of all, like the complexity of the modern firmware, it’s increasing every year. And to be honest, like your laptop is running the firmware, which is bigger than ntos kernel nowadays. It’s a whole operating system, but most of most of the endpoint solutions doesn’t look in there and also repeatable problems happen in there. So basically all the vulnerabilities we are discovering, it’s in not new attack vectors. It’s been known before. Probably pre-EFI is a new stuff, but system management, memory corruption bugs come from the early 2000 but are still a thing.
All this vulnerability is actually been discovered by Binarly team. And as you can see, like we today disclose 68 vulnerabilities for a bit more than a year and 50 plus. It’s still under the disclosure. That’s a lot. What we can deduce, it’s been a failure when HP not patch the bugs after Black Hat talk almost two months after the Black hat talk. Of course, not all of them, but some are still left unpatched. And think about it. It’s high severity issues. It’s like what, 8.5, 8.2 in CVS score and it’s left unpatched all the documentation available. Black Hat recording is available and actually it can cause problems to your infrastructures.
And today. So basically we make one more transition. To be honest, I want to discuss ten bugs, but three bugs actually been not patched because it’s actually cause more serious problems that this seven and takes more time.
But to be honest I want to thank you Insyde’s product security team which make it happen in less than 90 days, which is very very rare thing for firmware vulnerabilities. Usually the disclosure cycle, it goes six months and up. Sometimes it’s actually takes more than a year. As example, previous Black Hat talk, it’s been 13 months when Intel been patching BSSA DFT issue and today’s vulnerabilities which is we will be on the next slides. So it’s actually an industry wide thing.
So we are talking not just single vulnerability impacting one vendor.It’s actually a reference code which is used by many devices, not just single device, and that’s actually can cause a lot of problems.
So quickly on the attacker model. So usually such vulnerabilities is used for the second stage when actually the attacker want to gain additional persistence on the device and want to actually sit on there for months. And if you think about all these recent discoveries, MoonBounce, CosmicStrand, and others, it’s like comes from 2014, seven years of persistence. Wow. That’s kind of like a thing for the firmware problems. So.
But let’s recap a bit some of the basics on the firmware problems. So what is NVRAM, what is NVRAM persistence and why it can cause problems? So environments the storage for some of the data which is used during the boot, let’s just call that like that and it’s stored on the flash memory and basically this memory is not protected by any security features like Intel boot guard and others because it’s used in early boot. And some of these things are persistent. So this data used during the boot. What it can cause, of course, the bugs. Right. So if you modify some of these variables and if it can cause the code execution, these persistent variables can also be use for persistence, right. So consistent exploitation and actually take an attacker some advantage.
So here is the NVRAM attack surface, which is show you like this type of variables is used during the PEI phase, DXE phase, which is actually available during an operating system.
And of course from the operating system you have a read write access to there with a privileged user. And important reminder vulnerabilities we are covering today. It will be not detected by health attestation, not will be detected by PCR registers from the TPM because PCR are not extending in runtime. So they measure what been known to measure not new things.
Attack surface for pre-EFI. And actually what interesting about PFI mode, it’s actually enable a lot of security features for later stages and a lot of things are just not enabled yet. So also you can actually reflect or disable from security feature standpoint and this is completely new thing because everybody thinks oh, SMM is bad, SMM some rootkits is bad, but nobody thinks about what’s happening before. So basically pre-EFI is security boundary when by default it’s trusted because all system management mode is initialized at that phase. So this is actually a potential proof of concept about how we can survive of the payload delivery from platform initialization phase, pre-EFI phase to system management mode. So basically constantly enabling some of the key mechanisms which is actually can create an anchor between these different phases. We can survive. But of course I have a policy for that.
So basically. It works like a charm. So you exploit the bug, which is actually on pre-EFI phase, but it’s also available to exploitation from from operating system level and boom, you found the keys. Cool right? And working very fast. It’s a recent Ubuntu server. But let’s talk more about attack surface on pre-EFI and what kind of problem it can cause. So basically Intel Platform Properties Assessment Module, which is part of the hardware shield. It’s short abbreviation. It’s PPAM. So in the perfect world it looks like that. So it’s how Intel imagines it will be designed on the systems. But complexity increase. Design issues remain forever, right? So basically initially PPAM get inspired by is my transfer monitor its earlier concept. I would say it comes from 2013-14 when Intel been experimenting with these things and actually open sourced a lot of documentation about that but think is it’s been heavily impacting the performance and I don’t see much of this implemented actually on the real systems except some of specific systems. And Intel came up with a simpler concept. It’s not really related, but actually also try to separate different policies on SMM drivers and this concept called PPAM. So basically we have this very high level picture of how it is initialized. But if you look on that, we actually have PEI SMM CPU DXE SMM system, which is actually the driver which is being initialized during the pre-EFI phase.
So boom, of course we can influence from earlier boot to the later boot and actually we can attack the data and some of the configuration of things for such security feature. And we found one bug in an HP machine, it’s actually the most recent HP book, which is used a lot on the enterprise users networks. And also this bug exists actually on some other systems on HP. So basically simply you can modify in the memory one flag and then this feature will not be enabled in runtime. But of course like Intel reference implementation, by the way, doesn’t have this bug. Why HP has this bug? Because problem is you need to have remote management across your entire enterprise infrastructure, right? You want to enable by policies this feature on many laptops simultaneously. Right? And you need such mechanisms to be in place and actually boom, here is a bug. So basically, quickly, as I mentioned at just the data which is getting modified and then it causes this problem. But I need to move a bit faster and. I will show you. The other problem with this laptop. So basically, this policy engine, it has manifest and has sign at manifest by certificate. And what we discover this manifest actually is signed by outdated certificate, but we can start figuring out why. Actually, we checked many systems and the certificate issued by Intel and thing is, it’s outdated and never been in use on any laptops, which is actually enabling people.
So it’s been no validation or attestation for PPAM manifest. And actually you can just disable PPAM as a security feature and nobody will notice. If it be attestation in place, of course it will be noticed, but it’s not the case unfortunately. And a lot of machines in the field been using outdated certificates. And we just checked enterprise vendors like Lenovo, HP, Dell and many others, Fujitsu, and that’s been our result. So most recent one, it’s been outdated by five days before our Black Hat talks, which is cool coincidence. But let’s look. On attack surface broader. And let’s talk about this repeatable problems happening on the firmware. These seven bugs I mentioned that we discovered recently, we reported in June, it didn’t make it in our BlackHat catalog but make it to LABScon. We have three other bugs which can’t make it, but we’ll be make it with some others to Ekoparty talk, which is cool. And if you look over all of these vulnerabilities, most of them, it’s memory corruption vulnerabilities. And these vulnerabilities are actually happening not because this company doesn’t use the static analysis tools or not because their developers don’t understand how buffer overflows actually can looks like.
The problem is that the tools which is exist. It doesn’t fit to the firmware world correctly. So basically they can find your bugs on operating system level and when memcopy potentially can cause the problems, but it’s no memcopy in the firmware world and in UEFI firmware it’s called copymem. So basically all the copymems will be skipped. So that’s if you’re not tuned static analysis tools properly if you’re not customized. So basically it will be not working or we use a static analysis tooling, it doesn’t find any bugs, so we’re good to go for release. It’s how it works. Funny thing, it’s not a thing for enterprise vendors, but with some gaming stations. Actually, I report the bugs to some very famous gaming machine developer from Taiwan and they say like. ‘Oh, we have the product security teams validation all the firmware for bugs and malicious software’ and say, What are you using? Oh, we use a Trend Micro antivirus scans every firmware image before the release. Wow, that’s cool. But antivirus doesn’t find the bugs but somebody thinks it is. Anyway. So let’s look on the findings. And I like the name of the driver used by legacy controllers. Right. But it still exists on newest generation of the machines from I think it’s this particular bug been found on other like 12 generation intel CPU on one of the laptops and actually all these bugs been found in one of our customer’s device which we’ve been working with and reported to Insyde.
So but this is my favorite. You see this function it’s two bugs. Two actually memory disclosure bugs. Thing is single function, two memory disclosure bugs. Cool. So similar pattern, different name of the function. So basically, to be honest, I just need to rename it this return copymem function, but it’s actually similar pattern.
Anyway moving forward. So another memory corruption. Also pretty straightforward when we have a pointer controlled by the attacker and it’s NVRAM variable, right? So why I mentioned before actually NVRAM variables, it’s data storage. NVRAM is a data storage and modification of this data can cause a lot of problems because actually it’s used during runtime. Not all the developers checking this data to be valid because they trust this storage.
Another two pretty similar problems. And you can see how in the single function it can be again two problems. But let’s move further. And let’s talk about how we can find these problems. So by the way, we’ve been mostly focused on x86, but basically like next week we are releasing EFIexplorer, which will be supporting ARM for 32- bits and 64-. And basically, as you can see, semantic annotations from EFIexplorer and now works at ARM, which is cool. I think Mark started a good transition to releasing the tools on LABScon Stage.So I am continuing and next week EFI explorer for ARM will be on your IDA Pro.
So next thing is how we cannot automate the detection at scale. Right? And I think one of the problems, as I mentioned, when the static analysis tooling just doesn’t work. So and we can actually improve that with a lot of different directions. One of them actually used some SMT solvers and lightweight, unconstrained checkers. So basically here is a blog where we actually documented all the way how we implemented this techniques. And here is a demo. It’s found in memory corruption bug, which is caused by a COM buffer and leads to classical stack overflow. So we check SMI handler pointer. It’s actually tainted. And. Boom. Here’s a bug.
So all the advisories for these vulnerabilities is up. Also available online on the vendor side, on the Binarly side with all the technical details. But one of the most important thing, how we can detect it at scale. Right. And detecting unknown vulnerabilities and known vulnerabilities, it’s two different problems. So we created a firmware hunt, which is a cool concept when we actually can implement the detection on semantic level and detect these bugs more effectively than other tools. And it’s available, it’s open sourced and all the detection rules for bugs, which has been disclosed at BLACKHAT and all of our disclosures include today’s is available there and you can try firmware hunt on if you want to basically don’t want to install any tools.So and actually it is finding some bugs. This one, it’s been from Lenovo anyway.
As a conclusion. The firmware is a whole operating system below your usual operating system, and it’s really needs much more attention than we think about. And not only on x86 systems, everywhere. If you have like, Tesla. Think about what operating system inside your Tesla. It’s just like a super secure operating system? Or it is something which is can have some bugs and allow the attacker to open some interesting ways to play with your car. So I leave this question open. But another thing is charging stations. It’s just the workstations on the street. And all this complexity of ecosystem is just growing and growing and growing, and we need to really pay much more attention. And I’m really happy to see how on the policy side it’s happening, we are gaining more and more regulations in that direction. And just for defenders, I think it’s very important to start learning more about forensics and how we can dig and dive deeper into such landscape. So thank you very much all and hopefully I made it in time. We have few minutes for the questions and for the good questions I will leave this book signed with you.
We have time for two questions, if anyone. Of course.
How are you effectively constraining your symbolic execution?
Can you elaborate a bit about effectiveness?
I mean, I mean, you said it was like a lightweight stack analysis and obviously you are not exploring all possible paths. So just what are you baking in Heuristics to it? What’s your approach?
Yeah. The key here, like we don’t use a symbolic execution to find all the shades of the bugs. So we create in the checkers for particular classes of the bugs. So we use these unconstrained solvers to actually follow just particular class of the box and its multiple solvers, which is precisely been built on top of the knowledge base. So basically, let’s say it’s not finding like completely new problems in finding known unknowns, right?
So Perri get’s a book. Who else needs a copy of the book? Come on, guys. Don’t. Don’t. More questions. All right. All right. You guys can catch Alex on the side and grill him on this. Alex, thank you very much. I have a speaker gift here for you.
Oh, thank you. Thank you.
Sonix is the world’s most advanced automated transcription, translation, and subtitling platform. Fast, accurate, and affordable.
Automatically convert your mp4 files to text (txt file), Microsoft Word (docx file), and SubRip Subtitle (srt file) in minutes.
Sonix has many features that you’d love including automatic transcription software, world-class support, upload many different filetypes, transcribe multiple languages, and easily transcribe your Zoom meetings. Try Sonix for free today.
About the Presenter
Alex Matrosov is CEO and co-founder of Binarly Inc., where he builds an AI-powered platform to protect devices against emerging firmware threats. Alex has more than two decades of experience with reverse engineering, advanced malware analysis, firmware security, and exploitation techniques. He served as Chief Offensive Security Researcher at Nvidia and Intel Security Center of Excellence (SeCoE).
This presentation was featured live at LABScon 2022, an immersive 3-day conference bringing together the world’s top cybersecurity minds, hosted by SentinelOne’s research arm, SentinelLabs.