Quantifying the Security Benefits of Debloating Web Applications
The main idea of software debloating is to reduce software's attack surface by removing pieces of code that are not required by users. In this study, we analyze four popular PHP applications (phpMyAdmin, MediaWiki, Magento and WordPress) and show that a reduction of up to 71% in Logical Lines of Code, up to 60% in historic CVEs and up to 100% in Object Injection Gadgets is possible. Essentially, by removing unused functionality and libraries from the application we can significantly reduce the attack surface.
The paper is available at https://www.securitee.org/files/debloating_usec2019.pdf
This system is comprised of three docker images:
Mapping of vulnerabilities to source code: You can separately download the mapping of vulnerabilities to source code of applications. To ensure that the mappings point to the correct lines, use the specific version of applications that we used (Links are available below).
Mapped CVEs to vulnerable files and lines (CSV file).
Mapped CVEs to vulnerable functions (CSV file).
Pre-configured applications: Four pre-configured docker images that host versions of web applications used in our study along with File and Function level debloated versions are accessible below:
Current web application configuration expects each web application to reside in the "web" directory itself. To test file and function level debloated variants, simply move their directories up from file_debloating or function_debloating directories to web.
The source code is hosted on Github at https://github.com/silverfoxy/PHPDebloating.
* docker and docker-compose packages are required to run this code.
Running the container: In order to build and run docker containers, after downloading your desired template, navigate to its root directory and run:
docker-compose up
you can access the web applications at http://localhost:8085 and reports at http://localhost:8086/admin.
Running the usage profiles: In the paper, we mention Tutorials mapped through Selenium scripts. Monkey tests based on Gremlinsjs. Spider and Scanner from BurpSuite.
Debloating a new web application: To add a new web application, follow these steps:
Mapping web applications to versions in the database: The debloating engine needs to know which version of web applications is being used, you can either use cookies or environment variables to pass this information to the back end.
- Cookie: By setting the following cookies, the backend knows which application is being tested.We are a team of security researchers at PragSec Lab, Stony Brook University (https://securitee.org).
For any queries or questions contact Babak Amin Azad at [email protected]
- What does a debloated application look like in action?
Unless you trigger a removed part of the application, everything else looks identical. And a custom error message is displayed if a removed feature is executed. Here's a video of an exploit attempt on CVE-2016-4010 failing on debloated version of Magento 2.0.5. This attempt fails because the gadget chain relies on Credis_client class (redis client for Magento) being present which is removed after debloating.
- How are Object Injection Gadgets extracted?
Our current study is based on known gadget chains from popular PHP packages reported at PHPGGC library. After we debloat the web applications, we checked if these gadgets (the whole file or the magic functions) are removed or not.
- Can I run this software in a production environment?
XDebug, the PHP extension that is used to record line level code coverage hooks into Zend engine and has a noticeable overhead. In our setup, based on the four applications we tested, the page load time was increased 2-4 times. As such, this overhead may be unacceptable based on your setup. One solution is to record the coverage on a subset of users by load balancing. Another solution is to use record-and-replay proxies to record real traffic to the application and replay them offline.
A third solution is to optimize XDebug's code coverage engine and make it more efficient for our need. Currently it overloads over 47 op codes. As only detecting the coverage of files or functions is enough for our architecture, we can extend this module or use in-house implementations that decorate functions with code that records the coverage and achieve an acceptable overhead.