ROOTS 2018: Library and Function Identification by Optimized Pattern Matching on Compressed Databases – Maximilian von Tschirschnitz
[Editor’s note: This article belongs to the Reversing and Offensive-oriented Trends Symposium 2018 (ROOTS). It was misplaced, so we publish it today. Maximilian’s talk was recorded and can be watched on Vimeo.]
The goal of library and function identification is to find the original library and function to a given machine-code snippet. These snippets commonly arise from penetration tests attacking a remote executable, static malware analysis or from an IP infringement investigation. While there are several tools designed to achieve this task, all of these seem to rely on varied methods of signature-based identification. In this work, the author argues that this approach is not sufficient for many cases and propose a design and implementation for a multitool called KISS. KISS uses lossless compression and highly optimized pattern matching algorithms to create a very compact but substantial database of library versions. In practice, KISS shows to achieve remarkable compression rates below 30 percent of the original database size while still allowing for extremely fast snippet identification with high success rates.
Finally, the author also argues how this approach improves the security of existing techniques as the design relies fully on complete function body verification, which prevents analysis-resilient malware from disguising as external and trusted library code. This has recently been shown to be a problem for
malware analysis with existing identification solutions.
Maximilian von Tschirschnitz is working as an prototype engineer and researcher for the Intel Corporation in Germany. In parallel he is currently conducting his studies of Informatics at the TU Munich. His current research topics cover IT-security and high precision positioning methods. His further professional interests include theoretical informatics, image feature recognition and computer graphics.