Introduction
ProDisFuzz is a fuzzing program, that means it is designed to test a target application for potential vulnerabilities or strange behavior while executing input data. One special feature of fuzzing is that it is a (semi-)automated testing process and generates usually a large number of test cases.
The whole Fuzzing process can be divided into 7 phases:
- Target identification
- Interface identification
- Fuzz data generation
- Data execution
- Monitoring for flaws
- Impact analysis
- Documentation
ProDisFuzz can help you in phases 3–5 and 7 and hence reduce the effort a tester has to invest while testing an application.
Generic Features
- Support of stateless TCP: Currently only stateless protocols based on TCP are supported, such as HTTP or Modbus. Support for UDP-based protocols and even stateful protocols is planned to be implemented in future.
- Support of unknown proprietary protocols: ProDisFuzz can “learn” the basic protocol structure by looking into two or more sample communication captures. This is done by applying different bioinformatics algorithms. For more info see section “Bioinformatics Features”.
- XML description of protocol structure: The testing protocol is transformed into a XML format and can be stored on disk for later usage. Through this XML file ProDisFuzz can then generate fuzz data which can be used to test a target server application.
- Generation-based fuzz data: Because of the XML description of the protocol ProDisFuzz can generate new fuzz data according to the basic protocol structure. Basically it separates fixed and variable data blocks. That means, data which is present in all recorded sample captures is considered to be “fixed” data that is necessary in every communication protocol. Variable data differs in each of the recorded packages and is therefore combined to variable data blocks. Fuzz data is injected only in variable blocks leaving fixed data untouched.
- Injection of library-based and random data: The user can choose between the injection of random or library fuzz data. With latter it is ensured that data which is especially designed for the specific target application can be used.
- Remote monitoring: The server application is monitored by ProDisFuzz for every state of a DoS. When a crash is detected this way ProDisFuzz interrupts the testing process and tries to reconnect to the server until the user has restarted the application. ProDisFuzz then automatically continues the testing process.
- Report generation: At the end of the testing process a HTML report with information about detected crashes is generated. It contains testing parameters and all sent messages that caused a DoS to the target server.
- GUI: All features of ProDisFuzz are accessible through a GUI which makes the fuzzing process as easy as possible – even for unexperienced users.
Bioinformatics Features
To learn a protocol structure ProDisFuzz makes use of algorithms which usually can be found in bioinformatics. The mathematical problem of learning a protocol structure from a few communication sample captures is equivalent to the problem of aligning (sub)sequences of protein sequences (for details see http://en.wikipedia.org/wiki/Bioinformatics and http://packetstormsecurity.com/sniffers/PI.tgz).
ProDisFuzz uses the following algorithms:
- Needleman-Wunsch: This algorithm is the base for aligning two captures to identify fixed and variable protocol parts. It can precisely separate different data blocks and is a good start for learning unknown structures in strings. The downside of this algorithm is its enormous memory consumption. For example say you want to learn the structure of two 50 kByte sequences. The space you would need to find an alignment is about 10 GByte! So ProDisFuzz needs another additional algorithm.
- Hirschberg: This algorithm is a tweaked version of the Needleman-Wunsch algorithm and drastically reduces the memory consumption needed for aligning protocol sequences. This is the main algorithm ProDisFuzz uses for protocol learning. But like the Needleman-Wunsch algorithm only two secuences can be aligned. But what if there are more than two captures that describe the protocol?
- Neighbor joining: ProDisFuzz uses a technique called neighbor joining to cluster the communication sequences. With this algorithm it is possible to align more than two sequences pairwise. Although neighbor joining does not guarantee to find an optimal alignment its solutions are sufficient good.
- N-grams: N-grams are used to support the neighbor joining and to find similar sequences. ProDisFuzz uses trigrams to estimate the best clustering.
The process of learning a protocol with help of these algorithms is executed as follows:
- ProDisFuzz collects and reads all communication captures.
- It chooses two communication sequences that have minimal averagely distance by applying neighbor joining and n-grams.
- It aligns the two sequences with the Hirschberg algorithm.
- It deletes the two sequences of the set of sequences and adds the new aligned sequence to it.
- It repeats steps 2–4 until there is only one aligned sequence left.
- It cleans up the sequence from inconsistencies.
License
Currently ProDisFuzz is released under the WTFPL. As the program is more a proof of concept rather than a complete fuzzing framework the code can be used without any restrictions: Study it, copy it, modify it.
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
Version 2, December 2004 Sam Hocevar <sam@hocevar.net>
Everyone is permitted to copy and distribute verbatim or modified copies of this license document, and changing it is allowed as long as the name is changed.
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. You just DO WHAT THE FUCK YOU WANT TO.
Usage
Just start the jar file, e.g. through double-clicking.
ProDisFuzz has two modes, both of them self-explaining in their usage:
- The learning mode is used to create the protocol structure from a given set of captures.
- The fuzzing mode uses the structure generated in the previous step to test a remote server.