Faxploit: Sending Fax Back to the Dark Ages
Research By: Eyal Itkin and Yaniv Balmas
Fax, the brilliant technology that lifted mankind out the dark ages of mail delivery when only the postal service and carrier pigeons were used to deliver a physical message from a sender to a receiver.
Technology wise, however, that was a long time ago. Today we are light years away from those dark days. In its place we have email, chat messengers, mobile communication channels, web-services, satellites using quantum messaging and more. So fax today is surely nothing but a relic that has been cast aside to the museum of old technologies, right?
Wrong. Fax is surprisingly still widely used even today. With over 300 million fax numbers in use, according to a simple Google search, it seems like we are still far from seeing fax be a thing of the past.
With this in mind, Check Point Research decided to take a deeper look into this old fashioned form of communication and see if fax, other than being a loud noisy beeper and a bureaucratic burden, is also a major network security risk.
Taking Over a Network Using Just a Fax Number
To provide some background, fax today is widely used in all-in-one printer devices by many industries worldwide. These all-in-one printers are then connected both to the internal home or corporate networks through their Ethernet, WiFi, Bluetooth, etc interfaces. However, in addition they are also connected to a PSTN phone line in order to support the fax functionality that they include.
Our research set out to ask what would happen if an attacker, with merely a phone line at his disposal and equipped with nothing more than his target`s fax number, was able to attack an all-in-one printer by sending a malicious fax to it. If the answer was ‘yes’, then he could potentially gain complete control over the printer and possibly infiltrate the rest of the network connected to this printer.
So, after a long and tedious research, we finally succeeded in this mission.
In fact, we found several critical vulnerabilities in all-in-one printers which
allowed us to ‘faxploit’ the all-in-one printer and take complete control over
it by sending a maliciously crafted fax.
From that point on, anything was possible. We decided the best way to showcase this control will be to use Eternal Blue in order to exploit any PC connected to the same network, and use that PC in order to exfiltrate data back to the attacker by sending…a fax.
Still have questions? Skip the Technical Analysis and head straight to the Q&A.
Want a deeper look into this attack? Read on for our full technical research paper.
Below is A Video Demonstration of the Faxploit Attack:
Technical Details
Reversing the Firmware
Recon Phase
The first step in reverse engineering the firmware, once we loaded it to IDA,
was to figure what is being executed, and in what environment. After a quick
recon phase, we found out these details:
Architecture
The firmware is loaded to and executed by an ARM 32bit CPU, running in Big
Endian mode. The main CPU uses a shared memory region to communicate with an MCU
that controls the LCD screen.
Figure 1: Printer Architecture
Operating System
The Operating System is a ThreadX-based [ref. 3] real-time Operating System by
Green Hills [ref. 4]. It uses a flat memory model in which there are many tasks
that run in Kernel-Mode, all sharing the same virtual address space. Since this
is a flat memory model, we would expect the tasks to communicate with each other
over a message queue (a FIFO). In addition, the virtual address space is fixed,
and no ASLR-based mechanisms are deployed.
DSID Values
When we started to analyse the T.30 state machine task (“tT30”), though, we
stumbled upon many traces that used seemingly unique IDs. A deeper investigation
found that these IDs are also used in several lists of strings that start with
the “DSID_” prefix. And indeed, the strings seem to match the logic near these
traces, giving us important reversing hints. We built an Enum from all of the
different DSIDs lists, giving us textual descriptions for many traces throughout
the task.
Figure 2: using DSID values in the trace function.
Figure 3: DSIS list used for the T.30 state machine.
Figure 4: using the DSID Enum together with the trace function.
Gaps Between Tasks
When reversing the T.30 state machine, and later reversing the task that handles
the HDLC modem (“tFaxModem”), it seemed that there were several function pointer
tables that we were missing. We found two common code patterns that looked like
some allocation / deallocation routines. These functions are used in each module
in order to receive information from previous module, and maybe used also to
dispatch the buffers to the next module. An example is shown in figure 5.
Figure 5: Receiving a frame from another task using some function table.
If we were not able to locate these functions we would not have been able to see
how the data flowed inside the firmware, therefore limiting our understanding of
the firmware. Since we could not trace most of the function pointers to their
initialization, we needed to start a more dynamic approach. We therefore needed
a debugger.
Building a Debugger
The Serial Debugging Interface
Figure 6: Connecting JTAGULATOR to the printer’s serial debugger.
After a few attempts to use the serial debugger we found that the debugging
interface was limited by default:
Figure 7: The serial debugger refuses to obey our commands.
It seemed that we would need to elevate our privileges; and so we needed a
vulnerability.
Searching for 1-Days
When trying to exploit a given firmware, it is always useful to check what open
sources are being used and comparing their versions to known CVEs. In many cases
1-Days are enough, and they are surly good enough for debugging purposes. There
are two main ways for identifying the used open sources:
Use a string search in the firmware, and identify key strings from popular open
sources
You can search for CVE details that match the relevant library version
During our research we already identified the gSOAP library, and when we saw
reports regarding “Devil’s Ivy” in twitter, we immediately checked it out. The
code of CVE 2017-9765 [ref.5], a.k.a “Devil’s Ivy”, can be seen in figure 8.
Figure 8: Decompiled code of the “Devil’s Ivy” vulnerability.
We could reach this vulnerability by sending a huge XML (> 2GB) to the printer
over TCP port 53048 thus triggering a stack-based buffer overflow. Exploiting
this vulnerability would give us full control over the printer, meaning that we
could use this as a debugging vulnerability.
There were, however, two main drawbacks with this plan:
Sending the exploit over the network would take a considerable amount of time,
but with some optimizations we would be able to reduce the transmission time to
around seven minutes.
The vulnerability gave us a controllable stack-based buffer overflow, with some
limitations over our chars. The forbidden chars were:
Unprintable: 0x00 – 0x19
An important limitation we had to bare in mind when exploiting in an embedded
environment (not on an intel CPU) is the fact that the CPU has several caches.
Our received packet would be stored in the Data Cache (D-Cache), while
instructions were executed from the Instruction Cache (I-Cache). This means that
even though there is no NX bit support, we could not simply return back to
execute our payload directly from the stack buffer as the CPU would execute the
code as it sees it through the I-Cache.
To bypass all of the different limitations, we had to use a bootstrapping
exploit that consists of the following parts:
Basic ROP that flushes the D-Cache and I-Cache.
Scout Debugger
Our debugger is an instruction-based network debugger. It supports basic memory
read / write requests and can be extended to support firmware-specific
instructions as well. We used this debugger to extract memory dumps from the
printer, and later on we extended it to test some of the features we used in our
demonstration.
Once the debugger is configured with the addresses of the firmware’s API
functions (such as memcpy, sleep, and send) it can be loaded to any address as
it is fully position independent (PIC). We uploaded our “Scout Debugger” to our
Github, and it can be found here [ref.6].
ITU T.30 – Fax Protocol
When an all-in-one printer supports fax capabilities it means that it supports
Group 3 (G3) fax protocols, which conform to the ITU T.30 standard [ref.2]. This
standard defines the basic capabilities required from the sender and the
receiver, while also outlining the different phases of the protocol, as can be
seen in figure 9.
Figure 9: Diagram as taken from the ITU T.30 standard.
We will focus on Phase B and Phase C of the protocol. Phase B is responsible for
the capability negotiation (handshake) between the sender and the receiver,
while Phase C includes the transmission of the data frames according to the
negotiated specifications.
The frames themselves are sent over the phone line using HDLC frames, as can be
seen in figure 10.
Figure 10: Diagram as taken from the ITU T.30 standard.
Searching for attack vectors
Sending TIFFs
It is a common misconception that faxes simply send TIFF files. In actual fact
though, the T.30 protocol sends pages, while phase B negotiates parameters such
as page height and page width, and phase C is used to transport the page’s data
lines. This means that the final output will be a .tiff file that contains IFD
tags that were built using the meta-data from the handshake. The .tiff file will
later contain the page lines just as they were received over the phone line.
Although there are many vulnerabilities in .tiff parsers, these vulnerabilities
are mostly found in code that parses IFD tags, and in our case these tags are
built by the printer itself. The only processing that will be done to our page
content is opening its compression during the printing process.
TIFF Compression
Unfortunately for us, there are multiple names for the compression schemes used
by the .tiff file format, and we had to work them out. Here is the basic
mapping, as we understood it using [ref.8 and ref.9]:
TIFF Compression Type 2 = G3 without End-Of-Line (EOL) markers
We checked the decompression code for T.4 and T.6 and couldn’t find any
interesting vulnerabilities there.
T.30 Extensions
During phase B the modems exchange their capabilities, so they could decide what
is the best supported transmission method. We wrote a simple script to parse
these messages using the ITU T.30 standard, and we found out this interesting
result as shown in figure 11:
Figure 11: Parsing the DIS capabilities of the target.
It seemed that our printer supported the ITU T.81 (JPEG) format [ref.10],
declared in Annex E of the ITU T.4 standard [ref.11], and in short, it meant we
could send colourful faxes. When we examined the code that handles the colourful
faxes we found out another good finding: the received data is stored to a .jpg
file as is. In contrast to the .tiff case in which the headers are built by the
receiver, in the .jpg case we control the entire file.
We checked this behaviour with the standard and found out that since the JPEG
format is complex, the headers (called markers [ref.12]) are indeed sent over
the phone line, and the receiver should process them and decide what to keep.
Some of the markers might not be supported by the receiver, and should be
dropped, and other markers (such as the COM marker) should always be skipped. In
our firmware, and in open sources that we checked, the received content is
always dumped to a file without any filtering, giving an attacker a great
starting point.
Printing the coloured fax
So, to recap: when the target printer receives a colour fax it simply dumps its
content into a .jpg file (“%s/jfxp_temp%d_%d.jpg”, to be precise), without any
sanitation checks. However, receiving the fax is only the first step, as it now
should be printed. The printer module needs first to verify the width and height
of the received document, so it sends it for a basic parsing round.
The JPEG Parser
For some unknown reason, firmware developers tend to re-implement modules that
are already implemented in major popular open sources. This means that instead
of using libjpeg [ref.13], the developers implemented their own JPEG parser.
From an attacker’s point of view this is a jackpot, as finding a vulnerability
in a complex file format parser looks very promising.
The parser itself is quite simple, and works like this:
Check that the file starts with a Start Of Image (SOI) marker: 0xFFD8
According to the standard, a COM marker (0xFFFE) is a variable-sized text field
representing a text comment. This was our first candidate for finding a parsing
vulnerability, and ironically this marker was supposed to be dropped by the fax
receiver according to the standard.
And indeed, we found the following vulnerability as can be shown in Figure 12:
Figure 12: decompiled code for the COM marker vulnerability.
The parsing module parses a 2-byte (Little Endian) length field and runs in a
loop that copies data from our file into some global array. It looks like each
entry in the array is of size 2100 bytes, while our length field can be as high
as 64KB, granting us a massive controllable buffer-overflow.
CVE-2018-5924 – Stack-Based Buffer-Overflow while Parsing DHT Markers
Since the first vulnerability was located in a marker that shouldn’t be
supported by standard-compliant implementations, we chose to keep on looking for
vulnerabilities in additional markers. The DHT marker (0xFFC4) Define a special
Huffman Table that should be used when decoding the data frames of the file.
This function was even simpler than the previous one, show in Figure 13:
Figure 13: Decompiled code for the DHT marker vulnerability.
We can see that there is an initial parsing loop that reads 16 bytes, and
because each byte represents a length field, all of the bytes are accumulated
into an overall length variable.
Building an Exploit
We chose to exploit the DHT vulnerability as it was the easiest to exploit. If
we recall, our debugging exploit also used a stack-based buffer overflow
vulnerability, meaning we only needed to preform minor modifications to our
debugging exploit.
Autonomous Payload – Implementing a Turing Machine
We could have used the same network-based loader that we used for our debugging
exploit; however our current attack vector had a major advantage: our full
payload can be stored inside the sent “JPEG” file. Relying on the fact that no
one preforms any sanitation checks on our fax’s content, we could store our
entire payload inside the sent document, without worrying about it not being a
legal JPEG document.
Using this fact, together with the fact that the file’s file descriptor (fd) is
stored in an accessible global variable, we wrote a file-based loader. The
loader reads the payload from the file and loads it to memory. Later on, every
time the payload wants to preform a task using some input, it reads the input
from the same file and acts upon the instructions in it. Effectively, we
implemented a basic Turing Machine that reads input from the tape (the sent fax)
and acts accordingly.
Spreading Throughout the Network
Simply taking over a printer would be nice, but we wanted to do more. Indeed, if
we could take over the entire computer network that the printer is part of, we
could achieve a much bigger impact. So, knowing that one of the members in our
Vulnerability Research team knows Eternal Blue quite well [ref.14] and that our
Malware Research team did a similar research on Double Pulsar [ref.15], we
decided to implement both NSA tools by using our file-based Turing Machine.
And so, our payload implemented the following features:
Taking over the printer’s LCD screen – demonstrating full control over the
printer itself.
Wrapping it all together
When we started our research, our goal was to show that the fax machine, which
is now mostly embedded in all-in-one printers, poses a security risk that was
yet to be considered by the research community. In our research we presented the
ITU T.30 fax protocol, including some of its extensions, such as Annex E that
defines how to send colourful faxes. These protocols, defined in the 90s, use
complex state machines, complicated compressions and several hard to implement
extensions.
Using the HP Officejet Pro 6830 all-in-one printer as a test case, we were able
to demonstrate the security risk that lies in a modern implementation of the fax
protocol. Using nothing but a phone line, we were able to send a fax that could
take full control over the printer, and later spread our payload inside the
computer network accessible to the printer.
We believe that this security risk should be given special attention by the
community, changing the way that modern network architectures treat network
printers and fax machines. From now on, a fax machine should be treated as a
possible infiltration vector into the corporate network.
Disclosure Timeline
The responsible disclosure process was coordinated with HP Inc, which were very
helpful and responsive during the process.
1 May 2018 – Vulnerabilities were disclosed to HP Inc.
At first we analysed the board, searching for
a serial debugging port. And soon enough, our (now broken) printer was connected
to the serial debugger.
Search the vendor’s website for open source licenses of the products
In addition, identifying useful vulnerabilities in these open sources can be
done in many ways:
If you
are already familiar with several vulnerabilities, simply check if they are
relevant
Stay tuned – US CERT distributes a weekly e-mail containing the
newly published CVEs
gSOAP debugging vulnerability – Devil’s Ivy
We would need to develop this exploit using only IDA
and the basic serial dumps that would be generated on each failed attempt.
Exploiting Devil’s Ivy
‘?’: 0x3F
A major advantage in this vulnerability
was that our overflow was practically unlimited, enabling us to send the entire
exploit chain to be stored on the target’s stack.
Decoded shellcode that loads
our debugger’s network loader.
The full debugger will be sent to the loader
over the network.
We leave the task of constructing the full exploit chain as
an exercise to the reader.
TIFF
Compression Type 3 = G3 = ITU T.30 Compression T.4 = CCITT 1-D
TIFF
Compression Type 4 = G4 = ITU T.30 Compression T.6 = CCITT 2-D
The
compression scheme is basically a Run-Length-Encoding (RLE) scheme using fixed
Huffman tables for white codes, and black codes, as faxes are black and white.
Run in
a loop and parse each of the supported markers
When finished, return the
relevant data to the caller
CVE-2018-5925 – Buffer-Overflow While Parsing COM
Markers
A local stack buffer of size 256 bytes is
prepared for use – filled with zeros.
A second for loop uses the previous
length field, and copies data from our file into the local stack buffer
A
simple calculation could point out the vulnerability in this code: 16 * 255 =
4080 > 256. We have a controllable stack-based buffer overflow without any
limitations on our used chars, we couldn’t hope for a better vulnerability.
Checking if the printer’s network cable is connected.
Using Eternal Blue and Double Pulsar to attack a victim computer in the network,
taking full control over it.
To our knowledge, we now had the first (publicly
documented) printer capable of using Eternal Blue and Double Pulsar to
autonomously spread an attacker’s payload over a computer network.
1 May 2018 – HP Inc
acknowledged our submission and started working on a patch.
May – June 2018 –
Coordinated effort to recreate the PoC and patch the vulnerabilities.
2-3
July 2018 – Face to Face meeting with HP Inc:
The vulnerabilities were
demonstrated and discussed.
The patches by HP Inc were tested and approved by
both parties.
23 July 2018 – The vulnerabilities were flagged as Critical.
1 August 2018 – HP Inc published the patched firmware on their site [ref.1].
12 August 2018 – Official public disclosure during DEFCON 26.