d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7
ava i lab le at www.sc ienced i rec t . com
journa l homepage : www. e lsev ier . com/ loca te / d i in
Windows Mobile advanced forensics5
C. Klaver*
Netherlands Forensic Institute, Dept. Digital Technology and Biometrics, Digital Technology Group, Postbus 24044,
2490 AA Den Haag, The Netherlands
a r t i c l e i n f o
Article history:
Received 31 December 2009
Received in revised form
9 February 2010
Accepted 10 February 2010
Keywords:
Windows mobile
NAND flash
TFAT file system
Live forensics
Heap
CEDB/EDB database
Logical/physical acquisition
5 The Netherlands government is authorizstanding any copyright notation there on.
* Tel.: þ31 (0)70 888 6423; fax: þ31 (0)70 88E-mail address: [email protected]
1742-2876/$ – see front matter ª 2010 Elsevidoi:10.1016/j.diin.2010.02.001
a b s t r a c t
Windows CE (at this moment sold as Windows Mobile) is on the market for more than 10
years now. In the third quarter of 2009, Microsoft reached a market share of 8.8% of the
more than 41 million mobile phones shipped worldwide in that quarter. This makes it
a relevant subject for the forensic community. Most commercially available forensic tools
supporting Windows CE deliver logical acquisition, yielding active data only. The possi-
bilities for physical acquisition are increasing as some tool vendors are starting to imple-
ment forms of physical acquisition. This paper introduces the forensic application of freely
available tools and describes how known methods of Physical Acquisition can be applied to
Windows CE devices. Furthermore it introduces a method to investigate isolated Windows
CE database volume files for both active and deleted data.
ª 2010 Elsevier Ltd. All rights reserved.
1. Introduction MSAB’s.XRY and Cellebrite’s UFED support logical acquisition
With Windows CE on the market for more than 10 years now,
Microsoft has a market share that makes it a relevant subject
for the forensic community. The first versions of Windows CE
were not very successful on the hand-held electronics market.
However, with the release of Windows Mobile 6, based on
Windows CE 5.2 (Herrera, 2009), Microsoft has gained a market
share of 13.6% of the nearly 40 million mobile phones shipped
worldwide in the third quarter of 2008, but appears to be
falling in 2009 (Canalys, 2009).
Currently most commercial forensic tools that support
Windows CE (WCE) acquire data from the device through the
standard Remote Application Programmers Interface (RAPI).
This results in the acquisition of only the active data. The
capturing of deleted data is not possible using just this method.
In 2005, PDA Seizure was one of the first tools that supported
logical acquisition of WCE devices. Nowadays, other tools like
ed to reproduce and dist
8 6559.
er Ltd. All rights reserved
of WCE devices. In Ayers et al. (2005), a comprehensive over-
view of forensic tools for mobile devices is given.
MSAB is implementing physical acquisition of WCE devices
in its tool XACT (MSAB). Cellebrite is supporting physical
acquisition for Windows CE devices in their Physical-Pro
version of UFED (Cellebrite). Since 2003 Hengeveld (2009) is
publishing his open source XDA tools. With this toolset,
among other things, an acquisition of RAM and flash memory
inside WCE devices can be done. All these tools assume a WCE
device that is not device locked by a handset security code.
Revealing or circumventing security codes is beyond the scope
of this paper, but physical acquisition methods like chip
extraction, or the use of JTAG or a boot loader, work around
handset security codes. More advanced protection of a smart
phone would encrypt user data, imposing a new challenge to
forensic examination of such a mobile device. This is also
beyond the scope of this paper.
ribute reprints of this paper for governmental purposes notwith-
.
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7148
This paper takes forensic examination of WCE devices
beyond logical acquisition with commercial, off-the-shelf
forensic tools. In section 2 relevant aspects of the typical
hardware of a WCE device are described and physical loca-
tions that can contain user data are identified. Section 3
describes software components in a WCE device that are
involved in storing user data or can be used in a forensic
acquisition. Section 4 describes the process of acquiring
a forensic duplicate of data on a WCE based device. Section 5
covers methods for performing a physical acquisition of
a WCE device. Section 6 presents tools and techniques for
analyzing results of a physical acquisition. Section 7 discusses
results and future work is identified in section 8.
2. Typical WCE hardware
This section describes hardware elements in a typical WCE
device that can be relevant for a forensic examination of such
a device. Only a general overview will be given of aspects of
the processor, flash memory and RAM. Description of other,
more specialized hardware components fall outside the scope
of this paper.
2.1. Processor
With WCE Microsoft intends to deliver an Operating System
(OS) that can run on a range of hardware platforms. Currently
four families of processor cores are supported: ARM, MIPS, SH4
and x86 (Microsoft 1). Of these, ARM currently is most common
in consumer electronics like smart phones, PDAs and naviga-
tion devices. This paper focuses on ARM based devices.
The ARM processors used in WCE devices are coming from
various vendors. To name some that we have come across the
last years: Intel PXA2x0/PXA30x XScale family of processors.
Intel sold their activities in this field to Marvell (Intel, 2006).
Texas Instruments has its OMAP series (Texas Instruments).
Another player on this market is Samsung with its S3Cxxxx
range (Samsung).
One of the interesting aspects of all of this range of
processors is that nearly all peripheral devices needed to build
a smart phone are integrated into one chip. These processors
are also referred to as System on Chip (SoC). This means that
a lot of relevant information that might be needed for physical
data extraction on a device powered by such a SoC must come
from datasheets of this SoC. To be able to make a copy of the
NOR flash memory in the device by using Boundary Scan
technique it might be necessary to know the memory layout of
a smart phone, which chip select lines are connected to which
type of memory chip. Finding datasheets however is getting
harder because many SoCs are designed specifically for the
mobile handset builders and distribution of datasheets is
strictly controlled. Where these datasheets do not become
available, the necessary information can only be gained by
reverse engineering an exemplar device of the same make and
model as the evidence. The latest SoCs even contain the RAM
and flash dies inside one package, which makes physical
extraction of memory even more challenging. Besides that,
SoCs might contain special secure memory for storing data
like cryptographic keys.
2.2. Flash memory
Flash memory is widely used for non volatile storage of data.
There are two main types of flash memory, NOR and NAND
flash (Knijff). Flash has specific properties that have forensic
relevance. For instance, as data cannot be updated in place in
flash memory, first the data has to be copied from flash to
RAM, changed and then copied back to a different, empty
location in flash. The data before the change might be avail-
able after the change through physical acquisition for quite
a while. (Breeuwsma et al., 2007).
2.2.1. NOR flashThis type of flash memory has a RAM-like interface; it has
a data bus, an address bus and control lines. NOR flash is
mapped in the processor’s memory map and processor code
can be executed directly from it (this is called ‘execute in
place’; XIP). NOR flash can also be used as storage location for
user data. Many older WCE devices have a single folder in the
root directory that is mapped to a section in NOR flash. With
a special driver, like Intel’s Persistent Storage Manager
(Intel, 2005) the part of the NOR flash memory that is not used
for code can be used for user data. In a forensic investigation,
this folder should not be overlooked. This folder is for
example very suitable for storing system backups and
because it resides in flash, deleted data can persist. When
a device with a completely drained battery makes a full
system reset, this folder might still contain a recent backup
of all user data.
2.2.2. NAND flashNAND flash can be regarded as the solid state equivalent of
a hard disk. It has an interface with an I/O bus and control
lines connecting the memory chip to the processor. Over this
I/O bus, commands, addresses and data are sent. As NAND
flash memory is not mapped in the memory space of the
processor, code stored in a NAND flash chip can not be
executed directly, but has to be loaded into RAM first, again
much like a hard disk.
After reset, boot loader code is loaded into RAM through
some mechanism that is dependant of the type of flash
memory used. Some flash memory types are capable of pre-
senting a boot block of flash memory through a NOR flash
interface, allowing the processor to boot from this block. The
code in this block will contain instructions to access the rest of
the blocks, in order to load the OS into RAM. Typical behavior
of WCE smart phones is that after the OS is loaded, it will
detect whether it is a cold reset. In that case, it will install
customization .cab files from the customization flash parti-
tion, often TFAT. After these files are installed, the device is
rebooted and it is ready for use.
Flash memory must be erased to all 1s before reuse and
flash can only be erased in fairly large blocks, typically
between 128 and 512 kB. An erase block that mainly contains
expired pages can be made fully expired by copying away the
last few active pages and subsequently be erased. This means
that inactive data will be wiped beyond recovery by the
system itself, even in ‘quiescent’ state. Flash memory is worn
out by erasing. To minimize this effect, flash manufacturers
supply so called Wear Leveling algorithms with their
RAM based dlls
code
heap
stack
Free space
0x01FF-FFFF
0x0000-0000
Fig. 1 – Simplified memory layout of a WCE process.
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7 149
hardware. Wear leveling is the process in a flash Memory
manager that takes care of the evenly spreading of erase
actions across the whole flash memory range.
2.3. RAM
RAM in WCE devices can contain various types of data that
can have forensic relevance. As modern devices can have
hundreds of MB of RAM, it is essential to know where to look
for relevant data. In WCE versions prior to 6.0 a process has
a virtual1 address space of 32 MB (Boling, 2002). Fig. 1 shows
a simplified diagram of how various items are located within
the 32 MB address space. From the bottom up, first the code of
the process itself is loaded. From the highest address down,
dlls needed by the process are loaded. In between are the
stack and the heap for the process. The stack and the heap
are the locations where variables are stored during the life-
time of the process. From WCE 6.0 on, memory management
has changed quite drastically. For instance the addressable
Virtual Memory space for a process is no longer restricted to
32 MB (Microsoft 2).
2.4. Other
Beside NAND and RAM, a WCE device might have additional
memory for special purposes. Inside the processor for
instance, special registers can reside to hold for instance
cryptographic data. A register holding a unique number might
function as a master key for encryption of data. For instance,
the Texas Instruments (2009) OMAP35x processors have a 128
bit CONTROL_DIE_ID at address 0x4830A218.
3. Typical WCE software components
This section describes software components in a WCE device
that are involved in storing user data or can be used in
a forensic context. An important notion when looking at WCE
devices is the difference between the kernel that the WCE
device is based on, and the version name of the retail OS
(Herrera). For instance, all Windows Mobile 6 versions are
based on WCE 5.2.
3.1. Bootloader
In some WCE devices the bootloader can be used as a tool to
get a physical image of memory in a WCE device. Some
bootloaders already have capabilities for this, sometimes
though barred by some security mechanism like a password
or insertion of a special memory card. Other devices would
need an adapted bootloader to provide the needed function-
ality for creating a physical image.
A reset on an ARM platform forces the processor to execute
the code at the reset vector. The reset vector is the (physical)
address from where the first instruction is fetched. For ARM
this address generally is 0x0000:0000. At the reset vector there
1 See for a brief introduction of virtual memory www.windowsfordevices.com/c/a/Windows-For-Devices-Articles/What-is-virtual-memory/.
is normally an unconditional jump to the address where the
actual bootloader code is located. Bootloaders sometimes
have the functionality to copy various types of memory from
the device to external media, but the functionality is not
always accessible. Because it could facilitate SIM unlocking or
other forms of hacking, handset manufacturers make it diffi-
cult to access this functionality.
If a bootloader has accessible functionality to read memory,
then this is a very safe and fast way of obtaining memory
copies. If the bootloader functionality to read memory is not
accessible, one can replace the bootloader (risky), patch the
bootloader (risky), find out why the functionality is not acces-
sible, for instance a password block, and finding a way around
this. All of these steps are time consuming and labor intensive.
When applying this method, with the objective to search
for deleted data, one must be sure to avoid booting into the OS.
This might enable the OS to reorganize flash memory, erasing
deleted data beyond recovery.
3.2. Heap
A heap is a portion of memory reserved for an application to
use to allocate and free memory on a per-byte (Microsoft 3).
The heap holds variables that are created with OS functions
like ‘malloc()’2. Functions like this return a pointer to the
memory chunk offered by the OS, if the requested amount of
memory is available.
Investigating the heap of a process can yield very inter-
esting data. Often buffers for various purposes are located on
the heap. Think for instance of a buffer for receiving data from
other devices like NMEA data from an external GPS receiver,
a buffer to hold text that is to be printed on the screen or
a scratch pad for email composition.
Once a process no longer needs the memory it has allo-
cated, it will (when well programmed.) return the memory to
the OS by calling an OS function like ‘free()’, with a pointer to
the memory it wants to return to the OS.
On the heap itself however, the data is not changed just
because of freeing its location to the OS. The only thing that
happens is that the memory region is made available to the OS
2 The ‘c’ version of this function is used. Other high levellanguages might use the heap in other ways. This example is onlyillustrative.
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7150
again. In the heap the status of a memory block(active or free)
can be detected.
The software managing the heap will try to keep the heap
as clean as possible. When N bytes are requested through
a call to ‘malloc(N)’, the heap manager will try to find a free
block (or contiguous free blocks) capable of holding at least
these N bytes. It might try to merge small free blocks into
a bigger free block by rearranging the heap. This is comparable
to the defragmentation process that can be applied to a hard
disk. How well the heap manager succeeds in fitting requested
blocks in free blocks and how well it defragments will influ-
ence the lifetime of data in ‘free’ blocks. In any case it is
possible to find data on the heap, either active or deleted, that
is otherwise not available to the user.
3.3. File systems
Most modern WCE devices are equipped with flash memory
hosting (T)FAT partitions for user data or firmware exten-
sions, and binary partitions with firmware and bootloader
code (Rogers et al., 2005), see Fig. 2. File systems are usually
not stored in NAND flash directly. The OS interfaces with a so
called Flash Translation Layer (FTL), which takes care of
storing File System blocks in NAND flash (Knijff, 2010, p. 390).
When analyzing the storage devices at file system level on
a WCE device, both binary partitions as well as File System
partitions can be found. Under normal use, the only partition
interesting for forensic analysis is the partition containing the
user file system. This partition usually contains a FAT or a TFAT
file system. TFAT is a Transaction Safe variant of FAT (Microsoft
4). As TFAT is transaction safe, sudden power loss, or other
interruptions of changes to the file system, will not lead to
a corrupt file system.
When looking at a TFAT file system image with a forensic
tool like EnCase or Ftk, one can notice that the root directory
the user sees on the WCE device itself is often not the root
directory of the file system. WCE can create a directory called
‘__TFAT_HIDDEN_ROOT_DIR__’ and inside this directory all
files and directories are stored that are seen by the user
(Microsoft 5). This means that the call
CreateFile("\temp\myfile.dat")
resolves to
CreateFile("__TFAT_HIDDEN_ROOT_DIR__\temp
\myfile.dat")
Another noticeable artifact is the presence of many
(deleted) file entries called ‘DONT_DELnnn’, where nnn is
FTLFlash Translation Layer
TFATUser data fs
TFATCustom fs
Firmware Bootloader
Fig. 2 – Flash memory in WCE devices.
a number starting at 000 and counting up. The reason for the
presence of these files is not exactly clear at this moment.
They might be there to make sure that flash erase blocks that
contain parts of File Allocation Tables, only contain FATs, and
not parts of regular files. If regular files share erase blocks with
FATs, changes to such a file will lead to copying that file and
possible reallocation of the FATs to other erase blocks,
possibly causing performance loss of the file system.
The user on a WCE device doesn’t see any of these files or
directories, because the file system drivers hide them from the
user. When analyzing a WCE TFAT file system image with
forensic tools, one can safely ignore the ‘__TFA-
T_HIDDEN_ROOT_DIR__’ and the ‘DONT_DELnnn’ entries.
3.4. Databases
In WCE versions earlier than 4.0, all user data was stored in the
so called ‘object store’. The object store is a database con-
taining the file system, the databases and the registry. The
object store lived in RAM; when power failed, all user data was
lost. From WCE 4.0 on, the roles are reversed; the file system is
now hosting the databases and the registry files. In devices
where this file system is based on flash memory, user data is
less dependent on battery life.
Flash based file systems also allow for easier imaging of the
file system, compared to RAM based storage. After a flash
based file system image has been created from the WCE device,
the databases containing user data can be extracted from the
image. This can be done with normal forensic tools supporting
TFAT; as TFAT is compatible with FAT, most tools will load
TFAT images without problems. Once loaded, the two most
interesting databases are cemail.vol and pim.vol, both located
in the root directory of the file system, as seen by the user.
4. WCE forensic investigation
When found during a criminal investigation, a WCE device has
to be treated just like any other mobile phone. Mostly, the first
goal is to avoid any further changes to the phone as much as
possible. Phone data can be changed for instance by incoming
calls, received text messages, connecting to WiFi/Bluetooth
networks, recorded GPS data and depleted batteries. In order
to avoid these changes, the phone should be isolated from the
GSM and other networks, reception of GPS signals and pow-
ered by an external power supply. A discussion on this subject
can be found in Jansen et al. (2007), chapter 5.3 and 6.
Another cause of changes in phone data lies in the phone
itself. While on, either in active or quiescent state, the phone’s
OS is active. The OS might be trying to manage the various
types of memory in the phone. The flash file system might
rearrange flash pages and erase flash blocks that only contain
expired pages. The heap manager might be trying to rearrange
the heap structure to join small free items into bigger ones. To
stop these processes, the phone has to be powered down, but
this is not always wanted, for instance because it might acti-
vate handset security code, hindering logical acquisition of the
phone, or activate memory rearranging or garbage collection.
Once connection to networks is properly prevented, data
on the WCE device can be acquired. As mentioned already,
Table 1 – Relative risks for data during logical and physical acquisition.
Risks Physical acquisition Logical acquisition
Chip extraction(damaged chip)
JTAG(damaged PCB,
‘bricking’)
Bootloader(‘bricking’)
Active data High High High Low
Deleted data High High High –
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7 151
two types of acquisition can be distinguished, physical
acquisition and logical acquisition. Depending on the inves-
tigation, it has to be decided which of the two has to be done
first, because either acquisition types have their own advan-
tages and disadvantages.
Physical acquisition can be a destructive operation. True
physical acquisition can either mean physically removing
memory from the device, using hardware techniques like
JTAG to extract data from the device or use an (adapted)
bootloader to gain low level access to the device. Most of the
physical data extraction methods hold some risk of destroying
data, the device or both.
For WCE devices there are ways to do an acquisition that is
somewhere in between a physical acquisition and a logical
acquisition. A copy can be made of the flash file system over an
ActiveSync connection. This requires a dedicated dll to be
loaded into the system under investigation, thus overwriting
RAM and possibly flash memory. The result however is an image
at file system level and not at flash hardware level. Because of
this, only unallocated clusters that reside in active flash pages
are in the image. Expired flash blocks that are no longer part of
the file system but still might contain data will not be copied. In
this paper, it is referred to as pseudo physical acquisition.
Logical acquisition is generally safer for active data. It does
not have the risks of losing all active data because of risks
involved in physical extraction. However, setting up an
ActiveSync connection to do a logical acquisition can change
data related to the ActiveSync connection itself. Another
downside is that during logical acquisition deleted data, that
still resides in the system, might be erased beyond recovery.
Because logical acquisition uses the system that is being
investigated, the processes in the phone that are used during
acquisition are using memory, RAM and possibly flash.
Another cause of permanent loss of deleted data is active
Wear Leveling and Garbage Collection in a working system.
Garbage Collection is a phenomenon that occurs in RAM,
where blocks of data that are no longer referred to by pointers
are freed by the OS and made available for reuse.
Sometimes logical acquisition is not possible, for instance
when the device is broken beyond repair, or when the device
does not have a standard interface to do the logical acquisition
over.
In cases where active data might be enough for the investi-
gation, doing a logical acquisition and a pseudo physical
acquisition on a WCE device before doing a physical acquisition
is the safest way to go. The risk of changing or destroying some
deleted data due to logical acquisition is then regarded less than
the risk of loosing all data in a physical acquisition. Table 1
shows relative risks for active and deleted data in physical and
logical acquisition. When executed by an experienced
investigator, and when a reference model is available, the
absolute risks of physical extraction are acceptable.
Physical acquisition might be the only option in cases where
there is an active phone lock or a non functioning phone.
Physical acquisition methods generally work at a low level and
are not hindered by the phone lock. The NAND flash of a broken
phone might still be working, allowing physical chip extraction.
There might be other situations where it is necessary to first
do a physical acquisition; when there is strong indication that
the essential evidence is in deleted data, the risk of overwriting
this evidence by switching on the phone and doing a logical
acquisition might be considered higher than the risk of loosing
the evidence through a failed physical acquisition.
5. Physical acquisition
In this section, several methods for getting a forensic image
from a WCE device are described. The methods are described
in order of forensic soundness. One has to realize that the
success of the described methods greatly depends on the
experience the investigator has with applying these methods.
Incorrectly applying these methods may destroy the WCE
device, the data in it, or both.
5.1. Physical chip extraction
In a WCE device, the investigation of the file system residing in
flash memory is best done by accessing the flash memory
directly. This method ensures that the OS does not interfere with
the data in memory. However, this type of acquisition might not
be feasible due to lack of necessary equipment. Section III-C,
Breeuwsma et al. (2007) describes how to remove a BGA
memory chip from a PCB and subsequently read the content of
the memory device. Desoldering the flash memory chip from
a WCE device might be an option in the following cases:
� Every risk of loosing deleted data has to be eliminated.
� The device is not working anymore
� No (known) possibility for access through JTAG
This method has some downsides:
� TSOP/BGA rework equipment is required
� Memory reader equipment is required
� The memory reader tool might not support the target chip
� The datasheet of chip equipment might not be available
5.1.1. Case exampleIn an investigation Police seized a Fiat 500 equipped with
a Blue&Me multi media set. Blue&Me is an ‘‘in-car
‘dimage’ connects to WCE DoC through JTAG
dimage
Tffs.dll
PC hardware
JTAG JTAG
WCE hardware
DoC
Normal situation for ‘dimage’
dimage
Tffs.dll
PC hardware
DoC
Fig. 3 – Making a file system image of an M-Systems DoC
through JTAG Technologies’ tools.
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7152
communication system’’, based on WCE for Automotive
(BusinessWeek, 2006; Microsoft 6). The investigation required
the examination of the content of the device, as it could
contain information on gsm handsets paired to the B&M unit,
SMS messages received with the handset or MP3 files played
with it. As time was limited it was decided not to look at RAM
data and only focus on flash memory. Three options for
accessing the flash data were identified:
� Acquisition through the USB port on the board
� Acquisition through JTAG
� Desoldering the flash chip
From a Fiat dealer several scrap units were obtained as
exemplar devices. On these it was established that the flash
chips were of a well known type (Samsung K9F5608U0D-
JIBO) and because of component placement, it was rather
easy to do Physical Extraction as described in Breeuwsma
et al. (2007). Furthermore, no information could be
obtained in reasonable time on how to get access through
USB, nor on the JTAG Test Access Points (TAPs) in the device.
This led to the decision that desoldering and subsequent
reading the memory chip with the NFI Memory Toolkit was
the quickest and most sound way to obtain a copy of the
flash chip.
This procedure was first tested on the exemplar. The file
system of the exemplar was reconstructed from the memory
copy and it appeared to contain:
- Bluetooth MAC addresses from devices connected to the
B&M set
- Full pathnames of MP3 files played
- Contact lists from paired phones
- Call history
Then the exhibit was processed the same way. It appeared
that non of the phones found earlier in the investigation had
been paired to the Blue&Me kit, so no further investigation
was necessary. As usual, new knowledge produces new
questions: It is the above list is complete? Probably not. For
example, the device is able to read out loud incoming SMS
messages, which indicates that received SMS messages will
probably be stored in the B&M unit.
4 M-Systems announced this chip End Of Life (EOL) in october
5.2. JTAG
In Breeuwsma (2006), a method is described on how to find
and use JTAG Test Access Points to obtain copies of
memory in JTAG enabled digital devices. In this paper,
a WCE device, the HP iPaq h1930 is investigated. It was
shown that it is possible to access SDRAM and flash
memory in this device.
5.2.1. Case exampleIn a case we received an HP Hx2790, of which we would like to
acquire an image of the internal flash memory, an M-Systems3
3 M-Systems was acquired by SanDisk in 2006, see www.sandisk.com/about-sandisk/press-room/press-releases/2006/2006-11-19-sandisk-completes-acquisition-of-msystems.
DiskOnChip (DoC) G3 type MD4331-d1 G-V3Q18X.4 After
having identified the JTAG pins on the device, and the
configuration of the JTAG chain, we searched for tools that
would be able to read the DoC. We found a tool set from
a Dutch company called JTAG Technologies. Their tool set
provides a mechanism to let M-Systems’ own utility ‘dimage’,
designed to make an image of a DoC hosted by a PC,
communicate with a DoC in another device through the JTAG
protocol, as if the DoC is on the same PC as the dimage utility
itself (JTAG). In this setup a file system image of the DoC in
a WCE device can be made. The Flash Translation Layer is in
the tffs5 library. It is crucial that the right version of the tffs
software is used (Fig. 3). Details of the flash translation
apparently change even between minor versions. A 6.3 tffs dll
is not capable of reading a 6.2 formatted DoC. As this method
offers a file system level image, expired pages within the DoC
are not found in the image, although these pages might
contain relevant information.
The result is shown in Fig. 4. The first 0x30 bytes show data
indicating this is a dump from an M-Systems device. We also
recognize the TFFS version 6.2.20 in this part. Then at offset
0x10C0 a Master Boot Record can be recognized. At offset
0x12C0 the boot sector of a TFAT16 file system is recognized.
Loading the image in EnCase is still problematic, this is
currently being researched further.
5.3. Bootloader
An example of bootloaders that have been reverse engineered
by people at xda-developers.com is the HTC Hermes boot-
loader (XDA Developer, 2008). Another example is a process
where the bootloader is replaced by one with capabilities to
copy of the TFAT file system (XDA Developer, 2007). The
author claims that the command ‘fat2sd 3’ will copy the
internal NAND based file system is copied to SD at file system
level. Supposedly the flash translation is being executed by
the bootloader, looking at output lines like ‘Nand2SDReorder
start.’. The output presented on this site looks like a valid
Windows CE TFAT root directory, including cemail.vol and
pim.vol files.
More research is needed to explore the possibilities of
these techniques on recent WCE devices.
2005, see www.sandisk.com.tw/Assets/File/OEM/Manuals/eol/mdoc/EOL-DOC-0505.pdf, but the chip is still found in olderdevices.
5 TrueFFS (tffs) is the flash file system developed by M-Systems.
PC Windows CE device
Active Sync RAPI server
itsutils.dll
command line shell
Fig. 5 – Software architecture of RAPI tools.
Fig. 4 – Three sections of the image of an M-Systems DiskOnChip from an HP HX2790.
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7 153
5.4. Pseudo physical acquisition
There are several tools available for doing pseudo physical
acquisition. In this paper, the focus is on RAPI tools. XACT is
well documented and not dealt with much deeper here. The
RAPI tools are not specifically designed for forensic acquisi-
tion, so the use of these tools in a forensic context requires
special care.
5.4.1. XACTAs of version 3.3 XACT supports the acquisition of the WCE file
system, but for this it needs to load an ‘agent’ onto the device
under investigation. The user has the option to store the agent
on an external storage card, avoiding unnecessary changes to
the device file system. Then the agent needs to be loaded into
RAM to be able to be called by the ActiveSync server, thus
overwriting unallocated RAM.
The result of an acquisition with XACT is a file system level
copy of the device.
5.4.2. RAPI toolsAnother set of tools that can be used to obtain images from
a live WCE device are the so called RAPI tools, developed by
Hengeveld (2009). This toolset is a collection of some 30
command line programs which can be executed on a PC and
that operate on the WCE device over an ActiveSync connec-
tion. All commands communicate with the RAPI server which
is running on the WCE device. Some tools only use the native
API that the RAPI server provides, other tools need to have
more advanced access and these use a helper library called
‘itsutils.dll’. This library will be copied onto the WCE device
and loaded into memory by the RAPI server process. The tool
can then access specialized functions in the helper library.
Fig. 5 shows this; the RAPI server interacts with the WCE
device directly through the API functions (dotted arrow), and
through the helper dll (dashed arrow).
In early versions of the RAPI tools, the dll was always copied
into the directory \Windows. As of RAPI tools version 080731,
the location on the WCE device where the helper library is
copied to can be changed by adding a key to the PC’s registry:
HKEY_CURRENT_USER\software\itsutils
devicedllpath [ ’’\Storage Card\itsutils.dll’’
Also, itsutils.dll can write messages to a log file. By default,
logging is off and when on, the log file is written to root.
Adding another key will set the log file destination and switch
logging on or off:
Fig. 6 – Screen capture of WCE 5.2 Task Manager on an HTC
Blackstone100.
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7154
HKEY_CURRENT_USER\Software\itsutils
devicelogpath [ "\Storage Card\itsutils.log"
logtype [ dword:00000002
Log type has the following meaning: 0:no logging, 1:ker-
nellog, 2:file.
The above allows for copying ‘itsutils.dll’ and writing the
log file to a memory card instead of the internal flash memory
of the WCE device, thus avoid overwriting unallocated flash
pages in the WCE device.
Any non signed executable on a WCE device will only run
after permission by the user. So for the helper dll to be loaded,
Fig. 7 – pps executed
the user has to give permission on the screen. Furthermore,
a WCE can be configured so that there are restrictions on the
execution of code through RAPI calls. To change restricting
policies, one value in a registry key has to be changed. For this
several options are available. One option is to use the rapi tool
‘prapi’ with the command line option –p 4097 1. This will set
the registry key 0x1001 (4097d) in [HKLM\Security\Policies\-
Policies] to 1. (Hengeveld). Some devices do not allow this key
to be set through the RAPI. Then using a registry editor on the
WCE device itself could be used to manipulate this key. If one
doesn’t want to install a full blown registry editor, a small
command line program could be created that just opens the
registry key ‘‘Security\Policies\ Policies" in HKLM, by calling
‘‘RegCreateKeyEx’’ and subsequently set registry value
0x1001–1 by calling ‘‘RegSetValueEx’’. This program could
then be loaded from an SD memory card, minimizing changes
to RAM usage.
Whichever method is chosen, some data on the target
device will be changed. This might be violating the rule that
‘‘No actions performed by investigators should change data
contained on digital devices or storage media that may
subsequently be relied upon in court’’ (ACPO). But as there
often is no feasible alternative, the evidentiary implications of
the changes should be evaluated first (maybe data related to
the ActiveSync connection is not relied upon in court) and
only after accepting the implications, the method can be
applied.
The following sections discuss some useful RAPI tools.
5.4.2.1. pps. With the pps tool, all processes in the WCE
device can be listed. This is particularly interesting because
the native WCE Task manager does not show all processes. As
shown in Figs. 6 and 7, pps shows a complete list of all
processes running on the WCE device, whereas Task Manager
only shows a few.
on same device.
Fig. 8 – Making a copy of the working memory of the tmail.exe process on a WCE device.
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7 155
5.4.2.2. pmemdump. With the exact names of the processes
in the listing in Fig. 7, a dump of the working memory of
a process can be made. The tool has several options for this.
The most straight forward way is to make a complete dump.
Processes in WCE<6.0, have a process space of 32 M. Not all of
this space is actually backed by physical memory, but one can
make a 32 M dump of the processes in the pps list. Fig. 8 shows
how a copy is made of the RAM used by the process tmail.exe.
The copy is 32 MB in size and stored in a file tmail.exe.bin. One
of the interesting parts inside a dump like this is the heap. In
Section 6.1, it will be shown how to find the heap inside
a memory dump and how to analyze the heap.
5.4.2.3. pdocread. pdocread can be used to make a copy of
partitions on storage devices inside a WCE device. Originally it
was aimed at copying the M-Systems DiskOnChip managed
NAND flash chips that were found in many smart phones. The
program grew into a versatile tool to make copies of managed
NAND flash of manufacturers like Samsung and Qualcomm.
The first step is to find out what partitions are present at the
WCE device. This can be done by running ‘pdocread-l’. In Fig. 9
this command is given to an HTC S730.
In Fig. 9 one can see that there are 4 partitions in this
particular device, these are the partitions pointed out in Section
3.3. We are mainly interested in the partition containing user
data. The file system of the partition will be TFAT or FAT. As the
file system type is stored in the boot sector of the partition, let’s
look at the first 0x200 bytes of each partition. The easiest way is
Fig. 9 – Output of pdocread, listing
use the handle references #0 through #3, listed in the four rows
right below ‘STRG handles’.
In Fig. 10 we see three attempts to read the first page. The
first attempt fails because in this case the tool tries to read
a DiskOnChip memory, which apparently is not present on this
WCE device. The second attempt fails; with the ‘-w’ option, the
tool now read the generic Windows file system API and not the
DoC API, but still, the block size isn’t specified correctly. The
third attempt succeeds; here the block size is specified to 0x800
bytes, which is the correct value here and a very common value
in many WCE devices. In these first 0x200 bytes, a regular boot
sector can be found. Notice that the file system type here is
TFAT. In the listing of the partitions in Fig. 9, we can see that the
partition under handle #0 has a size of 133.00 MB, which is
0x8500000 bytes. This size will be used to make a full copy of
this partition.
In Fig. 11 the next three partitions are checked. In none of
thesea regular filesystemisrecognized,so they will be left alone.
In Fig. 12 finally a full image of the partition under handle
#0 is made. The output is sent to the file htc_wing220_h0.bin.
In Fig. 13 the dump is shown. Here we have a TFAT32 partition,
with a sector size of 0x800 bytes.
6. Forensic analysis of the physical image
In this section we are going to analyze the images and dumps
fromvarioussources that wehavefound intheprevioussection.
all partitions in a WCE device.
Fig. 10 – Output of pdocread, trying to dump the first 0x200 bytes of partition #0.
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7156
6.1. Flash
6.1.1. Reconstructing the file systemBreeuwsma et al. (2007), section IV-A describes how to
reconstruct the file system from a physical acquisition of
a NAND flash chip. This principle is also applicable to WCE
devices. In the WCE devices that we have come across, the file
system could be reconstructed rather easily. Data in NAND
flash is organized in pages. We have come across page size of
0x210 and 0x840 bytes. When the page size is 0x210, usually
the last 0x10 bytes are spare areas. In the spare area, bytes 0–3
indicate the Logical Block Number (LBN), and byte 6 indicates
the state of the page. The state can be either: free (0xff), busy
(0xf9) or expired (0xf8). With the following pseudo code the
pages can be reordered to form a valid TFAT image.
Fig. 11 – Output of pdocread, dumping the firs
- While possible:
B Get the file offset
B Read 0x210 bytes
B State ¼ byte at 0x206
B LBN ¼ bytes at 0x200 through 0x203
B If state is 0xf9, store (file offset, LBN) / active pages list
B If state is 0xf8, store (file offset, LBN) / expired pages list
B If state is 0xff, store (file offset, LBN) / free pages list
- Sort active pages list to LBN
- While pages in active pages list:
B Goto offset of page
B Read 0x200 bytes
B Gap ¼ LBN-previous LBN
B If gap>1:
� While gap>1:
t 0x200 bytes of partition #1 through #3.
Fig. 12 – Making a full image of partition #0.
6 This API function is obsolete and should be replaced by CeO-penDatabaseEx2 (database).
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7 157
� Write empty page to output
� Decrease gap by 1
B Write 0x200 bytes to output
- While pages in expired pages list:
B Goto offset of page
B Read 0x200 bytes
B Write 0x200 bytes to output
When the file system is reconstructed this way, it can be
loaded into tools like EnCase or FTK.
6.1.2. Where can active data be foundThe typical WCE file system and registry are very similar to the
Windows desktop equivalents. With tools like EnCase a file
system image can be investigated. One of the challenges is
that there is much less known about forensically interesting
artifacts of applications on WCE devices, than of artifacts on
desktop OSes. A lot of research has still to be done to fill this
knowledge gap.
Besides file system and registry, there is generally a set
of databases for Messaging and Personal Information
Management (PIM) data on a WCE device. These databases
are grouped into two volume files, cemail.vol for messaging
related databases and pim.vol for PIM related data. The
cemail.vol volume is in CEDB format, pim.vol is in the
newer EDB format (Microsoft 7). Both formats are proprie-
tary and little formal documentation can be found on the
internals. WCE databases can be decoded through the
following steps:
- Extract cemail.vol from a file system image, f.i. with EnCase
- Use ‘cedb400.dll’ to open cemail.vol and read all items, see
6.1.2.1
- Use EDB API on a Device Emulator to read PIM.vol, see 6.1.2.2
6.1.2.1. cemail.vol. When a cemail.vol database is extracted
from a WCE device image, the database can be read ‘in isola-
tion’ by using a library that comes with the Windows CE
development environment called Platform Builder (PB). With
PB come a number of tools and utilities (Microsoft 8) that can
be useful for forensic purposes. For instance, when installed
on a PC, in the folder ‘WINCE520\PUBLIC\COMMON\OAK\BIN\
I386’ the library ‘cedb400.dll’ can be found. This dll contains all
functions to read a CEDB database. The API of ‘cedb400.dll’ is
shown in Table 2.
With the following pseudo code, all data can be read from
a CEDB volume file.
- CeMountDBVol(file name)
- CeFindFirstDatabaseEx
- While database ¼ CeFindNextDatabaseEx
B CeOidGetInfoEx(database)
B CeOpenDatabaseEx(database)
- While record ¼ CeReadRecordPropsEx(database)6
� While property in record
B Print property
B Update MD5
� Print MD5 of record
A tool was written, called xpdumpcedb.exe, following this
algorithm. The tool reads a cemail.vol (or any other CEDB
formatted volume file) and produces an XML file containing all
active data in the volume file. Fig. 14 shows a part of the
output when processing a sample cemail.vol file. The table
shown is the Inbox from the SMS root folder.
In the appendix, Table 5 lists some of the databases found
in the cemail.vol file. The meaning of some fields in
cemail.vol can be found in header files in the Platform Builder
directory, for instance property IDs from 0x3000 to 0x3FFF are
defined in WINCE500\PUBLIC\IE\SDK\INC\wabtags.h and
WINCE500\PUBLIC\ DATASYNC\SDK\INC\addrmapi.h. The
meaning of some of the other fields in these databases has
been identified by reverse engineering. There are still other
fields of which the meaning has not yet been determined.
Furthermore, unlike a regular database, each record within
a table can have different number of fields, which makes it
hard to determine when a database has been fully under-
stood while doing reverse engineering on this volume.
6.1.2.2. pim.vol. A PC targeted equivalent of ‘cedb400.dll’ for
EDB formatted volume files has not been found. An alternative
to reading an extracted EDB formatted volume file like pim.vol
in isolation is running a tool like xpdumpcedb on a WCE device,
or preferably, a WCE emulator. Microsoft provides a WCE
emulator that is suitable for running a decoder program similar
to that described in 6.2.2.1. The emulator can use a directory on
the host PC as a shared folder. This shared folder can be used to
store the EDB volume and the decoder tool. A tool called
‘wmdumpedb.exe’ was created to read an EDB file and produce
an XML file containing all active data in the edb volume.
In the appendix, Table 6 lists some of the databases found in
the pim.vol file. The meaning of some of the fields in pim.vol
can be found in header files in the Platform Builder directory,
Fig. 13 – A dump of an HTC S730 made with pdocread.
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7158
for instance WINCE500\PUBLIC\SERVERS\SDK\SAMPLES\
OBEX\SRVRMODS\VUTILS\pegmapi.h and INCE500\PUBLIC\
SERVERS\SDK\ SAMPLES\OBEX\SRVRMODS\VUTILS\splus-
tag.h. The meaning of some of the other fields has been
identified by reverse engineering. There are still other fields of
which the meaning has not yet been established. No investi-
gation has been done on whether the pim.vol volume (an EDB
formatted database) has that same property as a CEDB data-
base; not all records have the same fields all the time.
6.1.3. Where deleted data can be foundAs in any file system, deleted data can often be recovered from
a TFAT image from a WCE device. In unallocated clusters and
file slack data from deleted files can be found. As mentioned
earlier, TFAT is compatible with FAT, so an image is easily
loaded into various forensic toolkits.
On WCE devices, the databases storing message and pim
data can also be explored for deleted data as databases can
contain deleted information as well (Stahlberg et al., 2007).
Data deleted from the database volume file might not be found
Table 2 – Functions exposed by cedb400.dll.
Function name Address Ordinal
CeCreateDatabaseEx 10003B5E 1
CeCreateDatabaseEx2 100058E2 2
CeFindFirstDatabaseEx 100044FE 3
CeFindNextDatabaseEx 100046CA 4
CeMountDBVol 10005C82 5
CeOidGetInfoEx 10003D72 6
CeOidGetInfoEx2 10003C57 7
CeOpenDatabaseEx 10003C1E 8
CeOpenDatabaseEx2 10004A41 9
CeReadRecordPropsEx 10007BDD 10
CeSeekDatabase 10003EAB 11
CeSeekDatabaseEx 10005103 12
CeUnmountDBVol 10005D60 13
CeWriteRecordProps 10007EE7 14
CloseDBFindHandle 100049A5 15
CloseDBHandle 10005025 16
NTCreateDatabaseEx 10003C05 17
NTReserveOID 10005E64 18
NTSetFlags 10005FA3 19
DllEntryPoint 10003EC6
in unallocated clusters at all. This section deals with finding
deleted data in cemail.vol.
When analyzing an unknown embedded system, one can
work ‘top down’ or ‘bottom up’. Working ‘top down’ means
trying to understand the way data is stored in the system from
coarse to fine and finally understand the way in which user
data is stored in raw format. Somewhere in between, one will
find a mechanism with which the system deals with deleting
data and freeing up memory space occupied by deleted data.
In this way one might find data that is still present in the
system, but no longer available through the API and for logical
acquisition. The Windows CE Object Store was examined with
this method in Eide et al. (2006).
When working ‘bottom up’, one tries to identify the smallest
entities containing the user data, and try to carve and decode all
those entities. All data, both active and deleted data can be found
in this way, but as the mechanism to distinguish between active
and deleted data is (yet) unknown, at first data is not yet classified
as active or deleted. By comparing the output of a logical acqui-
sition, yielding active data, and the physical acquisition, yielding
all data, and subtracting the two, one can determine the deleted
data as the difference between the two sets.
Reverse engineering showed that it is rather straight
forward to find the location of individual database records in
the cemail.vol file. First one needs to know that there are 9
data types in a CEDB database, see Table 3.
A property of CEDB databases is that each record in a data-
base can have its own set of fields. This requires that each
record stores a list of fields that are in the record. Another
property of CEDB databases is that the data block of the record
can be split into odd and even bytes. Each of these two separate
byte streams might be compressed. Often Unicode text in
a record appears as plain ASCII in a binary dump of cemail.vol.
This happens when the even stream is stored uncompressed
and the odd stream, which contains only zero values for Latin
characters, is compressed. The general structure of a CEDB
record is shown in Fig. 15.
The (partial) record header always starts at a four byte
boundary. Next follows a list of field type indicators, each 4
bytes long (Table 4).
When analyzing the cemail.vol database volume file,
potential single records can be found using the following
pseudo code:
Fig. 14 – Partial output of xpdumpcedb.exe, showing SMS Inbox content.
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7 159
1. On a four byte boundary, search for a DWORD7 with the
following structure: *, *, [0j1],[0x0Bj0x41j0x05j0x40j0x02
j0x03j0x1Fj0x12j0x13]
2. Go back 8 bytes and read header structure
3. In 1, if byte three ¼¼ 1, this is the last type indicator, go to 6
4. Next DWORD should have the following structure: *, *,
[0j1],[0x0Bj0x41j0x05j0x40j0x02j0x03j0x1Fj0x12j0x13]
5. In 4, if byte three ¼¼ 0, this is not the last type indicator, go
to 4
6. Ready, potential (partial) record header found
When a potential record header is found, checks can be
done to eliminate false positives. For instance, before and
after het property id list, there are fields with data on
compression type and sizes (compressed and deflated) of the
record. With this data most false positives can be eliminated.
Also, it is very unlikely that decompression will yield sensible
data on random data found after a false hit.
A Python8 script was written, called cedbexplorer.py, to find
cedb records as described above. Once the location where
a database record is located is known, the records that are
compressed need to be decompressed. For this the ‘cedb400.dll’
was reverse engineered to find out how decompression in cedb
records works. The decompression is implemented into the
same Python script. Also the script calculates the MD5 hash
value over each field in a record and over the record as a whole.
With this MD5 hash, the script can search in the output of
xpdumpcedb.exe to find the corresponding MD5. If found, the
record is identified and marked as ‘active’. If not found, the
record is assumed ‘deleted’. Also the script produces a Book-
mark file for Hex Workshop,9 so that the records found can be
checked manually in Hex Workshop.
7 4 bytes.8 www.python.org.9 www.hexworkshop.com.
A screenshot of a cemail.vol file opened with Hex Work-
shop and the corresponding bookmark file opened is shown in
Fig. 16. Below is an explanation by field:
1. The bytes 0xB380-0xB387 makes up (a part of) the record
header.
a. The first 4 bytes are always 0x00.
b. The next 2 bytes are the external size in bytes after
decompression, here 0x17c. The two highest bits are
used as a flag, here 0x40, to indicate compression of the
record data.
c. The next 2 bytes are often the internal size of the record
body. In this field, there is again a flag in the two highest
bits.
2. The bytes 0xB388-0xB3BF list 14 DWORDs, those are the
type identifiers of the 14 fields in this record. With the
values in Table 3 the data types of each individual field can
be decoded. In the last type identifier, byte 3 is 0x01, indi-
cating end of list.
3. In 0xB3C0-0xB3C1 the highest bit is a flag: 0x80 means not
compressed, 0x00 means compressed. When uncom-
pressed, the rest indicates the length of the record in double
bytes.
4. When it is compressed, the next 6 (0xB3C2-0xB3C7) bytes
are the length of the even bytes stream. First 3 bytes for
uncompressed size, next 3 bytes for compressed size.
5. The next item is the compressed even byte stream. The text
is somewhat readable (in Dutch), but it also contains other
items, like time, that cannot be interpreted at all in this way,
because they are spread over the even and odd streams.
6. The next 6 bytes (0xB479-0xB47E) indicate the length of the
odd bytes stream. The first 3 bytes for uncompressed size,
the next 3 bytes contain the compressed size. As a check,
the uncompressed size for even and odd bytes should be
equal, and equal to length described in point 1.b multiplied
by two.
Table 3 – Data types in CEDB databases.
Name Type Numerical ID
CEVT_BOOL Boolean value 0x0B
CEVT_CEBLOB Binary object 0x41
CEVT_R8 8-byte floating-point value 0x05
CEVT_FILETIME Time and date data 0x40
CEVT_I2 2-byte signed integer 0x02
CEVT_I4 4-byte signed integer 0x03
CEVT_LPWSTR Long pointer
to a Unicode string
0x1F
CEVT_UI2 2-byte unsigned integer 0x12
CEVT_UI4 4-byte unsigned integer 0x13
Table 4 – CEDB field type indicators.
Byte Indicator Description
1 Data type For instance ‘‘subject’’, ‘‘receive date’’,
‘‘message is opened’’.2
3 Last field Always ‘0’, except in the last
field it is ‘1’
4 Variable type See Table 3
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7160
In Fig. 17 the output of cedbexplorer.py is shown, decoding
this same record. Notice how the MD5 of this record matches
the MD5 of record 6 in Fig. 14. This means that record number
43 is actually message 6 in the SMS inbox. When this message
is deleted by the user, cedbexplorer.py will still find it back in
the volume file, as long as it is not actively erased by the
messaging application.
Record body
Odd byte stream
Even byte stream
[Compressed] data container
[Compressed data container]
Record header (8 bytes)
List of record ‘field type indicators’ (N* 4 bytes)
6.2. RAM
In 5.4.2.2 it was shown how to make a dump of the working
memory of a process. A first examination of such a memory
dump can be done with knowledge that can be found in the
source code that comes with Platform Builder. In heap.h10
constants and structures are defined that can help to dissect
a RAM memory copy of a process.
The first step is to find the start of the heap within the
memory copy. The heap starts at a 64 kB boundary and has
a marker ‘HeaP’. Starting at the ‘HeaP’ marker, a structure
occupying 0x30 bytes is the heap header. Then follows a so
called region. The regions contain the actual heap items. The
region header also contains a field that can point to the next
region. If there is no next region, this pointer is 0 (Fig. 18).
The first heap item starts at offset 0x58. Each heap item has
a header of 8 bytes. Four bytes indicate the length of the item
(including item header) and a pointer to the heap itself. A
positive length indicates that the item is in use, a negative
length indicates a free item.
A python script called heapdigger.py was written to dissect
the heap of a WCE process. This script searches the heap
marker in an image, decodes the heap headers and subse-
quent heap items. With this script a first step can be made to
analyze the way a program stores and leaves data on the
stack.
While simulating an SMS message being written but
cancelled before sending it, acquisitions of the process
memory of tmail.exe were made with pmemdump.exe. Fig. 19
shows an example of the output of the script. Heap item 823,
at address 0x1E0820 and a length of 0x118, is ‘free’ and avail-
able for reuse. In the raw data of this item one can read the
cancelled email in HTML format.
10 See shared source code that comes with Platform Builder. Thefile heap.h is in the directory WINCE500\PRIVATE\WINCEO-S\COREOS\CORE\ LMEM\.
7. Discussion
In section 2, NOR and NAND flash were identified as important
sources of user data. In modern WCE devices, NAND flash is
getting more important as opposed to NOR flash. RAM can also
hold information that can be of forensic relevance. WCE
devices may also have special memory locations, for instance
in the processor itself. These locations might hold items like
unique numbers that can be used for encryption.
Section 3 identified software components that are either
relevant to storage of user data, or useful in making copies of
that user data. A TFAT file system hosted on flash memory can
contain user data like text documents, pictures, videos, but
also database files for messaging and PIM data. The heap,
located in RAM, can contain relevant information related to
processes running on the WCE device. Examples are naviga-
tion software having an NMEA reception buffer or email
clients maintaining a scratch pad.
In section 4 the order of acquisition of the components
holding potential evidence was explored. Depending on what
kind of evidence is looked for, the risks of different kinds of
acquisitions are compared against the likelihood of successful
recovery of the evidence. For example, if there is reason to
believe that the essential evidence will be deleted video, the
risk of damaging the physical memory chip during chip
extraction might be regarded less than the risk of erasing
deleted data by switching the phone on to do a pseudo phys-
ical acquisitions with tools like pdocread or XACT. Likewise,
when it is suspected that evidence might be in the RAM that is
occupied by a navigation application which is still active, one
might decide to make a copy of RAM using JTAG or, if that is
not feasible, using pmemdump. Finding out how to apply
JTAG on a specific device might not only be a very time
consuming task, applying JTAG holds a risk of system crash,
making a reset necessary. This will obviously cause risk to
[Compressed] data container
[Compressed data container]
Fig. 15 – Structure of a CEDB record body.
Fig. 16 – cemail.vol opened with HexWorkshop and corresponding bookmark file.
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7 161
deleted data in both RAM and flash memory. On the other
hand, when using pmemdump the OS is still active, there is
a risk that deleted data in flash and RAM will be erased beyond
recovery.
Section 5 described methods for physical and pseudo
physical acquisition. For two physical acquisition methods
a case example was given; physical memory chip removal and
JTAG have been shown to be feasible methods to make
a physical acquisition of a WCE device. Furthermore the use of
several RAPI tools is described; it was shown how to deter-
mine the list of running processes on a WCE device. With this
list, the name of a running process could be determined and
used to make a dump of the RAM occupied by this process.
Next it was shown how to make a pseudo physical acquisition
of flash based file system in a WCE device.
When using the RAPI tools, or any other tool that is doing
pseudo physical acquisition on a WCE device, one has to take
into account that these tools make use of dedicated software
that has to be loaded on the running machine. Running this
software on the device will at least overwrite unused RAM that
might hold evidence. If the software is transferred to the WCE
device, it has to be stored on external media like an SD card
first, before it is loaded into RAM. If the software is loaded onto
the internal file system, it might cause expired flash blocks to
be erased and reused, thus erasing potential evidence.
In section 6 tools were presented to analyze acquired data.
An algorithm was presented to reconstruct a TFAT file system
from a physical acquisition. This reconstruction yields an
TFAT image file containing flash pages belonging to the latest
version of the file system. This image can be further investi-
gated with COTS file system analysis tools like EnCase or FTK.
It also produces a file containing flash pages that no longer
belong to the latest version of the file system. This file can be
loaded too into analysis tools as unallocated clusters.
A tool called xpdumpcedb was presented. This tool runs
under Windows XP. With this tool, a cedb database volume file
(for instance exported from EnCase) can be read completely.
All fields in all records of all databases in the volume are
outputted in xml format. The meaning of fields within data-
base records can sometimes be found in header files in the
Platform Builder source code, sometimes though, the meaning
has to be established by reverse engineering. Furthermore,
a tool called wmdumpedb was presented. This tool runs under
WCE, for instance on a WCE Emulator on an XP machine. This
tool reads a edb formatted database volume file. It writes all
fields in all records of all databases in the volume to a file in
xml format. This tool has the disadvantage that is can only
run on a WCE device or a WCE device emulator. No possibility
has yet been found to make a similar tool that runs on desktop
OS natively.
Fig. 17 – Output of cedbexplorer.py.
Region hdr Next reg.
Heap item hdr
Heap Item
Heap item hdr
Heap Item
Heap item hdr
Heap Item
Heap header0x300x57
Region hdr Next reg.
Heap item header
Heap Item
Heap item header
Heap Item
Heap item header
Heap Item
0x00 0x2F
Fig. 18 – Structure of a WCE heap.
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7162
Furthermore, some aspects of low level structures of
a WCE cedb database was explained. With knowledge of this
structures, a python script called cedbexplorer.py was
developed. It was presented here and it was shown that the
script can find and decompress (if necessary) individual cedb
records. Because the tool does not look at the status of the
individual records, it also finds deleted records that are still
present in a cedb volume file. On the bases of MD5 hash
values calculated over the record fields, the tool can deter-
mine whether a record is active (already found by xpdump-
cedb) or deleted (not found by xpcedbdump). Because the
decompression algorithm is not yet fully understood,
Fig. 19 – Part of the output of heap analysis tool.
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7 163
sometimes active records are not decompressed correctly
and thus not found back in the xpdumpcedb results and
subsequently falsely marked as deleted. This effect is noticed
mainly in big records. The script carves the records out of the
database file and uses an indirect way to distinguish between
active and deleted records. This indirect method is not the
most efficient way, but the advantage is that it can be used
independent of the higher level structures of cemail.vol, so
that also for instance expired flash memory blocks and
unallocated clusters can be carved as well. In these locations
both active and deleted database records can be found
(Stahlberg).
Finally a script called heapdigger.py was presented. With
this script a dump of the RAM occupied by a process can be
searched for the presence of a heap. If a heap is found, it will
be analyzed. Sections in the heap are written to an xml file. In
this way busy and free heap items can be studied. These items
can contain relevant evidence, as processes can use the heap
to allocate buffers. Examples are a receive buffer for GPS
NMEA data, or a buffer to hold an email while it is written.
These items can still be found on the heap although they are
released for reuse. This tool is a proof of concept. Experiments
are needed to find out what information is left on the heap by
popular WCE applications. Actual value in a forensic investi-
gation has to be established in this way.
8. Future work
This paper is mainly a starting point for further investigation
of Windows CE based devices. Lots of questions came up
during case driven research of WCE devices, but often the
research was stopped because the required data was recov-
ered or a case was closed. Because of this, there are still a lot of
interesting issues to be studied.
Besides the heap, the stack is a place to look for data. Func-
tions store local variables on the stack. Data stored by functions
with a relatively long lifetime might be on the stack for a rela-
tively long time. This has not been explored from a forensic point
of view. Memory mapped files might be interesting in a forensic
context? First, ways of recovering those files have to be estab-
lished. Then the forensic relevance can be determined.
Important aspects of the format of cedb database records is
clear, but the decompression for these records presented in
this paper is not yet perfect and has to be improved. The
format of the cedb successor edb however is not yet known. It
is likely that edb databases will also contain deleted records
until the moment of a database clean-up.
Deleted data is found in cedb databases. It would be
interesting to establish to what extent deleted data will
remain in the volume files and compare cedb databases to the
databases investigated by Stahlberg.
Finally, the knowledge on forensically interesting artifacts
of popular software on Desktop Windows version is not one
on one transferable to WCE platforms. Research should be
conducted to find out about similarities and differences in the
way evidence left by applications on both platforms.
Acknowledgement
The author would like to thank colleagues at the Netherlands
Forensic Institute for support in investigation the Windows CE
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7164
platform and writing this paper. Furthermore the author
would like to thank the reviewers for very useful suggestions.
Appendix.
Table 5. Databases and field types found in cemail.vol.
Database name: pmailFolders
This database holds the folder structure of the messaging system. Generally there are several root folders for the variousmessaging methods: SMS, ActiveSync, Hotmail, and some POP/SMTP email accounts. Each of these root folders havesubfolders: Inbox, Outbox, Sent items, Drafts and Deleted items.
c Type identifier Data type Function
1 3001001F String PR_DISPLAY_NAME, name of folder. When 2 and 3 are
equal, this is a ‘messaging method’ (SMS, ActiveSync,
hotmail or some SMTP/POP email account). When c2 and c3
are not equal, this is the ‘message box type’ (inbox, outbox,
drafts, sent items or deleted)
2 80010013 Uint32 Database for this folder. The name of the database is ‘fldrX’,
where X is the hexadecimal value of this field. If for example
this field has the value 31000026, messages in this folder can
be found in database with the name fldr31000026
3 80050013 Uint32 ‘Messaging method’ link. The ‘messaging methods’ and the
‘message box types’ that are grouped together by having
this number equal
8119001F String Signature used when composing a message in this channel.
Empty when c2 s c3
4 8117001F String Protocol. ‘SMS’, SMTP’ are values found
820F0040 DateTime Unknown
5 82160040 DateTime Unknown. Maybe last time sync’d
Database name: fldr3100026
This is an example for instance an Inbox. For each messaging method there are at least five subfolders. The user can alsocreate extra folders.
c Type identifier Data type Function
1 0E060040 DateTime Receive date and time
3 0C1A001F String ‘File as’ name
4 0C1F001F String ‘From’ name
5 003D001F String Subject prefix, like ‘Re:’ or ‘Fwd:’
7 0E1B0013 Uint32 If>0 then there is an attachment to this message
8 80050013 Uint32 Attachment id. In the attachment database, there is a field
‘81000013’. The attachment to this message can be found in
the record where the value in ‘81000013’ is equal to the
value in ‘80050013’ in this database
Database name: pmailAttach
This database is used to link the file holding the attachment to the message the attachment is attached to.
c Type identifier Data type Function
1 81000013 Uint32 Attachment ID. Links to field 80050013 in message
folders named ‘fldrX’
2 370E001F String Mime type of attachment
3 3704001F String Original name of file attached
4 81000013 Uint32 First set of 8 digits of file name where attachment is
actually stored in on the WCE device
5 80010013 Uint32 Second set of 8 digits of file name where attachment is
stored in, joined with c4 with a ‘-’. If c4 contains
13002345 and c5 contains 23450012, then the
attachment is stored in:
\Windows\Messaging\Attachments\13002345-
23450012.att
Database name: pmailMsgs
This database holds additional data on sent messages. Depending on the messaging method and the type of message(sent/received/draft or deleted), different data is stored.
c Type identifier Data type Function
1 81000013 Uint32 Attachment ID. Links to the value in field 80050013 in
message folders named ‘fldrX’
2 0E090013 Uint32 Originating folder. If this contains 31000026, the message
is in ‘fldr31000026’
3 851F0040 DateTime Related to the message: received date/time, stored date/
time, deleted date/time
4 800F001F String Dependent of message type. Can be ‘File as’ name
5 800C001F String Dependent of message type. Can be ‘From’ name
6 800E0041 Blob Contains data on the message, often data already found in
other fields. Seen to contain Protocol type, ‘File as’ name,
From, Number, email address
7 80010013 Uint32 Message number. For email messages: points to the file
holding the message body, including email header.
Example: If this field holds 34000135, the email message is
stored in a file \Windows\Messaging\35340001[postfix].mpb
(the value is rotated right one byte). The [postfix] is seen to
have on of the values: 8242001e, 8241001f, 1013001e,
1000001f, 1000001e and 81030102, but this list is probably
incomplete. These values seem to indicate the format of
the email body: html without header, smtp header,
empty. Email messages do not necessarily have to have
only one email body file. It can have more, for different
storage format types.
Database name: MessageThreadsDB
This database holds messages. Why messages are stored separately in this database is not yet looked at.
C Type identifier Data type Function
1 00010040 DateTime Date/time (exact meaning not looked at yet)
2 0002001F String Email subject or SMS body text
3 0004001F String ‘From’ name
4 0005001F String Originating from (phone number or email address)
Table 6. Databases and field types found in pim.vol.
Database name:Appointments Database
Holds appointment items
c Type identifier Data type Function
1 10420040 DateTime Due date/time
2 00520040 DateTime Some other
date/time
(exact meaning not looked at yet)
3 0020001F String Subject
4 0041001F String Location
5 0051001F String Organizer
6 0029001F String Type of appointment
(exact meaning not looked at yet)
Database name:Contacts database
Holds contact items
c Type identifier Data type Function
1 0080001F String Name 1
2 0082001F String Name 2
3 0096001F String Number
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7 165
Tasks database
Holds tasks items
c Type identifier Data type Function
1 0020001F String Subject
2 0029001F String Type of task
(exact meaning not looked at yet)
3 00620040 DateTime Date/time 1
(exact meaning not looked at yet)
4 00630040 DateTime Date/time 2
(exact meaning not looked at yet)
5 00640040 DateTime Date/time 3
(exact meaning not looked at yet)
6 00660040 DateTime Date/time 4
(exact meaning not looked at yet)
Clog
Holds the calllog of this phone
c Type identifier Data type Function
1 00020040 Date/Time Date/time 1
(exact meaning not looked at yet)
2 00030040 Date/Time Date/time 2
(exact meaning not looked at yet)
3 0006001F String ‘File as’ name
4 0007001F String Tbd
5 000A001F String Tbd
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7166
r e f e r e n c e s
Association of Chief Police Officers (ACPO). Good practice guidefor computer-based electronic evidence. Online, cryptome.org/acpo-guide.htm.
Ayers R, Jansen W, Cilleros N, Daniellou R. Cell Phone ForensicTools: An Overview and Analysis. Online. National Institute ofStandards and Technology, <csrc.nist.gov/publications/nistir/nistir-7250.pdf>; October 2005.
Boling D. Windows CE.NET advanced memory management.Online, msdn.microsoft.com/en-us/library/ms836325.aspx;August 2002.
Breeuwsma M, Jongh M de, Klaver C, Knijff R van der, Roeloffs M.Forensic data recovery from flash memory. Small Scale DigitalForensics Journal June 2007;1(1).
Breeuwsma M. Forensic imaging of embedded systems using JTAG(boundary-scan). Digital Investigation March 2006;3:32–42.
BusinessWeek, Fiat and Microsoft Launch Blue&Me. Online,www.businessweek.com/autos/content/feb2006/bw20060202_986426.htm; February 2006.
Canalys. Smart phone market shows modest growth in Q3. Online,www.canalys.com/pr/2009/r2009112.html; November 2009.
Cellebrite, UFED Physical Pro. Online, www.cellebrite.com/UFED-Physical-Pro.html.
Eide J, Skogheim Olsen JO. Forensic analysis of an unknownembedded device. Online, ntnu.diva-portal.org/smash/get/diva2:121991/FULLTEXT01; June 2006.
Hengeveld Hengeveld W. Smartphone-policies. Online, www.xs4all.nl/witsme/projects/xda/smartphone-policies.html.
Hengeveld W. xda tools. Online, www.xs4all.nl/witsme/projects/xda/tools.html; November 2009.
Herrera C de. Windows CE/Windows Mobile Versions. Online,www.pocketpcfaq.com/wce/versions.htm; October 2009.
Intel� persistent storage manager user’s guide. Online, www.developers.net/filestore2/download/2613; September 2005.
Intel. Marvell to purchase Intel’s communications and applicationprocessor business for $600 Million. Online, www.intel.com/pressroom/archive/releases/2006/20060627corp.htm; June 2006.
Jansen W, Ayers R. Guidelines on cell phone forensics. Online.Recommendations of the National Institute of Standards andTechnology, <csrc.nist.gov/publications/nistpubs/800-101/SP800-101.pdf>; May 2007.
JTAG, In-system flash programming support. Online, www.jtag.com/en/Support/Device_support/Flash.
Knijff R van der. Embedded systems analysis. In: Casey E, editor.Handbook of digital forensics and investigation; 2010.
Marvell, communications processors. Online, www.marvell.com/products/processors/communications/pxa_90/.
Microsoft 1, supported processors. Online, msdn.microsoft.com/en-us/windowsembedded/ce/aa714536.aspx#ARM.
Microsoft 2, virtual memory layout: Windows CE 5.0 vs. WindowsEmbedded CE 6.0. Online, msdn.microsoft.com/en-us/library/aa914933.aspx.
Microsoft 3, heaps. Online, msdn.microsoft.com/en-us/library/aa450550.aspx.
Microsoft 4, TFAT overview. Online, msdn.microsoft.com/en-us/library/aa915463.aspx.
Microsoft 5, TFAT File naming limitations. Online, msdn.microsoft.com/en-us/library/ms892402.aspx.
Microsoft 6, driving connectivity. Online, download.microsoft.com/download/6/5/0/6505FA0E-1F39-4A34-BDC9-A655A5D3D2DB/MicrosoftAutoOverview.pdf.
Microsoft 7, databases. Online, msdn.microsoft.com/en-us/library/ms885343.aspx.
Microsoft 8, Windows embedded CE 6.0 evaluation edition.Online, www.microsoft.com/downloads/details.aspx?familyid¼7E286847-6E06-4A0C-8CAC-CA7D4C09CB56&displaylang¼en.
d i g i t a l i n v e s t i g a t i o n 6 ( 2 0 1 0 ) 1 4 7 – 1 6 7 167
MSAB, XACT datasheet. Online, www.msab.com/fileadmin/user_upload/media/Documents/Product_Sgeets/XACT.pdf.
Rogers A, Glaum J, Tonkelowitz M. Creating file systems within animage file in a storage technology-abstracted manner, www.freepatentsonline.com/EP1544732.pdf; June 2005.
Samsung, application processor. Online, www.samsung.com/global/business/semiconductor/products/mobilesoc/Products_ApplicationProcessor.html.
Stahlberg P, Miklau G, Levine B. Threats to privacy in the forensicanalysis of database systems. In: Proc. ACM SIGMOD/PODS;June 2007.
Texas Instruments, wireless handset solutions: overview. Online,focus.ti.com/general/docs/wtbu/wtbugencontent.tsp?templateId¼6123&navigationId¼11988&contentId¼4638.
Texas Instruments. OMAP35x applications processor technicalreference manual. Online, focus.ti.com/lit/ug/spruf98d/spruf98d.pdf; October 2009.
Xda-developers, wings SSPL and HardSPL. Online, forum.xda-developers.com/showthread.php?t¼356295; May 2007.
Xda-developers, hermes bootloader. Online, wiki.xda-developers.com/index.php?pagename¼Hermes_BootLoader; November2008.
Top Related