Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage...

49
© 2010 VMware Inc. All rights reserved Smartphone storage virtualization: Fast, secure & reliable Harvey Tuch, Cyprien Laplace, Ken Barr – VMware Horizon Mobile Bi Wu – Duke University

Transcript of Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage...

Page 1: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

© 2010 VMware Inc. All rights reserved

Smartphone storage virtualization: Fast, secure & reliable Harvey Tuch, Cyprien Laplace, Ken Barr – VMware Horizon Mobile

Bi Wu – Duke University

Page 2: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

2

Context: Mobile Virtualization

§  Enterprise Bring Your Own Device (BYOD) •  Employee owned devices on

corporate network

•  Single device for personal and work use

•  Secure corporate assets from adverserial environment (e.g. Android market apps)

• Write once, run anywhere

CORPORATE PERSONAL

One Device Two Phones

Page 3: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

3

Context: Mobile Virtualization

§  Enterprise Bring Your Own Device (BYOD) •  Employee owned devices on

corporate network

•  Single device for personal and work use

•  Secure corporate assets from adverserial environment (e.g. Android market apps)

• Write once, run anywhere

§ Where does the VM live? •  Type-2 host hypervisor

•  Personal host, work guest

CORPORATE PERSONAL

One Device Two Phones

Page 4: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

4

Context: Mobile Virtualization

§  Enterprise Bring Your Own Device (BYOD) •  Employee owned devices on

corporate network

•  Single device for personal and work use

•  Secure corporate assets from adverserial environment (e.g. Android market apps)

• Write once, run anywhere

§ Where does the VM live? •  Type-2 host hypervisor

•  Personal host, work guest

§  Talk focus •  Providing performant, secure,

robust virtual storage to guest

CORPORATE PERSONAL

One Device Two Phones

Page 5: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

5

Contributions

§  Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage media • Characterization of performance requirements of mobile workloads

•  Security and reliability challenges

Page 6: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

6

Contributions

§  Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage media • Characterization of performance requirements of mobile workloads

•  Security and reliability challenges

§  Storage virtualization for mobile media • High performance log-structured VM image format for SD cards

•  5x-17x speedup over linear VM image formats

• Resistant to battery failure, phone drops, host crashes •  Secure against malicious host applications

Page 7: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

7

Contributions

§  Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage media • Characterization of performance requirements of mobile workloads

•  Security and reliability challenges

§  Storage virtualization for mobile media • High performance log-structured VM image format for SD cards

•  5x-17x speedup over linear VM image formats

• Resistant to battery failure, phone drops, host crashes •  Secure against malicious host applications

§  Improvements to mobile storage stack • How can OEMs, silicon vendors, SD card manufacturers, OS and virtualization

vendors + researchers improve both native and virtualized performance/security/reliability?

Page 8: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

8

Mobile storage challenges

Page 9: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

9

Smartphone storage devices

§  Internal storage • NAND flash devices •  Software Flash Translation Layer (FTL)

•  Limited size (256MB – several GB) •  Kernel, application code, libraries, middleware

Page 10: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

10

Smartphone storage devices

§  Internal storage • NAND flash devices •  Software Flash Translation Layer (FTL)

•  Limited size (256MB – several GB) •  Kernel, application code, libraries, middleware

§  External storage • microSD cards

• Hardware FTL • Up to 32GB today (2TB future)

•  Economics of semiconductor scaling • Optimized for cost, media workloads

•  FAT filesystem for compatibility

•  Application data (and some code)

Page 11: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

11

Guest User Application libc

Kernel VFS ext3 Block Layer Paravirtualized block driver

Host User LBS storage threads libc

Kernel VFS VFAT Block layer SD card layer SD card driver

Physical SD card Flash translation layer (FTL) NAND flash memory

Our additions

Hardware

Software

MVP Storage Architecture Layers

Page 12: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

12

VM image storage on SD cards

§  Size matters •  Storage footprint of guest may be several GB (including checkpoint images)

Page 13: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

13

VM image storage on SD cards

§  Size matters •  Storage footprint of guest may be several GB (including checkpoint images)

§ Goals • Hardware diversity – must support lowest common denominator

•  Performance – guest user experience should match host • Non-pertubation – existing data shouldn’t be lost, no reformatting

• Reliability - survive power loss, dropped phone

•  Security – hostile host environment (Angry Birds)

Page 14: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

14

VM image storage on SD cards

§  Size matters •  Storage footprint of guest may be several GB (including checkpoint images)

§ Goals • Hardware diversity – must support lowest common denominator

•  Performance – guest user experience should match host • Non-pertubation – existing data shouldn’t be lost, no reformatting

• Reliability - survive power loss, dropped phone

•  Security – hostile host environment (Angry Birds)

§  Challenges •  SD card Flash Translation Layer performance

•  FAT filesystem limitations

Page 15: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

15

SD card Flash Translation Layer performance

§  FTL optimized for cost, media workloads • Raw NAND sequential/random I/O gap is narrow • Minimize hardware FTL cost (SRAM page mapping tables)

• High latency bus transactions • Optimized for sequential media workloads (MP3, videos, photos), large

sequential transfers (SD card speed class)

•  VM workloads exhibit many small, non-sequential I/O, barrier operations

microSDCard

FTL

NAND

Page 16: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

16

VM image storage on SD cards - performance

.

SD cards are optimized for cost, compatibility and the I/Omixture expected from the transfer of large sequential files suchas MP3s, photos and videos. As a result, they are formatted withthe FAT filesystem and have simple flash translation layer (FTL)controllers (utilizing a minimum of costly SRAM) that performextremely poorly with small random writes [1, 6]. The I/O mixturefrom the guest is far less sequential than that of the media workloadsthat SD cards are intended for. In addition, FAT does not supportUnix permissions and does not provide robustness guarantees inthe event of a host upset. Users cannot be expected to reformat theSD card due to the non-perturbation requirement. These challengesmotivate a new VM backing store and checkpoint storage systemcapable of meeting the constraints outlined above while performingthe bulk of storage on FAT-formatted microSD cards.

Our contributions in this paper are as follows:

• An empirical characterization of the unique storage characteris-tics of SD cards and Android VM workloads.

• A storage architecture and block storage format, which we referto as the logging block store (LBS), capable of providing thedesired impedance matching between our enterprise VMs andlow cost consumer-grade SD card storage.

• Experimental evaluation of LBS and a performance characteri-zation.

• Potential optimizations at other levels of the I/O stack capableof improving VM performance if adopted in mobile platforms.

While the individual techniques we employ in LBS are familiar,to the best of our knowledge this is the first system to combine themto bridge the gap between the high performance/reliability/securityrequirements of a VM and the characteristics of the low-cost solidstate storage on mobile devices.

In the rest of the paper, we first show how device performancecharacteristics (Section 2) and virtual machine workload charac-teristics (Section 3) motivate the design of LBS. This is followedby the design and implementation details for LBS in Section 4and evaluation in Section 5. The paper concludes with suggestedoptimizations in Section 6, related work in Section 7 and futuredirections in Section 8.

2. SD card performance characteristicsAn SD card is composed of NAND devices, providing the rawstorage media, organized by an FTL into a logical block structurethat is exported across an SD card bus connector. The FTL performswear leveling, error detection and the remapping of bad blocks.The limiting storage performance characteristics are hence dictatedby the FTL, NAND read/write/erase times and page/erase blockorganization. For cost reasons, the FTLs are optimized for simplicityof implementation and minimization of SRAM, distinguishing SDcards from their richer cousins, solid state disks, which have moresignificant resources available for the FTL. The random accessproperty of NAND is as a result constrained by the FTL, with theinternal data structures utilized by simple FTLs being optimized forsequential write patterns and coarse block operations [6].

SD cards are rated by speed classes, e.g. Class 2, Class 10,indicating the expected minimum sequential I/O bandwidth (MB/s)in the presence of zero fragmentation [26]. Unfortunately, this ratingprovides no guarantee of random or fragmented I/O performance.We present some illustrative examples of these characteristics below,gathered on a HTC Nexus One smartphone by a synthetic tool,sdperf, designed to characterize SD cards. sdperf opens a file or rawblock device and performs read or write I/O of specific sizes to thetarget file. Sequential, strided, partitioned and random patterns aresupported.

Manufacturer Capacity Class Alloc. unit FAT clusterSanDiskTM 4GB 4 4MB 32KBSanDiskTM(WP7) 8GB 4 4MB 32KBKingstonTM 4GB 4 4MB 32KBADATATM 8GB 6 4MB 32KBPNYTM 16GB 10 4MB 32KB

Table 1. SD card details.

1 KB2 KB

4 KB8 KB

16KB

32KB

64KB

128 KB

256 KB

512 KB

1 MB2 MB

4 MB8 MB

Block size

0

5000

10000

15000

Ban

dwid

th(K

B/s

)

Seq, ReadSeq, Write

Rand, ReadRand, Write

Figure 1. 8 GB ADATA Class 6 SD card I/O bandwidth as afunction of block size and I/O ordering.

4 KB8 KB

16KB

32KB

64KB

128 KB

256 KB

512 KB

1 MB2 MB

4 MB8 MB

Block size

1

2

4

8

16

32

64

128

256

512

Seq

uent

ial:r

ando

mw

rite

band

wid

thra

tio

adata-8gbsandisk-4gbkingston-4gb

pny-16gbsandisk-8gb

Figure 2. Sequential:random write bandwidth ratio as a functionof block size.

Below we describe the results of various I/O read/write patternswithin a preallocated 128 MB file, intended to be representative ofa VM disk image file. The page cache layer in the Linux kernelwas bypassed with O DIRECT to avoid interference. Five SD cardsfrom different manufacturers and with different speed class ratingswere analyzed; card specific details are provided in Table 1. The8 GB SanDisk card packaging was labeled as being Windows Phone7 compliant, indicating potential improved support for randomread/write operations [29].

Page 17: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

17

VM image storage on SD cards - performance

.

SD cards are optimized for cost, compatibility and the I/Omixture expected from the transfer of large sequential files suchas MP3s, photos and videos. As a result, they are formatted withthe FAT filesystem and have simple flash translation layer (FTL)controllers (utilizing a minimum of costly SRAM) that performextremely poorly with small random writes [1, 6]. The I/O mixturefrom the guest is far less sequential than that of the media workloadsthat SD cards are intended for. In addition, FAT does not supportUnix permissions and does not provide robustness guarantees inthe event of a host upset. Users cannot be expected to reformat theSD card due to the non-perturbation requirement. These challengesmotivate a new VM backing store and checkpoint storage systemcapable of meeting the constraints outlined above while performingthe bulk of storage on FAT-formatted microSD cards.

Our contributions in this paper are as follows:

• An empirical characterization of the unique storage characteris-tics of SD cards and Android VM workloads.

• A storage architecture and block storage format, which we referto as the logging block store (LBS), capable of providing thedesired impedance matching between our enterprise VMs andlow cost consumer-grade SD card storage.

• Experimental evaluation of LBS and a performance characteri-zation.

• Potential optimizations at other levels of the I/O stack capableof improving VM performance if adopted in mobile platforms.

While the individual techniques we employ in LBS are familiar,to the best of our knowledge this is the first system to combine themto bridge the gap between the high performance/reliability/securityrequirements of a VM and the characteristics of the low-cost solidstate storage on mobile devices.

In the rest of the paper, we first show how device performancecharacteristics (Section 2) and virtual machine workload charac-teristics (Section 3) motivate the design of LBS. This is followedby the design and implementation details for LBS in Section 4and evaluation in Section 5. The paper concludes with suggestedoptimizations in Section 6, related work in Section 7 and futuredirections in Section 8.

2. SD card performance characteristicsAn SD card is composed of NAND devices, providing the rawstorage media, organized by an FTL into a logical block structurethat is exported across an SD card bus connector. The FTL performswear leveling, error detection and the remapping of bad blocks.The limiting storage performance characteristics are hence dictatedby the FTL, NAND read/write/erase times and page/erase blockorganization. For cost reasons, the FTLs are optimized for simplicityof implementation and minimization of SRAM, distinguishing SDcards from their richer cousins, solid state disks, which have moresignificant resources available for the FTL. The random accessproperty of NAND is as a result constrained by the FTL, with theinternal data structures utilized by simple FTLs being optimized forsequential write patterns and coarse block operations [6].

SD cards are rated by speed classes, e.g. Class 2, Class 10,indicating the expected minimum sequential I/O bandwidth (MB/s)in the presence of zero fragmentation [26]. Unfortunately, this ratingprovides no guarantee of random or fragmented I/O performance.We present some illustrative examples of these characteristics below,gathered on a HTC Nexus One smartphone by a synthetic tool,sdperf, designed to characterize SD cards. sdperf opens a file or rawblock device and performs read or write I/O of specific sizes to thetarget file. Sequential, strided, partitioned and random patterns aresupported.

Manufacturer Capacity Class Alloc. unit FAT clusterSanDiskTM 4GB 4 4MB 32KBSanDiskTM(WP7) 8GB 4 4MB 32KBKingstonTM 4GB 4 4MB 32KBADATATM 8GB 6 4MB 32KBPNYTM 16GB 10 4MB 32KB

Table 1. SD card details.

1 KB2 KB

4 KB8 KB

16KB

32KB

64KB

128 KB

256 KB

512 KB

1 MB2 MB

4 MB8 MB

Block size

0

5000

10000

15000

Ban

dwid

th(K

B/s

)

Seq, ReadSeq, Write

Rand, ReadRand, Write

Figure 1. 8 GB ADATA Class 6 SD card I/O bandwidth as afunction of block size and I/O ordering.

4 KB8 KB

16KB

32KB

64KB

128 KB

256 KB

512 KB

1 MB2 MB

4 MB8 MB

Block size

1

2

4

8

16

32

64

128

256

512S

eque

ntia

l:ran

dom

writ

eba

ndw

idth

ratio

adata-8gbsandisk-4gbkingston-4gb

pny-16gbsandisk-8gb

Figure 2. Sequential:random write bandwidth ratio as a functionof block size.

Below we describe the results of various I/O read/write patternswithin a preallocated 128 MB file, intended to be representative ofa VM disk image file. The page cache layer in the Linux kernelwas bypassed with O DIRECT to avoid interference. Five SD cardsfrom different manufacturers and with different speed class ratingswere analyzed; card specific details are provided in Table 1. The8 GB SanDisk card packaging was labeled as being Windows Phone7 compliant, indicating potential improved support for randomread/write operations [29].

Page 18: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

18

VM image storage on SD cards - performance

.

1 2 4 8 16 32 64 128

256

512

1024

2048

4096

8192

1638

432

768

Stride blocks

0

2000

4000

6000

8000

10000

12000

Ban

dwid

th(K

B/s

)

4KB, Write 256KB, Write

Figure 3. 8 GB ADATA Class 6 SD card write bandwidth as afunction of inter-write stride distance and block size.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16# of interleaved sequential workloads

0

2000

4000

6000

8000

10000

12000

14000B

andw

idth

(KB

/s)

adata-8gbsandisk-4gbkingston-4gb

pny-16gbsandisk-8gb

Figure 4. Write bandwidth as a function of the number of inter-leaved sequential workloads, separated by 2 AU, at 256 KB blocksize.

0 10 20 30 40 50 60 70 80 90 100Write

0

5000

10000

15000

Ban

dwid

th(K

B/s

)

Seq, 4KBSeq, 256KB

Rand, 4KBRand, 256KB

Figure 5. 8 GB ADATA Class 6 SD card I/O bandwidth as afunction of write percentage in I/O mixture and I/O ordering.

1 KB2 KB

4 KB8 KB

16KB

32KB

64KB

128 KB

256 KB

512 KB

1 MB2 MB

4 MB8 MB

Block size

0

5000

10000

15000

Ban

dwid

th(K

B/s

)

Frag, ReadNon-frag, Read

Frag, WriteNon-frag, Write

Figure 6. 8 GB ADATA Class 6 SD card sequential I/O bandwidthas a function of block size and fragmentation.

Figure 1 provides the observed I/O bandwidth as a function ofthe block size and access pattern on the 8 GB Class 6 ADATA SDcard. There is little difference in these examples between sequentialand random read performance, but a marked distinction on writes, inparticular at small block sizes. Figure 2 provides the sequential-to-random write performance ratio for all five cards. A similar randomwrite penalty can be observed across the tested cards, with theexception of the 8 GB SanDisk. The card exhibited comparablesequential and random write performance at 4 KB block sizes butbehaved similarly otherwise to its peers past 16 KB. We use the 8 GBADATA card as a running example in the rest of the paper since itskey characteristics are similar to other cards we have examined.

The penalty for a non-sequential write is not uniform, it dependsto some extent on the location and distance between the two writes,as well as the history of previous writes. Figure 3 provides the writebandwidth at the 4 KB and 256 KB block sizes when a stride takesplace between writes. At stride of 1 block, we have the sequentialcase and performance drops until writes are an allocation unit(AU) apart. The AU is a logical unit provided by the SD card atwhich erase operations are preferred and speed class calculationsperformed. For a given card it has a fixed size, dictated by the NANDerase block size and card internal organization (4 MB for the cardin the figure). Writes at a smaller granularity can involve a read-modify-write operation. When the stride becomes sufficiently large,we might expect to see a change in performance when once againonly a single AU is in use, as strides wrap around at the file size,128 MB. A performance improvement occurs earlier however, at32 MB, likely due to an FTL implementation that supports efficientinterleaving of writes to multiple AUs as long as sequentiality ismaintained within each stream [19]. This effect is visible in Figure 4,where we simulate interleaved sequential writers, with the writesoccurring at a distance of 2 AU. Several of the cards show goodperformance with up to four sequential writers. The PNY card isbest with a single writer, but supports 2-4 writers with mid-rangeperformance. The Kingston card supports only a single writer. Whileit may be tempting to exploit these patterns, they are card anddistance specific: with only 5 cards we were able to identify 3behaviors. We assume the non-sequential write penalty to be highfor the rest of this paper, since we are aiming to provide a portablesolution where the SD card is unknown.

Even a small number of write accesses in an I/O mixture candrive overall performance towards the write performance curve, asindicated in Figure 5, where write accesses were inserted at random

Page 19: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

19

VM image storage on SD cards - performance

.

1 2 4 8 16 32 64 128

256

512

1024

2048

4096

8192

1638

432

768

Stride blocks

0

2000

4000

6000

8000

10000

12000

Ban

dwid

th(K

B/s

)

4KB, Write 256KB, Write

Figure 3. 8 GB ADATA Class 6 SD card write bandwidth as afunction of inter-write stride distance and block size.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16# of interleaved sequential workloads

0

2000

4000

6000

8000

10000

12000

14000

Ban

dwid

th(K

B/s

)

adata-8gbsandisk-4gbkingston-4gb

pny-16gbsandisk-8gb

Figure 4. Write bandwidth as a function of the number of inter-leaved sequential workloads, separated by 2 AU, at 256 KB blocksize.

0 10 20 30 40 50 60 70 80 90 100Write

0

5000

10000

15000

Ban

dwid

th(K

B/s

)

Seq, 4KBSeq, 256KB

Rand, 4KBRand, 256KB

Figure 5. 8 GB ADATA Class 6 SD card I/O bandwidth as afunction of write percentage in I/O mixture and I/O ordering.

1 KB2 KB

4 KB8 KB

16KB

32KB

64KB

128 KB

256 KB

512 KB

1 MB2 MB

4 MB8 MB

Block size

0

5000

10000

15000

Ban

dwid

th(K

B/s

)

Frag, ReadNon-frag, Read

Frag, WriteNon-frag, Write

Figure 6. 8 GB ADATA Class 6 SD card sequential I/O bandwidthas a function of block size and fragmentation.

Figure 1 provides the observed I/O bandwidth as a function ofthe block size and access pattern on the 8 GB Class 6 ADATA SDcard. There is little difference in these examples between sequentialand random read performance, but a marked distinction on writes, inparticular at small block sizes. Figure 2 provides the sequential-to-random write performance ratio for all five cards. A similar randomwrite penalty can be observed across the tested cards, with theexception of the 8 GB SanDisk. The card exhibited comparablesequential and random write performance at 4 KB block sizes butbehaved similarly otherwise to its peers past 16 KB. We use the 8 GBADATA card as a running example in the rest of the paper since itskey characteristics are similar to other cards we have examined.

The penalty for a non-sequential write is not uniform, it dependsto some extent on the location and distance between the two writes,as well as the history of previous writes. Figure 3 provides the writebandwidth at the 4 KB and 256 KB block sizes when a stride takesplace between writes. At stride of 1 block, we have the sequentialcase and performance drops until writes are an allocation unit(AU) apart. The AU is a logical unit provided by the SD card atwhich erase operations are preferred and speed class calculationsperformed. For a given card it has a fixed size, dictated by the NANDerase block size and card internal organization (4 MB for the cardin the figure). Writes at a smaller granularity can involve a read-modify-write operation. When the stride becomes sufficiently large,we might expect to see a change in performance when once againonly a single AU is in use, as strides wrap around at the file size,128 MB. A performance improvement occurs earlier however, at32 MB, likely due to an FTL implementation that supports efficientinterleaving of writes to multiple AUs as long as sequentiality ismaintained within each stream [19]. This effect is visible in Figure 4,where we simulate interleaved sequential writers, with the writesoccurring at a distance of 2 AU. Several of the cards show goodperformance with up to four sequential writers. The PNY card isbest with a single writer, but supports 2-4 writers with mid-rangeperformance. The Kingston card supports only a single writer. Whileit may be tempting to exploit these patterns, they are card anddistance specific: with only 5 cards we were able to identify 3behaviors. We assume the non-sequential write penalty to be highfor the rest of this paper, since we are aiming to provide a portablesolution where the SD card is unknown.

Even a small number of write accesses in an I/O mixture candrive overall performance towards the write performance curve, asindicated in Figure 5, where write accesses were inserted at random

Page 20: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

20

VM image storage on SD cards - performance

.

0x02710

0x04e20

0x07530

0x09c40

0x0c350

0x0ea60

0x11170

49 49.2 49.4 49.6 49.8 50

Blo

ck in

de

x

Time (s)

DataInode

MetadataJournal

Figure 7. 1 second sample of browsing session writes on ext3.

by placing files of 1 MB in size until the card is filled, and thenreleasing a single FAT cluster from each file prior to image fileallocation. This results in the image file being spread across themaximum number of AUs. The curves representing performance ofsequential I/O on a non-fragmented filesystem are the same as thosein Figure 1. When the file system is severely fragmented, data isonly contiguous within a 32 KB FAT cluster, so read performance islimited by the speed of 32 KB reads. Sequential write performancedevolves to that of 32 KB random writes.

Beyond the above observations, there are implementation detailsthat apply to specific cards, for example the initial blocks backingthe file allocation table are optimized for the smaller, non-sequentialwrites that occur in the region [16]. These card specific details arealso not relied upon in the LBS design presented in Section 4.

3. Virtual machine I/O mixtureThe I/O mixture from the perspective of the host is non-sequential forthree reasons in our system: guest filesystem design, opportunisticcheckpointing and application behavior.

The guest uses ext3 and FAT filesystems over paravirtual-ized block storage devices. With the simpler FAT filesystem, non-sequentiality can arise from application access patterns and frag-mentation. The journaling ext3 filesystem introduces additionalnon-sequential writes, jumping between ordinary data and the jour-nal. Figure 7 shows a sample of a block write trace to an ext3partition during an Android 2.2 browsing session. Non-sequentialwrites can be observed as the application accesses four differentregions of data (the four “stripes” below 0x09c40). In addition, ac-cess to meta-data, inodes and the journal interrupt data access withadditional non-sequentiality.

Product also supports virtual machine checkpointing. It ispossible to save all virtual machine state on the host storage andto restore it later. A virtual machine’s state is composed of thefollowing parts:

• The virtual platform, including CPU registers and virtualdevice state (approximately 200 KB).

• The storage, maintained in persistent images by the paravirtu-alized block storage devices.

0k

10k

20k

30k

40k

50k

60k

70k

80k

90k

520 540 560 580 600 620 640 660 680 700

Pa

ge

ind

ex

Time (s)

Writebacks

Figure 8. 180 second sample of background cold page writebacks.

• The memory, which may require � 512 MB data to be writtenon checkpoint.

Of these components, saving the VM’s large memory dominatesthe space and time required to save a checkpoint. To shortenthe duration of checkpoint creation, unused memory is written topersistent storage proactively in the background. An adapted Clock-Pro [14] working set estimation algorithm is used to select blocksto write. Cold blocks are preferred with the assumption that theyare the least likely to change prior to explicit checkpoint creationtime. Unfortunately, this working set driven selection often leads tonon-sequential ordering of page writebacks, as shown in the samplegiven in Figure 8.

Application behavior in the presence of the guest buffer cacheintroduces another source of non-sequential I/O. We examinedseveral mobile workloads by generating traces of their I/O (detailsof the tracing procedure are given in Section 5).

1. Android Boot. The initial boot of an Android OS. The traceends when Android issues its BOOT COMPLETED intent. Aconsiderable number of writes occur during Android’s optimiza-tion of application bytecode.

2. Contacts Database. Import 2000 contacts into the AndroidContacts application. Search for and delete 40 contacts.

3. Mail Client. Use the Android Mail 2.2.1 client to access anIMAP mailbox. The mailbox is 24 MB and contains 356 mes-sages and 3 folders with lengths and attachments generated bythe SPECmail2009 benchmark [27] initialization script to reflectsize distributions of a large corporation.

4. Slideshow. Browse through 52 NASA images [18] using theAstro file browser [17]. Astro creates thumbnails and scalesphotos from their original size to fit on the device’s 800x480pixel screen.

5. Web Browsing. A one second sample of web browsing activityusing the Android 2.2 browser.

The Android guest that produced the traces has 5 partitions asshown in Table 2. Note that squashfs partitions are not writable andare managed by flat files in our implementation rather than LBS.

4

Page 21: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

21

VM image storage on SD cards - performance

.

0x02710

0x04e20

0x07530

0x09c40

0x0c350

0x0ea60

0x11170

49 49.2 49.4 49.6 49.8 50

Blo

ck in

de

x

Time (s)

DataInode

MetadataJournal

Figure 7. 1 second sample of browsing session writes on ext3.

in the I/O mixture at differing percentages. Beyond 10% there islittle difference between a mixed and pure write workload.

Fragmentation within an AU will result in a decrease in sequen-tial write performance. In the above experiments, the image filehad zero measured fragmentation, with logically contiguous blocksplaced on contiguous FAT clusters. Figure 6 presents a comparisonof sequential I/O performance in the non-fragmented and worst casefragmentation cases. Worst case fragmentation here is simulatedby placing files of 1 MB in size until the card is filled, and thenreleasing a single FAT cluster from each file prior to image fileallocation. This results in the image file being spread across themaximum number of AUs. The curves representing performance ofsequential I/O on a non-fragmented filesystem are the same as thosein Figure 1. When the file system is severely fragmented, data isonly contiguous within a 32 KB FAT cluster, so read performance islimited by the speed of 32 KB reads. Sequential write performancedevolves to that of 32 KB random writes.

Beyond the above observations, there are implementation detailsthat apply to specific cards. For example, the initial blocks backingthe file allocation table are optimized for the smaller, non-sequentialwrites that occur in the region [16]. These card specific details arealso not relied upon in the LBS design presented in Section 4.

3. Virtual machine I/O mixtureThe I/O mixture from the perspective of the host is non-sequential forthree reasons in our system: guest filesystem design, opportunisticcheckpointing and application behavior.

The guest uses ext3 and FAT filesystems over paravirtual-ized block storage devices. With the simpler FAT filesystem, non-sequentiality can arise from application access patterns and frag-mentation. The journaling ext3 filesystem introduces additional non-sequential writes, jumping between ordinary data and the journal.Figure 7 shows a sample of a block write trace to an ext3 parti-tion during an Android 2.2 web browsing session. Non-sequentialwrites can be observed as the application accesses four differentregions of data (the four “stripes” below 0x09c40). Access to meta-data, inodes and the journal interrupt data access with additionalnon-sequentiality.

0k

10k

20k

30k

40k

50k

60k

70k

80k

90k

520 540 560 580 600 620 640 660 680 700

Pa

ge

ind

ex

Time (s)

Writebacks

Figure 8. 180 second sample of background cold page writebacksof a large space of guest physical memory.

MVP also supports virtual machine checkpointing. It is possibleto save all virtual machine state on the host storage and to restore itlater. A virtual machine’s state is composed of the following parts:

• The virtual platform, including CPU registers and virtualdevice state (approximately 200 KB).

• The storage, maintained in persistent images by the paravirtu-alized block storage devices.

• The memory, which may require � 512 MB of data to bewritten on checkpoint.

Of these components, saving the VM’s large memory dominatesthe space and time required to save a checkpoint. To shortenthe duration of checkpoint creation, unused memory is written topersistent storage proactively in the background. An adapted Clock-Pro [14] working set estimation algorithm is used to select blocksto write. Cold blocks are preferred with the assumption that theyare the least likely to change prior to explicit checkpoint creationtime. Unfortunately, this working set driven selection often leads tonon-sequential ordering of page writebacks, as shown in the samplegiven in Figure 8.

Application behavior in the presence of the guest buffer cacheintroduces another source of non-sequential I/O. We examinedseveral mobile workloads by generating traces of their I/O (detailsof the tracing procedure are given in Section 5).

1. Android Boot. The initial boot of an Android OS. The traceends when Android issues its BOOT COMPLETED intent. Aconsiderable number of writes occur during Android’s optimiza-tion of application bytecode.

2. Contacts Database. Import 2000 contacts into the AndroidContacts application. Search for and delete 40 contacts.

3. Mail Client. Use the Android Mail 2.2.1 client to access anIMAP mailbox. The mailbox is 24 MB and contains 356 mes-sages and 3 folders with lengths and attachments generated bythe SPECmail2009 benchmark [27] initialization script to reflectsize distributions of a large corporation.

4. Slideshow. Browse through 52 NASA images [18] using theAstro file browser [17]. Astro creates thumbnails and scales

Page 22: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

22

FAT filesystem limitations

§  Reliability • Dropped phone, battery depletion, host kernel crash • No journalling on FAT, meta-data loss and corruption possible

• Data loss, no barrier semantics

§  Security •  Threat model

•  Physical attacks •  Malicious host applications

• No access control on FAT

• Confidentiality and integrity concerns •  Replay attacks •  Randomization attacks

Page 23: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

23

Mobile storage virtualization

Page 24: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

24

Data FAT

Meta-data ext4

Logging Block Store (LBS)

§  Log structured VM image format •  Sequentialize guest I/O op mixture • Write buffering to increase SD card I/O sizes

§  Reliability •  Barrier entries inserted in log when guest

kernel issues hardware drain reqs

• Rollback to last barrier on recovery

• Capture intended application semantics (e.g. SQLite transactions)

§  Security •  Split data (SD card, FAT) + meta-data logs

(internal, ext4)

•  XTS-AES block encryption of data log

•  Fletcher-32 or SHA-256 integrity checksums

controllers (utilizing a minimum of costly SRAM) that performextremely poorly with small random writes [1, 6]. The I/O mixturefrom the guest is far less sequential than that of the media workloadsthat SD cards are intended for. In addition, FAT does not supportUnix permissions and does not provide robustness guarantees inthe event of a host upset. Users cannot be expected to reformat theSD card due to the non-perturbation requirement. These challengesmotivate a new VM backing store and checkpoint storage systemcapable of meeting the constraints outlined above while performingthe bulk of storage on FAT-formatted microSD cards.

Our contributions in this paper are as follows:• An empirical characterization of the unique storage characteris-

tics of SD cards and Android VM workloads.• A storage architecture and block storage format, which we refer

to as the logging block store (LBS), capable of providing thedesired impedance matching between our enterprise VMs andlow cost consumer-grade SD card storage.

• Experimental evaluation of LBS and a performance characteri-zation.

• Potential optimizations at other levels of the I/O stack capableof improving VM performance if adopted in mobile platforms.

While none of the techniques we employ in LBS are particularlynovel, to the best of our knowledge this is the first system tobridge the gap between the high performance/reliability/securityrequirements of a VM and the characteristics of the low-cost solidstate storage on mobile devices.

In the rest of the paper, we first show how device performancecharacteristics (Section 2) and virtual machine workload charac-teristics (Section 3) motivate the design of LBS. This is followedby the design and implementation details for LBS in Section 4and evaluation in Section 5. The paper concludes with suggestedoptimizations in Section 6, related work in Section 7 and futuredirections in Section 8.

2. SD card performance characteristicsAn SD card is composed of NAND devices, providing the rawstorage media, organized by an FTL into a logical block structurethat is exported across an SD card bus connector. The FTL performswear leveling, error detection and the remapping of bad blocks.The limiting storage performance characteristics are hence dictatedby the FTL, NAND read/write/erase times and page/erase blockorganization. For cost reasons, the FTLs are optimized for simplicityof implementation and minimization of SRAM, distinguishing SDcards from their richer cousins, solid state disks, which have moresignificant resources available for the FTL. The random accessproperty of NAND is as a result constrained by the FTL, with theinternal data structures utilized by simple FTLs being optimized forsequential write patterns and coarse block operations [6].

SD cards are rated by speed classes, e.g. Class 2, Class 10,indicating the expected minimum sequential I/O bandwidth (MB/s)in the presence of zero fragmentation [26]. Unfortunately, this ratingprovides no guarantee of random or fragmented I/O performance.We present some illustrative examples of these characteristics below,gathered on a HTC Nexus One smartphone by a synthetic tool,sdperf, designed to characterize SD cards. sdperf opens a file or rawblock device and performs read or write I/O of specific sizes to thetarget file. Sequential, strided, partitioned and random patterns aresupported.

Below we describe the results of various I/O read/write patternswithin a preallocated 128 MB file, intended to be representative ofa VM disk image file. The page cache layer in the Linux kernelwas bypassed with O DIRECT to avoid interference. Five SD cardsfrom different manufacturers and with different speed class ratings

Manufacturer Capacity Class Alloc. unit FAT clusterSanDiskTM 4GB 4 4MB 32KBSanDiskTM(WP7) 8GB 4 4MB 32KBKingstonTM 4GB 4 4MB 32KBADATATM 8GB 6 4MB 32KBPNYTM 16GB 10 4MB 32KB

Table 1. SD card details.

Figure 1. 8 GB ADATA Class 6 SD card I/O bandwidth as afunction of block size and I/O ordering.

Figure 2. Sequential:random write bandwidth ratio as a functionof block size.

were analyzed; card specific details are provided in Table 1. The8 GB SanDisk card packaging was labeled as being Windows Phone7 compliant, indicating potential improved support for randomread/write operations [29].

Figure 1 provides the observed I/O bandwidth as a function ofthe block size and access pattern on the 8 GB Class 6 ADATA SDcard. There is little difference in these examples between sequentialand random read performance, but a marked distinction on writes, inparticular at small block sizes. Figure 2 provides the sequential-to-random write performance ratio for all five cards. A similar random

2

Page 25: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

25

LBS Meta-data structure

LBS header

Logical block index Physical block index

Zero block? & Run length (n) Physical block index

Checksum & Timestamp [0] Checksum & Timestamp [...]

Checksum & Timestamp [n-1] Barrier magic

Meta-data checksum

Page 26: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

26

Barrier magic :

Checksums :

LBS Meta-data check : File is correct

1) Parse meta-data to find barrier magic

2) Compute and compare meta-data checksums

3) Repeat for each meta-data barrier entry

Page 27: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

27

LBS Meta-data corruption : corrupted file

Barrier magic :

Checksums :

Last meta-data checksum does not match barrier checksum.

Page 28: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

28

LBS Meta-data corruption: corrupted file

Barrier magic :

Checksums :

Last meta-data checksum does not match barrier checksum.

Ignore meta-data after previous valid barrier entry.

Page 29: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

29

Performance results

1 KB2 KB

4 KB8 KB

16KB

32KB

64KB

128 KB

256 KB

512 KB

1 MB2 MB

4 MB8 MB

Block size

0.60

0.65

0.70

0.75

0.80

0.85

0.90

0.95

1.00

Ban

dwid

thR

atio

(LB

S/F

lat)

Seq, Read Rand, Read

1 KB2 KB

4 KB8 KB

16KB

32KB

64KB

128 KB

256 KB

512 KB

1 MB2 MB

4 MB8 MB

Block size

1

10

100

1000

Ban

dwid

thR

atio

(LB

S/F

lat)

Seq, Write Rand, Write

Figure 11. 8 GB ADATA Class 6 SD card I/O bandwidth: Ratiobetween LBS and Flat file format.

of bars for each partition that had non-negligible traffic. 160 MBwas set aside to provide LBS with sufficient free blocks such thatgarbage collection does not occur; the impact of garbage collectionis shown in Section 5.3.

Recall that the Contact Database workload is almost completelysmall writes with little sequentiality and a relatively small number ofbarriers. Thus, it exhibits a dramatic 17� improvement over the flatvirtual disk due to LBS transforming the I/O into large, sequentialwrites. The other write-heavy workloads benefit as well. The emailworkload is nearly 5� faster with LBS than it is with a flat file; thereason it is not better may be due to it having the highest percentageof barriers within its I/O trace. Each barrier forces a flush of theLBS write buffer, and smaller writes are less efficient.

5.3 Write amplificationAs described in Section 4.4, a garbage collection thread runswhen space is low to provide free clusters for future writes. Whilethis causes additional write operations to occur, their cost can bemitigated for two reasons. First, if no other write activity is required,the additional writes can occur in the background. Second, the writeswill be to contiguous blocks which are relatively inexpensive on SDcards.

In Table 5, we show this write amplification due to LBS forthe two workloads that triggered garbage collection. The tableaggregates the writes from all of the virtual disk partitions. In thiscontext, we ignore the effect of hardware-level writes that may occurdepending on the implementation of the SD card’s flash translationlayer. Column 1 lists the number of 1 KB writes generated by theguest block driver. Note that these writes have been filtered by theguest OS buffer cache at the time of trace capture.

Column 2 shows the number of writes that actually occur,measured in units of both 1 KB blocks and 256 KB LBS clusters.These counts were gathered by setting aside 100% of the virtualdisk’s space as a buffer for the garbage collector so that the workloadnever triggers the GC threshold, and no GC writes occur. A small

trace requested LBS writes LBS writes with GCwrites without GC

Contacts 219124 219124 blocks 223593 blocks,blocks 855 clusters 873 clusters

(802 contiguous)Email 216057 215972 blocks 220160 blocks

blocks 843 clusters 860 clusters(813 contiguous)

Table 5. Software-level write amplification due to garbage collec-tion. 12% additional storage used for garbage collection.

reduction in writes occurs in the Email Client workload due tooverwriting stale data in the LBS write buffer.

Column 3 reflects both the reduction in writes due to bufferingand the write amplification due to garbage collection (the LBS datafile had 12% extra space for GC). While the number of blocks thatmust be written has increased, recall that blocks belong to largerclusters, many of which are contiguous. The large size and highcontiguity make LBS efficient.

5.4 Cost of encryption and integrity checkingFigure 12 also shows how encryption and integrity checking affectperformance. The cost of encryption varies for each trace from aslittle as 2% for the Email Client workload, with its high barrierpercentage, to as much as 35% for Android Boot. The relative costof encryption increases as block size increases. For larger blocks,fixed I/O overhead represents a smaller fraction of overall time,and encryption overhead (which is proportional to block size) has agreater relative impact. Investigating the use of hardware encryptionoffload engines is left as future work.

As mentioned in Section 4.2, data blocks can be protected bythe fast Fletcher checksum or a slower-to-compute, cryptographi-cally strong SHA-256 checksum. The incremental cost of integritychecking with Fletcher-32 ranges from 1%–6%. Unfortunately,cryptographic hashes such as SHA-256 are relatively expensive tocompute on each block. Using a C implementation of Fletcher-32and the standard Android 2.2 OpenSSL implementation on a NexusOne device (in ARM assembly language), the following throughputfigures (with 1KB block sizes) represent the speed at which mem-ory contents can be checksummed. SHA-256 requires nearly 7.7�longer than Fletcher-32 to process data in these tests.

SHA-1 46.83 MB/sSHA-256 31.26 MB/sFletcher (32 bits) 240.81 MB/s

Corporate policies may demand cryptographically strong hashalgorithms such as SHA-256. In our tests, sdperf results weredegraded by approximately 4–8% for block sizes less than 4 KBwhen SHA-256 was used instead of Fletcher-32. Overhead of themore expensive hash function increases with larger I/O as the cost ofcomputation becomes a larger portion of the total I/O cost. While thisis non-negligible, even the worst-case sdperf result (17% reductionin bandwidth for 256 KB reads, 13% reduction for writes) is muchbetter than the 7.7� slowdown observed in memory-only tests.

6. Optimizing the mobile I/O stackLBS effectively bridges the gap between the available storage mediacharacteristics and VM I/O requirements. Enhancements in the guest,host and SD card have potential to further improve LBS performanceand/or simplify the VM image format. These optimizations mayprove beneficial even on non-virtualized systems.

Page 30: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

30

Performance results

andro

idbo

ot

(/data

)co

ntacts

(/data

)em

ail

(/data

)bro

wser

(/data

)

slide

show

(/data

)

slide

show

(/sdc

ard)

1

5

10

15

20

Spe

edup

vsFl

at

LBSLBS + checksum

LBS + encryptionLBS + checksum and encryption

Figure 12. Performance of I/O traces without garbage collection.

6.1 GuestThe sequential vs. random and block size write asymmetry may alsobe addressed at the filesystem level in the guest, rather than with theblock storage virtualization layer, with a suitable guest log structuredfilesystem. The YAFFS and JFFS families of flash filesystemsemploy a log structure and appear natural candidates on Android.Beyond scalability concerns, the chief drawback in a virtualizedsetting is the increased complexity at the guest-VMM surface, wherea device model capable of emulating NAND devices is required,since these filesystems leverage the spare area associated with pagesin NAND. MVP currently presents a simple paravirtualized blockdevice to the kernel’s block layer, minimizing both guest driver andVMM complexity and overhead. In addition, such filesystems areoptimized for direct NAND control and some translation will berequired to bridge between the FTL-abstracted NAND on the SDcard and the virtual NAND devices. Other Linux log structuredfilesystems, e.g. LogFS and NILFS, are experimental as of 2.6.35and are unsuitable for enterprise VMs as a result.

A stable implementation of a Linux log structured filesystemoperating on the block layer will open up the possibility of deploy-ment within MVP guests, lessening the need for the log structure ofLBS, but not the security and reliability aspects. The guest memoryworking set behavior and resulting checkpoint subsystem writebackscannot be modified as simply as the filesystem, since these are prop-erties of the application and kernel and not under our control. Asa result, the host filesystem is still presented with a non-sequentialI/O stream, and a system such as LBS will continue to be advanta-geous for VM checkpointing on SD cards — even if a log structuredfilesystem is in use in the guest.

6.2 HostThe fundamental assumption guiding a hosted mobile hypervisorarchitecture is that the host OS and hardware are the purview ofOEMs, silicon and mobile OS vendors. Below we provide severalsuggestions for these entities that should prove beneficial for VMsor applications that make extensive use of the SD card:

• As mentioned in Section 4, the LBS data files are allocated onthe FAT filesystem at initialization time, reducing fragmentationand ensuring availability of space at runtime. This is a slowoperation for large images: since FAT does not support sparsefiles or extents, space must be reserved by allocating each blockof the file (e.g., with zeroes). MVP provides a host kernel patch

that allocates filesystem structures for an LBS file and omits theblock zeroing. OEMs may optionally apply this patch to improvethe speed of VM provisioning.

• Similar to the guest in Section 6.1, formatting the host’s SDcard with alternate filesystems could lessen the requirementson the virtualization layer in terms of I/O reordering, securityand reliability. This would come at the cost of limiting theinter-operability of the SD card with other devices expecting aFAT filesystem or via USB mass storage. In addition, SD cardFTLs are optimized for FAT and the use of other filesystemswill require careful tuning [5] (or cooperation from SD cardmanufacturers to design hardware to suit a new filesystemstandard).There is also a trend towards the use of eSD/eMMC chipson recent phones for internal storage. These devices share theproblems of microSD cards discussed above, with the exceptionthat they can use a custom filesystem and often employ ext3/ext4.The relationship between these filesystems, the more generalI/O mixture from Android applications and middleware and theFTLs on the devices is an interesting area for exploration.

• SD card access control granularity can be improved withoutmodifying the filesystem, for example by the use of loopbackmounted encrypted images on the SD card with dm-crypt [24],with mounts restricted to specific applications or capabilities.This approach has been supported for read-only application codesince Android 2.2, but not for application data, which is wherethe VM images are located. In addition, integrity checking iscurrently lacking from solutions based around these mechanismson the Android and Linux platforms.

• The TRIM command for solid-state disks provides a means todetect when a block is no longer in use. TRIM commands aresupported by LBS, which marks given blocks as free in themeta-data. This can reduce the garbage collector overhead andincreases the write bandwidth.An example of where we greatly benefit from TRIM in LBSGC is the interaction between memory ballooning [30], used tobalance memory between the host and guest, and the checkpointsubsystem’s continuous writeback. It is common for the balloonto release large amounts of guest memory, which translates di-rectly to the discarding of the released pages from the checkpointimage.

Page 31: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

31

Related and future work

Page 32: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

32

Lots of SSD and Flash papers, but little SD card-specific work

§  SD Card characterization •  Arnd Bergmann. Optimizing Linux with cheap

flash drives. Linux Weekly News, February 2011. / Linaro Project. •  Most inspirational of related work.

•  Luc Bouganim, Björn Þór Jónsson, and Philippe Bonnet. uFLIP: Understanding flash IO patterns. In Conference on Innovative Data Systems Research, January 2009. •  Very thorough set of microbenchmarks

• Hyojun Kim, Nitin Agrawal, and Cristian Ungureanu. Revisiting Storage for Smartphones. In USENIX FAST, February 2012. •  Very good characterization of impact on end-

user-applications

Page 33: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

33

Comparison of log-structured filesystems

Name Block level

Current / Mature / Stable / Popular

Encrypting Compressing

Sprite LFS [1,2] Cloudburst [3] JFFS2 [4] YAFFS2 [5] NANDFS [6] LogFS [7] NILFS2 [8] LBS

Page 34: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

34

Mobile virtualization stack wishlist

§ Guest filesystem •  YAFFS/JFFS increase complexity (NAND device virtualization) •  Log structured filesystems for block devices (LogFS, NILFS)

§  SD card •  FTL and filesystems that match emerging application use cases • Ratings beyond sequential SD card speed classes

•  Embedded SD (eSD/eMMC)

§  Host • Hardware acceleration of cryptographic operations • Hardware support for secure key storage

•  Loopback support for encrypted (or journalled, logged) filesystems

Page 35: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

35

Wrapping up

§  Identified storage challenges on mobile platforms • Characterization of performance of mobile storage media and workloads •  Security and reliability challenges

§  Storage virtualization for mobile media • High performance log-structured VM image format for SD cards •  5x-17x speedup over linear VM image formats

• Resistant to battery failure, phone drops, host crashes

•  Secure against malicious host applications

§  Improvements to mobile storage stack •  Both native and virtualized

Page 36: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

36

Log structured filesystems. None was a perfect match for us.

§  Introduced in Sprite 1.  J. Ousterhout and F. Douglis, "Beating the I/O Bottleneck: A Case for Log-Structured

File Systems," Operating Systems Review, Vol. 23, No. 1, January 1989. 2.  Mendel Rosenblum and John K. Ousterhout. The design and implementation of a log-

structured file system. In ACM Symp. on Operating Systems Principles, October 1991.

§  Adopted for flash 3.  Gretta Bartels and Timothy Mann. Cloudburst: A compressing, logstructured virtual

disk for flash memory. Tech. Report 2001-001, Compaq SRC, Feb. ‘01. 4.  David Woodhouse. JFFS: The journaling flash file system. In Ottawa Linux

Symposium, July 2001. 5.  Charles Manning. YAFFS: the NAND-specific flash file system - Introductory Article.

http://linuxdevices.org, September 2002. 6.  Aviad Zuck, Ohad Barzilay, and Sivan Toledo. NANDFS: a flexible flash file system for

ram-constrained systems. In ACM Int’l Conf. on Embedded Software, October 2009.

§  Among emerging Linux filesystems (popularity of SSDs, desire for snapshots) 7.  LogFS. http://logfs.org (defunct?) 8.  NILFS: http://www.nilfs.org

Page 37: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

37

LBS Meta-data structure

LBS header

Logical block index Physical block index

Zero block? & Run lenght (n) Physical block index

Checksum & Timestamp [0] Checksum & Timestamp [...]

Checksum & Timestamp [n-1] Barrier magic

Meta-data checksum

Page 38: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

38

LBS Meta-data check : barrier magics & checksums

Barrier magic :

Checksums :

Page 39: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

39

LBS Meta-data check : barriers magic & checksums

Barrier magic :

Checksums :

1) Parse meta-data to find barrier magic

Page 40: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

40

Barrier magic :

LBS Meta-data check : barriers magic & checksums

Checksums :

1) Parse meta-data to find barrier magic

2) Compute and compare meta-data checksums

Page 41: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

41

Barrier magics :

Checksums :

LBS Meta-data check : File is correct

1) Parse meta-data to find barrier magic

2) Compute and compare meta-data checksums

3) Repeat for each meta-data barrier entry

Page 42: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

42

Barrier magics :

Checksums :

LBS Meta-data corruption #1 : truncated file

Use case : on power loss, meta-data writes haven't been completed.

Page 43: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

43

Barrier magics :

Checksums :

LBS Meta-data corruption #1 : truncated file

Last barrier magic is not found.

Page 44: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

44

Barrier magics :

Checksums :

LBS Meta-data corruption #1 : truncated file

Last barrier magic is not found.

Ignore meta-data after previous valid barrier entry.

Page 45: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

45

LBS Meta-data corruption #2 : corrupted file

Barrier magics :

Checksums :

Use case : on power loss, not all meta-data have been written to disk.

Page 46: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

46

LBS Meta-data corruption #2 : corrupted file

Barrier magics :

Checksums :

Last meta-data checksum does not match barrier checksum.

Page 47: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

47

LBS Meta-data corruption #2 : corrupted file

Barrier magics :

Checksums :

Last meta-data checksum does not match barrier checksum.

Ignore meta-data after previous valid barrier entry.

Page 48: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

48

Preferred cluster

Garbage Collection: Cluster Selection Heuristics

§ Write position •  If we are writing cluster 21, getting 22 freed would be a huge bonus

§  Emptiness • Cluster size – Used Blocks

22 � 23 � 24 � 25 � 26 � 27 �

22 � 23 � 24 � 25 � 26 � 27 �

Page 49: Smartphone storage virtualization: Fast, secure & reliable · 6 Contributions ! Identifies storage challenges on mobile platforms • Characterization of performance of mobile storage

49

Preferred cluster

Garbage Collection: Cluster Selection Heuristics

§  Left Empty • Cluster 24 is a good candidate because it frees up contiguous clusters

(23 and 24)

§ Outlier correction • Reclaiming cluster 8 increases the ability to form a larger run of clusters with

subsequent application of heuristics.

22 � 23 � 24 � 25 � 26 � 27 �

5 � 6 � 7 � 8 � 9 � 10 �