Improving MeeGo boot-up time
-
Upload
hiroshi-doyu -
Category
Documents
-
view
1.336 -
download
0
description
Transcript of Improving MeeGo boot-up time
Preface Background Handset Boot-Up status My experiment Further optimization idea Q & A
Preface
Inspired by QuickBoot
Ubiqutous QuickBoot
http://www.ubiquitous.co.jp/En/products/middleware/quickboot
Embedded Linux Wiki
Boot Time - eLinux.org
http://elinux.org/Boot_Time
Tributed to Tim Bird
Improving Android Boot-Up Time
Background
Impact of boot-up time
For consumer client device
User experience TV, IVI, Camera Immediate action is preferable right after power on.
Tablet, netbook, handset Is cold start really necessary?
More complicated S/W stacks, more memory consumed.
Mass Production test The more time a device spends on the production line, the more expensive.
Boot-Up time definition
Until when?
When Login prompt appears. When Desktop shows up. When Network is available. When Browser is ready. When it can take a picture. When CPU goes into idle. This depends on: Your H/W configuration.
Your S/W configuration.
Your system requirements.
The shortest isn’t always the best.
Measurement method(kernel)
printk timestamps show_delta: linux-2.6/scripts/show_delta, a python script
initcall debugging dmesg -s 256000 | grep "initcall" | \ sed "s/\(.*\)after\(.*\)/\2 \1/g" | sort -r -n bootgraph dmesg | \ linux-2.6.git/scripts/bootgraph.pl > output.svg ftrace
Measurement method(userland)
uptime / # cat /proc/uptime 18.73 14.24 / # cat /proc/uptime 20.55 16.05 bootchart A newer version is released in MeeGo No additional tool to create svg. Directly created.
entire measurement Including bootloader, kernel and userland
grabserial show_delta, again
oprofile ETM, Embedded Trace Macrocell, H/W assisted
Existing Optimization techniques
kernel optimization asynchronous initcall
asynchronous resume/suspend
misc: preset lpj, no probe, no console, deferred module loading
userland optimization initscript: upstart or systemd. Do it in parallel
readahead
prelink
hibernation based optimization snapshot boot
InstantBoot
Warp2
QuickBoot
BIOS/bootloader assisted.
Is cold start still necessary?
Do we need cold start so often? Flashing a hibernation image in advance could reduce the production
line usetime.
Optimization may depend on your product specific part S/W configuration
H/W configuration
Your system requirement
Wouldn’t hibernation be ok in most cases?
Handset Boot-Up status
Handset requirement
Responsiveness of device/applications Quick response could improve UX, especially Handsets. One touch can choose a friend from "contact list".
One touch can start camera. Same as digital camera.
One touch can start web browsing.
A call has to be processed within a short time, from operator spec.
Resolving dynamic libraries takes more time than swapping in pages. All major applications can be started but invisible Then, visible upon request.
RAM is occupied with started applications/daemons.
Handset Boot-Up time
N900 boot-up takes ~40 sec Until Desktop shows up.
Number of applications 137 Swap status
Handset Bootgraph
Handset BootChart
Handset memuse
N900 Boot time breakdown
Bootloader: 0.44 sec Kernel: 2.68 sec. With serial console.
Could be shorter without serial console.
Desktop: 39.03 sec
My experiment
Target spec
OMAP3 based reference board Similar to N900
512MB RAM MeeGo Handset Number of applications ~161
~120 sec with all application boot-up done
Swap status
No hibernation support for ARM
There was no hibernation support for ARM. Picked up old patch, and upgraded to v2.6.35. Rejected by RMK because: Need to be synch’ed with suspend-to-ram
Lack of PXA support
coprocessor differences between ARM versions
mrc p15, 0, %0, c2, c0, 0
At least, it works! Let’s proceed.
Which hibernation method to use?
Three implementation of hibernation
1. swsusp Included in mainline kernel as default.
2. uswsusp Userland implementation
3. tuxonice Out of kernel, but many features Compression of images
multiple thread I/O
readahead
LVM support
Start with swsusp
To start hibernation echo disk > /sys/power/state
swsusp/eMMC
swsusp/eMMC
Use mtdblock rather than eMMC
mtdblock is much faster than eMMC. mtdblock ~23 MB/sec/READ
eMMC ~20 MB/sec/READ
~15 MB/sec/READ
This is a HACK since: mtdblock itself is bogus without wear-leveling support.
mtdswap is *volatile*. Good performance
But cannot be used for hibernation.
Need non-volatile mtdswap!!
swsusp/MTD
swsusp/MTD
Port TuxOnIce on ARM
TuxOnIce has many optimization features:
Compression of images multiple threaded I/O readahead LVM support To drop pagecache echo -2 > /sys/power/tuxonice/image_size_limit To start hibernation echo disk > /sys/power/tuxonice/do_hibernation
TuxOnIce/MTD
TuxOnIce/MTD
Shrink memory before hibernation
Reclaim memory as much as possible right before hibernation. echo 10000 > /sys/power/shrink_mem
TuxOnIce/MTD/shirink_mem
TuxOnIce/MTD/shirink_mem
What is the bottleneck?
The smaller RAM consumed, the lesser boot time. But cannot squeeze any more after certain size
In our case: size: ~110 MB
~70% of boot time is spent on (compressed) image restoration.
meminfo/shirink_mem
What occupies RAM?
Who uses lots of memory MeeGo "memuse" can identify.
Why unevictable?
Recent SoC has smart coprocessors GPU, DSP and H/W accelerators.
They may have IOMMU. More memory could be shared with coprocessors
http://en.wikipedia.org/wiki/IOMMU
Why does IOMMU have an effect?
pages have to be DMA’able. Shared pages have to be pinned. They shouldn’t be swapped out. Unevictable
Further optimization idea
Linearity of hibernation method
Linux VM tries to occupy RAM as much as possible(ex: page cache). RAM consumption can be squeezed at certain point. The boot time increases in proportion to the size of unevictable
memory.
For further optimization, we need something more!
Proposals
1. To increase read performance of storage Faster storage? mtd gets shorter boot-up time than eMMC
faster mtd gets shorter boot-up time than slower mtd
non-volatile mtdswap driver
LVM swap to improve disk performance by raid-0
2. Still to decrease image size Kill & restart bloated Apps if possible. maybe a bit brutal, but it works certainly.
Swap out unevictable pages How to ensure if those pages exisit when it’s necessary?
page coloring memory cgroup, which process page can be swapped out
3. Lazy image/page loading
Don’t we forget the system responsiveness?
Example: Ubiquitous QuickBoot
Can be considered as "Lazy image/page loading":
http://www.ubiquitous.co.jp/En/products/middleware/quickboot