I/O Stack Optimization for Smartphones

bawltherapistΛογισμικό & κατασκευή λογ/κού

13 Δεκ 2013 (πριν από 3 χρόνια και 3 μήνες)

87 εμφανίσεις

I/O Stack Optimization for Smartphones

Sooman

Jeong
1
,
Kisung

Lee
2
,
Seongjin

Lee
1
,

Seoungbum

Son
2
,
and
Youjip

Won
1


1

Dept
. of Electronics and Computer
Engineering,
Hanyang

University

2

Samsung Electronics

2013 USENIX Annual Technical Conference (ATC’13)


SAN JOSE, CA, USA, JUNE 26~28, 2013

Outline


Motivation


Background


Analysis of the Android I/O Stack


Optimizations of the Android I/O Stack


Using the optimal journaling mode in SQLite


Alternative Filesystems


Eliminating unnecessary metadata flushes


External journaling


Using polling based I/O


Evaluations


Demo


Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

2


Motivation

Smartphone is everywhere!

[KPCB Internet Trends
2013]

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

3


Motivation

Storage I/O

is the performance bottleneck in Android.

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

4


Application Framework

Android
Platform

Window

Package

Telephony

Contact

Libraries

SQLite

libc



Android Runtime

Core lib

Dalvik VM

Linux Kernel

Storage

Display

WiFi

Filesystem

Power MM

Block Device driver

Audio

Key PAD

Apps

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

5


I/O
stack of Android Platform

EXT4

Block Device Driver

(CFQ, Interrupt Driven IO)

SQLite

Insert/update/delete

Read/write

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

6


I/O
characteristics of Android
Apps (GS3, ICS)

File Types

Block Types

I/O Modes

Locality

I/O Size

IRQs

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

7


I/O
characteristics of Android
Apps (GS3, ICS)

File Types

Block Types

I/O Modes

Locality

I/O Size

IRQs

SQLite > 90%

Metadata &
Journal > 40%

Synchronous > 70%

Random > 80%

4KB I/O > 64%

IRQ for eMMC

> 18%

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

8


SQLite > 90% !!!

Metadata & Journal > 40% !

Synchronous Write > 70% !!!

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

9


Journaling
in
SQLite (Delete Mode)

Insert a database entry

time

Create journal.

Record the data to journal.

Put commit mark to journal.

Insert entry to DB

Delete journal.

fsync
()

fsync()

SQLite

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

10


Journaling in EXT4 (ordered mode)

write(
fd
, )

metadata

data

journal

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

11


time

fsync
()

fsync
()

time

write SQLite journal to storage.

write EXT4 journal (descriptor, metadata) to storage.

write EXT4 journal (commit) to storage.

write SQLite DB to storage.

SQLite and EXT4

SQLite

EXT4

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

12


fsync
()

fsync
()

time

i
nsert()

SQLite

EXT4

Summary

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

13


fsync
()

fsync
()

time

i
nsert()

SQLite

EXT4

Summary

9 random writes to eMMC!!!

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

14


fsync
()

fsync
()

time

i
nsert()

SQLite

EXT4

Summary

9 random writes to eMMC!!!

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

15


EXT4 journals SQLite journal file
.

SQLite maintains DB journal.

EXT4 maintains filesystem journal.

+

=

Journaling of Journal

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

16


EXT4 journals SQLite journal file
.

SQLite maintains DB journal.

EXT4 maintains filesystem journal.

+

=

EXT4 journals SQLite journaling activity.

7
0
% of the
writes
are purely for managerial purpose
!

Journaling of Journal

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

17


EXT4 journals SQLite journal file
.

SQLite maintains DB journal.

EXT4 maintains filesystem journal.

+

=

EXT4 journals SQLite journaling activity.

7
0
% of the
writes
are purely for managerial purpose
!

Journaling of Journal

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

18


EXT4

Block Device Driver

SQLite

Optimize
Android

I/O stack !

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

19


SQLite Journaling mode

SQLite

DELETE

TRUNCATE

PERSIST

WAL

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

20


EXT4

SQLite

f
sync
() vs.
fdatasync
()

Eliminating
unnecessary
metadata
flushes

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

21


EXT4

XFS

Alternative Filesystems

NILFS2

F2FS

BTRFS

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

22


Interrupt vs. Polling

Block Device Driver

interrupt

polling

vs.

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

23


External Journaling

EXT4

vs.

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

24


SQLite Journaling Modes

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

25


Delete (GS3, ICS)

1

2

3

4

5

6

7

8

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

26


SQLite Journaling

SQLite DB ops

2
fsync()
and
9
writes

for one insert() !

Truncate
(GS3
, ICS)

1

2

3

4

5

6

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

27


SQLite Journaling

SQLite DB ops

t
runcate(.
db
-
journal)

2
fsync()

and
8
writes.

Persist
(GS3
, ICS)

1

2

3

4

5

6

7

8

9

3
fsync()

and
12
writes
.

The worst mode!

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

28


SQLite Journaling

SQLite DB ops

WAL
Mode
(GS3
, ICS)

1

2

3

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

29


Only
1
fsync()

and
3
writes
.

The best mode!

SQLite Journaling Mode

Summary

SQLite

Journaling Mode

DELETE

TRUNCATE

PERSIST

WAL

Number of


fsync() calls

2

2

3

1

Number of

IOs

9

8

12

3

EXT4

Journal size
(metadata)

24 KB

16 KB

8 KB

16 KB

Total IO Volume

72 KB

64 KB

72 KB

36 KB

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

30


Filesystems

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

31


fsync
()

fsync
()

time

i
nsert()

SQLite

EXT4


write()

followed by
fsync()


i
s the essence of the Android I/O.

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

32


EXT4

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

33


4 KB
write()

followed by
fsync()

BTRFS

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

34


NILFS2

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

35


XFS

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

36


F2FS (Flash Friendly Filesystem)

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

37


write()

followed by
fsync
()

BTRFS

NILFS2

XFS

F2FS

Summary

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

38


f
sync()

vs.
fdatasync
()

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

39


Eliminating

Unnecessary Metadata Flushes

fsync(fd0)

size

data

atime

mtime

data

fsync(fd1)

fdatasync(fd0)

size

data

atime

mtime

data

fdatasync(fd1)

Page cache

D
isk

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

40


Eliminating

Unnecessary Metadata Flushes

fsync(fd0)

size

data

atime

mtime

data

fsync(fd1)

fdatasync(fd0)

size

data

atime

mtime

data

fdatasync(fd1)

Page cache

D
isk

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

41


External Journaling

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

42


4K random write() followed by
fsync
()

R
andom

S
equential

Random!

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

43


External
journaling

Journal on separate partition




䙔F⁣慮⁥硰汯楴 瑨攠汯捡汩瑹 ⁉⽏!

sequential

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

44


Interrupt driven I/O vs. Polling based I/O

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

45


Multi
-
core on smartphones

Performance of mobile flash storage

2009

2011

2013

2015

Cores

Octa

Quad

D
ual

Hardware trend

The number of CPU cores ↑

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

46


I/O latency of eMMC ↓

Interrupt driven I/O

mmcqd

Send
I/O request

Sleep()

IRQ handler

Complete I/O request



Polling based I/O

mmcqd

Send
I/O request

Busy wait

Complete I/O request

Polling can reduce

context switching overhead!

Context
Switches

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

47


Sooman Jeong et al.

48


USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

1/50

Experiment

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

49


Implementation








Galaxy S3(ICS 4.0.4, Linux 3.0.15)

Component

Specification

CPU

Exynos

4412 1.4 GHz Quad
-
core

RAM

2 GB

Internal Storage

32 GB eMMC

External Storage

16 GB Transcend

u
-
SD Card

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

50


SQLite performance: journaling modes

SQLite Insert

281%

116%

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

51



TRUNCATE(default)


W䅌 › 1ㄶ1 異


TRUNCATE, EXT4(default)


W䅌ⱆ㉆区A
㈸ㄥ 異

SQLite performance: journaling modes

SQLite Update

348%

232%

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

52



TRUNCATE(default)


W䅌
㨠㈳㈥



TRUNCATE, EXT4(default)


W䅌ⱆ㉆区A
㌴3



f
sync
()
vs.
fdatasync
()

SQLite Insert

126%

17%

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

53



fsync()



fdatasync()

: 17%
up


fsync()



fdatasync()

and F2FS : 126%
up

SQLite Update

f
sync
()
vs.
fdatasync
()

250%

53%

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

54



fsync()



fdatasync()

:
53% up


fsync()



fdatasync()

and F2FS :
250% up

External
journaling

SQLite Insert

SQLite Update

30%

37%

39%

20%

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

55


Polling

# of

thread

Scenario

Idle

HD Record

base

poll

base

poll

1

KIOPS

1002

981

667

756

CPU (%)

7.5

10.9

26.4

30.2

10

KIOPS

2609

2705

2136

2351

CPU (%)

11.1

12.9

30.1

33.1

4 KB random
write+fsync
()


Marginal
gain (1~2%) when CPU is IDLE
.


13
% gain

when
we

record
HD video in background
.




Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

56


Real
Workload

Replay Twitter and Facebook by
Mobigen

Twitter

Facebook

-
71%

-
58%

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

57


Combining All the Improvements

B: Base

P: Polling

E: External Journaling

F:
fdatasync()

W: WAL mode

SQLite Insert

300%

134%

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

58



fdatasync(), Ext. J, Polling, WAL: 134% up

EXT4

Block Device Driver

SQLite

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

59


Finally,

Polling

Ext. J

fdatasync

F2FS

WAL

300
%
up !!

39


157 ins/sec

EXT4

Block Device Driver

SQLite

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

60


Finally,

Polling

Ext. J

fdatasync

F2FS

WAL

300
%
up !!

39


157 ins/sec


Android IO stack is collection of
unorchestrated

layers.


Journaling of Journal(JOJ) lies at the core of the problem.


We optimize Android I/O stack with WAL mode
in
SQLite, F2FS,
fdatasync(),
External journaling, polling
based
I/O.


What we achieved is…



With
legacy EXT4, SQLite performance improves by 134%.


With F2FS, SQLite performance improves by 300
%


solely via software modification on existing smartphone!

Conclusion

Sooman Jeong et al.

61


USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

Thank you…

77smart@hanyang.ac.kr

Sooman Jeong et al.

USENIX ATC'13, SAN JOSE, CA, June 26~28, 2013

62