Latency in Cloud Computing and Recent Research on IDC in Xen

smilinggnawboneInternet et le développement Web

4 déc. 2013 (il y a 4 années et 11 mois)

112 vue(s)

Latency in Cloud Computing and
Recent Research on IDC in Xen
Sisu Xi
Network Seminar on 03/04/2013
Latency Matters to Services
Amazon: Revenue decreased by 1% of salesfor every 100 ms latency

http://highscalability.com/blog/2009/7/25/latency‐is‐everywhere‐and‐it‐
costs‐you‐sales‐how‐to‐crush‐it.html
Google: slowing down the search results page by 100 ms to 400 ms has 
a measurable impact on the number of searches per user of ‐0.2% to ‐
0.6%

http://googleresearch.blogspot.com/2009/06/speed‐matters.html
Firefox: 2.2 seconds faster web response increases 15.4%more Firefox 
install package download. (equals 10.28 million additional downloads
per year)

http://blog.mozilla.org/metrics/2010/04/05/firefox‐page‐load‐speed‐‐‐part‐ii/
2
Into the Virtualized World
Infrastructure as a Service (IaaS)

Amazon EC2: Amazon Elastic Compute Cloud

http://aws.amazon.com/media‐sharing/

Microsoft Azure: Use your OS, language, database, tool

http://www.windowsazure.com/en‐us/

Google Compute Engine: Run your large‐scale computing workload

https://cloud.google.com/products/compute‐engine
Question: 
Can these services guarantee network latency to end user?
3
Outline
Current services in cloud computing

Microsoft Azure

Google Compute Engine

Amazon EC2
Networking in Xen

Para‐virtualization network architecture
IDC in Xen

Why is it important?

Three Shared‐Memory Approaches

Our Approach: RTCA
Summary
4
Microsoft Azure
3/4/20135
Windows Azure Pricing
http://www.windowsazure.com/en‐us/pricing/calculator/?scenario=virtual‐
machines
Windows Azure Pricing
http://www.windowsazure.com/en‐us/pricing/calculator/?scenario=virtual‐
machines
Google Compute Engine
3/4/20136
Google Compute Engine Pricing
https://cloud.google.com/pricing/compute‐engine
Google Compute Engine Pricing
https://cloud.google.com/pricing/compute‐engine
Amazon EC2
3/4/20137
Amazon EC2 Instance Types
http://aws.amazon.com/ec2/instance‐types/
Amazon EC2 Instance Types
http://aws.amazon.com/ec2/instance‐types/
Current Services in Cloud Computing
CPU, Memory resources can be dedicated, which provide 
highest level isolation
Network resource are usually shared

no mechanism for rate‐control, let alone priority

coarse grand indicator (low/medium/large)

in Amazon, can pay more to get dedicated network resources
Recall: 
Can these services guarantee network latency to end user?
8
Amazon EC2 in Action
9
The Impact of Virtualization on Network Performance of Amazon EC2 Data Center
Guohui Wang, T.S. Eugene Ng,  INFOCOM 2010
The Impact of Virtualization on Network Performance of Amazon EC2 Data Center
Guohui Wang, T.S. Eugene Ng,  INFOCOM 2010
Enabling Technologies ‐‐Xen
10
Scheduling I/O in Virtual Machine Monitors
Diego Ongaro, Alan L. Cox, Scott Rixner, VEE 2008
Scheduling I/O in Virtual Machine Monitors
Diego Ongaro, Alan L. Cox, Scott Rixner, VEE 2008
Outline
Current services in cloud computing

Microsoft Azure

Google Compute Engine

Amazon EC2
Networking in Xen

Para‐virtualization network architecture
IDC in Xen

Why is it important?

Three Shared‐Memory Approaches

Our Approach: RTCA
Summary
11
B
Xen Overview
12
A
NIC
VMM Scheduler
VCPU
netfront
Domain 1
VCPU
NIC driver
softnet_dat
a
netback
Domain 0
VCPU
netfront
Domain 2
……
Outline
Current services in cloud computing

Microsoft Azure

Google Compute Engine

Amazon EC2
Networking in Xen

Para‐virtualization network architecture
IDC in Xen

Why is IDC important?

Three Shared‐Memory Approaches

Our Approach: RTCA
Summary
13
IDC in Xen
14
B
A
VMM Scheduler
VCPU
netfront
Domain 1
VCPU
softnet_dat
a
netback
Domain 0
VCPU
netfront
Domain 2
……
IDC : Inter‐Domain Communication
Why is IDC Important
15
Hardware

Now: Intel Xeon E7‐8870, 10 cores, 20 threads

http://ark.intel.com/products/53580/Intel‐Xeon‐Processor‐E7‐8870‐
30M‐Cache‐2_40‐GHz‐6_40‐GTs‐Intel‐QPI

Future: Intel’s 80‐CoreCPU Running at 5.7 GHz

http://news.softpedia.com/news/Intel‐039‐s‐80‐Core‐CPU‐Running‐
at‐5‐7‐GHz‐46881.shtml
System Administrator

“By optimizing the placement of VMs on host machines, traffic 
patterns among VMs can be better aligned with the 
communication distance between them, e.g. VMs with large 
mutual bandwidth usage are assigned to host machines in close 
proximity.”
Improving the Data Center Networks with Traffic‐aware Virtual Machine Placement 
Xiaoqiao Meng, et.al, INFOCOM, 2010
Improving the Data Center Networks with Traffic‐aware Virtual Machine Placement 
Xiaoqiao Meng, et.al, INFOCOM, 2010
Why is IDC Important
16
Embedded Systems

Integrated Modular Avionics

ARINC 653 Standard

Honeywell claims that IMA design can save 350 pounds of weight on 
a narrow‐body jet: equivalent to two adults

http://www.artist‐
embedded.org/docs/Events/2007/IMA/Slides/ARTIST2_IMA_WindRiv
er_Wilson.pdf
Can IDC provide guaranteed network latency to end user?
Can IDC provide guaranteed network latency to end user?
Full Virtualization based ARINC 653 partition
Sanghyun Han, Digital Avionics Systems Conference (DASC), 2011
Full Virtualization based ARINC 653 partition
Sanghyun Han, Digital Avionics Systems Conference (DASC), 2011
ARINC 653 Hypervisor
VanderLeest S.H., Digital Avionics Systems Conference (DASC), 2010
ARINC 653 Hypervisor
VanderLeest S.H., Digital Avionics Systems Conference (DASC), 2010
Domain-0
Domain-U
Xen Virtual Network
17
socket(AF_INET, SOCKET_DGRAM, 0);
socket(AF_INET, SOCKET_STREAM, 0);
sendto(…)
recvfrom(…)
VMM
app
kernel
TCP
IP
Netback Driver
UDP
INET
TCP
IP
Netfront Driver
UDP
INET
Transparent
Isolation
General
Migration
XPerformance
XData Integrity
XMulticast
XenSocket
18
Domain-U
VMM
app
kernel
TCP
IP
Netfront
UDP
INET
AF_Xen
Netfront
Performance
XTransparent
XOne way Communication
XPatch Guest OS
XenSocket: A High‐Throughput Interdomain Transport for Virtual Machines
Xiaolan Zhang et. al, IBM, Middleware, 2007
XenSocket: A High‐Throughput Interdomain Transport for Virtual Machines
Xiaolan Zhang et. al, IBM, Middleware, 2007
XWAY
19
Domain-U
VMM
XWAY switch
TCP
IP
XWAY
protocol
Netfront
XWAY driver
UDP
INET
app
kernel
Performance
Dynamic Create/Destroy
Live Migration
XPatch Guest OS
XMigration
XNo UDP
XComplicated
XWAY: Inter‐domain Socket Communications Supporting High Performance and Full 
Binary Compatibility on XenKangho Kim et. al, VEE, 2008
XWAY: Inter‐domain Socket Communications Supporting High Performance and Full 
Binary Compatibility on XenKangho Kim et. al, VEE, 2008
XenLoop
20
Domain-U
VMM
app
kernel
socket(AF_INET, SOCKET_DGRAM, 0);
socket(AF_INET, SOCKET_STREAM, 0);
sendto(…)
recvfrom(…)
TCP
IP
Netfront
UDP
INET
XenLoop
Transparent
Performance
Migration
XKernel Module in Guest OS
XDomain 0 Co‐operation
XMigration
XenLoop: A Transparent High Performance Inter‐VM Network Loopback
Jian Wang et. al, HPDC 2008
XenLoop: A Transparent High Performance Inter‐VM Network Loopback
Jian Wang et. al, HPDC 2008
Summary for Shared Memory in IDC
All require modification to the Guest OS

XenSocket needs to re‐compile guest kernel, modify the app

XWAY needs to re‐compile guest kernel

XenLoop needs to load a kernel module, and co‐operation with 
Domain 0
Issue with migration

XenSocket does not support

XWAY/XenLoop requires dynamic teardown channels between 
two domains, which incurs extra overhead
21
Recall: Xen Network Architecture
22
B
A
VMM Scheduler
VCPU
netfront
Domain 1
VCPU
softnet_dat
a
netback
Domain 0
VCPU
netfront
Domain 2
……
VMM Scheduler: Evaluation
23
VMM Scheduler: RT‐Xen VS. Credit
C 5
C 0
C 1
C 3
C 4
Dom 0
Linux 3.4.2
100% CPU
sent pkt every 10ms
5,000 data points
C 2

When Domain 0 is not busy, the VMM scheduler 
dominates the IDC performance for higher priority 
domains
When Domain 0 is not busy, the VMM scheduler 
dominates the IDC performance for higher priority 
domains
VMM Scheduler: Enough???
24
VMM Scheduler
C 5
C 0
C 1
C 2
C 4
C 3
Dom 0
100% CPU



Domain 0: Background
25
B
B
A
netfront
Domain 1
Domain 0

A
netfront
Domain 2

netif
netif
TXRX
netback
netback[0] {
rx_action();
tx_action(); }
netfront
Domain m

netfront
Domain n



netif
netif
netif
netif
softnet_dat
a
Packets are fetched in a round-robin order
Sharing one queue in softnet_data
Domain 0: RTCA
26
Packets are fetched by priority, up to batch size
A
netfront
Domain 1
Domain 0

A
netfront
Domain 2

netif
netif
TXRX
netback
netback[0] {
rx_action();
tx_action(); }


softnet_dat
a
B
netfront
Domain m

netif
netif
netfront
Domain n


netif
netif
B
Queues are separated by priority in softnet_data
RTCA: Evaluation Setup
27
VMM Scheduler
C 5
C 0
C 1
C 2
C 4
C 3
Dom 0
100% CPU
Original vs. RTCA
Interference
Medium
Heavy
Light
Base



sent pkt every 10ms
5,000 data points
RTCA: Latency
28
When there are no interference, IDC performances are comparable
Original Domain 0 performs poor under all cases
•Due to priority inversion within Domain 0
RTCA with batch size 1 performs best
•we eliminate most of the priority inversions
RTCA with larger bath sizes perform worse under IDC interference
By reducing priority inversion in Domain 0, RTCA can 
effectively mitigate impacts of low priority traffic on 
the latency of high priority IDC
By reducing priority inversion in Domain 0, RTCA can 
effectively mitigate impacts of low priority traffic on 
the latency of high priority IDC
IDC Latency between Domain 1 and Domain 2 in presence of low priority IDC (us)
RTCA: Throughput
29
Base
Light
Medium
Heavy
0
2
4
6
8
10
12
Gbits/s


RTCA, Size 1
RTCA, Size 64
RTCA, Size 238
Original
A small batch size leads to significant reduction in high 
priority IDC latency and improved IDC throughput 
under interfering traffic
A small batch size leads to significant reduction in high 
priority IDC latency and improved IDC throughput 
under interfering traffic
iPerf Throughput between Dom 1 and Dom 2
Summary
Current services in cloud computing

Microsoft Azure

Google Compute Engine

Amazon EC2
Networking in Xen

Para virtualization network architecture
IDC in Xen

Why is it important?

Three Shared‐Memory Approaches

Our Approach: RTCA
Summary
30
Backup Slides
31
Multiple computing elements

Cost!   Weight!   Power!

Communicate via dedicated network or real‐time networks
Use fewer computing platforms to integrate independently 
developed systems 
Motivation
32
Physically Isolated Hosts 
‐> Common Computing Platforms
Physically Isolated Hosts 
‐> Common Computing Platforms
Network Communication 
‐> Local Inter‐Domain Communication
Network Communication 
‐> Local Inter‐Domain Communication
Preserve Real‐Time Properties with Virtualization???
Preserve Real‐Time Properties with Virtualization???
System Model
We focus on:

Xen as the underlying virtualization software

Single core for each virtual machine on a multi‐core platform

Local Inter‐Domain Communication (IDC)

No modification to the guest domain besides the Xen patch
Future work:

Multi‐core for each virtual machine (domain)

Integrating with the Network Interface Card (NIC)
33
B
Background –Xen Overview
34
A
NIC
VMM Scheduler
VCPU
netfront
Domain 1
VCPU
NIC driver
softnet_dat
a
netback
Domain 0
VCPU
netfront
Domain 2
……
Part I –VMM Scheduler: Limitations
Default Credit Scheduler:

schedule VCPUs in round‐robin order
RT‐Xen Scheduler

schedule VCPUs by priority
However:

If execution time < 0.5 ms, VCPU budget is not consumed
Solutions:

Dual quanta: msfor scheduling, while usfor time accounting
35
“Realizing Compositional Scheduling through Virtualization”, 
Real‐Time and Embedded Technology and Application Symposium (RTAS), 2012
“Realizing Compositional Scheduling through Virtualization”, 
Real‐Time and Embedded Technology and Application Symposium (RTAS), 2012
“RT‐Xen: Towards Real‐Time Hypervisor Scheduling in Xen”, 
ACM International Conferences on Embedded Software (EMSOFT), 2011
“RT‐Xen: Towards Real‐Time Hypervisor Scheduling in Xen”, 
ACM International Conferences on Embedded Software (EMSOFT), 2011
Conclusion
36
Hardware
VMM Scheduler
VCPU
netfron
t
Domain 1
VCPU
softnet_dat
a
netback
Domain 0
VCPU
netfron
t
Domain 2
VMM scheduler alone cannot guarantee real‐time IDC
RTCA: Real‐Time Communication Architecture
RTCA + real‐time VMM schedulerreduces high priority IDC 
latency from msto usin the presence of low priority IDC
https://sites.google.com/site/realtimexen/
End‐to‐End Task Performance
37
VMM Scheduler: Credit vs. RT‐Xen
C 5
C 0
C 1
C 2
C 4
C 3
Interference
Medium
Heavy
Light
Dom 0
100% CPU
Original vs.
RTCA
T1(10, 2)
T2(20, 2)
T1(10, 2)
T3(20, 2)
T1(10, 2)
T4(30, 2)
Dom 1 & Dom 2
•60% CPU each
Dom 3 to Dom 10
•10% CPU each
•4 pairs bouncing
packets
Dom 3
Dom 4
Dom 5
Dom 6
Dom 7
Dom 8
Dom 9
Dom 10
Base
End‐to‐End Task Performance
38
By combining the RT‐Xen VMM scheduler and the 
RTCA Domain 0 kernel, we can deliver end‐to‐end real‐
time performance to tasks involving both computation 
and communication
By combining the RT‐Xen VMM scheduler and the 
RTCA Domain 0 kernel, we can deliver end‐to‐end real‐
time performance to tasks involving both computation 
and communication
Backup –Baseline
39
0
500
1000
1500
0
10
20
30
40
50
60
70
Packet Size
Micro Seconds


IDC
Local Thread
A
netfront
Domain‐1
Domain‐0

A
netfront
Domain‐2

netif
netif


softnet_data
NIC driver
multiple
kthreads
B
netfront
Domain‐m

netif
netif
netfront
Domain‐n


netif
netif
B
TXRX
netbac
k
TXRX
netbac
k
TXRX
netbac
k
priority
kthreads
highest priority