X-Trace: A Network Tracing Framework - HPTS

bricklayerbelchedInternet and Web Development

Feb 5, 2013 (4 years and 8 months ago)

118 views

UC Berkeley

Scaleable Structured
Datastorage for Web 2.0

Michael Armbrust, David Patterson

October, 2007

RAD Lab 5
-
year Mission


Today’s Internet systems complex, fragile, manually
managed, rapidly evolving


To scale Ebay, must build Ebay
-
sized company


“Moon shot” mission statement:

Enable
a single person

to
D
evelop,
A
ssess,
D
eploy, and
O
perate the next
-
generation IT service


“The Fortune 1 Million” by enabling rapid innovation


Create core technology to enable vision via synergy
across systems, networking, and Statisical Machine
Learning


Making datacenter easier to manage enables vision of
single person to analyze, deploy and operate a
scalable IT service


If Datacenter is the
computer…


What is the programming language?


What are the libraries?


How do trace/monitor programs?


What is the simulator?


What is Computer Aided Design?


What is the Operating System?


What is the Database System?

Storage Status Quo


Current status of data storage for Web 2.0
apps


Large relational databases running on
expensive hardware


Manual horizontal and vertical partitioning of
data


Problem: Requires redesign at each
scaling milestone


Goal: Scaleable structured data storage
for Web 2.0

Web 2.0 App Characteristics


Need to scale to YouTube or MySpace
sizes


Require geographic replication


Short transactions


No ad
-
hoc queries


Willing to trade relaxed consistency for
scalability and availability


Photos, not financials

Relaxed Consistency


Some things can be updated lazily


Eventual consistency is often acceptable


However users should see their own
writes immediately


Need to provide simple choices to
developers

Our Idea


Large scale distributed database underneath


Runs on 1000+ of shared nothing commodity
servers


ActiveRecord
-
like layer in Ruby on Rails vs. SQL


Provides simple relationships and consistency
guarantees between models


has_many


belongs_to


searchable_by (for full
-
text search)


Pre
-
compute joins for quick reads

Related Work (we know of)


G. DeCandia, D. Hastorun, et al. Dynamo: Amazon

s highly
available key
-
value store. In SOSP. 2007. [5] M. Stonebraker and U.
Cetintemel. one size fits all: an idea whose time has come and
gone. pp. 211. 2005.


M. Stonebraker, S. R. Madden, et al. The end of an architectural era
(its time for a complete rewrite). In VLDB. Vienna, Austria, 2007.


D. J. Abadi, A. Marcus, S. R. Madden, and K. Hollenbach. Scalable
semantic web data management using vertical partitioning. In VLDB,
Vienna, Austria, 2007.


F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M.
Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A
distributed storage system for structured data. In OSDI

06: Seventh
Symposium on Operating System Design and Implementation,
November 2006.