Accelerating Data Movement To Support Lattice QCD Computations

moneygascityInternet και Εφαρμογές Web

8 Δεκ 2013 (πριν από 4 χρόνια και 5 μήνες)

212 εμφανίσεις

PERFORMANCE
Below are results from one Lattice researcher’s use of Globus Online over the past few months:
0RYHG*%ÀOHVLQPLQXWHV

²,PSURYHGWUDQVIHUUDWHIURP0EVHFWR0EVHF

²6DPHWUDQVIHUZRXOGKDYHWDNHQRYHUGD\VZLWKVFS
0RYHG*%LQPLQXWHV
0RYHGa7%LQKRXUV
0RYHG*%LQPLQXWHV
7RGHPRQVWUDWHKRZZHOO*OREXV2QOLQH·VSHUIRUPDQFHRSWLPL]DWLRQORJLFSHUIRUPVLQSUDFWLFDO
VLWXDWLRQVZHFRPSDUHG*OREXV2QOLQHSHUIRUPDQFHZLWKWKDWDFKLHYHGZKHQXVLQJVFSDQG
WKHJOREXVXUOFRS\*8&FOLHQW)LJXUHVKRZVWKHSHUIRUPDQFHRIYDULRXVWUDQVIHUPHFKDQLVPV
IRU WUDQVIHUV RYHU D KLJKVSHHG ZLGH DUHD QHWZRUN (61HW EHWZHHQ WZR KLJKSHUIRUPDQFH
SDUDOOHOVWRUDJHV\VWHPVDW$/&)DQG1(56&,WKDVWREHQRWHGWKDWWKHSHUIRUPDQFHRI*OREXV
2QOLQHLVEHWWHUWKDQRUVLPLODUWRWKHSHUIRUPDQFHRIWXQHG*8&WXQHGE\DGDWDPRYHPHQW
H[SHUW $FWXDOO\ WKH FRPSDULVRQ ZLWK WKH XQWXQHG *8& DQG 6&3 VKRZV WKH SHUIRUPDQFH
JDLQVWKDWPDQ\XVHUVFDQH[SHFWWRJDLQIURP*OREXV2QOLQH
)LJXUH'DWDWUDQVIHUSHUIRUPDQFHEHWZHHQ$/&)DQG1(56&IRUYDULRXVÀOHVL]HVDQGWUDQVIHUWHFKQRORJLHV








(

(


(

(
(

(

(
Rate (Mbits/sec)

File Size (bytes)
JR
JX
c
sc
p
WXQHGJX
c
A
RCHITECTURE
Globus Online comprises:


$VHWRIXVHUJDWHZD\VZKLFKH[LVWWRSURYLGHLQWHUDFWLRQEHWZHHQXVHUVDQGWKHV\VWHP
YLDWKH:HE&/,DQG5(67LQWHUIDFHV


$VHWRIRQHRUPRUHZRUNHUVZKLFKH[LVWWRRUFKHVWUDWHGDWDWUDQVIHUVDQGSHUIRUP
RWKHUWDVNVVXFKDVQRWLI\XVHUVRIFKDQJHVLQVWDWH


$SURÀOHVDQGVWDWHGDWDEDVHXVHGWRPDLQWDLQXVHUSURÀOHVDQGÀOHWUDQVIHU

VWDWHLQIRUPDWLRQ
3HUVRQDOORFDOPDFKLQHVDUHFRQQHFWHGWR*OREXV2QOLQHXVLQJ*OREXV&RQQHFWDRQHFOLFN
GRZQORDGDQGLQVWDOODSSOLFDWLRQIRU0DF26/LQX[DQG:LQGRZV*OREXV&RQQHFWFRPSULVHV
D*ULG)73VHUYHUWKDWUXQVDVDQRUPDOXVHUUDWKHUWKDQDVURRWIURPLQHWGOLNHDW\SLFDO*ULG)73
VHUYHU DQG D JVLRSHQVVK FOLHQW FRQÀJXUHG WR HVWDEOLVK DQ DXWKHQWLFDWHG FRQQHFWLRQ WR D
*OREXV2QOLQHUHOD\VHUYHULQRUGHUWRWXQQHOWKH*ULG)73FRQWUROFKDQQHOUHTXHVWVIURP*OREXV
2QOLQH*OREXV&RQQHFWRQO\HVWDEOLVKHVRXWERXQGFRQQHFWLRQVDQGWKXVFDQZRUNEHKLQGD
ÀUHZDOORURWKHUQHWZRUNLQWHUIDFHGHYLFHWKDWGRHVQRWDOORZIRULQERXQGFRQQHFWLRQVF
ront-end architecture and performance

:HEUHQGHULQJHQJLQHDQGVXSSRUWLQJFRUHVHUYLFHVDUHGHYHORSHGLQ3\WKRQXVLQJ
WKH3\UDPLGZHEVHUYLFHVWDFN

*OREXV2QOLQHVHUYLFHVDUHOLJKWZHLJKWKLJKO\VFDODEOHVHUYLFHVWKDWSUHVHQW5(67IXO
LQWHUIDFHVOHYHUDJLQJ+773VHPDQWLFVIRUHDVHRIXVHSHUIRUPDQFHDQGVFDODELOLW\


8VHUGDWDLVVWRUHGLQ&DVVDQGUDDKLJKO\VFDODEOHVHFRQGJHQHUDWLRQGLVWULEXWHG
GDWDEDVHRULJLQDOO\GHYHORSHGE\)DFHERRNDQGQRZVXSSRUWHGDV
DQ$SDFKHSURMHFW

85/UXOHVDQGSUR[\ORJLFDUHKDQGOHGE\1*,1;DKLJKSHUIRUPDQFH+773VHUYHU
DQGUHYHUVHSUR[\

:HESDJHVDUHGHÀQHGZLWKDVLPSOH\HWSRZHUIXOGHFODUDWLYHPRGHOWKDW

HQDEOHVDVVHPEO\WREHSHUIRUPHGFOLHQWVLGHRUVHUYHUVLGH

:HEZLGJHWVDUHGHYHORSHGLQSXUH'+70/1RSUHVHQWDWLRQORJLFLVSHUIRUPHG
VHUYHUVLGH


Security and robustness

$OOZHEVLWHWUDIÀFLVKDQGOHGRYHU+7736WRHQVXUHVHFXULW\RIVHQVLWLYHGDWDDQGFRRNLHV

6XSSRUWIRUÀQHJUDLQHGDFFHVVFRQWUROWRXVHULQIRUPDWLRQ

$OOXVHUSURÀOHLQIRUPDWLRQLQFOXGLQJSHUVRQDOGDWDDQGVHFXULW\FUHGHQWLDOVLVVWURQJO\
HQFU\SWHGLQVWRUDJHWRIXUWKHUSURWHFWDJDLQVWXQDXWKRUL]HGDFFHVVR
eliability and scalability


*OREXV2QOLQHLVKRVWHGRQ$PD]RQ·V(ODVWLF&RPSXWH&ORXG(&IRURSWLPXP
SHUIRUPDQFHDQGVFDODELOLW\


&RUHVHUYLFHVDUHKRVWHGRQWKUHHVHSDUDWH$PD]RQGDWDFHQWHUVIRUUHGXQGDQF\


*OREXV2QOLQHHDVLO\VFDOHVZLWKWKHDGGLWLRQRIQHZ(&LQVWDQFHV


7UDIÀFWR:HEVLWHLVORDGEDODQFHGE\$PD]RQ(ODVWLF/RDG%DODQFHU(/%IRU

VFDODEOHUHOLDELOLW\
SY
S
TEM
O
VERVIEW
*OREXV2QOLQHLVDKRVWHGÀOHWUDQVIHUV\VWHPWKDWSURYLGHVVLPSOHUHOLDEOHGDWDPRYHPHQWIRU
YLUWXDOO\DQ\UHVHDUFKHUDQGIDFLOLW\
*OREXV2QOLQHPDNHVKLJKSHUIRUPDQFHÀOHWUDQVIHUFDSDELOLWLHV²WUDGLWLRQDOO\DYDLODEOHRQO\
RQH[SHQVLYHVSHFLDOSXUSRVHVRIWZDUHV\VWHPV²DFFHVVLEOHWRDQ\UHVHDUFKHUZLWKDQ,QWHUQHW
FRQQHFWLRQDQGDODSWRS8VHUVVLPSO\VLJQXSIRUWKHVHUYLFHORJLQDFFHVVWKHLUGDWDXVLQJ
H[LVWLQJFUHGHQWLDOVDQGWKHQFOLFNRULVVXHDVLPSOHFRPPDQGWRWUDQVIHUÀOHV
Key features:
I
NTERFACE
S
*OREXV2QOLQHIHDWXUHVDZHELQWHUIDFHDFRPPDQGOLQHLQWHUIDFHDQGD5(67$3,IRU

PRYLQJÀOHVVHFXUHO\DQGUHOLDEO\
W
eb
UI
:
8VLQJMXVWDZHEEURZVHUÀOHVRUHQWLUHGLUHFWRULHV
IURPRQHHQGSRLQWFDQEHVHOHFWHGDQGWUDQVIHUUHGWR
DQRWKHUHQGSRLQW
C
L
I

C
ommands:
 )RU WKRVH PRUH FRPIRUWDEOH DW D
FRPPDQGSURPSWWKH¶VFS·DQG¶WUDQVIHU·FRPPDQGV
SURYLGHDGGLWLRQDORSWLRQVIRUFUHDWLQJDQGFRQWUROOLQJ
ÀOH WUDQVIHU UHTXHVWV <RX PD\ XVH D IDPLOLDU VFSOLNH
LQWHUIDFH IRU PRYLQJ ÀOHV EHWZHHQ WZR HQGSRLQWV RU
WDNHDGYDQWDJHRIWKH*OREXV2QOLQHWUDQVIHUFRPPDQG
IRULQFUHDVHGÁH[LELOLW\LQVSHFLI\LQJDOLVWRILQGLYLGXDO
ÀOHVRUGLUHFWRULHVIRUWUDQVIHUEHWZHHQHQGSRLQWV
T
ransfer
RE
S
T

A
P
I
:
7KLV´5(67IXOµ*OREXV2QOLQHLQWHUIDFHLVGHVLJQHGIRUGHYHORSHUVZKR
QHHGWRGHOLYHUVROXWLRQVWRDFRPPXQLW\RIXVHUVRULQWHJUDWHWRDQH[WHUQDOZHEVLWHRU
DSSOLFDWLRQ7KH7UDQVIHU5(67$3,PDNHVLWSRVVLEOHWRLQWHJUDWHUHOLDEOHGDWDPRYHPHQW
LQWR XVHUV· +3& ZRUNÁRZV VXFK DV LQWHJUDWLQJ ZLWK -DYD FOLHQWV RU:HE²EDVHG SRUWDOV
ZLWKQRUHTXLUHPHQWIRUVSHFLDOL]HGVRIWZDUH
Amazon Elastic Cloud Computing (EC2)
Data is securely archived on S
3
Amazon Elastic Lead Balancer (ELB
)
Traffic load balanced with EC2
Amazon Simple Storage Service (S3)
Instances distributed across multiple data center
s
Globus Onlin
e
EC2 Instanc
e
Globus Onlin
e
EC2 Instanc
e
Globus Onlin
e
EC2 Instanc
e
Ubuntu Linux Server
NGIN
X
Pyramid
Cassandra
DHMTL
(DFK&DVVDQGUDLQVWDQFH
FRQILJXUHGWRUHSOLFDWHGDWDRQ
RUPRUHLQVWDQFHVIR
r
LQFUHDVHGGXUDELOLW
\
'DWDLVVWRUHGHQFU\SWHGWR

SUHYHQWXQDXWKRUL]HGDFFHV
s
US
ER SCENARIO:
LATTICE Q
C
D DATA
M
OVEMENT
6FLHQWLVWVDWWKH0,0'/DWWLFH&RPSXWDWLRQ0,/&5HVHDUFK
&ROODERUDWLRQDUHXVLQJ*OREXV2QOLQHWRVLPSOLI\WKLVSURFHVV
RI PRYLQJ ODUJH ÀOHV DURXQG WR GLIIHUHQW VXSHUFRPSXWLQJ
FHQWHUV LQFOXGLQJ 7HUD*ULG PDFKLQHV OLNH.UDNHQ $WKHQD
/LQFROQDQG/RQJKRUQRU'2(FRPSXWHUVOLNH)UDQNOLQ,QWUHSLG
DQG +RSSHU RQ D IUHTXHQW EDVLV $FFRUGLQJ WR 6WHYHQ
*RWWOLHE'LVWLQJXLVKHG 3URIHVVRU,QGLDQD 8QLYHUVLW\ DQG D
VHQLRUPHPEHURIWKH0,/&&ROODERUDWLRQWKHUHVXOWLQJVSHHGXSKDVEHHQ´YHU\LPSUHVVLYHµ
ZLWKRQRQHRFFDVLRQRQHKXQGUHGJLJDE\WHÀOHVWUDQVIHUUHGLQMXVWPLQXWHVDSURFHVV
WKDWZRXOGKDYHWDNHRYHUGD\VZLWKVFS7KHUHGXFWLRQLQGHOD\VDQGPDQXDOLQWHUYHQWLRQ
KDV´PDGHDELJGLIIHUHQFHµLQWKHFRQYHQLHQFHRIPRYLQJSURMHFWVEHWZHHQFHQWHUV
globus online
5HOLDEOH)LOH7UDQVIHU1R,75HTXLUHG
A
ccelerating Data
M
ovement
T
o Support Lattice Q
C
D
C
omputations
C
HA
LL
ENGE
)LOH WUDQVIHU LV ERWK D FULWLFDO DQG IUXVWUDWLQJ DVSHFW RI FRPSXWH
LQWHQVLYH UHVHDUFK VXFK DV/DWWLFH TXDQWXP FKURPRG\QDPLFV
)RUDUHODWLYHO\PXQGDQHWDVNPRYLQJWHUDE\WHVRIGDWDUHOLDEO\
DQG HIÀFLHQWO\ FDQ EH VXUSULVLQJO\ FRPSOLFDWHG RQH PXVW GHDO
ZLWK GHWHUPLQLQJ DYDLODEOH SURWRFROV QHJRWLDWLQJ ÀUHZDOOV DQG
DXWKHQWLFDWLRQ SUHVWDJLQJ ÀOHV IRU WUDQVIHU GHWHFWLQJ DQG
UHVSRQGLQJ WR IDLOXUHV GLDJQRVLQJ QHWZRUN PLVFRQÀJXUDWLRQV
FRQÀJXULQJ VRIWZDUH HWF 2IWHQ/DWWLFH VFLHQWLVWV XVH
VXSHUFRPSXWHUVZKHUHWLPHKDVEHHQDOORFDWHGVXFKDV7HUD*ULG
RU'2(PDFKLQHV²EXWXVHRIWKHVHIDFLOLWLHVLVFRVWO\DQGFDQ
EH WLPHFRQVXPLQJ VLQFH VFLHQWLVWV PXVW GHDO ZLWK PDQDJLQJ
ÀOH WUDQVIHUV WR DQG IURP WKHVH PDFKLQHV EDVHG RQ ZKHUHYHU
FRPSXWLQJWLPHKDSSHQVWREHDYDLODEOH
,IÀOHWUDQVIHUFRXOGEHDFFHOHUDWHGDQGVLPSOLÀHGVFLHQWLVWVZRXOGEHQHÀWQRWRQO\IURPWLPH
DQGFRVWVDYLQJVEXWDOVRIURPWKHHOLPLQDWLRQRIWHGLRXVUHSHWLWLYHWDVNVWKDWWDNHWKHLUDWWHQWLRQ
DZD\IURPWKHLUFRUHUHVHDUFK

Globus Online frees
up my time to do
more creative work
rather than typing
scp commands or
devising scripts
to initiate and
monitor progress.”
Profiles & Stat
e
Worker
Reques
t
collector
GridFT
P
server
GridFT
P
server
Notificatio
n
target
User
User
User
User
)LJXUH+RVWLQJRQ$PD]RQ:HE6HUYLFHVSURYLGHVVFDODELOLW\IRUFRUHVHUYLFHV
)LJXUH6FKHPDWLFRIWKH*OREXV2QOLQHDUFKLWHFWXUH
“I moved 100 7.3 GB
ÀOHVLQDERXW
hours. The same

transfer would have
taken over 3 days
with scp.”User
(2) User makes request
to Globus Online: e.g.,”Transfer
data from MyDesktop to SiteA”
Globus
Onlin
e
(3) Globus Connect
forwards request
s
to Globus Connect
(4) Globus Connect establishes data channel
connection to SiteA and transfers dat
a
GridFT
P
server
“SiteA”
“MyDesktop”
G
lobus
C
onnect
)LJXUH*OREXV&RQQHFW+LJKOHYHOUHSUHVHQWDWLRQRIWUDQVIHUÁRZ

5HOLDELOLW\WKURXJKDXWRPDWLFIDXOW
UHFRYHU\DQGLQWHJULW\FKHFNLQJ


,QWHJUDWHGPDQDJHPHQWRIWUDQVIHUV
DFURVVPXOWLSOHVHFXULW\GRPDLQVZLWK
PXOWLSOHXVHULGHQWLWLHV

2SWLPL]HGSHUIRUPDQFHE\DXWRWXQLQJ
WUDQVIHUVEDVHGRQQXPEHURIÀOHVDQG
ÀOHVL]HV

3HUIRUPDQFHPRQLWRULQJÀQHJUDLQHG
DFWLYLW\ORJJLQJDQGVWDWXVUHSRUWLQJ

&RQGLWLRQDOÀOHV\QFKURQL]DWLRQEDVHG
RQDWWULEXWHVVXFKDVVL]HWLPHVWDPSRU

FKHFNVXPPLVPDWFK

6LPSOLÀHGWUDQVIHUVIURPPDVVVWRUDJH
XVLQJSUHVWDJLQJRIGDWDDQG
automatic retries

7UDQVIHUIURPPDFKLQHVEHKLQGÀUHZDOOV
DQG1$7VXVLQJDOLJKWZHLJKWFOLHQW

6LPSOHFUHDWLRQDQGPDLQWHQDQFH
RISHUVRQDOHQGSRLQWVXVLQJ*OREXV
&RQQHFW

1HZIHDWXUHVLPPHGLDWHO\DYDLODEOHWR
scientists as a result of Software from

Service approach
(1)
R
egister
G
lobus
C
onnect
instance with
G
lobus
O
nline
“Globus Online is
WKHPRVWEHQHÀFLDO
grid technology I
have even seen.”