Pemanfatan GPU dalam peningkatan kinerja analisis dinamika ...

gradebananaSoftware and s/w Development

Dec 2, 2013 (3 years and 6 months ago)

118 views

1

Heru

Suhartanto


Faculty of Computer Science,

Universitas

Indonesia

E
-
mail:
heru@cs.ui.ac.id


Presented at University of YARSI




General Course


on 27
-
th April 2011

A revised version of presentation at ICACSIS2010,

http://icacsis2010.cs.ui.ac.id/

Soon the presentation will be available at
http://hsuhartanto.wordpress.com




Hungry problems that need super computing resources.
(examples and types)


Why Grid and Cloud computing (definition, structure, ….)


Some past and current works


The development of the first Indonesia Grid infrastructure


parallel Molecular dynamics process in drug design based on typical
Indonesian plants on Cluster environment;


and
IndoEdu
-
grid design for Indonesian e
-
learning resources based
on Grid computing.


Prospects in the future and some proposals to overcome the
challenges will be covered and this includes cloud computing.


Next coming works


2

3

3

Resource Hungry Applications

[Ref
Hai

Jin and Raj
Buyya
]


Solving grand challenge applications using
computer
modeling
,
simulation

and
analysis

Life Sciences

CAD/CAM

Aerospace

Military Applications

Digital Biology

Military Applications

Military Applications

Internet &

Ecommerce

4


Information simulation
-

Compute dominate


Information repository
-

Storage dominate


Information access
-

Communication dominate


Information integration
-

System of
systems



These applications are impossible to be solved
using ordinary computing resources


There are 3 ways to improve performance:


Work Harder


Work Smarter


Get Help


Computer Analogy


Using faster hardware


Optimized algorithms and techniques used to
solve computational tasks


Multiple computers to solve a particular task

5


Improve the operating speed of processors & other
components


constrained by the speed of light, thermodynamic laws, &
the high financial costs for processor fabrication



Connect multiple processors together & coordinate their
computational efforts


parallel computers


allow the sharing of a computational task among multiple
processors

6

Ref: Buyya

7


Supercomputer ?

Cluster Computing ?

Grid Computing ?

Cloud Computing?

8

We need to ‘collect’ these resources
and share them among the needed
people.


This lead to Grid Computing concept.

9


http://www.pragma
-
grid.net/



The Pacific Rim Application and Grid Middleware Assembly
(PRAGMA) was formed in 2002 to establish sustained
collaborations and advance the use of grid technologies in
applications among a community of investigators working with
leading institutions around the Pacific Rim.



Four working groups focus our activities in the areas of:



* Resources and Data



* Biosciences



*
Telescience



* Global Earth Observatory (GEO)



10

members have been doing a combination of the following:



-

join their resources with PRAGMA grid


http://goc.pragma
-
grid.net/pragma
-
doc/userguide/join.html


http://goc.pragma
-
grid.net/pragma
-
doc/computegrid.html


-

running grid applications in PRAGMA grid


http://goc.pragma
-
grid.net/pragma
-
doc/userguide/pragma_user_guide.html


http://goc.pragma
-
grid.net/wiki/index.php/Applications


-

develop, integrate, enhance, implement and share software in PRAGMA grid


http://goc.pragma
-
grid.net/wiki/index.php/Main_Page#Middleware



Our recent focus is virtualization. Some sites have been actively working
together on VM technology.


http://goc.pragma
-
grid.net/wiki/index.php/Virtualization


11


Deteksi

kerusakan

pipa
,
Inspeksi

100 km
pipa

dgn

garis

tengah

50
inci
,
data yang
terkumpul

280 Terabytes (2.8 x 10^{14} bytes),
kecepatan

transfer 2.8
Gb
.
Hanya

bisa

diproses

oleh

SDK Grid computing, [ ref:
inspektionmolch

:
http://www.hpe.fzk.de/projekt/molch/
,
akses

27 Sep 08]



Analisis

data
aktifitas

otak

yang
dikumpulkan

dari

instrument MEG
(
Magnmetoencephatolgraphy
)
adalah

topik

riset

yg

sangat

penting

karena

mendorong

para

dokter

untuk

identifikasi

simptom

penyakit
.
Kerja

sama

Grid Lab


Univ

Melbourne, Nimrod
-
G Project
Monash

Univ
,
dan

MEG
project


Osaka
Univ

[ref:
http://www.gridbus.org/neurogrid/
,
akses

27 sep
08]


Novartis Institute for Biomedical Research
perlu

6
tahun

waktu

proses

dgn

komputer

super,
namun

dengan

PC Grid
berjumlah

3700 desktop Pc,
hanay

perlu

waktu

proses

12 jam.
Hemat

dana

sekitar

200
juta

dollar
untuk

tiga

tahun
,
kekuatan

komputasi

tercapai

lebih

dari

5
Tera
-
flops [Ian Foster,
www.globus.org
]




12


the combination of computer resources from multiple administrative
domains to reach a common goal. The
Grid

can be thought of as a
distributed system

with non
-
interactive workloads that involve a large
number of files.



Infrastruktur

komputasi

yang
menyediakan

akses

berskala

besar

terhadap

sumber

daya

komputasi

yang
tersebar

secara

geografis

namun

saling

terhubung

menjadi

satu

kesatuan

fasilitas
.
Sumber

daya

ini

termasuk

antara

lain supercomputer, system storage,
sumber

sumber

data,
dan

instrument
instrument
.

13

Grid computing physical structure [Ian Foster]

14

Grid Architecture [GridBus]


Thailand


ThaiGrid


Started at 2002


Funding : $ 6M (3 years)


10 univ., Weather Forecast Services, NECTEC


158 CPUs


Singapore


NGP (National Grid Project)


Started September 2002


3 univ., 5 ministries (MOE, MOH, MITA, MINDEF, MTI)


Malaysia


Proposal “National Technology Roadmap for Grid Computing”
submitted to MOSTI (initiator: MIMOS Berhad, th. 2005)


Regional forums:


SEA Grid Forum (3 countries)


ApGrid (14 countries)

15

16

Ask others to provide them, and users
use them as a
Services

then Grid
computing will be function as Cloud
computing;


17

Services in the Cloud


Software as a Service (
SaaS
)


Platform as a Service (
PaaS
)


Infrastructure as a Service (
IaaS
)


18



SaaS



bisa

dalam

bentuk

Aplikasi

seperti

CRM


customer relationship management, Email,



PaaS



Platform,
antara

lain Programming Language,
APIs, Development Environment,



IaaS


Virtualization : Provisioning, Virtualization, billing,


Hardware : Memory, computation, Storage


Colocation

: the data center owner rents out floor
space and provides power and cooling as well as a
network connection


19

Some cloud vendors:
amazon


Aws.amazon.com,
amazon

web services (AWS) offers
a large number of cloud services. Focuses on Elastic
Compute Cloud (EC2) and its supplementary storage
services


EC2 offers the user a choice of virtual machine
templates that can be instantiated in a shared and
virtualized environment,



Each virtual machine is called Amazon Machine Image.
The customer can use pre
-
packaged AMIs from
Amazon and 3
rd

parties or they can build their own.


20


Appian
-

www.appian.com



Offers management
softwares

to design an deploy
business processes. The tool is available as a web
portal for both business process designers and users,



the design is
faciliated

with a graphic user interface
that maps processes to web forms,



End users are then able to access the functionality
through a dash board of forms,



Executives and managers can access the same web
site for bottleneck analysis, real time visibility and
aggregated high level analysis

21

Google:


apps.google.com , appengine.google.com



Google App Engine is a platform service. It provides basic run time
environment, it eliminates many of the system administration and
development challenges involved in building applications scale to million
users,



Another infrastructural services, used primarily by Google applications
themselves is Google Big Table. It is a fast and extremely large
-
scale
DBMS designed to scale into
petabyte

range across “hundreds or
thousands of machines”



On the
SaaS
,
google

offers some free and competitively priced services
including Gmail, Google Calendar, Talk, Docs, and sites.

22

Cloud computing services by
Indonesians?

Gratis:
Esfindo

(
SaaS
),
InGrid

(
IaaS
), ……

Bayar :
telkomcloud
, webhosting, collocation, ….


Over 20 definitions:


http://cloudcomputing.sys
-
con.com/read/612375_p.htm


Buyya’s definition:


"A Cloud is a type of parallel and distributed system consisting
of a collection of inter
-
connected and
virtualised

computers
that are
dynamically provisioned

and presented as one or
more unified computing resources based on
service
-
level
agreements

established through
negotiation

between the
service provider and consumers.”


Keywords: Virtualisation (VMs), Dynamic Provisioning
(negotiation and SLAs), and Web 2.0 access interface

23

Segala

kebutuhan

pengelolaan

data
di

Internet
dengan

sumber

daya

yang
disiapkan

oleh

suatu

provider. [. H
Suhartanto
, 2011]

24

Private/Enterprise
Clouds

Cloud computing

model run

within a company’s

own Data Center /

infrastructure for

internal and/or


partners use.


Public/Internet
Clouds

3rd party,

multi
-
tenant Cloud

infrastructure

& services:



* available on

subscription basis

(pay as you go)

Hybrid/Mixed Clouds

Mixed usage of

private and public

Clouds:

Leasing public

cloud services

when private cloud

capacity is

insufficient


No upfront infrastructure investment


No procuring hardware, setup, hosting, power, etc..


On demand access


Lease what you need and when you need..


Efficient Resource Allocation


Globally shared infrastructure, can always be kept busy by serving
users from different time zones/regions...


Nice Pricing


Based on Usage, QoS, Supply and Demand, Loyalty, …


Application Acceleration


Parallelism for large
-
scale data analysis, what
-
if scenarios
studies…


Highly Availability, Scalable, and Energy Efficient


Supports Creation of 3
rd

Party Services & Seamless offering


Builds on infrastructure and follows similar Business model as
Cloud


25

26



some previous research works are
available


The development of internet
infrastructures among universities;


Some related courses are offered
in
universitities



National network infrastructure provided by
telecommunication industries



Combining terrestrial and satellite connections


Terrestrial: optical
fiber
, copper, digital micro wave;
(wireless and on
-
wire)



Pengguna

Internet :
40

juta


Pelanggan

telp

seluler
:
105
juta

Nizam
,
presentasi

Aptikom

2011

Konfigurasi Zona Perguruan Tinggi

Medan
Padang
Panjang
Padang
Pekanbaru
Jambi
Padang
STSI
Palembang
Bandar lampung
Bengkulu
Serang
Jkt UI
Bogor
Jkt UT
Bandung
Semarang
Denpasar
Potianak
Samarinda
Manado
Manado
Gorontalo
Palu
Makasar
Manukwari
Ambon
Kupang
Solo
Mataram
Purwokerto
Malang
Jogya
Jember
Bangkalan
Ternate
`
Kendari
Singaraja
Tual
40
39
41
38
42
37
32
35
33
34
29
28
11
12
10
24
25
26
27
52
22
23
49
46
53
6
5
8
1
30
9
28
18
13
17
16
14
51
50
48
43
45
19
44
7
Jayapura
3
4
Jkt DIKTI
20
36
21
47
15
31
2
Lhokseumawe
Poltek
Banjarmasin
Banda Aceh
Unsyiah
Lhokseumawe
Unimal
Surabaya
155
Mbps
16
Mbps
2
Mbps
1
Mbps
2
Mbps
Catatan
:
Total Link teresterial
:
41
Link VSAT
:
12
Total link
:
53
Palangkaraya
8
Mbps
4
Mbps
Batam
56
Pol Smr
55
54
Pangkep
JarDikNas
Topologi “INHERENT” tahun
2010

Nizam
, 2011 at APTIKOM meeting


Jumlah koneksi


82 PTN (32 sebagai Local Nodes)


224
PTS


12
Kopertis


SEAMEO
-
Seamolec


Kapasitas bandwidth


Advance:
155
Mbps


Medium: 8 Mbps


Basic: 2 Mbps


Self
-
funding: (leased line 512


1 M; wireless 11
-
55 M)


Network configuration:
scale
-
free

network


Cita
-
cita ke depan: Higher Education super corridor dengan dark fiber
sehingga koneksi antar perguruan tinggi minimal 1 GBps dan backbone
nasional 10 GBps (Thailand antar PT sudah 1
-
10 GBPs)

Nizam
, 2011 at APTIKOM meeting

30

inGRID

PORTAL

Globus

Head Node



INHERENT

User

User

Linux/Sparc

Cluster

Globus

Head Node

Linux/x86

Cluster

Windows/x86

Cluster

Solaris/x86

Cluster

Globus

Head Node

UI

I*

U*

Custom

PORTAL


inGRID Portal


SUN Fire X2100, AMD Opteron Processor (2.4 GHz, dual core),
2 GB Memory, 80 GB Disk, 2 10/100/1000 Mbps NICs, DVD
-
ROM Drive


Globus Head Node


SUN Fire X2100, AMD Opteron Processor (2.2 GHz, dual core),
1 GB Memory, 80 GB Disk, 2 10/100/1000 Mbps NICs, DVD
-
ROM Drive


Linux Cluster (
16

nodes)


SUN Fire X2100, AMD Opteron Processor (2.2 GHz,
dual core
),
1 GB Memory, 80 GB Disk, 2 10/100/1000 Mbps NICs


Storage Server


Dual Xeon Processor (3.0GHz), 2 GB Memory,
1 TB Disk

31


User Interface:


UCLA Grid Portal


Middleware


Globus Toolkit


Job Scheduler
:


Sun Grid Engine
(SGE)


Programming:


C, Java


Paralel
: MPICH


Applications:


Chemistry:


Gromach


Biology:


Blast


Computer Graphic
:


Povray


Utilities
:


Matrics multiplication,
Sort, Octave (
Matlab
-
like
)

32

33

34


Ari Wibisono, Heru Suhartanto, Arry Yanuar, Performance Analysis of
Curcumin Molecular Dynamics Simulation using GROMACS on Cluster
Computing Environment, this conference.


Muhammad Hilman, Heru Suhartanto, Arry Yanuar, Performance
Analysis of Embarrassingly Parallel Application on Cluster Computer
Environment : A Case Study of Virtual Screening with Autodock Vina
1.1 on Hastinapura Cluster, this conference.


used to study the solvation of proteins, the interaction
of DNA
-
protein complexes and lipid systems, and
study the ligand binding and folding of proteins.


to produce a trajectory of molecules in a finite time
period, where each the molecules in these simulations
have positional parameters and momentum.


be used to assist drug discovery. The usage of
computers offer a method of in
-
silico as a complement
to the method in
-
vitro and in
-
vivo that are commonly
used in the process of drug discovery. Terminology in
-
silico, analog with in
-
vitro and in
-
vivo, refers to the use
of computer in drug discovery studies


GROMACS is used in the simulation.

35


Molecular docking is a computational
procedure that attempts to predict non
covalent binding of macromolecules.


The goal is to predict the bound conformations
and the binding affinity.


The prediction process is based on information
that embedded inside the chemical bond of
substance.


Autodock Vina is used in the simulation.

36

No

Time Step

Amount of Processor

2

3

4

5

1

200ps

1.85

2.64

3.07

3.74

2

400ps

1.84

2.46

3.13

3.73

3

600ps

1.83

2.42

3.04

3.69

4

800ps

2.03

2.47

3.09

3.76

5

1000ps

1.87

2.51

3.14

3.82

37

38


discusses the design and simulation of an e
-
learning computer
network topology, based on Grid computing technology, for
Indonesian schools called the Indonesian Education Grid
(abbreviated as IndoEdu
-
Grid).


The establishment of such network without Grid computing
capabilities will lead to redundancies of the idle resources.


We proposed scenarios that have different network topologies
based on their routers and links configuration. Each scenario will
be run in the simulator using two packet scheduling algorithms,
one will be FIFO (First In First Out) Scheduler and the other
SCFQ (Self
-
Clocked Fair Queuing) Scheduler.


The processing time of the job’s packets will be evaluated to
determine the most effective network topology for IndoEdu
-
Grid

39


The entities of our design are resources, users, and jobs or Gridlets


Resource entities are responsible to perform computation on job
entities in form of Gridlets sent by one or more users and send it
back to the user. Our work uses one resource for each province;
each resource consists of one Machine and each Machine consists
of 4 PEs (processing elements).


Users are entities responsible to submit jobs in form of Gridlet
objects to the resources. The users are programmed to send jobs to
a particular resource at the same time, thus we are able to gain
more knowledge on the performance of Grid system in its peak
load, when all the users are accessing the resource at the same
time.


Jobs in GridSim are represented as the objects of the class Gridlet
provided by GridSim. In our work, each user will create three
Gridlets having different lengths

5000 MI (millions instructions),
3000 MI, and 1000 MI. This was aimed to simulate the real
situation where a user does not just send one job, but it can also
send more than one job with different sizes and needs of
computation powers.

40


The first scenario is a representation of our thought that divides the whole territory of
Indonesia into three main sections

the western, central, and eastern part of Indonesia. Each of
these three sections will be subdivided into parts or units that are smaller

the islands and/or
archipelagos.

41

42

The second scenario is a representation of our thought that divides
the whole territory of Indonesia directly into islands and/or
archipelagos units. These islands and/or archipelagos will be
divided again into province units.


Hardware


Intel® Core™ 2 Duo T5800 processor with 2.0 GHz clock speed, 800 MHz FSB
(Front Side Bus), and 2 MB L2 cache.


2048 MB RAM (
Random Access Memory
) with shared dynamically with Mobile
Intel® Graphics Media Accelerator 4500MHD.


320 GB

Fujitsu MHZ2320BH G2 SATA harddisk with 5400 rpm rotation speed.


Software


32
-
bit Microsoft Windows Vista™ Business operating system.


JDK (Java Development Kit) version 1.6.0_05 with Java™ Runtime
Environment 1.6.0_05
-
b13.


GridSim version 5.0 beta.


The simulation was run 10 times in each scenario to increase the validity of
simulation results, and then the results were averaged.



SCFQ scheduling algorithm, even
-
numbered users are set to have a weight 1,
indicating that they have a higher priority, while odd
-
numbered users are set to
have a weight 0, indicating that they have normal priority. This weighting is
useful to determine the type of service (ToS) which is owned by the packets sent
by the users.



FIFO scheduling algorithm, all users by default are set to have a weight 0, so all
sent packets will have the same ToS.

43

Processing Time

(in Simulation Seconds)

Scheduling
Algorithm

Scenario

Gridlet#0

Gridlet#1

Gridlet#2

Scenario 1

239.76471

184.89620

124.45739

FIFO

Scenario 2

240.23045

185.26774

124.11812

Scenario 1

235.50311

180.73233

124.67395

SCFQ

Scenario 2

235
.78695

181.59782

124.05540


44

Average Simulation Results Data for the Entire Provinces per Gridlet Using
FIFO and SCFQ Scheduling Algorithm



Job = Gridlet, which simulates the job packets that contain information about the length of
jobs in units of MI (millions instruction), the length of input and output files in units of bytes,
starting and finishing execution time, and the owner of the jobs.


three Gridlets #0, #1, #2 has different lengths

5000 MI (millions instructions), 3000 MI,
and 1000 MI, respectively.



More people are becoming interested in shared
computing facilities,


Many free of charge grid development tools are
available,


Develop a strong unit that capable building the Grid
infrastructure, but it needs commitment and dedication
from at least university level and government, or


INHERENT can be improved, it will open more
collaboration among universities,


Nusantara Super Highway
Rampung

di

2015
,
"Nusantara Super Highway
berbasis

optical network
merupakan

kelanjutan

dari

cita
-
cita

Telkom
untuk

menyatukan

Indonesia
melalui

visi

Nusantara 21 yang
sudah

dimulai

sejak

2001
dengan

teknologi

berbasis

satelit
,"http
://
www.detikinet.com
/read/2011/04/19/143116/1620709/328/nusantara
-
super
-
highway
-
rampung
-
di
-
2015?i991101105



45


Unreliable electricity supplies


No coordination at national level to have ICT research
and development programs involving across
government and private organizations


Relies on grant fund which leads to other negatives
effects such as,


Most Indonesian funding resources do not allow hardware
(computers) investment (only spare parts are allowed


)


Permanent human resources that manage the Grid,


Maintenance of the grid to adapt with current technology
development.


Many organization are “very protective” to their
computing resources, only a few are willing to share
them.



46

47

Only few (may one or two) faculties teach
cluster, cloud and grid Computing. So only
few master and understand them.


Perhaps Cloud computing is the alternative
solution in one way, however ……….the
cloud itself has some challenges


Challenges
-

cont

48

Uhm, I am not quite

clear…Yet another

complex IT paradigm?

Billing

Utility & Risk
Management

Scalability

Reliability

Software Eng.
Complexity

Programming Env.

& Application Dev.

49


More bioinformatics, medical informatics,
image analysis, finance with GPU
computing
environment,


Indonesian
Egov

Grid services


Indonesian Archeology and
Culture
-
Grid
services


Indonesian Health
-
Grid
services

50


ABCGrid
, http://abcgrid.cbi.pku.edu.cn (
akses

3
Oktober

2008), also by Ying Sun,
Shuqi

Zhao,
Huashan

Yu,
Ge

Gao

and
Jingchu

Luo
. (2007)
ABCGrid
: Application for Bioinformatics
Computing Grid. Bioinformatics



Rajkumar

Buyya
,
www.gridbus.org/megha
;
www.buyya.com
; www.manjrasoft.com


GCIC, http://www.gridcomputing.com/, akses 25 Sep 2008.



Globus
, http://www.globus.org,
akses

25 Sep 2008


Gridbus

Application, http://www.gridbus.org/applications.html,
akses

25 Sep
2008


Gridbus

Middleware, http://www.gridbus.org/middleware/,
akses

25 Sep 2008


GridGain
, http://www.gridgain.com,
akses

15 Sep 2008


Ivo

Bahar
,
Heru

Suhartanto
, Design and Simulation of Indonesian Education Grid Topology
using
Gridsim

Toolkit, to appear at Asian Journal of Information Technology, 2010


H.
Suhartanto
,
Kajian

Perangkatbantu

Komputasi

tersebar

berbasis

Message Passing,
Makara

Teknologi
,
Vol

10, No 2, 2006, page 72


81.


H.
Suhartanto
,
Peluang

dan

tantangan

Aplikasi

Grid Computing
di

Indonesia,
pidato

pengukungan

guru
besar
, 2008.


InGrid, https://grid.ui.ac.id/gridsphere/gridsphere, akses 28 Sep 2008


Jardiknas, http://jardiknas.diknas.go.id/, akses 28 Sep
2008


John Rhoton, cloud computing explained, 2nd ed, recursice press, 2010

References


51



Molecular Docking, http://grid.apac.edu.au/OurUsers/MolecularDocking, akses 27
Sep 2008



Molecular Docking Definition,
http://en.wikipedia.org/wiki/Docking_(molecular)
, akses
3 Oktober 2008


MultimediaGrid,
http://www.gridbus.org/papers/MultimediaGrid
-
MJCS2007.pdf
, akses
27 Sep 2008


NeuroGrid, http://www.gridbus.org/neurogrid/, akses 27 Sep 2008


Paul Coddington, Distribute and High Performance Computing course, University of
Adelaide, 2002 UK national HPC service,
http://www.csar.cfs.ac.uk/user_information/grid/grid
-
middleware.shtml


Peluang dan tantangan Aplikasi Grid Computing di Indonesia Page 12 of 12


Pipeline


Inspektionmolch: http://www.hpe.fzk.de/projekt/molch/, akses 27 Sep
2008


Top500, http://www.top500.org, di akses 14 September 2008.



Wahid Chrabakh, Computational Grid Computing: Application Viewpoint, Computer
Science, Major Exams, UCSB, ppt file,



Zlatev, Z. and Berkowicz, R. (1988), Numerical treatment of large
-
scale air pollutant
models, Comput. Math. Applic., 16, 93
--

109

52