A Novice Enters the World of GPU Parallel Processing

gradebananaΛογισμικό & κατασκευή λογ/κού

2 Δεκ 2013 (πριν από 3 χρόνια και 8 μήνες)

68 εμφανίσεις

Ipsos
-

Nobody's Unpredictable


A Novice Enters the World of GPU Parallel Processing


Oliver Will, Ph.D.

Ipsos Open
Thinking Exchange

1

Ipsos
-

Nobody's Unpredictable


Ipsos


Top 10 global market research firm founded in 1975


Over

1 billion in revenues in 2010


1000s of employees. Lots of offices in lots of countries


Main focus is survey research. Wide variety of industries


I belong to a sub
-
division called the Ipsos Open Thinking Exchange


I analyze data from surveys


No experience with GPUs until January 2011 when I met with Dustin and Charles
for drinks at Lucky
Baldwins


2

Ipsos
-

Nobody's Unpredictable


SoCal

GPGPU







Polls


Who is in the bottom third? Middle third? Top third?


Who has access to a GPU? Who has programmed a GPU with CUDA? Who has
programmed a GPU with
OpenCL
? Who has programmed a GPU directly? (In
Assembly? I guess?)


Who is affiliated with UCLA? Who is not a professor? Who works in the private
sector?

3

Oliver Will

Marc
Suchard

Individual Experience

Ipsos
-

Nobody's Unpredictable


Outline


Discuss the
cMaps

problem


Installation of CUDA


The
cMaps

solution

4

Ipsos
-

Nobody's Unpredictable


cMaps

Product


cMaps

stands for Consumer Maps


Anthropological Facts


Human culture is defined by the people with whom one interacts (
A social network
)


Humans like to have things


Theories and observational evidence from the past 30 years indicate that humans
may treat things like culture

I am defined by the things I have


Assumption: Brands are abstractions that can be treated the same independent of
category


For example: Tide,
Facebook
,
Miley

Cyrus and
The Social Network
behave similarly


Goal: Find social networks of brands based on what people like and dislike

5

Ipsos
-

Nobody's Unpredictable


cMaps

Product


Quarterly on
-
line survey with around 5,000 participants per wave


Sample has quotas set to the gender, age, geographic location and ethnicity
information from the 2000 census


Currently the survey has 3,371 brands across CPG, celebrity and entertainment
categories


Each respondent is asked to evaluate a random subset of 250 brands on an 8
-
point
scale



6

0


Don’t know

1


Hate

2


Dislike strongly

3


Dislike

4


Neither like or dislike

5


Like

6


Like a lot

7


Love

Ipsos
-

Nobody's Unpredictable


cMaps

Product


For the next 3 weeks, the respondent is invited back to evaluate another random
subset of 250 brands


A respondent could have evaluated 1000 brands on the 8
-
point scale by the end


Most come back and evaluate 1000 brands. It’s a fun survey


Data for 5 quarters



7

Ipsos
-

Nobody's Unpredictable


cMaps

Data



8

Respondent

QuarterlyWave

Age

Gender

ETH

EDU

EMP

HHI

_10canerum

_16_pregnant

_1800

_1800contacts

_1800flowers

701

q5

51

female

white

somecollege

fulltime

r50k_74k

5

702

q5

54

male

white

somecollege

parttime

noanswer

0

4

703

q5

66

female

white

somecollege

retired

r35k_49k

0

704

q5

58

male

white

collegegraduat
e

retired

r150k_plus

705

q5

49

female

white

associatesdegr
ee

fulltime

r100k_149k

0

706

q5

34

female

black

somecollege

fulltime

r50k_74k

4

707

q5

59

female

white

highschoolgrad
uate

notemployed

r25k_34k

4

708

q5

41

female

white

collegegraduat
e

homemaker

r75k_99k

709

q5

45

male

white

collegegraduat
e

fulltime

r50k_74k

710

q5

49

female

white

highschoolgrad
uate

homemaker

r15k_24k

0

711

q5

27

male

white

highschoolgrad
uate

fulltime

r50k_74k

4

712

q5

24

female

white

collegegraduat
e

parttime

r75k_99k

713

q5

51

female

white

collegegraduat
e

fulltime

r100k_149k

6

Ipsos
-

Nobody's Unpredictable


BAC Score



9

Don't Love
Hallmark

70%

Love
Hallmark

30%

Among Those
Who
Were Exposed to Hallmark Channel
&
Avon

(n=635)…

Also Love
Avon

48%

Don't Love
Avon

52%

Among Those
Who Love
Hallmark
(n=186)…

Love
Avon

19%

Don't Love
Avon

81%

Among Those
Who Don't Love
Hallmark (n=449)…

48%
-

19% = BAC of 29

Brand Affinity

Connection

Ipsos
-

Nobody's Unpredictable


BAC Score



10

Don't Love
Hallmark

70%

Love
Hallmark

30%

Among Those
Who
Were Exposed to Hallmark Channel
&
Lifetime

(n=532)…

Also Love
Lifetime

68%

Don't Love
Lifetime

32%

Among Those
Who Love
Hallmark
(n=157)…

Love
Lifetime

19%

Don't
Love
Lifetime

81%

Among Those
Who Don't Love
Hallmark (n=375)…

68%
-

19% = BAC of 49

Ipsos
-

Nobody's Unpredictable


BAC Computational Challenge



Convert an
n x m
respondent
-
by
-
brand matrix into a
m x m
brand
-
by
-
brand

matrix.
Not symmetric.
n

= 27,785 and
m

= 3,371


Fit the entire matrix into memory


Use the
apply
function in R to compute the BAC score for one column against all
the other columns


Don’t have enough memory for
mapply


Requires 1.5 minutes to compute a column. Hence, need three days to compute
the entire matrix


Want to compute the brand
-
by
-
brand matrix on subsets of rows. (
e.g.
males under
35)


Subsets not pre
-
specified and vary by client

11

Ipsos
-

Nobody's Unpredictable


BAC Computational Challenge

12

Input

Brand Filter

Brand A

drphilshow

856

WaveAdded

Status

IconCategory

Brand B

X_16_pregnant

2

All

All

Personality

Demo Filter

QuarterlyWave

Age

Gender

All

All

All

Data File

#RError

Output

Brand

MI

BACI Top 1

BACI Top 2

BACI Top 3

Presence for row top 2 box for A

1

drphilshow

2.70

1000

1000

1000

10%

2

drphil

1.05

678

692

772

3%

3

theoprahwinfreyshow

0.29

302

402

409

3%

4

thedrozshow

0.29

400

452

383

5%

5

rachaelrayshow

0.25

202

261

288

4%

6

oprahwinfrey

0.24

368

453

373

3%

7

own

0.23

291

276

265

3%

8

livewithregisandkelly

0.23

220

216

314

3%

Current solution

Ipsos
-

Nobody's Unpredictable


BAC Computational Challenge

Software used to currently solve the problem


Excel 2007


R version 2.12.2


RExcel


Many roads to Rome


4 CPUs on my computer. Access to web servers with lots of CPUs


Big data/parallel packages in R (ff,
bigmemory
, Revolution R Enterprise, snowball,
etc)

http://cran.r
-
project.org/web/views/HighPerformanceComputing.html


Amazon/
Hadoop

solutions (Cloud computing)


Parallel processing on a GPU


13

Ipsos
-

Nobody's Unpredictable


Work Resources

14

HP wx8600 Workstation


Ipsos
-

Nobody's Unpredictable


CUDA
(Compute Unified Device Architecture)


C/C++ bindings and a driver for an
Nvidia

graphics card


OpenCL

is the open source competitor


Good free support from
Nvidia

http://www.nvidia.com/object/cuda_home_new.html


Wikipedia page clearer

http://en.wikipedia.org/wiki/CUDA



15

Navigate on

this menu

Ipsos
-

Nobody's Unpredictable


CUDA

16

Taken from the

Wikipedia page

Ipsos
-

Nobody's Unpredictable



CUDA
-
enabled GPU (Anything better than a
GeForce

8 with 256MB of local
graphics memory)


Microsoft Windows XP, Vista or 7 or Windows Server 2003 or 2008


Most current
Nvidia

graphics card driver


CUDA software (
Cuda

Toolkit 3.2 and SDK)


Microsoft Visual Studio 2005 or 2008, or the corresponding versions of Microsoft
Visual C++


CUDA Installation (Windows)

17

Updating the
Nvidia

GPU drivers and installing the CUDA Toolkit is easy in Windows

Download the appropriate executables, double click and restart your computer

a few times

266.45
-
Quadro
-
winxp
-
32bit
-
english
-
whql.exe

cudatoolkit_3.2.16_win_32.msi

gpucomputingsdk_3.2.16_win_32.exe

Ipsos
-

Nobody's Unpredictable



CUDA Installation

18

Ipsos
-

Nobody's Unpredictable



CUDA Installation

19

Ipsos
-

Nobody's Unpredictable



CUDA Example

20

Ipsos
-

Nobody's Unpredictable



gputools
:
http://brainarray.mbni.med.umich.edu/brainarray/rgpgpu/


Cula

library linear algebra GPU


Windows binary not available


cudaBayesreg
: Windows version of
cudaBayesregData


GPU enabled hierarchical regression models with Gibbs sampling


Windows binary not available


magma:
http://icl.cs.utk.edu/magma/


Another linear algebra system


Windows binary not available


gcbd
: GPU benchmarking for the
Debian

system


Windows binary not available


rgpu
: Bioinformatics support.
Not even in the CRAN repository


Why no Windows Support ?

http://cran.r
-
project.org/bin/windows/contrib/r
-
release/ReadMe


R Packages for CUDA

21

Ipsos
-

Nobody's Unpredictable


1.
Don’t use R. Code everything in C/C++ (C/C#?). Link to Excel through VB

2.
Compile
gputools

using Visual Studio. Use R and Excel

3.
Install Linux. Use R. Front end unknown

4.
Implement a C routine that R can call

5.
Give up and hit the bar for a beer

cMaps

Solution

22

Has anyone done this?

Poll

Which solution did I try?

Ipsos
-

Nobody's Unpredictable


cMaps

Solution

23

No peeking . . .

Ipsos
-

Nobody's Unpredictable


PyCUDA

http://mathema.tician.de/software/pycuda


Installation process


Python(
x,y
): Windows binary installer


http://code.google.com/p/pythonxy/


PyCUDA
: Windows binary installer


http://www.lfd.uci.edu/~gohlke/pythonlibs/


Pytools
: Needed for
PyCUDA
. Not discovered until I ran a demo program.


Go to folder with easy_install.exe with the command line. (My computer
C:
\
Python26
\
Scripts) Type “
easy_install

pytools



Add the folder that contains “cl.exe” to your computer’s PATH variable. (My
computer C:
\
Program Files
\
Microsoft Visual Studio 9.0
\
VC
\
bin)


cMaps

Solution

24

Another Linux person

Ipsos
-

Nobody's Unpredictable


PyCUDA

25

Unfortunately, I ran out of time . . .

Ipsos
-

Nobody's Unpredictable


CUDA Exercise

26

Ipsos
-

Nobody's Unpredictable


CUDA Exercise

27

Graphic from:
http://news.softpedia.com/news/CUDA
-
Software
-
Tools
-
Updated
-
Download
-
Here
-
92312.shtml


Ipsos
-

Nobody's Unpredictable


int

N = 50000;


// Device code

__global__ void
VecAdd
(const float* A, const float* B, float* C,
int

N)

{


int

i

=
blockDim.x

*
blockIdx.x

+
threadIdx.x
;


if (
i

< N) // Shouldn’t N be controlled?
tmpN

=
i+theardsPerBlock



C[
i
] = A[
i
] + B[
i
];

}


// Invoke kernel


int

threadsPerBlock

= 256;


int

blocksPerGrid

= (N +
threadsPerBlock

-

1) /
threadsPerBlock
;


VecAdd
<<<
blocksPerGrid
,
threadsPerBlock
>>>(
d_A
,
d_B
,
d_C
, N);

// 196 blocks

CUDA Exercise: Key Code

28

Ipsos
-

Nobody's Unpredictable



Intro videos:
http://developer.nvidia.com/object/cuda_training.html#1


Book: CUDA by Example


Easy to register as a CUDA developer on the
Nvidia

website. I have not looked at
the forums yet



Dr. Dobb’s CUDA articles:
http://www.drdobbs.com/high
-
performance
-
computing/207200659;jsessionid=O3SEFNQ2NTT0BQE1GHPSKHWATMY32JVN



Theano
:
http://deeplearning.net/software/theano/





Helpful Resources

29

Ipsos
-

Nobody's Unpredictable



RExcel

is a great R package


Linux appears to be where CUDA is at


Key to programming is to understand the two
-
dimensional grid on slide 27
(Map/Reduce functions, SQL calculus)


Memory management is unknown. Step one is to copy the
cMaps

data into the
GPU memory. Otherwise different ways to handle the memory transfer


GPUs can handle concurrency


OpenCL
?


Contact emails:


Personal:
owill4 AT yahoo DOT com


Work:
oliver

DOT will AT
ipsos

DOT com




Conclusions and Further Topics

30