How to download and load CSO data into PostgreSQL

outstandingmaskΔιαχείριση Δεδομένων

29 Νοε 2012 (πριν από 4 χρόνια και 10 μήνες)

263 εμφανίσεις





Lab 8
b


This note describes how to load the
cso
cars
.csv

and csoedu.csv

file
s

into
dublin_eds table
.

You should only use this lab on your own machine.

Do
not

do this lab in the college labs.


Note the data from he
cso
cars
.csv

does

not contain geometry
. C
olumns

from the
cso
cars
.csv

need

be loa
ded into
dublin_
eds

(which must already be
loaded into
PostgreSQL
)
.

If
dublin_
eds

is not loaded, then load it with the following
command:




The same basic technique is used to load other CSO files.




CREATE TAB
LE cso
cars

(geographic_area character varying(50),

no_motor_car integer
,

one_motor_car

integer,

two_motor_cars
integer,

three_or_more_motor_cars integer,

total
integer);



COPY
csocars

FROM

E'
\
C:
\
\
Program Fil
es
\
\
PostgreSQL
\
\
8.4
\
\
bin
\
\
csocars
.csv'

WITH CS
V;


If you are using Windows
-
7 or Vista

and you install
ed PostgreSQL/PostGIS in
Program Files (x86)

then should replace
Program Files

in the path
above with
Program Files (x86)
.
For the difference between "
Program
Files
" folder & the "
Program Files (x86)
"
folder see:

http://answers.microsoft.com/en
-
us/windows/forum/windows_7
-
files/programs
-
files
-
vs
-
program
-
files
-
x86/7d631676
-
e688
-
472a
-
a1bb
-
eefccfeca6d7


If you are using Windows
-
7 or Vista

you could copy the CSV file in the table as
follows:

COPY csoedu FROM

E'
\
C:
\
\
Program Files (x86)
\
\
PostgreSQL
\
\
8.4
\
\
bin
\
\
csoedu.csv'

WITH CSV;


The following updates cop
y the car data from

the

csocars

t
able t
o

the

dublin_eds

table.


update dublin_eds set no_car15_1 = ( select no_motor_car from csocars
where saps_label = geographic_area);


update dublin_eds set one_car15_ = ( select one_motor_car from csocars where
saps_
label = geographic_area);


update dublin_eds set two_cars15


= ( select two_motor_cars from csocars where saps_label = geographic_area);



update dublin_eds set three_or_m


= ( select three_or_more_motor_cars from csocars where saps_label = geographic_area
);









You can do the same for
the education data from the CSO
. The names in the
dublin_eds

table and the csoedu.csv are listed below:

Names in dublin_eds DB table

Names in
CSOEDU
.CSV

saps_label

formal_edu


primary_ed


lower_seco


upper_seco


technic
al1


upper_s_01


non_degree


primary_de


profession


both_degre


postgradua


postgra_01


doctorate1


not_stated



Geographic Area

No formal education

Primary education

Lower secondary education

Upper secondary

Techni
cal or vocational qualification

Upper se
condary and technical or

vocational

Non
-
degree

Primary degree

Professiona
l qualification (degree status)

Both degree

and professional qualification

Post
-
graduate certificate or diploma

Post
-
graduate degree (masters)

Doctorate (PhD)

Not stated

Total




N
ote to make a database table from csoedu.csv you will need to remove the brackets
and replace spaces with underscores as in the cars example above.

To help speed
things up I have supplied the details of adding the education
al

data to
PostgreSQL/PostGIS. Th
e procedure is the same as for the
cars

above.


1)

C
reate an empty table for the CSO educational data called
csoedu
.


2)

L
oad the data from the CSV (comma
separated

values) file called csoedu.csv
(which was download
ed

from the CSO web site and is availa
ble on the course web
page).

We use the SQL COPY command for this.


3)
Move the data in the
csoedu

table to the

dublin_eds

table.


Here are the details. You should be able to copy and paste the following SQL.



1) Create an empty
csoedu

table

CREATE TABLE

csoedu (

geo
graphic_area character varying(8
0),

no_formal_education integer,

primary_education integer,

lower_secondary_education integer,

upper_secondary integer,

technical_or_vocational_qualification integer,

upper_secondary_and_technical_or_vocationa
l integer,

non_degree integer,

primary_degree integer,

professional_qualification_degree_status integer,

both_degree_and_professional_qualification integer,

post_graduate_certificate_or_diploma integer,

post_graduate_degree_masters integer,

doctorat
e_phd integer,

not_stated integer
,

total integer
);


2
) C
opy the data from
csoedu.csv

file to the

csoedu

table

COPY
csoedu

FROM

E'
\
C:
\
\
Program Fil
es
\
\
PostgreSQL
\
\
8.4
\
\
bin
\
\
csoedu
.csv'

WITH CSV;

Check that the data has loaded

select geographic_area,prima
ry_degree from csoedu;

selec
t geographic_area,total

from csoedu;





2
) C
opy the data from the

csoedu

table

to the dublin_eds table. This can ne
done one column at a time (there are quicker ways)
. The table above was used to
match the column names in the
CSO data with those in the
dublin_eds

table.


update dublin_eds set
formal_edu

= (
select no_formal_education from csoedu

where
saps_label = geographic_area);


update dublin_eds set primary_ed = ( select primary_education from csoedu where
saps_label = geo
graphic_area);


update dublin_eds set
lower_seco

= ( select
lower_secondary_education

from csoedu
where saps_label = geographic_area);


update dublin_eds set upper_seco = ( select upper_secondary from csoedu where
saps_label = geographic_area);


update dub
lin_eds set technical1 = ( select technical_or_vocational_qualification from
csoedu where saps_label = geographic_area);



update dublin_eds set upper_s_01

= ( select upper_secondary_and_technical_or_vocational from csoedu where saps_label =
geographic_are
a);


update dublin_eds set non_degree

= ( select
non_degree

from csoedu where saps_label = geographic_area);


update dublin_eds set primary_de

= ( select primary_degree from csoedu where saps_label = geographic_area);


update dublin_eds set
profession
= (
select
professional_qualification_degree_status

from csoedu where saps_label = geographic_area);



update dublin_eds set both_degre = ( select both_degree_and_professional_qualification
from csoedu where saps_label = geographic_area);


update dublin_eds se
t postgradua = ( select post_graduate_certificate_or_diploma from
csoedu where saps_label = geographic_area);



update dublin_eds set postgra_01


= ( select post_graduate_degree_masters from csoedu where saps_label =
geographic_area);


update dublin_eds se
t
doctorate1


= ( select
doctorate_phd

from csoedu where saps_label = geographic_area);


update dublin_eds set not_stated


= ( select not_stated from csoedu where saps_label = geographic_area);