Tisztelt ……

fantasicgilamonsterΔιαχείριση Δεδομένων

20 Νοε 2013 (πριν από 3 χρόνια και 8 μήνες)

116 εμφανίσεις


PSZK | H
-
1149 Budapest, Buzogány utca 10
-
12. | Telefon: (+36
-
1) 469
-
6692

| Fax
: (+36
-
1) 469
-
6627

| www.bgf.hu





Közgazdasági Informatikai Intézeti
Tanszék

SYLLABUS


Data Science


Course title:

Data Science

Course code:


Status:


Contact hours:

1 lecture + 2 practice per week

Credits:

5

Prerequisites:


Course unit leader:


Tutor(s):

Attila Petróczi

petro
cziattila@gmail.com


Aims and objectives

To introduce the

basic
of the modern data driven business decision making
process.
To
arouse the interest

for system
-
theoretic thinking,

system
-
approaching
knowledge

and analysis processes.
To acquaint

the
studen
ts with

the advantages of
software supported analysis in business environment. Discuss the advantages and
disadvantages of data
-
mining models.
As in the real business life

decision making is
used to be based on
data analyses, students should to be familiar

with data
gathering, data cleansing, data storing and data analysis methods and software.


Aim is to provide for the students useable and applicable data analyses method
and data
-
mining software skills.


Learning outcomes

The student will be able to so
lve a complicated analysi
s
tasks alone. The student
will have the appropriate knowledge of analysis process and a toolset for the
solution in business situations.




PSZK | H
-
1149 Budapest, Buzogány utca 10
-
12. | Telefon: (+36
-
1) 469
-
6600 | Fax:
(+36
-
1) 469
-
6610 | www.bgf.hu






Methodology


This subject underlines the distinctive role of
lectures that

will be carried
out each
week during the term.

These lectures intent

to transfer basic theoretical knowledge.
Lectures are followed by practice, where students are given practical knowledge, IT
usage competences and problem solving tasks regarding to
business analysis and

data
-
mining.
These lessons also give the competence of using
data
-
mining models
and software.




Course schedule

Consultations
(semester weeks)

Topic

1
st



2
nd

Lecture:
Elements
: Business Intelligence, Data
-
mining,
Statistics, Data Science, Business An
alyst, Data Scientist
.

Practice:
Introduction of programing environment, Unix
commands, AWK scripting language
.

3
rd


4
th

Lecture:
Analysis processes and methods. Data structures,
Streaming API, Data Cleansing
.

Practice:

Text file processing, cleansing in
Python
.

JSON objects
in practice.


5
th
-
6
th

Lecture:
Database, Data Warehouse,
NoSQL. Relation algebra
and SQL. Map
-
Reduce and Hadoop basics.


Practice:
The connection of relational algebra and SQL. Map
-
Reduce implementation in Python.


7
th
-
8
th

Lecture:
B
asics of Data
-
mining, data
-
mining software and
frameworks.

Characterize data.

Practice:

Characterizing data using software:

descriptive
statistics, charts, statistical tests.

9
th

-
10
th

Lecture:
Data preparation

Practice:
Data preparation with Data
-
mining
software.

11
th

-
12
th

Lecture:
Data
-
mining models (clustering, classification,
association). Case study.


Practice:
Data
-
mining models in practice
.




PSZK | H
-
1149 Budapest, Buzogány utca 10
-
12. | Telefon: (+36
-
1) 469
-
6600 | Fax:
(+36
-
1) 469
-
6610 | www.bgf.hu






Course policies

Students a
re expected to attend lectures and carry out tasks during practice
lessons.


Assignments

Exam requirements: the condition for seminar grade is 2 compulsorily written end
-
term exam papers
, 2 end
-
term practice exams

and one team project task
.


To have a pass seminar grade one has to get minimally 61% of th
e summarized
scores of th
e 4

end
-
term exams,

50% of the team project task
.


The condition for signature: attending seminars and writing
2

end
-
term papers.


Assessment and grading

The fina
l mark will be composed of the 5

above mentioned assignments.
Grading:

the points (percentages
)
corresponding

to marks from 1
-
5.


0
-
60%

Fail (1)

61
-
69%

Pass (2)

70
-
79%

Fair (3)

80
-
89%

Good (4)

90
-
100%

Excellent (5)


Set readings


-

The material for lectures and seminars.

Recommended readings

-

Larose, Daniel T.,

Discovering Knowledge in Data: An I
ntroduction to Data
Mining, Wiley
-
Interscience, 2004.

-

Lar
ose, Daniel T.,

Data Mining Methods and Models, Wiley
-
Interscience,
2006.