幻灯片 1

parsimoniousknotNetworking and Communications

Feb 16, 2014 (3 years and 3 months ago)

119 views

PROJECT

Topics


Theoretical:


Error Performance Analysis for Partitioned Sketch Data
Structures


Survey:


Security and Privacy for Big Data: A Survey and Future
Directions


Experiments:


Citizen Behavior of 7
-
21 Storm in Beijing, 2012


Music Knowledge Mining


Hadoop for Video Streaming on the Web


MapReduce Jobs For Video Conversion


Your proposed one…

1. Error Performance Analysis

for Partitioned Sketch Data Structures


We talked about the time complexity already (in terms of
update time)


TASK:


What about error performance?


How to optimally allocate the depth of each sketch (zipfian)?


Start to learn from how CM sketch analyzes its error
performance (Theorem 1 and alike)


http://dimacs.rutgers.edu/~graham/pubs/papers/cm
-
full.pdf


Learn about P(d)
-
CU


http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=
6574663

How to determine this?

Result


Analysis (e.g., mathematical derivations)


Some initial simulation (correctness)

2. Survey


Write a good survey in English on


Security and Privacy for Big Data: A Survey and Future
Directions


Cite at least 40+ references (IEEEXplore and ACM Digital Lib)


Paper organization


Classify these works in different categories, from different
angles


Extensive comparisons


Identify future directions (i.e., what are the missing
pieces?)

Some Materials


http://www
-
03.ibm.com/security/solution/intelligence
-
big
-
data/


https://ssl.www8.hp.com/ww/en/secure/pdf/4aa4
-
4051enw.pdf


http://www.emc.com/collateral/industry
-
overview/big
-
data
-
fuels
-
intelligence
-
driven
-
security
-
io.pdf


http://www.isaca.org/Groups/Professional
-
English/big
-
data/GroupDocuments/Big_Data_Top_Ten_v1.pdf


http://www.trendmicro.com/cloud
-
content/us/pdfs/business/white
-
papers/wp_addressing
-
big
-
data
-
security
-
challenges.pdf


http://scholarlycommons.law.northwestern.edu/njtip/vol11/iss5/1/



Think about:


Storage


Analysis


Applications


Cloud, Internet
-
of
-
Things

3. Analyze Citizen Behaviors

of 7
-
21 Storm in Beijing, 2012


The Power of Social Networks and Public Crowd


http://v.youku.com/v_show/id_XNDM5NjY1Mzc2.html


Using social network APIs like Sina Weibo


open.weibo.com/wiki


Use the keyword search to retrieve all related data


#
望京人赴机场免费救援
#

#
双闪车队
# (100+)


菠菜
X6

@
望京网


4. Music Knowledge Mining


Million Song Dataset


http://labrosa.ee.columbia.edu/millionsong


For Example: to calculate music density


http://musicmachinery.com/2011/09/04/how
-
to
-
process
-
a
-
million
-
songs
-
in
-
20
-
minutes/



YOUR TASK: Predict which songs a user will listen to


http://www.kaggle.com/c/msdchallenge


5. Video Streaming on the Web


Store your video as chunks in HDFS


Case: user suddenly move to a specific part of the video


Seek in the file to position the cursor at a specific location


HDFS can only be accessed through a Hadoop client, Apache
server is not.


Apache/FUSE: all file system operations (dir browsing, file
opening and content access) are enabled over HDFS content
through the FUSE interface.


http://internetmemory.org/en/index.php/synapse/using_had
oop_for_video_streaming/


Result


A demo


Choose a least 1 type of video format (e.g., flv)


A client to play video


A web server (with Apache FUSE)


HDFS to store your videos

6. MapReduce For Video Conversion


Convert huge number of video files from one format
to another.


using the open source video converter FFMPEG
(http://ffmpeg.org/download.html).


Data stored on HDFS


Create an app doing it (running on Google AppEngine)

Mechanism


Working in group: 3
-
5 students, clear roles


Email me (ase_bit@yahoo.com) by this Friday (Nov 22)


Team leader, Team members


Topic


Deadline: 28 December 2013!


Deliverable: project report in Chinese


Introduction (motivation, WHY?)


Related Work (What others have done)


Your proposal (HOW?)


Performance Evaluation


Conclusion


Presentation

Suggested Arrangement


Week
-
1: Define your roles and start literature
research


Week
-
2 and 3: Propose solutions


Week
-
4 and 5: Implementation and obtain results


Week
-
6: Write report