Sand Box Assignment 3 Mini Project 2 - Python, Twitter and Data visualisation

peanutunderwearΛογισμικό & κατασκευή λογ/κού

7 Νοε 2013 (πριν από 4 χρόνια και 5 μήνες)

233 εμφανίσεις

Sand Box Assignment 3

Mini Project 2

Python, Twitter and Data visualisation

The following a
ssignment is based upon the Sandbox guest lectures on python, twitter and Data
visualisation. This is a programming and analysis assignment as such you are expected to liberally
comment your code to describe what you are doing at each step, how you identif
y the sender of each
tweet, and how you avoided naming each tweet sender more than once.

You will be required to run a demonstration of your
code and
explain how
works as part of your

Issue Date 19

December 2012

Submission Date 22

January 2012

Resources for this assignment can be found on my webpage


Using the Twitter API and a Python script or otherwise, search for up to 1500 recent tweets
around a particular event or topic based hashtag. (Try to

identify a hashtag that has a "small"
number of participants (10s to 100s) who send tweets to each other at least some of the time.

Examples might be hastags on the following subjects:

Politics, Football fans, Online Debates, fan Reviews, events.



Write a Python routine that will loop through each of the search results, identify the distinct users
who sent the tweet and a count of the number of tweets send by each of them in the sample you

15 Marks


Extend your program to print
out (or save to a file) a sorted league table showing the top 10
tweeters, along with their rank position by tweet volume and a count of the number of tweets they

15 Marks


Visualise this data using or a spreadsheet programme such
as Google Sheets to
produce a bar chart showing the top 10 tweeters along with the number of tweets each of them

10 Marks


Optional extension: Tweets that start with a Twitter ID may be thought of as "conversational",
specifically tweets that are referred to one user by another who starts a tweet with that particular
user's name. For each "conversational" tweet, extract who se
nt the tweet and to whom, and
produce two more league tables:


showing an ordered ranking of who sent the most conversational tweets


showing an ordered ranking of who received the most conversational tweets.

20 Marks


Identify how many unique (sender
,receiver) pairings there are in your sample (a weak measure of
conversational diversity), and produce an ordered league table that displays how many times each
conversational pairing was observed in the sampled dataset.

20 Mark