Throwing the Net and Catching
Table of contents
Table of contents
Amazon EC2 Cloud Computing
The Onion Router
project's goal was to discover new attack trends via an application level honey
In order to achieve this goal, we decided on two paths
Create a TOR server coupled with a sniffer, capturing all outgoing HTTP
Deliberately infect a machine w
ith a botnet Trojan, and log all traffic from the
In order to cope with analyzing all the data we expected of collecting, we devised an
automatic tool to tag and filter the log files on a host based filter, and display it in
HTML's, allowing eas
ier processing of the data.
Cloud computing is a model for enabling convenient,
demand network access to
a shared pool of configurable computing resources
American National Institute of
Standards and Technology
The Cloud Computing technology is based on five main principles:
A user may at any time, without other human
intervention, use available resources from the Cloud, such as data storage or
Broad network access
Access to the Cloud is not limited to a single ISP or
limited by geographic locations, but is available by globally and from any
(e.g., mobile phones, laptops, and PDAs).
The user need not have any technical knowledge of the
’s implementation or even geographic location,
and the Cloud’
resources are pooled to serve multiple users.
Examples of resources include
storage, processing, memory, network bandwidth, and virtual machines
computing resources must be q
uickly deployable, with the
user having the option of quickly modifying the computing power receive
from the cloud. To the average user, the Cloud resources often appear
Cloud systems have monitoring services, allowing users
o pay only for consumed Cloud resources, such as by computing power or
Abstract Cloud Diagram
Amazon EC2 Cloud Computing
Amazon Elastic Compute Cloud (Amazon
) is a web service that provides
resizable compute capacity in the cloud.”
For our project we needed a means of creating a honey
pot with the option of
creating multiple honey
pots for added visibility. This led us to implement our
pot on Amazon’s EC2 service.
By creating our probes in a cloud, we enjoyed the following benefits:
No physical server is needed, everything is in the cloud.
As we were handling Trojans and prone to hacker attacks, we could rapidly
terminate and re
ate our instances.
By creating a ‘snapshot’ of the instance, we can easily backup our work.
With the same ‘snapshot’ feature we can deploy as many probes as we like,
determining their geographic location as well.
Through this menu we can browse our running instances and launch new instances,
terminate them, configure security permissions, connect to them through Remote
Desktop Protocol (RDP), and view their traffic usage.
Amazon Machine Images (AMI)
are snapshots of running instances. This feature
allows us to save an instances state, either as backup or to be redeployed whenever
required. We used this feature to save all our progress for future projects. To create
a new AMI, a machine needs to be ‘b
As well as instances, a storage device may be leased from Amazon. This ‘removable’
storage acts as an external HD which can be ‘attached’ or ‘removed’ from any of
your instances, allowing in effect easy data transfer between instances.
s feature solved an interesting problem
when we infected an instance we a
Trojan, the Trojan blocked all RDP connections to the instance, in effect denying us
any contact with the machine. This stopped us from accessing the log files
ng from the instance. The workaround we found, was redirecting
ireshark’s log files to a volume, and after a while we detached the volume and re
attached it to a different instance in order to read the log files.
EC2 allows you to determi
ne the security for every instance, a feature very unlike a
firewall with an IP table. The user may define several security groups, and under
each group define the IP rules. Through the instance screen, the security groups are
assigned to the different ins
tances. This allows to strengthen the likelihood of an
The Onion Router
Tor is a network of virtual tunnels that allows people and groups to improve their
privacy and security on the Internet.
When an internet user wants to hide his identity while surfing the web, he may use
the TOR client software.
Instead of taking a direct route from source to destination,
data packets on the Tor netw
ork take a random pathway throug
h several relays that
tracks so no observer at any single point can tell where the data came
from or where it's going.
The underlying assumption of our project is that amongst legitimate users, TOR is
ing used by nefarious groups or individuals in order to commit cybercrime
while staying anonymous.
We set up a TOR server on Amazon's EC2 cloud computing service and installed
on it in order to capture all outgoing traffic, i.e traffic that used our server
as an end node in the TOR network.
We wrote a batch script that utilizes several
executables and xslt
files, in order to automatically export the log files, first into an
xml format, and then to html format, making the analysis process much easier.
password for this instance is: h2YlscaDXcG.
machine was also set up with Wireshark, an
d infected with a botnet.
The password for this instance is: oXFBdHBIxrY.
Compares an xml file with 'packet' nodes to a
list and adds a valid attribute to the 'host' field of the
'packet' nodes acc
Output is written into new file named
is the name of the original
garbage XML_file filter_list
Iterates over an XML with 'packet' nodes, and
creates a new file
holds only the 'packet' nodes
whose hosts passed the
Output is written into
new file named
is the name of the
fixes the encoding definition
in the given xml file, to comply
Output is written into a new file named
is the name of the original file
NOTE: this executable
is a work
around a bug in
earlier versions of
you are using the lates
version of T
shark this executable
is superfluous and
can (and should) be omitted.
: fix_xml XML_file
XSLT files used
converts the pdml
files created by T
shark to xml files that store the captured packets in the form of 'packet'
nodes. Written by Amichai Shulman.
converts xml files with 'packet' nodes to a html file
a table of links to URLs that appeared
in the 'packet' nodes,
sorted by IP address.
We noticed in log file 41 access to guest
books of legit sites. The guest
filled with comments, where each comment was actually a list of hyperlinks with
like tags and a comment about porn.
The links lead to non
where a script had uploaded a link to a “porn movie”. These links are in forum
or fake social profiles.
When the “movie” is played, the browser is
referred to another site where you are either told you need to download a codec to
play the video (Trojan), or using scareware, you are told you have a virus on
and you mus
t install a new anti
Using a disassembler we took a close look at the Trojan file. Most of the code was
encrypted and/or obfuscated, as we saw a reference to crypt32.dll and the code was
mostly unreadable. The only readable part of the code
was a request for
Using a different EC2 instance, we infected the machine with the Trojan and used
Wireshark to log the traffic. Analyzing the log files, we noted our machine contacted
the IP address
malware server, and our Trojan was
listed a s a "fast
flux rogue anti
After contacting the IP above, our machine began contacting various sites in the US
(Manhattan, Francisco, etc.) and downloading encrypted data.
systematic downloading of complete real estate sites. This
phenomenon is known as scraping. Scraping is the unauthorized extraction of
information from another website. Automated bots can collect all of the exposed
information on a site, though a cr
paste operation. Realtors can then
post the data on their own sites and link the properties to their agents, as if they had
been hired to market the properties. Essentially, scraping is copying the listings from
one company and then displaying
them as your own on your website.
n log file 15 we found
systematic downloading of physician names as well as other
state medical board
s. This is the first step in an
identity fraud scheme intended to defraud Medicare.
For further reading
Yahoo! brute force attack
ne of the thing
apparent in virtually
every log file
was a widespread
force attack against Yahoo email
users, which aimed
to obtain login
credentials and then use the hijacked accounts for spamming.
Yahoo Mail's main
login page utilizes a number of
security mechanisms to protect against brute force
attacks, including providing a generic "error" page that does not reveal whether it
was the username or password that the user got wrong. Also, Yahoo tracks the
number of failed login attempts and requires
that users solve a CAPTCHA if they
have exceeded a certain number of incorrect tries. Attackers have apparently found
a web service application used to authenticate Yahoo users that does not contain the
same security mechanisms
. The application
is an API used to
authenticate ISP business partners of Yahoo
The application is giving detailed error
messages when someone enters the wrong username and password, noting which was
incorrect. Also, it does not utilize any CAPTCHA on the
enabling attackers to
guess an unlimited number of times until they come up with the right credentials.
We managed to identify the botnet we had used for infection as belonging to the
Zeus botnet, at the time infecting around 1
000 computers world
machine failed to contact its C&C, probably due to a takedown on Zeus servers a
month before we used the botnet.
One of the initial goals of our project was to deploy several TOR probes, allowing
truer visibility of TOR usage.
From our single TOR probe we accumulated ~10MB
, meaning ~1.4GB of data per day.
The enormous amount of
data we col
lected made sieving through the data for attack vectors very difficult.
Even after applying our filtering tool, the process
We suggest a different approach in order to circumvent this problem. The
idea is to
limit the scope. Instead of
looking at all traffic while excluding specific sites, a better
idea might be to exclude all sites except for a chosen few, in which we are
This way, we can deploy multiple probes, while getting only pertinent