Throwing the Net and Catching Hackers Final Report

vanillaoliveInternet και Εφαρμογές Web

3 Νοε 2013 (πριν από 3 χρόνια και 5 μήνες)

63 εμφανίσεις

Throwing the Net and Catching
Hackers



Final Report


Winter 2009
-
2010










Submitted by:

Shalev Mintz

Ori Rezen


Supervisor:

Amichai Shulman

Table of contents


Table of contents









2

Overview










3

Cloud Computing
-

Introduction







4

Amazon EC2 Cloud Computing







5

EC2


Main Features









6

TOR


The Onion Router








7

Project Architecture









8

Results











1
0

Conclusion










12



Overview

Our

project's goal was to discover new attack trends via an application level honey
-
pot.
In order to achieve this goal, we decided on two paths





Create a TOR server coupled with a sniffer, capturing all outgoing HTTP
traffic.



Deliberately infect a machine w
ith a botnet Trojan, and log all traffic from the
machine.


In order to cope with analyzing all the data we expected of collecting, we devised an
automatic tool to tag and filter the log files on a host based filter, and display it in
HTML's, allowing eas
ier processing of the data.



Cloud Computing
-

Introduction


Cloud computing is a model for enabling convenient,
on
-
demand network access to
a shared pool of configurable computing resources



American National Institute of
Standards and Technology
(NIST)


The Cloud Computing technology is based on five main principles:

1.

On
-
demand self
-
service



A user may at any time, without other human
intervention, use available resources from the Cloud, such as data storage or
server time.

2.

Broad network access



Access to the Cloud is not limited to a single ISP or
limited by geographic locations, but is available by globally and from any
device
(e.g., mobile phones, laptops, and PDAs).

3.

Resource pooling



The user need not have any technical knowledge of the
Cloud
’s implementation or even geographic location,
and the Cloud’
s
resources are pooled to serve multiple users.
Examples of resources include
storage, processing, memory, network bandwidth, and virtual machines

4.

Rapid elasticity



computing resources must be q
uickly deployable, with the
user having the option of quickly modifying the computing power receive
from the cloud. To the average user, the Cloud resources often appear
unlimited.

5.

Measured Service



Cloud systems have monitoring services, allowing users
t
o pay only for consumed Cloud resources, such as by computing power or
storage size.


Abstract Cloud Diagram



Amazon EC2 Cloud Computing


Amazon Elastic Compute Cloud (Amazon
EC2
) is a web service that provides
resizable compute capacity in the cloud.”
-

aws.amazon.com/ec2/


For our project we needed a means of creating a honey
-
pot with the option of
creating multiple honey
-
pots for added visibility. This led us to implement our
h
oney
-
pot on Amazon’s EC2 service.

By creating our probes in a cloud, we enjoyed the following benefits:



No physical server is needed, everything is in the cloud.



As we were handling Trojans and prone to hacker attacks, we could rapidly
terminate and re
-
cre
ate our instances.



By creating a ‘snapshot’ of the instance, we can easily backup our work.



With the same ‘snapshot’ feature we can deploy as many probes as we like,
determining their geographic location as well.


EC2 Interface




EC2


Main Features

Instances

Through this menu we can browse our running instances and launch new instances,
terminate them, configure security permissions, connect to them through Remote
Desktop Protocol (RDP), and view their traffic usage.


AMI

Amazon Machine Images (AMI)

are snapshots of running instances. This feature
allows us to save an instances state, either as backup or to be redeployed whenever
required. We used this feature to save all our progress for future projects. To create
a new AMI, a machine needs to be ‘b
undled’.


Volume

As well as instances, a storage device may be leased from Amazon. This ‘removable’
storage acts as an external HD which can be ‘attached’ or ‘removed’ from any of
your instances, allowing in effect easy data transfer between instances.

Thi
s feature solved an interesting problem


when we infected an instance we a
Trojan, the Trojan blocked all RDP connections to the instance, in effect denying us
any contact with the machine. This stopped us from accessing the log files
W
ireshark
was creati
ng from the instance. The workaround we found, was redirecting
W
ireshark’s log files to a volume, and after a while we detached the volume and re
-
attached it to a different instance in order to read the log files.


Security Groups

EC2 allows you to determi
ne the security for every instance, a feature very unlike a
firewall with an IP table. The user may define several security groups, and under
each group define the IP rules. Through the instance screen, the security groups are
assigned to the different ins
tances. This allows to strengthen the likelihood of an
instance ‘highjack’



TOR


The Onion Router

"
Tor is a network of virtual tunnels that allows people and groups to improve their
privacy and security on the Internet.
"
www.torproject.org

When an internet user wants to hide his identity while surfing the web, he may use
the TOR client software.
Instead of taking a direct route from source to destination,
data packets on the Tor netw
ork take a random pathway throug
h several relays that
cover one's

tracks so no observer at any single point can tell where the data came
from or where it's going.



The underlying assumption of our project is that amongst legitimate users, TOR is
also be
ing used by nefarious groups or individuals in order to commit cybercrime
while staying anonymous.


Project Architecture

We set up a TOR server on Amazon's EC2 cloud computing service and installed
Wireshark
on it in order to capture all outgoing traffic, i.e traffic that used our server
as an end node in the TOR network.

We wrote a batch script that utilizes several
executables and xslt

files, in order to automatically export the log files, first into an
xml format, and then to html format, making the analysis process much easier.
The
password for this instance is: h2YlscaDXcG.


Another

virtual

machine was also set up with Wireshark, an
d infected with a botnet.

The password for this instance is: oXFBdHBIxrY.



Executables used



Filter
-
garbage.exe

-

Compares an xml file with 'packet' nodes to a
pre
-
defined filter
-
list and adds a valid attribute to the 'host' field of the
'packet' nodes acc
ordingly.

Output is written into new file named
XML_file
_filtered,
where
XML_file

is the name of the original
file
.


Usage:
Filter
-
garbage XML_file filter_list



remove
-
non
-
valid.exe

-

Iterates over an XML with 'packet' nodes, and
creates a new file
that
holds only the 'packet' nodes

whose hosts passed the
filter
-
list.

Output is written into

a

new file named

XML_file
_only_valid.xml

where
XML_file

is the name of the
original file
.

Usage:
remove
-
non
-
valid XML_file



fix_xml.exe


fixes the encoding definition

in the given xml file, to comply
with UTF
8.
Output is written into a new file named
XML_file

_fixed
where
XML_file

is the name of the original file
.

NOTE: this executable

is a work
-
around a bug in

earlier versions of

T
-
shark. If
you are using the lates
t
version of T
-
shark this executable

is superfluous and
can (and should) be omitted.



Usage
: fix_xml XML_file





XSLT files used



wireshark
-
basic
-
v2.xslt



converts the pdml

files created by T
-
shark to xml files that store the captured packets in the form of 'packet'
nodes. Written by Amichai Shulman.



tor
-
to
-
html.xslt



converts xml files with 'packet' nodes to a html file
s
that contain

a table of links to URLs that appeared
in the 'packet' nodes,
sorted by IP address.




Results

Trojan attack

We noticed in log file 41 access to guest
-
books of legit sites. The guest
-
book
s

were
filled with comments, where each comment was actually a list of hyperlinks with
xml
-
like tags and a comment about porn.

The links lead to non
-
pornographic sites,
where a script had uploaded a link to a “porn movie”. These links are in forum
comments
or fake social profiles.

When the “movie” is played, the browser is
referred to another site where you are either told you need to download a codec to
play the video (Trojan), or using scareware, you are told you have a virus on

your

windows

OS

and you mus
t install a new anti
-
virus (Trojan).

Using a disassembler we took a close look at the Trojan file. Most of the code was
encrypted and/or obfuscated, as we saw a reference to crypt32.dll and the code was
mostly unreadable. The only readable part of the code

was a request for
administrator privileges.

Using a different EC2 instance, we infected the machine with the Trojan and used
Wireshark to log the traffic. Analyzing the log files, we noted our machine contacted
the IP address
195.5.161.117
.

This IP
is

a k
nown

malware server, and our Trojan was
listed a s a "fast
-
flux rogue anti
-
virus."

After contacting the IP above, our machine began contacting various sites in the US
(Manhattan, Francisco, etc.) and downloading encrypted data.



Scraping

In
many

log file
we saw
systematic downloading of complete real estate sites. This
phenomenon is known as scraping. Scraping is the unauthorized extraction of
information from another website. Automated bots can collect all of the exposed
information on a site, though a cr
ude copy
-
and
-
paste operation. Realtors can then
post the data on their own sites and link the properties to their agents, as if they had
been hired to market the properties. Essentially, scraping is copying the listings from
one company and then displaying

them as your own on your website.




Physician
identity theft

I
n log file 15 we found

systematic downloading of physician names as well as other
details, from

several

state medical board

Web site
s. This is the first step in an
identity fraud scheme intended to defraud Medicare.
For further reading

on the
subject

click
here
.


Yahoo! brute force attack

O
ne of the thing
s that
were

apparent in virtually

every log file
was a widespread
brute
-
force attack against Yahoo email
users, which aimed

to obtain login
credentials and then use the hijacked accounts for spamming.

Yahoo Mail's main
login page utilizes a number of
security mechanisms to protect against brute force
attacks, including providing a generic "error" page that does not reveal whether it
was the username or password that the user got wrong. Also, Yahoo tracks the
number of failed login attempts and requires

that users solve a CAPTCHA if they
have exceeded a certain number of incorrect tries. Attackers have apparently found
a web service application used to authenticate Yahoo users that does not contain the
same security mechanisms
. The application
-

/config/
isp_verify_user


is an API used to
authenticate ISP business partners of Yahoo
.
The application is giving detailed error
messages when someone enters the wrong username and password, noting which was
incorrect. Also, it does not utilize any CAPTCHA on the

error page,

enabling attackers to
guess an unlimited number of times until they come up with the right credentials.


Zeus B
otnet

We managed to identify the botnet we had used for infection as belonging to the
Zeus botnet, at the time infecting around 1
00,
000 computers world
-
wide. Our

machine failed to contact its C&C, probably due to a takedown on Zeus servers a
month before we used the botnet.




Conclusion


One of the initial goals of our project was to deploy several TOR probes, allowing
truer visibility of TOR usage.
From our single TOR probe we accumulated ~10MB

of
HTTP tr
a
ffic per
minute
, meaning ~1.4GB of data per day.


The enormous amount of
data we col
lected made sieving through the data for attack vectors very difficult.
Even after applying our filtering tool, the process
remained
Sisyphean
.


We suggest a different approach in order to circumvent this problem. The
idea is to

limit the scope. Instead of

looking at all traffic while excluding specific sites, a better
idea might be to exclude all sites except for a chosen few, in which we are
interested.

This way, we can deploy multiple probes, while getting only pertinent
data.