PhishLurk: A Mechanism for Classifying and Preventing Phishing Websites

ovenforksqueeSecurity

Nov 3, 2013 (3 years and 5 months ago)

76 views




Master Project Proposal


P
hish
Lurk
:

A
Mechanism for C
lassif
ying and
Preventing
Phishing Websites


by:
Mohammed Alqahtani


1. Committee Members and Signatures:


Approved by Date



__________________________________ _____________


Advisor: Dr. Edward Chow



__________________________________ _____________


Committee member: Dr. Albert Glock



___________________
_______________ _____________


Committee member: Dr. Chuan Yue


















I
ntroduction


Phishing is a cybercrime done by person or company to steal highly sensitive information
such as usernames, passwords and credit card details.

Mostly
, phishing
attacks come

into two
types
e
mails and
w
ebpages that spoof
or
lure the user to enter sensitive information. On other
words, phishing is directing users to fraudulent web sites in order to get the sensitive
information. Users
are

increasin
gly

using the internet to do their
daily task
such bills payment,
banking, socializing. As result of
there are
more
and more
personal information will be used for
different purposes which mean expand the surface of target for phishing.



Sample

of a phis
hing
website
(source:
www
.
phishtank
.
com
)


Phishing has been a major concern in the IT
security.
In the
U.S.
,

companies lose more
than $2 billion
every year as result of phishing attacks [
6
].
Phishing works because of many
reason,

one of the most common reason is the
users’
carelessness and
the
use
rs’

ignorance
about
how to
differentiate
whether the website is phishing or not [1
].
Moreover, there are long lists of
website that are hard to detect.

There are many research have been proposed focusing on anti
-
phishing,
u
sing different
methods of filtering and detecting such as black list, plugs
-
in, extensions and toolbars for
browsers [2]. Desktop browsers’ Developers try hard to provide a solid protection such as
warning the user by displaying a box massage if the websit
e potentially is a phishing websites or
invalid or expired SSL certificates. Mostly a third party and black
-
list are involved to display and
identify phishing websites
[3]
.




Recently, Users started to have more varieties of
access
to
surf
the internet for
example
notebooks, PC game, handhelds and smartphones

, However; using more
varieties
of devises
made in different abilities and features

leads to complicate and sophisticate
providing a full
protection,

especially from phishing attacks
methods
.

Yet there
is no such a complete protection.

One of the most used devices is smartphones.
According to a survey of ComScore, Inc. the
number of smartphones subscribers increased 60 percent in 2010 compared to 2009
[4]
. Another
report by Nielsen Company indicates that

by 2011 half of cell
-
phones users would be using
smartphones
[5]
.


Figure

explains the global rapid growth of smartphones market 2009
-

2010


Users
started to use these types of access to
do
their activities
and
tasks

due to the
advantages
they provide i.e. smartphone preferred to use
because of the easiness
, flexibility and
mobility

that smartphone have
.

Some
activates such as online banking, paying bills, online
shopping and emailing
[5]

demand
users need to enter
sensitive information to
complete the
authentication and authorization process, sensitive information could be credit
-
cards numbers,
password and usernames. In fact,
having many types of devices to access the inter
net would
expand
the surface for phishing attackers and complicate
the protection.



Related

Work

PhishTank

is
an

unprofitable project aimed to build dependable database of phishing
websites

[
7
],
the project is
to
collect
, verify, track and share phishing data
. In order to report a
phishing link
s
, the user has to be register as a member.
S
o the admin

can learn
and
judge each
member's contribution
. The phishing websites can be report
ed and submitted via emails or
PhishT
ank’s websites. The data are verified

by

committee after they are submitted

by t
he


members
. Phishtank
’s

data
base

can be shared via the API.

The links in
the
original database
are
only classified as “phishing” and “unknown”.
We will classify the phishing sites in PhishTank
database into
more precise

categories and used them in the proposed project.

PhishTank

Has
been working effectively to fight against phishing attacks, thousands of phishing links are
monthly detected and verified as valid phishing sites [9], using the public’s effort and
contributi
on to build a trustworthy and dependable database that is open for everyone to use and
share. As result of that
sever
a
l well know organizations
and browsers started
us
ing

PhishTank
data
base
such as Yahoo mail, Opera, MaCafee
,
and Mozilla

Firefox [10].
In m
y prototype, I use
PhishTank as a phishing URLs’ provider.


In the paper titled “
Large
-
Scale Automatic Classification of Phishing Pages

[2]

, Colin
Whittaker, Brian Ryner
,
and

Marria Nazif

propose
d an

automatic classifier to detect phishing
websites. The classifier
maintains Google’s

phishing blacklist automatically and analyzes
millions of pages a day including examining the URL

and the contents to verify whether the page
is phishing or not. The paper p
roposed a classifier works automatically with large
-
scale system
which will maintain a false positive rate below 0.1% and reduce

the life time of phishing page.
They used machine learning technique to analyze the web page conten
t.

In my project, the
determ
ination is based on
Phishtank’s blacklist
, However;
I aim

to
propose a

methodology
for

classification

the phishing website. My

ultimate
goal is not
to determine whether the page
phishing or not,
PhishLurk

determines

depend
ing

on
Phishtank’s blacklist,
but to provide a new
method to classify phishing links

and
consider
ing two factors:

consuming

as less memory and
screen space as possible which eventually
improve the overall classification
efficiency.


In the paper titled “
PhishGuard: A Browser Plug
-
in
for Protection from

Phishing

[
8
],
Joshi, Y. Saklikar, S. Das, D. Saha, propose
d a
mechanism to

detect a forged website via
submitting fake credentials before the actual credentials
during the
login process of a
website,

then
the server
-
side

analyzes the
responses

of
the submissions of all those credentials to
determine
whether
the website is phish
ing or not. The mechanism was implemented on browsers
side “user
-
side” as plug
-
in of Mozilla

FireFox
, However; the mechanism only detects during the
log
-
in proce
ss
for a user. If another user log
-
in to the same phishing website, he will goes
through the same detection process. In
my project
, if the website reported as phishing site, no
other
user can get access,
the reported
link will be blocked, to the reported w
ebsite.

In the paper titled

BogusBiter: A Transparent Protection Against Phishing Attacks

[17]

Chuan Yue and Haining Wang proposed
a client
-
side tool called BogusBiter

that
send a large

number of bogus credentials
to
suspected phishing site
s and hides
the real credentials from
phishers

.
BogusBiter

is unique it also
helps
legitimate web sites to detect stolen

credentials a
timely manner by having the phisher to verify the credentials he has collected
at that legitimate
web site
. BogusBiter was implanted

as
Firefox 2
extension
,

however;
My project i
s different that
uses server side to provide the
protection.

In the paper titled
“The Battle Against Phishing:
Dynamic Security Skins

[18]”
Rachna
Dhamija and J. D. Tygar

p
roposed, an extended paper of [1], an
anti
-
phishing tools helps user
distinguishing if they are interacting with a trusted site or not by. This approach uses shared
cryptographic
image that remote web servers use to proof their
identities
to users, in a way that supports


easy verification for
humans being and hard for attackers to spoof, however; in my project there is no
dependency on the client
-
side.
[18]
can’t provide protection when we have user utilizing a public access
because the approach requires support from both client
-
sides and ser
ver
-
side.


M
ost
popular
browsers

provide

a phishing
f
ilter
that
warn
s

users
from malicious
websites

including phishing websites
.
F
ilter
s
mainly depend

on certain lists
to detect the
malicious
websites
.
IE7 used “Phishing Filter” that has been improved to
be SmartScreen Filter in later
version of IE

due to the weak protection phishing filter
provides [
15]
.

In
IE

8

and
IE 9

"SmartScreen Filter
"

verifies the visited websites based on
the
updated list of
malicious
website
s

that Microsoft created and updated continuously [11] [12].

Similar to
IE
,

Safari browser has

filters
checks the websites while the user
browsing

against
a list of
phishing sites
. After
t
he
warning of
PayPal

to its member
s

that Safari is not safe for their

service

[13]
,
Safari started

to
use an
extended validation certificates
to support analyzing

websites

[14]
.

Firefox earlier
version
s of Firefox

take advantage of ant
-
phishing companies such as GeoTrust or the Phish
-
Tank, using their list to support identifying

malicious

websites
.
The current version of
Firefox
has adopted
Google's anti
-
phishing
program
to
support its phishing protection.

Many

res
earch

projects

have proposed
mechanisms
that
implemented
as browser plug
s
-
in
and tool
-
bar

against

phishing attack
.

The main

problem with

plugs
-
in
and
tool bar
is the
need
for

users’ cooperation
.

U
ser
s

may not cooperate and install the tool.
Some
users
occasionally prefer
to turn their filter off

to brows faster

[16]
.
P
lugs
-
in and tool
s

bar in some devices may not be
as
effective as it in desktop browser
due
to the limitation in the performance and the sc
reen space as
the case in smart
phone
s
.

PhishIurk
’s

mechanism is aimed to use

as less space and
memory
as
possible in the Client
-
side,
us
ing

the server side to provide the classification and protection of
phishing links. So even the phishing protection was disabled on client
-
side
PhishIurk

still
provide pr
otected and classified links to the user.



The different phishing defense approaches can be further classified based on where the
alerts are generated:

• Browsers themselves
: IE9, Firefox 5.

• Browsers extensions or plug
-
ins:

BogusBiter
,
PhishGuard
.

• Anti
-
phishing Search Site
: Phishlurk “my project”.

• Proxy server
: Dansguardian

[20]
.


Anti
-
phishing Server
:

OpenDNS

[19]
,
GFI MailEssentials

[
21
]
, and

some browser
extensions use server side partially such as
Skins

[18
]
.



According to the official website

[20]
, DansGuardian is an active web content filter that
filters web sites based on a number of criteria including website URL, words and phrases
included in the page, file type, mime type and more. DansGuardian use

as proxy server that


control
, filter,

and
monitor all content, So its function
more than anti
-
phishing. There is no such a
project using proxy server as anti
-
phishing but it can be really an effective technique to classify
and prevent phishing websites
.


Proposed Project

I propose a mechanism to protect the user from phishing
attack
s
;

the

mechanism assesses
and classifies the sites, based on Phishtank’s blacklist,
from the server side and
using color
scheme. The system also utilizes less screen space and m
emory to be
work even with small sizes
devices
. The mechanism

classif
ies
the links into
four
types
by
using
coloring
scheme

that
use
less space and requires less memory.
I expanded the classification that used in Phishtank to be

as
following:



Phishing link

(Red)
:
is an absolute phishing link
.

The link

will be disable
d
, s
o
even if the

user is ignorant or surfing
careless
ly as we saw

in the survey

[1]
, there is no way to
access
the link.



Unknown link (
Orange
)
: suspicious link, it might potentially be phishing

link
,

it could
be
l
ink indicate the same name or part of a real company's name asking the user to provide
sensitive information
.

The
link
is submitted as phishing link but it hasn’t be verified yet.
The u
ser can click and get access to this type in their
responsibility.

The user gets warned
before accessing the link.



Un
likely

link (Gray):
The same as unknown link, the difference is when the black list get
a
report about link that unlikely to be
a
phishing link for example
websites that have
Top
-
Level Domai
n

TLD

ends with (.edu or .gov)
, they are unlikely to be used by hackers
website because their specialized for official use of
organizations. The link will maintain
to be
unlikely

until get
s

verified

by Phishtank
.
Note that
it might be
someone
reported the
unlikely site
tr
ying
to denigrate the
organizations
;

i
t is fair to maintain the
unlikely status
until it gets verified and changed to a Safe link, or
the site
might actually be attacked by
Cross
-
site
s
cripting
a
ttacks
or
SQL
i
njection
a
ttack
.





Global Phishing
Survey: Trends

and Domain Name Use
-

April 2011


As we see in the above chart,
60
% phishing

attacks was lunched by
TLDs: .COM, .NET,
.TK, and .CC.



Safe Link (Green)
:
These are
safe link
s
, totally not phishing
. The user can access the
link
with
out triggering

warning

messages
.

Providing the protection from the server side and u
sing the coloring
scheme

for
classification
would safe much memor
y and more
space on

the client
-
side
.
The
mechanism
determines whether
the website is phishing or
not based on provided black
-
list of phishing website

that is
periodically updated to achieve the possible maximum accuracy.


The
plan

In this project
,

I
will develop an anti
-
phishing search web site

called “
PhishLurk

using PHP and
CSS

that
responds to the user search inquires with
classified protective links
.

I
n case the website was a phishing link, the engine would classify it as risk
y, disable it
,

and
warn the user

by

producing a red link.


If the link was classified
as “
unknown


or

susp
icious

, it would give user
s

the choice
whether to access the link or not
,
and warn them about the impact or consequence
s
.



If the link was
classified

unlikely

, it would give the user the choice whether to access the
link or not and warning to take the
responsibility

and warn that the link unlikely to be phishing,
the link might be hacked or there is someone try to
denigrate

the organizations of the website
.


The last case when
the link has no risk or suspicious note, the engine would classify it as a
sa
fe link. I use CSS to help
classifying

the links
because it doesn’t consume

a lot of screen
resources

or demand
extensive computation
.
Beside processing the classification and providing
the safe results to the user,

PhishLurk

system
reads and updates the blacklist periodically from
PhishTank.com to have the most
up
-
to
-
date
results.





PhishLurk
’s Design



Metric for Evaluating the
PhishLurk

System

The proposed
PhishLurk

system can be evaluated by examining the effectiveness of its us
age by
the users and the processing overhead. We will conduct a survey on the usage of
PhishLurk

and
summarize the feedbacks. Stress tests will be performed on the system and collecting the
statistics about the average processing time overheads for classi
fying the
URL
, and modifying
the links.

Deliverables



The
working software prototype
,
PhishLurk
, with user guide

and
installation manual
.



A
master

report documenting the design and implementation of
PhishLurk
,
implementation choices and their performance
evaluation
, and th
e lessons learned
.





References
:

1.

Rachna Dhamija, J. D. Tygar, and Marti Hearst. 2006. Why phishing works. In Proceedings of the SIGCHI conference on
Human Factors in computing systems (CHI '06), Rebecca Grinter, Thomas Rodden, Paul Aoki,

Ed Cutrell, Robin Jeffries, and
Gary Olson (Eds.). ACM, New York, NY, USA, 581
-
590. DOI=10.1145/1124772.1124861
http://doi.acm.org/10.1145/1124772.1124861
.

2.

Aaron Blum, Brad Wardman, Thamar Solorio, and Gary Warner. 2010. Lexical feature based phishing URL

detection using
online learning. In <em>Proceedings of the 3rd ACM workshop on Artificial intelligence and security</em> (AISec '10).
ACM, New York, NY, USA, 54
-
60. DOI=10.1145/1866423.1866434 http://doi.acm.org/10.1145/1866423.1866434

3.

Gross, Ben. "Sma
rtphone Anti
-
Phishing Protection Leaves Much to Be Desired | Messaging News." Messaging News | The
Technology of Email and Insta
nt Messaging. 26 Feb. 2010. Web
. <http://www.messagingnews.com/story/smartphone
-
anti
-
phishing
-
protection
-
leaves
-
much
-
be
-
desired>
.

4.

ComScore, Inc. "Smartphone Subscribers Now Comprise Majority of Mobile Browser and Application Users in U.S."
ComScore, Inc.
-

Measuring the Digital Wo
rld. ComScore, Inc, 1 Oct. 2010.

<http://www.comscore.com/Press_Events/Press_Releases/2010/10/Smartphon
e_Subscribers_Now_Comprise_Majority_of_Mo
bile_Browser_and_Application_Users_in_U.S>.

5.

Entner, Roger. "Smartphones to Overtake Feature Phones in U.S. by 2011." Http://www.nielsen.com. Nielsen Wire,
26 Mar.
2010. Web.

<http://blog.nielsen.com/nielsenwire/cons
umer/smartphones
-
to
-
overtake
-
feature
-
phones
-
in
-
u
-
s
-
by
-
2011/>.

6.

Kerstein, Paul L. "How Can We Stop Phishing and Pharming Scams?" CSO Online
-

Security and Risk. CSO Magazine
-

Security and Risk, 19 July 2005. Web. <http://www.csoonline.com/article/220491/how
-
can
-
we
-
stop
-
phishing
-
and
-
pharming
-
scams
-
>.

7.

OpenDNS, LLC. PhishTank
:
an Anti
-
phishing Site. [Online]. http://www.phishtank.com
.

8.

Joshi, Y.; Saklikar, S.; Das, D.; Saha, S.; , "PhishGuard
: A browser plug
-
in for protection from phishing," Internet
Multimedia Services Architecture and Applications, 2008. IMSAA 2008. 2nd International Conference on , vol., no., pp.1
-
6,
10
-
12 Dec. 2008 doi: 10.1109/IMSAA.2008.4753929, URL:

http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4753929&isnumber=4753904

9.

PhishTank
-

Statistics about phishing activity and P
hishTank usage ,
http://www.phishtank.com/stats.php

10.

Phis
hTank,

Friends of PhishTank,

http://www.phishtank.com/friends.php

11.

SmartScreen Filter: Frequently Asked Questions."

Windows Home
-

Microsoft Windows. <http://windows.microsoft.com/en
-
US/windows7/SmartScreen
-
Filter
-
frequently
-
asked
-
questions
-
IE9>.

12.

"SmartScreen Filter
-

Microsoft Windows." Windows Home
-

Microsoft Windows. Web. <
http://windows.microsoft.com/en
-
US/internet
-
explorer/products/ie
-
9/features/smartscreen
-
filter>.

13.

Apple
-

Safari
-

Learn about the Features Available in Safari." Apple. <http://www.apple.com/ca/safari/features.html>.

14.

TECH.BLOR
GE
-

Top Technology news,

Paypal

warns buyers to avoid Safari browser from Apple
-

<

http://tech.blorge.com/Structure:%20/2008/02/28/paypal
-
warns
-
buyers
-
to
-
avoid
-
safari
-
browser
-
from
-
apple/
>

15.

"Firefox 2 Phishing Protection Effectiveness Testing." Home of the Mozilla Project.
<http://www.m
ozilla.org/security/phishing
-
test.html>.

16.

"AVIRA News
-

Anti
-
Virus Users Are Restless, Avira Survey Finds." Antivirus Software Solutions for Home and for
Business. <http://www.avira.com/en/press
-
details/nid/482/>.

17.

Chuan Yue and Haining Wang. 2010. BogusBite
r: A transparent protection against phishing attacks. ACM Trans. Internet
Technol. 10, 2, Article 6 (June 2010), 31 pages. DOI=10.1145/1754393.1754395
http://doi.acm.org/10.1145/1754393.1754395

18.

Rac
hna Dhamija and J. D. Tygar. 2005. The battle against phishing: Dynamic Security Skins. In Proceedings of the 2005
symposium on Usable privacy and security (SOUPS '05). ACM, New York, NY, USA, 77
-
88. DOI=10.1145/1073001.1073009
http://doi.acm.org/10.1145/1073001.1073009

19.

OpenDNS | DNS
-
Based
Web Security.

<http://www.opendns.com/>.

20.

DansGuardian
-

True Web Content Filtering for All. <http://dansguardian.org/>.

21.

GFI
-

Web, Email and Network Security
Solutions for SMBs on Premise and Hosted. <http://www.gfi.com/>.