Use of distributed or Parallel Computing to Crack Windows Password Hashes

desirespraytownSoftware and s/w Development

Dec 1, 2013 (3 years and 11 months ago)

95 views

Use of distributed

or Parallel

C
omputing to
Crack
Windows Password Hashes

Shishir Jha

(Author)

Department of Computer Science

Hood College

Frederick Maryland, USA

sj4@hood.edu



Abstract



Cracking windows password hashes has been an
inherently single p
rocess based algorithm that requires extensive
computing resource and time
.

Since use of parallel or distributed
computing can address the need of computing resource by
delegating the processing to multiple nodes, this paper is an
attempt to see if
passwo
rd cracking
be modified to take the
advantage of multiple nodes and
find the potential speed up.

Keywords
-
LM Hash, parallel, distributed, dictionary attack,
windows password

I.

I
NTRODUCTION


Passwords have been used for centuries as a method for
challenging
the credentials of someone attempting to enter a
secured compound or accessing a private gathering.
Computers are no different and require one form or other of
passwords to let only certified personnel access the
information in it. The trend of using passw
ord in computers
started in sixties and seventies when a need for more
structured method of processing the access control emerged in
the computing world.
[1]



The first password schemes relied simply on a flat file
located on disk or in memory which contai
ned the user names
and passwords. These password files were typically locked by
the operating system however some systems allowed any user
with the appropriate privileges to access the file.


When Windows 95 was released, user account information
was place
d in a .pwl file or password list file and the password
file was encrypted using the password and an RC4 encryption
algorithm. When a user entered his or her credentials, the
password was encrypted and the checksum of the .pwl file was
compared against the

checksum derived from the user
credentials. By the time Windows XP was released, the
passwords were moved to a SAM (Security Account Manager)
file. The SAM file was encrypted with the SYSKEY. The
passwords were still not placed in the file directly but ra
ther a
hashed version of the password was saved. The Hashing
algorithm was based on the MD5 hashing. Access to the
computer required the user to enter a password and comparing
the hashed output with the contents of the SAM file.


Hashes in windows SAM file

are computed using either the
LM has method or the NTLM Hash method. Although it is
based on DES encryption, LM Hash is not a true one
-
way
function. Because of way the LM Hash function is
implemented, there are several weaknesses in its
implementation whi
ch allows careful programmers
[2]

to use
different algorithms to get the password from the MD5 hashes.

II.

B
ACKGOUND

A.

LM Hash

Though
N
TLM has primarily replaced the LM Hash in all
the application protocols used to authenticate remote users and
to provide sessi
on security when requested by the application,
LM is still used in vast majority of Windows machine that are
not part of any remote domain and Active Directory based
networks.
[3]

However, Windows

before


the introduction of
V
ista

still compute and store th
e
LM

hash by default for
compatibility with
previous generation clients that still use 16
bit applications which requires LM Hash based authentication.
This paper is an attempt to attack this security hole present in
the Windows architecture and gain acces
s to users
authentication credential.


To better understand the premise of the problem that this
paper is attempting to solve, it is necessary to first understand
how LM hash encrypts the user credentials. The general
steps
in LM Hash computing are

as fol
lows:

[2]

1.

The user entered ASCII password is converted to
UPPERCASE

2.

This password
is then made 14 byte long by null
padding and split into two 7
-
byte halves

3.

These values are used to create two DES keys, one
from each half
. This generates a 64 bit DES Keys

4.

Each of these keys is used to DES
-
encrypt the
constant

ASCII

string,

resulting in two 8
-
byte
cipher
text

values
.

5.

These two
cipher text

values produce a 16 byte long
cipher text which is result of concatenating the two 8
-
byte values.


The hashes produced by

both NTLM and LM are stored in
the System32 directory of the Windows installation
in a SAM
file which itself is encrypted by using a different file.


B.

Different ways of extracting the password

Though LM Hash uses a DES encryption key, because of
the encryp
tion methodology used by the LM, the encryption is
not a true one way function and with some widely known hash
attack algorithms and fair bit of time and computing resource,
the hash can be reversed to the user credentials.

Using simple
math it can be show
n that total number of passwords to be
cracked is ideally 2
95
but because of the different steps like
converting the characters to uppercase and splitting the whole
password into two the real problem domain for any hash attack
algorithm is reduced to 2
43
.


The most common form of cracking me
thod used is
Dictionary attack. Ideally, it
does nothing more than
comparing
t
he hash function obtained from the host computer
with that in pre
-
existing dictionary of commonly used
passwords. This way of cracking method
is a long attempt and
usually works for weak and commonly used passwords.
The
ability of the algorithm to crack the password is directly
dependent upon how diverse and big the dictionary is.


The next technique used in cracking of hashes is the brute
force

method. Essentially, a password generator, generating all
possible of combination of words until a possible match is
found. Though bound to work, the time complexity inc
reases
with password length and set of different characters and
symbol used to generat
e the password.


The third method and now widely used method is a
cracking utility by Zhu Shuanglei called Rainbow Table
Method. His tool is based on Philippe Oeshslin’s faster time
-
memory trade off technique. This method proposes a new way
of pre
-
calcula
ting the data which reduces by two the number
of calculation necessary for cryptanalysis

[4]
. For use in
cracking
passwords
, this method is almost equivalent to brute
-
force attack but Rainbow Table uses pre
-
calculated
chains of
words stored in the table
.
T
hough the cracking speed decreases
by fold of 100’s if not 1000’s using this method, the main
pitfall of this method is the time investment required to build
the tables. However, thanks to the internet and wide
availability of tools and resources under ope
n license there are
numerous libraries avai
lable for free in the internet which
provides users with pre
-
build tables.


III.

P
RESENT
W
ORK


There has been wide spread work in this field as recovering
passwords for bo
th legitimate and illegitimate purposes has
been a necessity ever since the start of its use. Though there are
freely available tools that will attempt at cracking the password
using one of the numerous methods mentioned above, there is
no guarantee of res
ults. This is primarily because of the
possible time complexity and lack of
proper utilization of the
computing resource available in today’s multi core and high
speed interfaced computers.

To fill in the gap left by these free tools there are
commerciall
y

available tools

in the market that use specialized
algorithms, pre hashed tables, rainbow tables to
crack the
password. Further more, there are online web sites that offer
services to crack your password
, provided
you can give them
the hash function from

the SAM file. These online sites use
massive collection of rainbow tables ranging
anywhere
from 60
to 160 GB to facilitate its user with cracking of their purpose.

Inherently almost all the programs available in the internet
that are downloadable
are sin
gle processor based programs and
not optimized to take benefit of either multiprocessor
environment.

Though in reality the general public might not
have access to a hugely parallel system or a distributed
computing for their daily use, the fact that
these
systems

cannot
even
use the
processing power of general multicore processors
in use today puts these programs at a disadvantage.

Because of
this and other reasons out of scope of this paper, acceptance of
use of such programs has not grown over the year.
From the
presence of different commercial tools and websites available
in the internet it is evident that need of
a tool
which can run
faster and utilize the available computing reso
urces in general
users computer

exists
.

It is however

worth mentioning th
at

there are projects which
have implemented crack tools in various parallel programming
paradigms

that
utilize

the multiple
cores

and other recent
advancement in computing resources available
, not much of
material is available

for review.

Furthermore
, it

has to be noted that there are numerous
implementation of crack tools that utilize the massively parallel
capability of today’s graphics card to do calculation for
cracking hashes generated by different
programs. Primarily
programmed using CUDA and simila
r interface, this kind of
crack tool claim huge improvement in cracking time over
traditional single and SMP based computers. But because using
such programs mandates presence of compatible graphics card
in
users’

computer, it is not always feasible for us
ing such
programs.

Since cracking password is among many different families
of present problems that can utilize parallel computing
techniques t
o speedup their performance
, this p
aper and the
final project

is an attempt at
measuring how much of difference

can using parallel and distributed computing can make over
traditional methods of cracking a password.


IV.

APPLYING

PARALLEL

COMPUTING

TECHNIQUES


Applying parallel computing technique to a problem that is
repeated time and over
which

can be broken into sma
ller pieces
seems to be a pretty intuitive thing to do. All crack tools
essentially have an approach where the algorithms serially
attack the hash in hand with the ones that are either present in a
table, dictionary or are generated depending upon the
Identify applicable sponsor/s here.
(sponsors)

cons
traints set by the user. As the data set that the algorithm is
working on is usually mutually exclusive or can be made
exclusive to each other, there is a distinct possibility that using
parallel computing or distributed computing in which data set
that th
e algorithm acts upon can divided

will

speed up
the
process.

Though there are numerous constraints on how the
algorithms works in some of the method explained above,
Brute force attack and dictionary attack work on set of data and
constraints that are pre
-
defined and not dependent upon pre
-
formulated chains as in Rainbow attack. Hence, by
dividing
and distributing
either

the dictionary in dictionary attack
or

the
random password generator constraint in the brute force attack,
theoretically, a good speed up
can be achieved by using parallel
or distributed computing. Since the overall hash calculation and
matching will be done in the cluster or individual processor
there is minimum communication and setup overhead which
compared to the processing time necessar
y should have
insignificant effect on overall computing time.
This means that
there will be less communication overhead resulting in a better
speed up.

The attempt here will be to parallelize or distribute the
dictionary attack algorithm for cracking the h
ash file using
MPI
or OpenMP
based parallel computing approach by
distributing the hash file to multiple nodes and with pre
-
assigned dictionaries to perform the algorithm. On success the
node will reply to the root node to indicate that matching hash
has b
een found and performance time will be noted.



V.

E
XPECTED RESULT AND C
ONCLUSION


Though the expected speedup of the algorithm is ideally
number of processes spawned by the MPI program, this is true
just for worst case scenario. However, even for a general c
ase,
this simulation of parallel computing

should yield some
substantial
speed up
data that can be used to measure the
viability of using parallel processes to speed up programs that
have inherently been designed for single processor based
computers.
As we

know that result of any parallel computing
needs a sizeable amount of data for producing any legible
data. Though the data in the test might be slightly limited
because the processing is being distributed among more than
one node, the overall result shoul
d produce data that can be
used legibly for drawing co
nclusions for the work being done.


R
EFERENCES


[1]

M Naylor, S Jha, S K Mehta
, “
Hacking Windows Password
,”
unpublished.

[2]

Microsoft Knowledgebase
,
www.microsoft.com

[3]

Brian Wilson, “LM & MD5 Hash Security & C
racking,”

[4]

PhilippeOechslin, “Making a Faster Cryptanalytic Time
-
Memory Trade
-
O

, ”