Yara & Python

voltaireblingΔιαχείριση Δεδομένων

20 Νοε 2013 (πριν από 3 χρόνια και 6 μήνες)

99 εμφανίσεις

Yara

& Python

Malware Identification and Classification


CarolinaCon

7

Michael
Goffin

@
mjxg

http://
www.mgoff.in

Hey sir!

Why hello there!!


Rochester Institute

of Technology


Computer Science House


Information

Security Scientist/Engineer

What’s in store?


Malware


Yara


Python


Identification and Classification of Malware


Showing it all off


QQ session

Malware!
Sonofa
...

Methods of acquisition



downloads



compromised website content (ex:
images)



attachments



links to compromised site content

You’ve been infiltrated!

Things to note:


You don’t know it yet, and might not for a
while


You don’t know the scope of it


You don’t know the severity of it


But you eventually see something…

Start the cycle!

Management wants answers!

What do you do next?


Go into a panic!


Oh no! We should remove the known
compromised
host(s
) from network!


We should assess the compromise…somehow!


Oh geez, might be good to change passwords


let’s just have everyone do it just in case!


We need to go through logs and other hosts for
signs of lateral movement


wait, what are we
looking for?


Can we make firewall rules to block any
IPs

or
domains?


Do we have any AV or IDS appliances?


Most importantly

You did get a copy of the malware to
analyze, right?


…Right?


Get better at data mining!



Who is interested in this user or your company?



What are they trying to do with this malware (and what are they exploiting?)?



When did this malware come in?



Where did it come from and where did it go to?



Why are they after your company, or this user?



How does this malware help them accomplish their goals?

What do we do with all the
data?

Build a classification database over time!



Identify trends



Find commonalities

Lots of action, now what?

Enter
Yara

What does
Yara

do?

Identify and classify malware samples based on textual or
binary patterns contained within those samples

MALWARE!

MALWARE!

MALWARE!

MALWARE!

How does it do it?

Pretty basic:


Search for patterns


Use defined conditions to determine if the
patterns are a positive match


Output matching rule content for
consumption

Yara

and Python

Step 1:


% python


Step 2:


> import
yara


> rules =
yara.compile(signatures
)


> matches =
rules.match(filetoscan
)


Step 3:


profit

As the old saying goes…

If it walks like a duck…

And it quacks

like a duck…

It’s probably the DHA installing

backdoors
and
keyloggers

while
xfil’ing

your data.

Identification


Can we tease out specific characteristics
about this piece of malware that can
describe it both from a functional and
fashionable perspective?


What does it attempt to touch?


What does it attempt to modify?


Is this type of malware stylish?


Etc.

Identification


Are there any quantitative or qualitative
datasets about this malware that can help
further describe its nature?


Functions used in other malware


Code style similar to other malware


IPs

or domains used


Specific targets (files, processes, etc.)


End result of successful execution

Classification

Questions
[1]
:


Does an unknown malware instance
belong to a known malware family or does
it constitute a novel malware strain?


What behavioral features are
discriminative for distinguishing instances
of one malware family from those of other
families?


Compare these to our Identification

Strains


Trojan


Rootkit


Backdoor


Xfil


Worms


Ransomware


Keylogger

Build Signatures


Generate conditions


Build rules for those conditions


Compile rules into a signature set


Develop process to scan files using those
signature sets


Generate alerts


Set human response expectations to these
alerts!!

What a rule looks like

rule
foo

{

meta:


key: value

strings:


$variable = something

condition:


logic_for_determining_positive_rule_match

}

Conditions

Some basic condition examples:


A string or value exists


A set of strings or values exist


Strings or values at certain offsets exist


The number of times a string or value
occurs


File size restriction


Let’s see
Yara

in action!


How to incorporate
Yara


Web downloads


Web content


Urllib


Email attachments


Honeypots


Grab files from AV and IDS appliances to
scan!

Why
Yara
?


Supplement to additional applications
(Snort, AV, detonation chambers)


MD5 of known malware only good if exact
file is seen again


Detect future malware with similar
identifiers that AV or IDS might not catch
yet


Free


The
cooldown



http://code.google.com/p/yara
-
project/




Questions?

References


[1]
Learning and Classification of Malware
Behavior


Rieck
,
Holz
,
Willems
,
Dussel
,
Laskov


http://pi1.informatik.uni
-
mannheim.de/filepool/publications/malware
-
classification
-
dimva08.pdf