Example: Data Mining for the NBA - The University of Texas at Dallas

levelsordΔιαχείριση Δεδομένων

20 Νοε 2013 (πριν από 3 χρόνια και 11 μήνες)

83 εμφανίσεις


Digital Forensics

Dr. Bhavani Thuraisingham

The University of Texas at Dallas


Lecture #5

Forensics Systems


September 5, 2007

Outline


Some developments


Review of Lectures 3 and 4


Lectures 5

-
Types of Computer Forensics Systems

-
Objective: Identify issues in corporate planning for
computer forensics


Tools for Digital Forensics


Assignment #1


Lab Tour


Some Developments


Internships positions available in commuter forensics with
DFW area FBI and Law Enforcement


Guest lectures are being arranged to be given by DFW FBI
and Law Enforcement

-
Dates to be given


Mid
-
term exam: week of October 9 or October 16





Review of Lectures 3 and 4


Lecture 3

-
Forensics Technology


Military, Law Enforcement, Business Forensics

-
Forensics Techniques


Finding Hidden Data, Spyware, Encryption, Data
Protection, Tracing, Data Mining

-
Security Technologies


Wireless, Firewalls, Biometrics

-
APPENDIX: Data Mining


Lecture 4: Data Mining for Malicious Code Detection


Types of Computer Forensics Systems


Internet Security Systems


Intrusion Detection Systems


Firewall Security Systems


Storage Area Network Security Systems


Network disaster recovery systems


Public key infrastructure systems


Wireless network security systems


Satellite encryption security systems


Instant Messaging Security Systems


Net privacy systems


Identity management security systems


Identify theft prevention systems


Biometric security systems


Homeland security systems


Internet Security Systems


Security hierarchy

-
Public, Private and Mission Critical data

-
Unclassified, Confidential, Secret and TopSecret data


Security Policy

-
Who gets access to what data

-
Bell LaPadula Security Policy, Noninterference Policy


Access Control


-
Role
-
based access control, Usage control


Encryption

-
Public/private keys

-
Secret payment systems


Directions

-
Smart cards



Intrusion Detection Systems


An intrusion can be defined as “any set of actions that attempt to
compromise the integrity, confidentiality, or availability of a resource”.


Attacks are:

-
Host
-
based attacks

-
Network
-
based attacks


Intrusion detection systems are split into two groups:

-
Anomaly detection systems

-
Misuse detection systems


Use audit logs

-
Capture
all
activities in network and hosts.

-
But the amount of data is huge!



Our Approach: Overview




Training

Data

Class

Hierarchical

Clustering (DGSOT)

Testing

Testing Data

SVM Class Training

DGSOT: Dynamically growing self organizing tree

Hierarchical clustering with SVM flow chart

Our Approach

Our Approach: Hierarchical Clustering

Worm Detection: Introduction


What are worms?

-
Self
-
replicating program; Exploits software vulnerability on a victim;
Remotely infects other victims


Evil worms

-
Severe effect;
Code Red
epidemic cost $2.6 Billion


Automatic signature generation possible

-
EarlyBird System (S. Singh.
-
UCSD); Autograph (H. Ah
-
Kim.
-

CMU)


Goals of worm detection

-
Real
-
time detection


Issues

-
Substantial Volume of Identical Traffic, Random Probing


Methods for worm detection

-
Count number of sources/destinations; Count number of failed connection
attempts


Worm Types

-
Email worms, Instant Messaging worms, Internet worms, IRC worms, File
-
sharing Networks worms

Email Worm Detection using Data Mining

Training data

Feature
extraction

Clean

or Infected ?

Outgoing
Emails

Classifier

Machine
Learning

Test data

The Model

Task:

given some training instances of both
“normal” and “viral” emails,

induce a hypothesis to detect “viral” emails.


We used:

Naïve Bayes

SVM

Firewall Security Systems


Firewall is a system or groups of systems that enforces an
access control policy between two networks


Benefits

-
Implements access control across networks

-
Maintains logs that can be analyzed


Data mining for analyzing firewall logs and ensuring
policy consistency


Limitatations

-
No security within the network

-
Difficult to implement content based policies

-
Difficult to protect against malicious code


Data driven attacks

Traffic Mining


To bridge the gap between what is written in the firewall policy rules
and what is being observed in the network is to analyze traffic and
log of the packets


traffic mining


Network traffic
trend may show that some rules are out
-
dated or not used recently




Firewall

Log File

Mining Log File

Using Frequency

Filtering

Rule

Generalization



Generic Rules


Identify Decaying

&

Dominant Rules

Edit

Firewall Rules

Firewall

Policy Rule

Traffic Mining Results

Anomaly Discovery Result

Rule 1, Rule 2: ==>
GENRERALIZATION

Rule 1, Rule 16: ==>
CORRELATED

Rule 2, Rule 12: ==> SHADOWED

Rule 4, Rule 5: ==>
GENRERALIZATION

Rule 4, Rule 15: ==>
CORRELATED

Rule 5, Rule 11: ==> SHADOWED

1: TCP,INPUT,129.110.96.117,ANY,*.*.*.*,80,DENY

2: TCP,INPUT,*.*.*.*,ANY,*.*.*.*,80,ACCEPT

3: TCP,INPUT,*.*.*.*,ANY,*.*.*.*,443,DENY

4: TCP,INPUT,129.110.96.117,ANY,*.*.*.*,22,DENY

5: TCP,INPUT,*.*.*.*,ANY,*.*.*.*,22,ACCEPT

6: TCP,OUTPUT,129.110.96.80,ANY,*.*.*.*,22,DENY

7: UDP,OUTPUT,*.*.*.*,ANY,*.*.*.*,53,ACCEPT

8: UDP,INPUT,*.*.*.*,53,*.*.*.*,ANY,ACCEPT

9: UDP,OUTPUT,*.*.*.*,ANY,*.*.*.*,ANY,DENY

10: UDP,INPUT,*.*.*.*,ANY,*.*.*.*,ANY,DENY

11: TCP,INPUT,129.110.96.117,ANY,129.110.96.80,22,DENY

12: TCP,INPUT,129.110.96.117,ANY,129.110.96.80,80,DENY

13: UDP,INPUT,*.*.*.*,ANY,129.110.96.80,ANY,DENY

14: UDP,OUTPUT,129.110.96.80,ANY,129.110.10.*,ANY,DENY

15: TCP,INPUT,*.*.*.*,ANY,129.110.96.80,22,ACCEPT

16: TCP,INPUT,*.*.*.*,ANY,129.110.96.80,80,ACCEPT

17: UDP,INPUT,129.110.*.*,53,129.110.96.80,ANY,ACCEPT

18: UDP,OUTPUT,129.110.96.80,ANY,129.110.*.*,53,ACCEPT

Storage Area Network Security Systems


High performance networks that connects all the storage
systems

-
After as disaster such as terrorism or natural disaster
(9/11 or Katrina), the data has to be availability

-
Database systems is a special kind of storage system


Benefits include centralized management, scalability
reliability, performance


Security attacks on multiple storage devices

-
Secure storage is being investigated

Network Disaster Recovery Systems


Network disaster recovery is the ability to respond to an
interruption in network services by implementing a disaster
recovery palm


Policies and procedures have to be defined and subsequently
enforced


Which machines to shut down, determine which backup
servers to use, When should law enforcement be notified

Public Key Infrastructure Systems


A certificate authority that issues and verifies digital
certificates


A registration authority that acts as a verifier for the
certificate authority before a digital certificate is issued to a
requester


One or more directories where the certificates with their
public keys are held


A certificate management systems


Digital Identity Management


Digital identity is the identity that a user has to access an
electronic resource


A person could have multiple identities

-
A physician could have an identity to access medical
resources and another to access his bank accounts


Digital identity management is about managing the multiple
identities

-
Manage databases that store and retrieve identities

-
Resolve conflicts and heterogeneity

-
Make associations

-
Provide security


Ontology management for identity management is an
emerging research area

Digital Identity Management
-

II


Federated Identity Management

-
Corporations work with each other across organizational
boundaries with the concept of federated identity

-
Each corporation has its own identity and may belong to
multiple federations

-
Individual identity management within an organization
and federated identity management across organizations


Technologies for identity management

-
Database management, data mining, ontology
management, federated computing

Identity Theft Management



Need for secure identity management

-
Ease the burden of managing numerous identities

-
Prevent misuse of identity: preventing identity theft


Identity theft is stealing another person’s digital identity


Techniques for preventing identity thefts include

-
Access control, Encryption, Digital Signatures

-
A merchant encrypts the data and signs with the public
key of the recipient

-
Recipient decrypts with his private key

Biometrics


Early Identication and Authentication (I&A) systems, were
based on passwords


Recently physical characteristics of a person are being used
for identification

-
Fingerprinting

-
Facial features

-
Iris scans

-
Voice recognition

-
Facial expressions


Biometrics techniques will provide access not only to
computers but also to building and homes


Systems are vulnerable to attack e.g., Fake biometrics

Homeland Security Systems


Border and Transportation Security

-
RFID technologies?


Emergency preparedness

-
After an attack happens what actions are to be taken?


Chemical, Biological, Radiological and Nuclear security

-
Sensor technologies


Information analysis and Infrastructure protection

-
Data mining, security technologies


Other Types of Systems


Wireless security systems

-
Protecting PDAs and phones against denial of service
and related attacks


Satellite encryption systems

-
Pretty Good Privacy


PGP that uses RSA security


Instant messaging

-
Deployment of instant messaging is usually not
controlled

-
Should IM be blocked?


Net Privavacy

-
Can we ensure privacy on the networks and systems

-
Privacy preserving access?

Conclusion


We have discussed many types of forensics systems


These are systems that are secure, but can be
attacked


Security solutions include policy enforcement,
access control encryption, protecting against
malicious code


How can these systems be compromised and what
are the actions that need to be taken?

Open Source and Related Tools


http://www.opensourceforensics.org/tools/index.html


http://www.cerias.purdue.edu/research/forensics/


http://www.digital
-
evidence.org/papers/opensrc_legal.pdf


http://digitalforensics.ch/nikkel05b.pdf


http://www.fukt.bth.se/~uncle/papers/master/thesis.pdf


http://www.vascan.org/webdocs/06confdocs/Day1
-
TechnicalTrack
-
DONE/CrimJesseDigital%20Forensics.pdf




Assignment #1


Four exercises at the end of Chapters 1, 2, 3 and 4


Due date: September 24, 2007


You can read the answers at the back, but please try to
produce your own answers



Lab Tour and possible Programming projects


SAIAL: Security Analysis and Information Assurance
Laboratory


Develop programs to monitor what your adversary is doing

-
Will help our research a lot


Can you develop techniques that will put pieces of the deleted
files together to create the original file?


Use data analysis/mining for intrusion detection


Simulate an attack and use the open source tools


Analyze a disk image

-
Will try to give you a disk image to work with