DNS Data and Metadata Extraction - US-Cert

cabbagepatchtapeInternet και Εφαρμογές Web

5 Φεβ 2013 (πριν από 4 χρόνια και 5 μήνες)

416 εμφανίσεις

Techniques for DNS Analysis

Ryan Breed


Manager Critical Infrastructure Security


1997
-
2005 Pen Testing, Risk Assessment,
Infrastructure roll
-
outs


2005
-
2009 Operations, Incident Response,
Network and Host forensics, Instrumentation
development


2010
-

Manager, Critical Infrastructure
Security:


Security Operations


Compliance Controls Implementation

Ryan Breed
-

ERCOT


Client requests and server responses tell you
a lot about what is happening on endpoints


What servers clients are looking for


Services the client supports


Record format is compact: cheap to store


Normal operation is easy to weed out


Abnormal operation is easy to spot

Ryan Breed
-

ERCOT


Clients
-
> Local DNS Server


Local server lookups


Misconfigured search paths



Queries to Remote Domains


Local DNS Server
-
> Internet


Recursive Queries


Misconfigured Searches


Cache refreshes (sometimes)


Internet
-
> Local DNS Server


Referrals (Authority Records)


Authoritative Answers


Local DNS Server
-
> Clients


Authoritative Answers for local domains


Cached Responses


Non
-
Authoritative answers (from recursive lookups)

Ryan Breed
-

ERCOT

REF: RFC 1035

Two kinds of (normal) DNS
traffic in transit :


Client
-
>



Local DNS

Local DNS server
-
>

Remote DNS


Anything else (e.g. client
-
>
remote DNS server) is a tip
-
off

Ryan Breed
-

ERCOT


RFC 1034 (concepts) RFC 1035 (specification)


Primarily used to resolving hostnames into IP addresses and vice
versa


Really just a giant, distributed key
-
value store


Client requests a key (e.g. www.google.com)


Server returns a value (74.125.227.52)


Different servers are authoritative for subsections of the namespace


DNS Resource Records (RRs) can convey different types of
information


NS


Authoritative Name Server


A


Hostname
-
>IP mapping


PTR
-

IP
-
>Hostname mapping


MX


Mail Exchanger


CNAME


Canonical Name (Alias)

REF:
http://www.iana.org/assignments/dns
-
parameters




Ryan Breed
-

ERCOT

EASY TO PARSE

REF: RFC 1035

Ryan Breed
-

ERCOT


TYPE Value and meaning


-----------

-----------------


A


1 a host address


NS


2 an authoritative name server


MD


3 a mail destination (Obsolete
-

use MX)


MF


4 a mail forwarder (Obsolete
-

use MX)


CNAME

5 the canonical name for an alias


SOA

6 marks the start of a zone of authority


MB

7 a mailbox domain name (EXPERIMENTAL)


MG

8 a mail group member (EXPERIMENTAL)


MR

9 a mail rename domain name (EXPERIMENTAL)


NULL

10 a null RR (EXPERIMENTAL)


WKS

11 a well known service description


PTR

12 a domain name pointer


HINFO

13 host information


MINFO

14 mailbox or mail list information


MX

15 mail exchange


TXT

16 text strings

Ryan Breed
-

ERCOT


The query/reply ( QR ) flag indicates whether the
message is a query (0) or a reply (1).


The authoritative ( AA ) flag is set in a reply
message when a name server is an authoritative
server for a queried name.


The truncation ( TC ) flag is set whenever the
massage is
truncated.The

recursion
-
desired


( RD ) flag is set when a client (host or name
server) desires that the name server to perform
recursion when it doesn't have the record.


The recursion available ( RA ) flag is set in a reply
if the name server supports recursion.

Ryan Breed
-

ERCOT

Ryan Breed
-

ERCOT


Clients requesting blacklisted names (easy)


Could be a variety of reasons to see names in
transit


DNS queries will typically pass where web
requests are blocked


Monitoring systems may be resolving domain
names


Anti
-
spam systems may be looking up names in
RBLs





Ryan Breed
-

ERCOT


Low TTL names


Rotating, randomized IPs from a large pool


Difficult to block access with just a firewall


Algorithmically generated domain names


Used in botnet C2 infrastructure

Ryan Breed
-

ERCOT


DNS is a block query/response transfer protocol


Anything can be encapsulated


Basic tunneling


File transfer


Simple socket forwarding


Full
-
featured tunneling


Complete L3 transport stack


Optional transport encryption


Basic implementation theory


Break up
datagrams

into small chunks


Apply encapsulation and sequencing protocol in an existing
query/
rdata

format


Maintain state to handle retransmissions/drops


Ryan Breed
-

ERCOT


OzymanDNS


Uses TXT records


easy to spot


Supports socket forwarding


DNS2TCP


NSTX


Heyoka


DNScat


Uses A and CNAME queries


Supports file transfer and tunneling via PPP


Iodine


Uses A, CNAME, TXT, NULL, SRV, or MX queries



Ryan Breed
-

ERCOT


Blacklisted Domains


Activity is only suspicious if it originates from client
addresses


May see lots of activity from forwarding servers if you
have monitoring systems


http://Malwaredomains.com


http://spamlinks.net/filter
-
dnsbl
-
lists.htm


Dynamic DNS Domains


Allows anyone to get a delegated
subdomain


DynDNS
,
FreDNS
, No
-
IP


Low
-
TTL domains


Ryan Breed
-

ERCOT


Odd record types


E.g.: NULL, HINFO, MINFO, NXT


Ref:
http://www.iana.org/assignments/dns
-
parameters


Odd record classes


HS (Hesiod)


CH (Chaos)


Unassigned

Ryan Breed
-

ERCOT


# Client
-
server bytes sent/received


# Client
-
server queries
-
responses


# of unique names resolved by client


σ of client name request length


μ of # requests/responses per packet


Ryan Breed
-

ERCOT


Ratio of unique (internet) name queries to
unique (internet) http destinations


More DNS traffic than internet client traffic


Ratio of non
-
standard query types to
standard query types


High number of CNAME/NULL/MX


Unique IPs resolved per name


Unique names resolved per client

Ryan Breed
-

ERCOT


Can use
wireshark

for interactive exploration


Can use
tshark

(parses most record types)

+ shell
tools (cut/sort/
awk
/
sed
)


Use analytics developed during exploration


Use display filters to pull specific pieces of data out


http://www.wireshark.org/docs/dfref/d/dns.html


Develop IDS sigs for qualitative analytics


Use ETL framework or
dnsdump

to transform
traffic into CSV, load into database


http://dns.measurement
-
factory.com/tools/dnsdump/


Ryan Breed
-

ERCOT


DSC


dns

stats collector


http://dns.measurement
-
factory.com/tools/dsc/


Quantify query types


Quantify queries by node


DNStop


http://dns.measurement
-
factory.com/tools/dnstop/


Quantify top resolvers


Quantify top
tld
/2ld


Quantify record types


DNSSTAT




http://www.caida.org/tools/utilities/dnsstat/


Types/queries by client


Ryan Breed
-

ERCOT

Ryan Breed
-

ERCOT

Overkill at its finest:




Multiple sensor/parser engines


Multiple query processors


Can have dedicated ones per app


Replicated metadata store


Sharded data store


Replication sets for busy shards


Many cores or many boxes



Distributed


Multiple PCAP readers and/or sniffers


Schema
-
free fast database persistence
(
mongodb
)


Can add attributes and indices on the fly


No need to rebuild DB or existing client code


Stores DNS queries and responses


Parses DNS records and indexes the results


Collects statistics about endpoints, queries, and
responses


Simple query API (JSON/REST)

Ryan Breed
-

ERCOT


Passive DNS server


Collect and store A queries


Identify fast
-
fluxing domains


Identify IP information where hostnames only
exist


Blacklist detection


Anomalous record type detection


Raw packet storage


Statistical analysis



Ryan Breed
-

ERCOT

Ryan Breed
-

ERCOT


Collection organization


Data partitioned into hour
-
level partitions


Makes cleaning up easier, queries run faster


Data
sharded

by source/
dest

IP address


Adds locality for queries


Reasonably distributed


Document organization


Jam everything into a single document


Parsed metadata


Raw packet

Ryan Breed
-

ERCOT


FAST persistence/query times (2GB queries in 30s)


Backend DB supports auto
-
sharding
, replication,
failover


Can scale out rather than scale up


Cross
-
platform/language query API


Simplified data administration


Can use capped collections


Schemaless

design


allows flexibility in metadata extraction


Can change metadata extraction on the fly without DB changes


Map
-
reduce queries run asynchronously, distributed


Can work on or offline



Ryan Breed
-

ERCOT


Space vs. Speed (2
-
10x increase over pcap)


Single
-
threaded parser(s)


Pure
-
ruby parser is SLOW (20 minutes to
process 5 minutes of traffic)


Pure
-
ruby serialization is SLOW


No integrated UI (must use shell)


Reporting done through outboard system
(e.g.: Jasper) or hand
-
coded reports


Ryan Breed
-

ERCOT

16m records queried in 37s, random shard distribution

Ryan Breed
-

ERCOT

Ryan Breed
-

ERCOT

Ryan Breed
-

ERCOT


Need to have a full
-
content capture solution


No DNS/TCP support


Parsing DNS in pure ruby is SLOW


60 S of traffic can take 10 minutes to parse


Parsing and storage are exacerbated by
duplicate packet storage


Need PCAP
deduplication

and reordering


Need multiple cores in processing pipeline


Need to develop a good
sharding

strategy


Front
-
end?
Feh
. Who needs a front
-
end?

Ryan Breed
-

ERCOT


Add canned reporting with Jasper


Canned js queries loaded server
-
side


Map
-
reduce queries for quantitative analytics


Switch pure
-
ruby parser over to java
-
based
JRuby implementation (20x speedup)


Switch pure
-
ruby serialization over to java
-
based serialization


Age out older collections


Ryan Breed
-

ERCOT


If you reduce scope by only looking at what comes out your
internet pipe, you will lose what client was doing the lookup


If you only pay attention to what is coming from your recursive
DNS servers, you will lose client request detail


Your security sensor and/or SIEM infrastructure will look up scary
names


Those lookups will be transferred to your recursive DNS servers


This will light up your IDS and SIEM like a
christmas

tree (unless you
use suppression)


If you use dedicated recursive name servers for your security
infrastructure (MX, SPAM/Virus filters, IDS, SIEM, etc), you will
reduce your baseline.


Clients tend not to look up a lot of unique names


Clients that look up lots of unique names (but within an order of
magnitude of gen pop) are generally surfing beaucoup web


Non
-
DNS servers tend not to look up many names at all


Ryan Breed
-

ERCOT


DNS is an underappreciated vector for security
analysis


Funny things can happen over DNS that no IDS
or firewall will be able to see


Spotting the trends can be hard if you are
picking packets apart one at a time


DNS anomalies can happen well within the RFCs


Backdoors and
Exfiltration

over DNS are no joke,
but if you have bigger network security
deficiencies, please prioritize you control
strategies

Ryan Breed
-

ERCOT


My blog:
http://securityanaly.st


My contact:
blog@securityanaly.st


Twitter: @secanalystblog


Fork me on GitHub:


https://github.com/ryanbreed/nominalyze


Ryan Breed
-

ERCOT

Ryan Breed
-

ERCOT