DETECTING RESOLVERS AT - DNS-OARC (Indico)€¦ · DNS TRAFFIC IS NOISY Despite general belief, not...

DETECTING RESOLVERS AT .NZ

Jing Qiao, Sebastian CastroDNS-OARC 29 Amsterdam, October 2018

BACKGROUND

DNS-OARC 29 2

DNS TRAFFIC IS NOISY

Despite general belief, not all the sources at auth nameserver are resolversNon-resolvers’ traffic could skew our Domain Popularity ranking resultIf we can tell whether a source is a resolver?

DNS-OARC 29 3

A classifier to predict the probability of a source being a resolverIt’s not a simple task:

1. Trail blazing2. Uncertainty about resolvers’ pattern

“In the search of resolvers”, OARC 25

“Understanding traffic sources seen at .nz”, OARC 27

OBJECTIVE

DNS-OARC 29 4

USING DNS KNOWLEDGE TO REDUCE THE SCOPE

DNS-OARC 29 5

REMOVE NOISE BY CRITERIA

4 weeks (28/08/2017 – 24/09/2017)

27.8% of sources only queried for 1 domain45.5% of sources only queried for 1 query type25% of sources only queried 1 of 7 .nz servers65.8% of sources sent no more than 10 queries per day

The address list is reduced from 2M to 550k

DNS-OARC 29 6

FOCUS ON ACTIVE SOURCES

Active at least 75% of total hoursActive at least 5/7 days per week

Address list is further reduced to 82k

DNS-OARC 29 7

ASSUMPTION: THE REST IS EITHER RESOLVER OR MONITOR.

THE KEY IS TO FIND DISCRIMINATING FEATURES.

DNS-OARC 29 8

DATASET

.nz DNS traffic across 4 weeks82k unique sources

DNS-OARC 29 9

2515 Resolvers• ISP, Google DNS, OpenDNS, Education &

Research, addresses collected from RIPE Atlas probes

106 Monitors• ICANN, Pingdom, ThousandEyes, RIPE Atlas

probes, RIPE Atlas anchors

KNOWN SAMPLES

DNS-OARC 29 10

NO CLEAR SPLIT OVER SINGLE FEATURE

Resolvers and monitors are distributed across overlapped ranges

DNS-OARC 29 11

Query features: query type, query name, query rate, DNS flag, response code, ...

• Each feature can be described as a time series• mean, standard deviation, percentiles (10, 90)

TEMPORAL FEATURES

DNS-OARC 29 12

Queries from a resolver should be more random than those from a monitorTiming entropy:• Time lag between successive queries• Time lag between successive queries of

same query name and query typeQuery name entropy:• Similarity of the query names (Jaro-

Winkler string distance) between successive queries

ENTROPY FEATURES

DNS-OARC 29 13

A monitor’s query flow is likely to be less variable compared to a resolver• Finer aggregation across hour tiles• Variance metrics

• Interquartile Range• Quartile Coefficient of Dispersion• Mean Absolute Difference• Median Absolute Deviation• Coefficient of Variation

14DNS-OARC 29

VARIABILITY

Remove redundant features with above 0.95 correlationSelect most relevant features by statistical tests, such as F-test and Mutual Information (MI)

FEATURE SELECTION

DNS-OARC 29 15

50 features of different scales Normalize to comparable scales• Standardization: zero mean and unit

variance• Quantile Transformation: robust to outliers

FINAL FEATURE SET

DNS-OARC 29 16

Algorithms:• K-Means• Gaussian Mixture Model• MeanShift• DBSCAN• Agglomerative Clustering

Evaluation metrics:• Adjust Rand Index• Homogeneity Score• Completeness Score

VERIFY THE FEATURE SET BY CLUSTERING

DNS-OARC 29 17

The model with best performance on the known samples:

Gaussian Mixture with 5 clusters• Adjust Rand Index: 0.849789 (good)• Homogeneity score: 0.960086 (good)• Completeness score: 0.671609 (average but

acceptable as resolvers can be separated into a set of clusters different from monitors)

CLUSTERING RESULT

DNS-OARC 29 18

Most resolvers are separated from monitors OpenDNS is a special case

• Only 7 samples • Specific behavior

VERIFY WITH GROUND THRUTH

DNS-OARC 29 19

Exclude samples from OpenDNS90% resolvers (Cluster 0, 2, 4) are separated from 91% monitors (Cluster 1, 3) using our feature set

CONCLUSION

DNS-OARC 29 20

Training data• 2515 positive (resolver) • 106 negative (monitor)

50 features 4 week period

SUPERVISED CLASSIFIER

DNS-OARC 29 21

Challenges of training a classifier:• Searching algorithms and hyperparameters• Performance and efficiency

We use auto-sklearn to solve the challenges• Off-the-shelf toolkit that automates the process of

model selection and hyperparameter tuning• Improve performance and efficiency using Bayesian

optimization, meta-learning, ensemble construction• Efficient and Robust Automated Machine Learning,

Feurer et al., Advances in Neural Information Processing Systems 28 (NIPS 2015).

AUTOML

DNS-OARC 29 22

An ensemble of 28 models:• Running time: 10 minutes• Accuracy score: 0.991• Precision score: 0.991• Recall score: 1.000• F1 score: 0.995

PRELIMINARY RESULT

DNS-OARC 29 23

CURRENT STATUS

• Train the classifier on a sliding window regularly• Takes a few hours, mostly on feature

generation• Predict type of source on a given

threshold (probability of being like a resolver)• We picked probabilities higher than 0.7

• Integrate with popularity ranking algorithm• Improved ranking for known domains

24DNS-OARC 29

CURRENT STATUS

• Out of 100k source address classified, 73k were detected as resolvers

• The model identified 96% of Illimunati probes as resolvers, from a request from SIDN

25DNS-OARC 29

POTENTIAL USES & FUTURE WORK

With adjustment of feature set and training data, source classification can be used to measure:• validating resolvers• adoption of QNAME minimization in the wild• etc

from authoritative data, e.g. the root servers

DNS-OARC 29 26

ACKNOWLEDGEMENTS

Daniel Griggs for his input to reduce noise.Sebastian Castro for the variability metrics.

DNS-OARC 29 27

REFERENCE

https://blog.nzrs.net.nz/source-address-clustering-feature-engineering/

https://blog.nzrs.net.nz/source-address-classification-clustering/

DNS-OARC 29 28

jing@internetnz.net.nz

sebastian@internetnz.net.nz

THANK YOU!

DETECTING RESOLVERS AT - DNS-OARC (Indico)€¦ · DNS TRAFFIC IS NOISY Despite general belief, not...

Documents

Transcript of DETECTING RESOLVERS AT - DNS-OARC (Indico)€¦ · DNS TRAFFIC IS NOISY Despite general belief, not...

HOST SPONSOR PROMOTER PATRON · 2019-05-12 · Systems Engineer ... which have consistently high-quality ... “DNS OARC provides the invaluable mingling of DNS operators, developers,

Query name minimization and authoritative DNS server behavior€¦ · Title: Query-name minimization and authoritative DNS server behavior Venue: DNS-OARC, May 9th 2015, Amsterdam,

Understanding and evaluating the Behaviour of DNS resolvers

Domain Name System Security · Web viewThis publication provides information on Domain Name System (DNS) security. DNS systems, known as DNS Resolvers, are vulnerable to a number

Resolvers in IPv6 Identifying DNS Open - NANOG

Self-Service Open Resolver Scanning Duane Wessels DNS-OARC Workshop Dublin May 12, 2013.

The Current State of DNS Resolvers and RPKI Protection · 2020-02-09 · The Current State of DNS Resolvers and RPKI Protection Marius Brouwer University of Amsterdam marius.brouwer@os3.nl

A Look at the ECS Behavior of DNS Resolvers...cies between end-users that use public resolvers and their edge servers by 50%, at the cost of 8 times increase in the number of DNS queries

Poison Over Troubled Forwarders: A Cache Poisoning Attack ...DNS Cache Poisoning Attacks Forging attacks targeting recursive resolvers Craft a DNS answer which matches the query’s

CDAR - RIPE 73 | Madrid, 24-28 October 2016 Data-driven Analysis of Root Stability DNS OARC RSO's DNS OARC H-Root RIPE ICANN NLnet Labs RSO's ICANN Public web Hyp. Group Hyp. ID Hypothesis

Domain Name System Security - Cyber - Dom… · This publication provides information on Domain Name System (DNS) security. DNS systems, known as DNS Resolvers, are vulnerable to

DNS-OARC Systems Update€¦ · – Various collections of support systems, like DNS, out- ... • Access to that data available via analysis servers! ... • Ix2 was picked as the

Open Resolvers in COM/NET Resolution Duane Wessels, Aziz Mohaisen DNS-OARC 2014 Spring Workshop Warsaw, Poland.

OARC Systems Update - DNS-OARC (Indico)...as Webex, etc. • Peculiar incidents involving instabilities within Indico have caused a lot of headaches • Fortunately, BackUpPC has rescued

DNS & IPv6 - MENOG · DNS & IPv6 Status DNS status to support IPv6: 1. DNS extensions 2. Software 3. Resolvers 4. Root servers ... Secure64 DNS No Appliance Unbound Yes Linux/Unix

Discover Financial Services DNS Practice Statement for the ...€¦ · Relying parties include DNS resolvers e.g., the browsers or hosts which resolve names in the zone, DNS providers,

Master in System and Network Engineering - NLnet Labs · Master in System and Network Engineering Analysis of DNS Resolver Performance Measurements Author: ... widely used DNS resolvers

DNS-OARC 2014 AGM President's Report › event › 20 › contributions › ... · DNS-OARC Staff Resources President, Secretary (Keith Mitchell) 0.75 FTE Systems Engineer (William

Defeating DNS Amplification Attacks...Largest DDoS ever uses open resolvers - April 2013 – 300Gbps targeted at Spamhaus Providers worldwide see attacks using their DNS resolvers

Negative Caching of DNS records - DNS-OARC · Brand Default max Negative TTL Windows 15 minutes Bind 3 hours Unbound 1 day Power DNS recursor 1 hour Microsoft Windows DNS: dig asdsadas.google.com