Use more than two DNS servers

Ryan Weal

July 27, 2017

Toward the end of 2016 I was visiting California for BadCamp and hosting a day-long training during which "Internet of Things" (IoT) devices were wreaking havock on the Internet by DDOS'ing DNS servers. It caused an outage for major sites that covered much of North America.

I was slightly smug that day as the night before I had emailed the trainees and encouraged them to download the materials in advance, and most had done so! Those who did not were able to use Pantheon's hosted environment as their DNS provider seemed to be unaffected where we were.

We carried on. The training went smoothly.

Throughout the day I kept thinking to myself... I run a DNS. I should probably do a full review of the aftermath of this incident. So I did...

What should we be doing for DNS

Later on in reading about the aftermath I read a comment about DNS servers. Most people have their websites hosted by two DNS servers. I did the same... no problems here, right?

However, there were problems... I noticed them first when travelling to South America a couple years prior. The first request to my website on any network seemed to timeout, and on second try the DNS would resolve. At the time my servers were in Canada. So I moved them to New York. Problem solved. Or so I thought...

A couple years later I was in India. Guess what happened? More or less the same thing as in South America, but it was possible on some networks to get a reply on first hit. Whenever I got a request it was slow to arrive.

Back to 2016 in California I was surprised to learn that a lot of people "in the know" on these matters recommend you have at least THREE (3) authoratiative name servers for your domains!

There was also a suggestion by some commentators during that huge DNS outage in 2016 that having diversity within those three servers would be a good idea as well. Both geographic and also TLD-diversity.

Knowing all of this, I knew I could be doing better on DNS. I created a modest goal: improve the reachability of the site; to increase resiliance to network failures.

Time for some tests

Simply testing connection speeds from where I live in Montréal to New York wasn't going to do the trick. Ping times are typically really fast. What I needed was a way to measure global latency. There are a few providers out there but I settled on as it seemed to do what I needed and the DNS tests were free.

beginning graph

The tests confirmed my experiences travelling internationally. For people close to North America thigns are good, but once you start crossing oceans - and multiple oceans - the picture changes quite a lot.

Improvement 1: a third DNS server

The next step was to add a third DNS server. My existing servers were in NYC and San Francisco so I added one to Europe (Frankfurt) as that is my next most visited place in real life.

now with 3 dns servers

Huge improvement already. Wow. 62 users are getting the response within 50ms (up from 53 users before), and now we have 88 people getting a response (up from 71 before).

Note that I'm not doing anything fancy here - no load balancing, no geo IP targeting, just simply adding an IP address to the pool and letting the magic of the Internet work the way it was designed.

Improvement 2: a fourth DNS server, on a different TLD

Following the advice of commentators I thought I would setup a new domain (on the ".global" tld in this case) to "diversify" my DNS in case the top-level domain that I normally use goes down. Simple enough to do... but I forgot about how to add a custom DNS server to my registrar to "authorize" a DNS running on that domain. Minor detail, but I have been running DNS for so long I had forgotten about it.

Once I figured that out we were in business. I used MXtoolbox to check the results and made changes until everything synced up nicely.

Time for more tests:

even more users getting replies

Wow! That is an amazing improvement. Now roughly the same 62 people are getting the request within 50ms, but there are so many more users now, 99 people are getting replies (the progression was 71->88->99).

I compared this result to one of my Cloudflare-powered sites and although Cloudflare preforms slightly better at this task the chart is looking very much the same. Not bad for a first attempt!

Huge win for very little effort. I won't have any timeouts on the first request when I travel anymore.

If you're still not willing to host your own DNS, use a service like Cloudflare or Fastly to speed things up. They will likely make you publish only 2 nameservers but they have other magic to achieve similar results.

When you're ready to go to the next level I recommend reading "Surviving the next DNS attack" to get more insight into tuning DNS.

Written by:
Ryan Weal @ryan_weal
Web developer based in Montréal. I run Kafei Interactive Inc. Node.js, Vue.js, Cassandra. Distributed data. Hire us to help with your data-driven projects.