Domain Name System
The Domain Name System (DNS) associates various information with domain names; most importantly, it serves as the "phone book" for the Internet by translating human-readable computer hostnames,e.g. www.example.com, into IP addresses, e.g. 208.77.188.166, which networking equipment needs to deliver information. It also stores other information such as the list of mail servers that acceptemail for a given domain. In providing a worldwide keyword-based redirection service, the Domain Name System is an essential component of contemporary Internet use.
Uses
The most basic task of DNS is to translate hostnames to IP addresses. In very simple terms, it can be compared to a phone book. DNS also has other important uses. Above all, DNS makes it possibleto assign Internet names to organizations (or concerns they represent) independent of the physical routing hierarchy represented by the numerical IP address. Because of this, hyperlinks and Internetcontact information can remain the same, whatever the current IP routing arrangements may be, and can take a human-readable form (such as "example.com"), which is easier to remember than the IPaddress 208.77.188.166.
People take advantage of this when they recite meaningful URLs and e-mail addresses without caring how the machine will actually locate them. The Domain Name System distributes the responsibilityfor assigning domain names and mapping them to IP networks by allowing an authoritative name server for each domain to keep track of its own changes, avoiding the need for a central register to becontinually consulted and updated.
History
The practice of using a name as a more human-legible abstraction of a machine's numerical address on the network predates even TCP/IP, and goes all the way to the ARPAnet era. Back then however, adifferent system was used, as DNS was invented only in 1983, shortly after TCP/IP was deployed. With the older system, each computer on the network retrieved a file called HOSTS.TXT from a computerat SRI (now SRI International). The HOSTS.TXT file mapped numerical addresses to names. A hosts file still exists on most modern operating systems, either by default or through configuration, andallows users to specify an IP address (eg. 208.77.188.166) to use for a hostname (eg. www.example.net) without checking DNS. Systems based on a hosts file have inherent limitations, because of theobvious requirement that every time a given computer's address changed, every computer that seeks to communicate with it would need an update to its hosts file.
The growth of networking called for a more scalable system, one that recorded a change in a host's address in one place only. Other hosts would learn about the change dynamically through anotification system, thus completing a globally accessible network of all hosts' names and their associated IP Addresses. At the request of Jon Postel, Paul Mockapetris invented the Domain Namesystem in 1983 and wrote the first implementation. The original specifications appear in RFC 882 and RFC 883. In November 1987, the publication of RFC 1034 and RFC 1035 updated the DNS specificationand made RFC 882 and RFC 883 obsolete. Several more-recent RFCs have proposed various extensions to the core DNS protocols.
In 1984, four Berkeley students — Douglas Terry, Mark Painter, David Riggle and Songnian Zhou — wrote the first UNIX implementation, which was maintained by Ralph Campbell thereafter.In 1985, Kevin Dunlap of DEC significantly re-wrote the DNS implementation and renamed it BIND (Berkeley Internet Name Domain, previously: Berkeley Internet Name Daemon). Mike Karels, Phil Almquistand Paul Vixie have maintained BIND since then. BIND was ported to the Windows NT platform in the early 1990s. Due to BIND's long history of security issues and exploits, several alternativenameserver and resolver programs have been written and distributed in recent years.
How DNS works in theory
The domain name space
The domain name space consists of a tree of domain names. Each node or leaf in the tree has zero or more resource records, which hold information associated with the domain name. The treesub-divides into zones beginning at the root zone. A DNS zone consists of a collection of connected nodes authoritatively served by an authoritative DNS nameserver. (Note that a single nameserver canhost several zones.)
When a system administrator wants to let another administrator control a part of the domain name space within the first administrator’s zone of authority, control can be delegated to thesecond administrator. This splits off a part of the old zone into a new zone, which comes under the authority of the second administrator's nameservers. The old zone ceases to be authoritative forthe new zone.
Parts of a domain name
A domain name usually consists of two or more parts (technically a label), separated by dots, such as example.com. The rightmost label conveys the top-level domain (for example, the addresswww.example.com has the top-level domain com). Each label to the left specifies a subdivision, or subdomain of the domain above it. Note: “subdomain” expresses relative dependence, notabsolute dependence.
For example: example.com comprises a subdomain of the com domain, and www.example.com comprises a subdomain of the domain example.com. In theory, this subdivision can go down 127 levels. Eachlabel can contain up to 63 characters. The whole domain name does not exceed a total length of 255 characters. In practice, some domain registries may have shorter limits. A hostname refers to adomain name that has one or more associated IP addresses; ie: the 'www.example.com' and 'example.com' domains are both hostnames, however, the 'com' domain is not.
DNS servers
The Domain Name System consists of a hierarchical set of DNS servers. Each domain or subdomain has one or more authoritative DNS servers that publish information about that domain and the nameservers of any domains "beneath" it. The hierarchy of authoritative DNS servers matches the hierarchy of domains. At the top of the hierarchy stand the root nameservers: the servers to query whenlooking up (resolving) a top-level domain name (TLD).
DNS resolvers
A resolver looks up the resource record information associated with nodes. A resolver knows how to communicate with name servers by sending DNS queries and heeding DNS responses. A DNS query maybe either a recursive query or a non-recursive query: A non-recursive query is one where the DNS server may provide a partial answer to the query (or give an error). DNS servers must supportnon-recursive queries. A recursive query is one where the DNS server will fully answer the query (or give an error).
DNS servers are not required to support recursive queries. The resolver (or another DNS server acting recursively on behalf of the resolver) negotiates use of recursive service using bits in thequery headers. Resolving usually entails iterating through several name servers to find the needed information. However, some resolvers function simplistically and can communicate only with a singlename server. These simple resolvers rely on a recursive query to a recursive name server to perform the work of finding information for them.
Address resolution mechanism
In theory a full host name may have several name segments, (e.g ahost.ofasubnet.ofabiggernet.inadomain.example). In practice, in the experience of the majority of public users of Internetservices, full host names will frequently consist of just three segments (ahost.inadomain.example, and most often www.inadomain.example). For querying purposes, software interprets the name segmentby segment, from right to left, using an iterative search procedure. At each step along the way, the program queries a corresponding DNS server to provide a pointer to the next server which it shouldconsult.
As originally envisaged, the process was as simple as:
1. the local system is pre-configured with the known addresses of the root servers in a file of root hints, which need to be updated periodically by the local administrator from a reliable sourceto be kept up to date with the changes which occur over time.
2. query one of the root servers to find the server authoritative for the next level down (so in the case of our simple hostname, a root server would be asked for the address of a server withdetailed knowledge of the example top level domain).
3. querying this second server for the address of a DNS server with detailed knowledge of the second-level domain (inadomain.example in our example).
4. repeating the previous step to progress down the name, until the final step which would, rather than generating the address of the next DNS server, return the final address sought.
The diagram illustrates this process for the real host www.wikipedia.org. The mechanism in this simple form has a difficulty: it places a huge operating burden on the root servers, with each andevery search for an address starting by querying one of them. Being as critical as they are to the overall function of the system such heavy use would create an insurmountable bottleneck fortrillions of queries placed every day. The section DNS in practice describes how this is addressed.
Circular dependencies and glue records
Name servers in delegations appear listed by name, rather than by IP address. This means that a resolving name server must issue another DNS request to find out the IP address of the server towhich it has been referred. Since this can introduce a circular dependency if the nameserver referred to is under the domain that it is authoritative of, it is occasionally necessary for thenameserver providing the delegation to also provide the IP address of the next nameserver. This record is called a glue record.
For example, assume that the sub-domain en.wikipedia.org contains further sub-domains (such as something.en.wikipedia.org) and that the authoritative name server for these lives atns1.something.en.wikipedia.org. A computer trying to resolve something.en.wikipedia.org will thus first have to resolve ns1.something.en.wikipedia.org. Since ns1 is also under thesomething.en.wikipedia.org subdomain, resolving ns1.something.en.wikipedia.org requires resolving something.en.wikipedia.org which is exactly the circular dependency mentioned above.
The dependency is broken by the glue record in the nameserver of en.wikipedia.org that provides the IP address of ns1.something.en.wikipedia.org directly to the requestor, enabling it to bootstrapthe process by figuring out where ns1.something.en.wiki.org is located.
In practice
When an application (such as a web browser) tries to find the IP address of a domain name, it doesn't necessarily follow all of the steps outlined in the Theory section above. We will first lookat the concept of caching, and then outline the operation of DNS in "the real world."
Caching and time to live
Because of the huge volume of requests generated by a system like DNS, the designers wished to provide a mechanism to reduce the load on individual DNS servers. To this end, the DNS resolutionprocess allows for caching (i.e. the local recording and subsequent consultation of the results of a DNS query) for a given period of time after a successful answer. How long a resolver caches a DNSresponse (i.e. how long a DNS response remains valid) is determined by a value called the time to live (TTL). The TTL is set by the administrator of the DNS server handing out the response. Theperiod of validity may vary from just seconds to days or even weeks.
Caching time
As a noteworthy consequence of this distributed and caching architecture, changes to DNS do not always take effect immediately and globally. This is best explained with an example: If anadministrator has set a TTL of 6 hours for the host www.wikipedia.org, and then changes the IP address to which www.wikipedia.org resolves at 12:01pm, the administrator must consider that a personwho cached a response with the old IP address at 12:00pm will not consult the DNS server again until 6:00pm. The period between 12:01pm and 6:00pm in this example is called caching time, which isbest defined as a period of time that begins when you make a change to a DNS record and ends after the maximum amount of time specified by the TTL expires. This essentially leads to an importantlogistical consideration when making changes to DNS: not everyone is necessarily seeing the same thing you're seeing. RFC 1537 helps to convey basic rules for how to set the TTL.
Note that the term "propagation", although very widely used in this context, does not describe the effects of caching well. Specifically, it implies that when you make a DNS change, it somehowspreads to all other DNS servers (instead, other DNS servers check in with yours as needed), and that you do not have control over the amount of time the record is cached (you control the TTL valuesfor all DNS records in your domain, except your NS records and any authoritative DNS servers that use your domain name).
Some resolvers may override TTL values, as the protocol supports caching for up to 68 years or no caching at all. Negative caching (the non-existence of records) is determined by name serversauthoritative for a zone which MUST include the Start of Authority (SOA) record when reporting no data of the requested type exists. The MINIMUM field of the SOA record and the TTL of the SOA itselfis used to establish the TTL for the negative answer.
Many people incorrectly refer to a mysterious 48 hour or 72 hour propagation time when you make a DNS change. When one changes the NS records for one's domain or the IP addresses for hostnames ofauthoritative DNS servers using one's domain (if any), there can be a lengthy period of time before all DNS servers use the new information. This is because those records are handled by the zoneparent DNS servers (for example, the .com DNS servers if your domain is example.com), which typically cache those records for 48 hours. However, those DNS changes will be immediately available forany DNS servers that do not have them cached. And any DNS changes on your domain other than the NS records and authoritative DNS server names can be nearly instantaneous, if you choose for them to be(by lowering the TTL once or twice ahead of time, and waiting until the old TTL expires before making the change).
In the real world
Users generally do not communicate directly with a DNS resolver. Instead DNS-resolution takes place transparently in client-applications such as web-browsers, mail-clients, and other Internetapplications. When an application makes a request which requires a DNS lookup, such programs send a resolution request to the local DNS resolver in the local operating system, which in turn handlesthe communications required.
The DNS resolver will almost invariably have a cache (see above) containing recent lookups. If the cache can provide the answer to the request, the resolver will return the value in the cache tothe program that made the request. If the cache does not contain the answer, the resolver will send the request to one or more designated DNS servers. In the case of most home users, the Internetservice provider to which the machine connects will usually supply this DNS server: such a user will either have configured that server's address manually or allowed DHCP to set it; however, wheresystems administrators have configured systems to use their own DNS servers, their DNS resolvers point to separately maintained nameservers of the organization. In any event, the name server thusqueried will follow the process outlined above, until it either successfully finds a result or does not. It then returns its results to the DNS resolver; assuming it has found a result, the resolverduly caches that result for future use, and hands the result back to the software which initiated the request.
Broken resolvers
An additional level of complexity emerges when resolvers violate the rules of the DNS protocol. A number of large ISPs have configured their DNS servers to violate rules (presumably to allow themto run on less-expensive hardware than a fully-compliant resolver), such as by disobeying TTLs, or by indicating that a domain name does not exist just because one of its name servers does notrespond.
As a final level of complexity, some applications (such as web-browsers) also have their own DNS cache, in order to reduce the use of the DNS resolver library itself. This practice can add extradifficulty when debugging DNS issues, as it obscures the freshness of data, and/or what data comes from which cache. These caches typically use very short caching times — on the order of oneminute. Internet Explorer offers a notable exception: recent versions cache DNS records for half an hour.
Other applications
The system outlined above provides a somewhat simplified scenario. The Domain Name System includes several other functions: Hostnames and IP addresses do not necessarily match on a one-to-onebasis. Many hostnames may correspond to a single IP address: combined with virtual hosting, this allows a single machine to serve many web sites. Alternatively a single hostname may correspond tomany IP addresses: this can facilitate fault tolerance and load distribution, and also allows a site to move physical location seamlessly. There are many uses of DNS besides translating names to IPaddresses. For instance, Mail transfer agents use DNS to find out where to deliver e-mail for a particular address. The domain to mail exchanger mapping provided by MX records accommodates anotherlayer of fault tolerance and load distribution on top of the name to IP address mapping. Sender Policy Framework and DomainKeys instead of creating their own record types were designed to takeadvantage of another DNS record type, the TXT record. To provide resilience in the event of computer failure, multiple DNS servers are usually provided for coverage of each domain, and at the toplevel, thirteen very powerful root servers exist, with additional "copies" of several of them distributed worldwide via Anycast.
Protocol details
DNS primarily uses UDP on port 53 to serve requests. Almost all DNS queries consist of a single UDP request from the client followed by a single UDP reply from the server. TCP comes into play onlywhen the response data size exceeds 512 bytes, or for such tasks as zone transfer. Some operating systems such as HP-UX are known to have resolver implementations that use TCP for all queries, evenwhen UDP would suffice.
Extensions to DNS
EDNS is an extension of the DNS protocol which allows the transport over UDP of DNS replies exceeding 512 bytes, and adds support for expanding the space of request and response codes.

