The Domain Name Service
From Web (09.06.1997)
Table of Contents
Every computer on the Internet has a unique, 32 bit number assigned to it. This
number, called the IP address, is essential if you want to be considered live
on the Internet. If you don't have a number, you're not on the net. Most people
who surf the Internet have IP address dynamically allocated to them when they
dial up and log into their ISP's service. This IP address changes every time
they dial their Internet Service Provider. Of course, this doesn't really
matter to most people because they aren't running servers on their machines.
Only when you're running a commercial server do you need to worry about the
permanence of your IP address. After all, if your IP address keeps changing
every time, then it's going to be a wee bit difficult to reach your page! These
IP addresses, in the range zero to four billion, are a mite difficult to
remember. If you had to memorize a number like 198.24.3.103 for each and every
site you like, then you'd run out of patience pretty fast. It was to minimize
this frustration and free up mental resources that the DNS was created. A DNS
or Domain Name Service, converts names to their corresponding IP addresses. It
is only because of the DNS that you can now write www.microsoft.com instead of
having to write it's whole IP address. It just makes the Internet so much
easier to use.
The DNS originally started out as a text file on a central computer. It had two
columns, one for IP addresses and the other for the matching names. This
approach was very inefficient. If I wanted to update the listing, I had to go
to the computer and tell the operator to manually update the file. This was
fine for as long as the Internet was only a couple of hundred computers strong,
but when you start talking about a couple of hundred million computers, then
having one file is simply not enough. Besides, there was no redundancy in the
design. If the central computer went down, then no one could access the DNS.
Even if the main DNS was functional, the load on it would be phenomenal.
Anything will grind to a halt if a couple of million computers attempt
to use it! So the people who were designing the Internet decided to
metamorphose the DNS into something that would be scaleable, distributed and
efficient; and the modern DNS was born.
The DNS distributes the load among a whole bunch of DNS servers. Every ISP has
it's own DNS, yet you can find the IP address of any name through any DNS
server. This process of converting names to numbers is best describe through a
specific example.
Type www.netscape.com. in your browser window. Always read domain names
from right to left rather than from left to right. The first thing you should
see is the '.' before '.com'. This period is entirely optional and it signifies
the root of the DNS. The modern DNS is organized a bit like a tree. There is a
root with certain main branches, called Top Level Domains (TLD's) and these
branches divide and further sub-divide, become more intricate and web like. At
the root are nine Root Servers. These servers know the addresses of all the
servers which handle individual TLD's like .com, .edu, .mil, .gov .in, .uk etc.
These servers are scattered all over the world and are said to be authoritative
for those TLD's. This means that they have the Start of Authority or SOA for
those domains. So, for example, the Root Servers know that the .in domain is
handled by a certain DNS server in India.
The Top Level Domain .com is handled by a server somewhere in the U.S. of A..
The server handling .com knows the address of the server in charge of the
.netscape domain. This server belongs to Netscape. The .netscape server has the
SOA for the domain www. Nothing stops Netscape from adding another level after
www, making their address aaa.www.netscape.com. The aaa part of the system will
be handled by www. In fact, it's not compulsory to have a www.something.com.
Tradition is the only reason www is used as the leftmost label. There are many
TLD's, as many as there are countries. India's TLD is .in, so all Indian sites
end with a .in. For example, a friend of ours has a site called
www.mafatlal.co.in. Here www is a domain under mafatlal, which is a domain
under .co, which in turn is a domain under .in. The .co is short for .com which
means that this is a commercial site.
Domain names are assigned by a body called the InterNIC (Headed by Jon Postel)
in America. Other countries have their own national NIC's; for example, the
APNIC is for the Asia Pacific. The structure of the domain names is going to
start changing soon. There are plans afoot to add more gTLD's (global Top Level
Domains). The ones in use are:-
.com - For Commercial sites.
.net - For sites emphasizing networking e.g. ISP's
.org - For non-profit organizations
.mil - For the U.S. military
.gov - For government sites
.edu - For educational sites
.int - For International organizations
Some of the proposed gTLD's are:
.firm - For businesses
.store - For businesses offering on-line purchase
.web - For entities concentrating on Internet related activities
.arts - For sites emphasizing cultural and artistic content
.nom - For personal sites
.info - For sites providing information e.g. Libraries
.rec - For sites providing entertainment.
The process of granting domain names is also being democratized with up to 28
new domain name handling agencies to be selected by lottery from applicants
world wide.
That's enough about the layout of the DNS service, now to study the structure
of the actual DNS packets...
The DNS listens to UDP packets on port 53. The reason UDP is used instead of
TCP for the DNS is because UDP is much faster than TCP. TCP's slow not because
of any inherent weaknesses, but because it attempts to be reliable and
reliability is very time consuming. UDP packets don't always reach the other
end, but when they do, they're really fast. The DNS will also respond to TCP
packets if it receives any. The original DNS program was called BIND, for
Berkeley Internet Name Daemon. It was written in the University of California
(Berkeley Software Division) and it's later versions are still used today.
When you install your TCP/IP stack, you're asked for the IP addresses of your
DNS server. You're usually supposed to enter two, but even one will do. The
second DNS address is used as a backup in case the first server is down. When a
browser, or any other program which uses the WinSock, calls the function
gethostbyname(), a UDP packet is sent to your ISP's DNS on port 53. This packet
will ask the server if it knows the IP address of the site mentioned, e.g.
www.netscape.com. If another person had recently (usually less than a day ago)
asked for the IP address of the same, site, then the DNS sends you the cached
copy of the reply and the response is instantaneous. Things proceed at a more
leisurely pace of the response is not cached. If the DNS does not find a match
in its cache, then it sends a query to one of the Root Servers asking for the
address of the server which has the SOA (The Start of Authority) for the domain
.com. The Root Server will send your ISP's DNS the address of the server which
handles the .com domain. Our DNS server will now send a query to the .com
server, asking it if it knows the address of the authoritative DNS server for
.netscape. The .com DNS server will reply with the appropriate IP address. Now
our DNS server will ask Netscape's DNS server for the IP address of www. and if
the operation is successful, then you'll get the correct IP address.
This whole process is called an Iterative Lookup. In a Recursive
Lookup, our ISP's DNS itself handles all the running around and all we do is
wait for it to send us the correct IP address. This complex processing is
hidden from us because all we do is call gethostbyname() and it does the rest.
To better understand the DNS, we'll write a program which sends a raw DNS query
packet to a DNS server and tries to elicit a response.
#include <windows.h>
#include <stdio.h>
unsigned char kk[1000],ll[1000];
void abc(char *p)
{
FILE *fp=fopen("z.txt","a+");
fprintf(fp,"%s\n",p);
fclose(fp);
}
int ii,dw,jj;
void abc1(unsigned char p)
{
FILE *fp=fopen("z.txt","a+");
fprintf(fp,"%x..%d..%c\n",p,p,p);
fclose(fp);
}
WNDCLASS a;HWND b;MSG c;char aa[200];SOCKET s;struct hostent h;
WSADATA ws;DWORD e;char bb[100];struct sockaddr_in sa;
long _stdcall zzz (HWND,UINT,WPARAM,LPARAM);
int _stdcall WinMain(HINSTANCE i,HINSTANCE j,char *k,int l)
{
a.lpszClassName="a1";
a.hInstance=i;
a.lpfnWndProc=zzz;
a.hbrBackground=GetStockObject(WHITE_BRUSH);
RegisterClass(&a);
b=CreateWindow("a1","time client",WS_OVERLAPPEDWINDOW,1,1,10,20,0,0,i,0);
ShowWindow(b,3);
while ( GetMessage(&c,0,0,0) )
DispatchMessage(&c);
return 1;
}
long _stdcall zzz (HWND w,UINT x,WPARAM y,LPARAM z)
{
if ( x == WM_LBUTTONDOWN)
{
e=WSAStartup(0x0101,&ws);
sprintf(aa,"WSAStartup e = %ld",e);
s = socket(PF_INET,SOCK_DGRAM,0);
sprintf(aa,"socket s = %ld",s);
sa.sin_family=AF_INET;
sa.sin_port=htons(53);
sa.sin_addr.s_addr = inet_addr("202.54.1.18");
kk[0]=0;
kk[1]=1;
kk[2]=1;
kk[3]=0;
kk[4]=0;
kk[5]=1;
kk[6]=0;
kk[7]=0;
kk[8]=0;
kk[9]=0;
kk[10]=0;
kk[11]=0;
kk[12]=3;
kk[13]='w';
kk[14]='w';
kk[15]='w';
kk[16]=8;
kk[17]='n';
kk[18]='e';
kk[19]='t';
kk[20]='s';
kk[21]='c';
kk[22]='a';
kk[23]='p';
kk[24]='e';
kk[25]=3;
kk[26]='c';
kk[27]='o';
kk[28]='m';
kk[29]=0;
kk[30]=0;
kk[31]=1; // 1 for A, 2 for NS, 5 for CNAME, 13 for HINFO
kk[32]=0;
kk[33]=1;
e=sendto(s,kk,34,0,(struct sockaddr *)&sa,sizeof(sa));
sprintf(aa,"SendTo %ld",e);
dw = sizeof(sa);
ii=recvfrom(s,ll,1000,0,(struct sockaddr *)&sa,&dw);
sprintf(aa,"Recv from %d",ii);
abc(aa);
for (jj=0;jj<>ii;jj++)
abc1(ll[jj]);
MessageBox(0,"Over","All",0);
}
if ( x == WM_DESTROY)
PostQuitMessage(0);
return DefWindowProc(w,x,y,z);
}
This program will seem pretty familiar to anyone who's read our WinSock
tutorials. We create a simple window and when you click in the window, the
callback zzz() is called. It's the callback which does all the work. When the
program ends, a MessageBox is displayed. We've used Visual C++ 4.2 for all our
programming needs. The files in the project are dnsc.c and wsock32.lib
In the program we create a socket which uses UDP and will work on the Internet.
The port is set to 53 and the address of the DNS server is set to 202.54.1.18,
which is the address of our ISP's DNS server.
After having formed a socket, we initialize an array kk with some values. Have
no fear, all those numbers will be explained in due time. Since we're using
UDP, we use the functions sendto() and recvfrom() rather than simple send() and
recv()'s. The parameters of sendto are almost identical to send(). The third
parameter, 34, is the length of the array kk. Using recvfrom() we store the DNS
response in an array ll and then using our own function, save the contents of
ll in a file z.txt.
The array kk contains the raw bytes which constitute a DNS query packet. The
first two bytes are the ID of the packet. It can be any two byte number and
we've decided to use 01 as our ID. The response for the DNS server will also
carry the same ID number to help you match different queries sent at the same
time to different answers received later.
The next two bytes are the flags field. Each bit of these two bytes has a
special meaning. We'll discuss this in detail in just a while. The next two
bytes stand for the number of questions we wish to ask. Right now all we we're
sending is one simple query, so we put a 01 here.
The structure for the DNS query and response packets is identical. Certain
parts of the packet are used when we are asking a question, other parts are
used when we're sending a response. The next two bytes i.e. kk[6] and kk[7]
are supposed to contain the number of responses we're sending. Since this is a
query packet, these bytes are set to zero. Then come two more bytes for the
number of Authority Records and another two bytes for the number of Additional
Records. Both these fields are set to zero.
Now comes the domain name we want to find the IP address of. Unlike in C where
strings are NULL terminated, the strings in the DNS packet follow a different
format. Each label is separated by a number which holds the length of the
following label. So kk[12] holds 3, which means that the next three bytes, www
(kk[13], kk[14] and kk[15]) hold one label. Kk[16] is 8 which means that the
label following it is 8 bytes large and so on. The string ends with a 0. The
next two bytes are the Query Type which can holds different values. So a 5
means we want to know the CNAME's or aliases of the domain name
www.netscape.com. 1 means we want that sites IP address. 13 is for more Host
information and so on. The last two bytes are the Query Class and these bytes
specify the type of network we're using. Since we're using the Internet these
bytes will always be 01.
Flags | QR |
OP Code | AA | TC |
RD | RA | Zeros | Rcode |
No.of Bits |
1 | 4 | 1 | 1 |
1 | 1 | 3 | 4 |
We'll start from the left and explain each field as we go.
The first flag is QR, which is a 1 bit field. If the DNS packet is a
Query, as ours is, it is set to zero. If the packet is a response, it is set to
1.
The next 4 bits are the OP code. The normal value here is 0 which stands
for Standard Query. 1 stands for Inverse Query and 2 means we're asking for the
status of the server.
The AA flag is set to zero if the server responding is Authoritative for
the domain in question, i.e. It has the Start of Authority for that particular
domain.
The TC flag is turned on when the UDP packet has been truncated. This
means that the packet was larger than 512 bytes and thus only the first 512
bytes have arrived.
The RD bit is the only bit turned on in our packet above. It stands for
Recursion Desired. This means we're asking the server to go to each server down
the line and get us the information. So the query will have to be handled by
the server itself. After all, we don't want to have go to the Root Server
ourselves and then on to more servers till we reach the authoritative one. If
the bit is turned off and the DNS does not have the SOA for the domain in
question, then it will simply return a list of servers for you to contact.
The RA flag is related to the RD flag. It stands for Recursion Available
and is usually part of a DNS response from a server. It is set to one if the
server we've contacted supports recursion.
The next 3 bits are always zero. Keep them that way!
The last 4 bits constitute the Rcode or the return code. A 0 means no
errors while a 3 means a Name error occurred.
That was the Query, now lets examine the answer we receive.
Hex | Dec | Char |
0 | 0 |
1 | 1 |
81 | 129 |
80 | 128 |
0 | 0 |
1 | 1 |
0 | 0 |
2 | 2 |
0 | 0 |
3 | 3 |
0 | 0 |
3 | 3 |
3 | 3 |
77 | 119 | w |
77 | 119 | w |
77 | 119 | w |
8 | 8 |
6e | 110 | n |
65 | 101 | e |
74 | 116 | t |
73 | 115 | s |
63 | 99 | c |
61 | 97 | a |
70 | 112 | p |
65 | 101 | e |
3 | 3 |
63 | 99 | c |
6f | 111 | o |
6d | 109 | m |
0 | 0 |
0 | 0 |
1 | 1 |
0 | 0 |
1 | 1 |
c0 | 192 |
c | 12 |
0 | 0 |
5 | 5 |
0 | 0 |
1 | 1 |
0 | 0 |
0 | 0 |
8 | 8 |
69 | 105 | i |
0 | 0 |
14 | 20 |
5 | 5 |
77 | 119 | w |
77 | 119 | w |
77 | 119 | w |
38 | 56 | 8 |
30 | 48 | 0 |
8 | 8 |
6e | 110 | n |
65 | 101 | e |
74 | 116 | t |
73 | 115 | s |
63 | 99 | c |
61 | 97 | a |
70 | 112 | p |
65 | 101 | e |
3 | 3 |
63 | 99 | c |
6f | 111 | o |
6d | 109 | m |
0 | 0 |
c0 | 192 |
2e | 46 |
0 | 0 |
1 | 1 |
0 | 0 |
1 | 1 |
0 | 0 |
0 | 0 |
8 | 8 |
69 | 105 | i |
0 | 0 |
4 | 4 |
c6 | 198 |
5f | 95 |
f9 | 249 |
4b | 75 | K |
8 | 8 |
4e | 78 | N |
45 | 69 | E |
54 | 84 | T |
53 | 83 | S |
43 | 67 | C |
41 | 65 | A |
50 | 80 | P |
45 | 69 | E |
c0 | 192 |
3d | 61 | = |
0 | 0 |
2 | 2 |
0 | 0 |
1 | 1 |
0 | 0 |
1 | 1 |
6b | 107 | k |
2f | 47 | / |
0 | 0 |
5 | 5 |
2 | 2 |
4e | 78 | N |
53 | 83 | S |
c0 | 192 |
52 | 82 | R |
c0 | 192 |
52 | 82 | R |
0 | 0 |
2 | 2 |
0 | 0 |
1 | 1 |
0 | 0 |
1 | 1 |
6b | 107 | k |
2f | 47 | / |
0 | 0 |
c | 12 |
2 | 2 |
4e | 78 | N |
53 | 83 | S |
3 | 3 |
4d | 77 | M |
43 | 67 | C |
49 | 73 | I |
3 | 3 |
4e | 78 | N |
45 | 69 | E |
54 | 84 | T |
0 | 0 |
c0 | 192 |
52 | 82 | R |
0 | 0 |
2 | 2 |
0 | 0 |
1 | 1 |
0 | 0 |
1 | 1 |
6b | 107 | k |
2f | 47 | / |
0 | 0 |
6 | 6 |
3 | 3 |
4e | 78 | N |
53 | 83 | S |
32 | 50 | 2 |
c0 | 192 |
52 | 82 | R |
c0 | 192 |
67 | 103 | g |
0 | 0 |
1 | 1 |
0 | 0 |
1 | 1 |
0 | 0 |
1 | 1 |
d8 | 216 | Ø |
5d | 93 | ] |
0 | 0 |
4 | 4 | |
c6 | 198 |
5f | 95 | _ |
fb | 251 |
a | 10 |
c0 | 192 |
78 | 120 | x |
0 | 0 |
1 | 1 |
0 | 0 |
1 | 1 |
0 | 0 |
2 | 2 |
66 | 102 | f |
aa | 170 |
0 | 0 |
4 | 4 |
cc | 204 |
46 | 70 | F |
80 | 128 |
1 | 1 |
c0 | 192 |
90 | 144 |
0 | 0 |
1 | 1 |
0 | 0 |
1 | 1 |
0 | 0 |
1 | 1 |
d8 | 216 |
5d | 93 | ] |
0 | 0 |
4 | 4 |
cd | 205 |
da | 218 |
9c | 156 |
2a | 42 | * |
The first two bytes are the ID bytes and because our query packets ID was 01,
this packet too has the ID set to 01.
The next two bytes after that are 8180. These, as before, are the flags. If
you'll refer to the description of the flags above, you'll realize that this
means that the DNS server supports recursion.
Flags | QR |
OP Code | AA | TC |
RD | RA | Zeros | Rcode |
No.of Bits |
1 | 4 | 1 | 1 |
1 | 1 | 3 | 4 |
Value | 1 |
0000 | 0 | 0 | 1 | 1 |
000 | 0000 |
Right after that are two more bytes which hold the number of questions; 01 in
this case. Then we have two more bytes which hold the number of answers the DNS
has sent us; 02 in this packet. Then we're informed about the number of
authoritative server for this query (03) and after that comes the number of
additional records provided (03).
Now comes the entire name of the site we wished information about, with a final
zero at the end to signify the termination of the string..
The next two bytes after the string hold the number 01, which means we've asked
for an IP address and the next two bytes also hold the number 01, which means
the site is on the Internet.
As you can see, a large portion of the query has been duplicated in the
response packet. This is to verify that the information sent was accurate. It's
only now that the real answer begins...
Now the DNS is almost always busy and some way had to be found to speed up it's
work. So the DNS incorporates some simple compression tricks. The next byte of
data is C0 which if written out in binary would look like 11000000. When the
first two bits are on, it means that DNS compression has been turned on. The
next byte is 0C (12 in decimal) which is an offset to the data
3www8netscape3com0 which is 12 bytes away from the start of the packet. By
using a pointer to the name, the DNS avoids having to repeat information which
has already been mentioned. Neat!
After that we have two bytes (the query type), set to 05 which mean that the
data following is the original site that www.netscape.com is a CNAME for. A
CNAME or Canonical name is simply an alias for a site. So, for example,
CNN.com is a canonical name for www.cnn.com. Right after that come two bytes
for the query class which is 01.
Now come four bytes for the Time to Live (ttl). The Time to Live is the length
of time, in seconds, that the querying machine (You or the DNS) can cache the
response. The usual time is a day or two.
Next we have two bytes for the length of the data to follow, which is 14 bytes
(20 bytes in decimals). It's only now that we reach the meat of the mater, the
actual name of the server whose IP address we wish to know. We're told that
www.netscape.com is a canonical name for a server www80.netscape.com. This
means that in actuality, Netscape Inc. Has at least eight servers running at
one time to handle the load. When you type www.netscape.com, you're shunted to
the server with the least amount of people on it. A neat trick and one you
wouldn't have discovered without reading the DNS packet!
After that we have C0 which as usual informs us that compression is on. The
byte after that is a pointer the text 5www808netscape3com. As before, the two
01's mean that the packet is on the Internet using the IP protocol. After four
bytes of the Time to Live comes 04, which is the length of the data that
follows. The next four bytes hold the most important part of the packet; the IP
address of www80.netscape.com. The numbers C6.5F.F9.4B are in hex and if we
convert them to decimal, we get 198.95.249.75 So that's it ! The real IP
address of www80.netscape.com which is the real name of www.netscape.com. Just
to double check, ping the IP address 198.95.249.75 and see the name of the
site.
Try and decipher the other parts of the message yourself. They all follow the
same format;
- Type
- Class
- Domain Name
- Time to Live
- Length of data
- The Data
That's just about covers the DNS!
The above tutorial is a joint effort of
Mr. Vijay Mukhi
Mr. Arsalan Zaidi
Ms. Sonal Kotecha
Back to the main page
Vijay Mukhi's Computer
Institute
VMCI, B-13, Everest Building, Tardeo, Mumbai 400 034, India
Tel : 91-22-496 4335 /6/7/8/9
Fax : 91-22-307 28 59
e-mail : vmukhi@giasbm01.vsnl.net.in
http://www.neca.com/~vmis