Project sponsored by

Project hosted on

SourceForge Logo

BIND DLZ > Performance Tests

The goal of the DLZ project was to modify Bind to use a database instead of loading all data into memory at startup. This capability could be used to provide numerous benefits, such as reducing Bind's startup time and memory requirements while also simplifying DNS management. DLZ has been very successful and supports a wide variety of flexible database drivers. But how does DLZ and its various drivers perform compared to an unmodified version of Bind? Which driver is the fastest? How can you get the best performance from each of the available DLZ drivers?

The DLZ Performance tests attempt to answer these questions and provide users with additional information so the best DLZ driver for your situation can be determined. This page documents the DLZ test environment used for testing all the DLZ drivers. More details on the configuration and tuning of each driver is available on the driver specific pages. A summary for quick comparison of each driver's performance is also included.

Test Environment

The test environment consists of three computers.

Computer 1 - Test Server
Super Micro P4SCE 800Mhz FSB SATA motherboard Dual 100MB Ethernet
1GB 400 Mhz DDR ECC Ram
3.0 Ghz 800Mhz FSB Pentium 4 Hyperthread enabled processor
Two Western Digital Raptor 36GB 10K rpm SATA drives

Fedora Core 1 was used because it included the 2.4.22 kernel which was necessary to support the SATA drives. Later the 2.6.1 kernel was compiled and installed on the machine so DLZ could be tested with both kernels to see if there was any difference in performance. The drives were setup with separate partitions for each of /boot / (root), /dns_data/ext3 and /dns_data/resierfs. All of these partitions used ext3 file systems except /dns_data/reiserfs, which used the reiserfs. The partitions were configured with linux software raid level one. Additionally, a swap partition was created on each drive. These partitions were not raided.

This system was used as the Bind server. When using DLZ, any necessary database was run on this machine as well as Bind-DLZ. When a database was not being used, it was shut off. That is to say, when running DLZ and testing the PostgreSQL driver, all other database servers on the machine like LDAP and MySQL were not running. When running unmodified Bind, none of the database servers were running. Installed and running services were kept to a minimum. Things like the SSH server, however, were always running as that is how I expect most "production" servers to be configured.

Computer 2 - Load client
I used my development system as the load client. This system is a HP Pavillion N5495 laptop. It has a 1.03GHZ Pentium III CPU, 512MB ram and 100MB ethernet. This system runs Windows 2000, and I run Redhat Linux 8.0 as a virtual machine in VMWARE on top of Windows. The queryperf tool only works on UN*X machines, so I had to run it in the Redhat virtual machine. Not an ideal situation, but you have to use what you have available. I am sure running the query perf tool on the Redhat virtual machine did not have any serious effects on the performance testing of DLZ.

Computer 3 - Latency client
This system is just a low end pentium II 233Mhz pc. I used it to test the loaded and unloaded latency of the test server. Latency is the amount of time it takes for the test server to respond to a single query. Loaded latency tests are done by placing a heavy DNS query load on the DLZ test server by having computer #2 execute queryperf. Unloaded tests are done while the DLZ test server is idle and not loaded down by queryperf. These tests demonstrate that a busy server takes longer to answer each query than one that is not heavily loaded with queries. The latency can be a good indicator to determine if a DLZ driver is I/O bound or CPU bound. A driver which is I/O bound (has an I/O bottleneck) will probably have a higher latency than a driver which is CPU bound because I/O operations take much longer. Lastly, if a query response takes too long, the requestor will time out and think the query was never answered, so latency is important.

This machine is also used to run another test which determines how long it takes Bind to start and begin answering queries. Startup and shutdown times can be very important on a server with a large number of zones. Normal Bind systems with an extremely large number of zones can take hours to startup or shutdown.

Test Network
The test network is a dedicated private 100MB ethernet switch. No other computers were attached to the switch while the tests were performed, so the only traffic on the switch was generated by these computers.

Testing Methodology

Before the performance testing was done, a great deal of time was spent determining the optimum configuration for each DLZ driver and its database. The optimum configuration of each driver is documented on its performance page. Keep in mind that these configurations were optimum for this particular hardware, and you should find the best configuration for your system as it is likely to be different from the configurations found here. These configurations are provided so someone can replicate these tests on similar hardware, or use them for pointers on how to tune the driver.

Each test began with a freshly booted system so that any cached data was wiped out and any memory that may have been leaked in a driver or database was reclaimed. I don't suspect there were any memory leaks at all. The system was allowed to sit for a few minutes after starting up so that any boot operations were fully completed. If a database was required for the test, it was started and a minute was allowed for the system to settle to an idle again. The system always settled within a second or two, so waiting the entire minute was overkill.

The first test performed was the startup test. With the startup command for Bind and the timeDnsRefresh utility already typed in, the enter key of both systems was pressed simultaneously. For this test, differences of a few milliseconds or even a second are irrelevant; only large differences in startup times are important. So pressing the two enter keys simultaneously is good enough.

Next a warm up cycle was done. The queryperf tool was run for 10 minutes so the DNS server could load any caches that the drivers or databases may use. After the ten minute warmup, three unloaded latency tests were run. The results from these tests were averaged to create the numbers below. A load was placed on the server by running queryperf, and then the loaded latency tests were run. The results from three tests were averaged. The queryperf tool was then stopped.

Last, the query speed tests were run. The queryperf utility was run 3 times with a duration of 1 minute each time. The averaged results are presented below.

Performance Results Summary

The two tables below provide a quick summary of the performance results of each DLZ driver and an unmodified Bind. The first table is the performance results with a 2.4.22 Linux kernel, and the second is with the 2.6.1 kernel. The numbers represent the Queries Per Second (QPS) that Bind was able to answer using the associated driver. The best performance of each driver on each kernel is highlighted green for quick reference.

For drivers which connect to external databases (PostgreSQL, MySQL, LDAP), the number of database connections allocated was set to match the number of threads. The BTREE column is the original BDB driver using btree indexes. The HASH column is the original BDB driver using hash indexes. The driver is actually the same; the type of indexes used is the only difference.

The BDBHPT driver can operate in three modes: Transactional (HPT-T), Concurrent (HPT-C) and Private (HPT-P). The different modes of operation provide varying degrees of read / write locking within Berkeley DB itself. Transactional mode provides the most locking and concurrency for simultaneous reads / writes, but its performance suffers because of the locking overhead. Concurrent mode provides less locking, and as a result operates faster. Private mode does no locking at all, and thus provides the fastest results. For more information on the various modes of operation and the benefits of each, see the BDBHPT drivers documentation.

Threads Drivers - 2.4.22 kernel
Bind Postgres MySQL LDAP FileSystem* Berkely DB
1 QPS 16,108 589 689 82 176 1116 1011 5325 9164 12,050
Loaded 1 34 29 602 1215 17 23 4 3 2
Unloaded 1 3 2 86 1 2 2 1 1 1
Startup 542 0 1 112 1 2 2 1 1 1
2 QPS 16,869 759 N/A 88 185 1238 999 4132 5459 5731
Loaded 1 27 N/A 448 596 15 16 5 4 3
Unloaded 1 3 N/A 81 1 2 1 1 1 1
Startup 1336 1 N/A 107 1 1 1 1 1 1

Threads Drivers - 2.6.1 kernel
Bind Postgres MySQL LDAP FileSystem* Berkely DB
1 QPS 16,470 552 618 90 271 1057 956 4285 5890 8488
Loaded 1 36 35 442 1436 23 22 5 4 3
Unloaded 1 3 2 95 2 2 2 1 1 1
Startup 647 1 1 483 2 2 1 0 0 0
2 QPS 15,569 673 N/A 91 244 1247 1117 4081 4749 5178
Loaded 1 29 N/A 434 1722 16 16 5 4 4
Unloaded 1 3 N/A 91 2 2 2 1 1 1
Startup 1441 2 N/A 192 1 1 1 1 1 1

* to get a "clean run" the query timeout for the file system driver had to be increased from its default of 5 seconds. For the 2.4.22 kernel, the timeout was set to 9 seconds for both 1 and 2 threads. With the 2.6.1 kernel, the timeout was set to 10 seconds for one thread and 12 seconds with 2 threads. I estimate about 5 percent of the queries would have been lost if the timeout values were not increased.