The goal of the DLZ project was to modify Bind to use a database instead of loading all data into memory at startup. This capability could be used to provide numerous benefits, such as reducing Bind's startup time and memory requirements while also simplifying DNS management. DLZ has been very successful and supports a wide variety of flexible database drivers. But how does DLZ and its various drivers perform compared to an unmodified version of Bind? Which driver is the fastest? How can you get the best performance from each of the available DLZ drivers?
The DLZ Performance tests attempt to answer these questions and provide users with additional information so the best DLZ driver for your situation can be determined. This page documents the DLZ test environment used for testing all the DLZ drivers. More details on the configuration and tuning of each driver is available on the driver specific pages. A summary for quick comparison of each driver's performance is also included.
The test environment consists of three computers.
Computer 1 - Test Server
Fedora Core 1 was used because it included the 2.4.22 kernel which was necessary to support the SATA drives. Later the 2.6.1 kernel was compiled and installed on the machine so DLZ could be tested with both kernels to see if there was any difference in performance. The drives were setup with separate partitions for each of /boot / (root), /dns_data/ext3 and /dns_data/resierfs. All of these partitions used ext3 file systems except /dns_data/reiserfs, which used the reiserfs. The partitions were configured with linux software raid level one. Additionally, a swap partition was created on each drive. These partitions were not raided.
This system was used as the Bind server. When using DLZ, any necessary database was run on this machine as well as Bind-DLZ. When a database was not being used, it was shut off. That is to say, when running DLZ and testing the PostgreSQL driver, all other database servers on the machine like LDAP and MySQL were not running. When running unmodified Bind, none of the database servers were running. Installed and running services were kept to a minimum. Things like the SSH server, however, were always running as that is how I expect most "production" servers to be configured.
Computer 2 - Load client
Computer 3 - Latency client
This machine is also used to run another test which determines how long it takes Bind to start and begin answering queries. Startup and shutdown times can be very important on a server with a large number of zones. Normal Bind systems with an extremely large number of zones can take hours to startup or shutdown.
Before the performance testing was done, a great deal of time was spent determining the optimum configuration for each DLZ driver and its database. The optimum configuration of each driver is documented on its performance page. Keep in mind that these configurations were optimum for this particular hardware, and you should find the best configuration for your system as it is likely to be different from the configurations found here. These configurations are provided so someone can replicate these tests on similar hardware, or use them for pointers on how to tune the driver.
Each test began with a freshly booted system so that any cached data was wiped out and any memory that may have been leaked in a driver or database was reclaimed. I don't suspect there were any memory leaks at all. The system was allowed to sit for a few minutes after starting up so that any boot operations were fully completed. If a database was required for the test, it was started and a minute was allowed for the system to settle to an idle again. The system always settled within a second or two, so waiting the entire minute was overkill.
The first test performed was the startup test. With the startup command for Bind and the timeDnsRefresh utility already typed in, the enter key of both systems was pressed simultaneously. For this test, differences of a few milliseconds or even a second are irrelevant; only large differences in startup times are important. So pressing the two enter keys simultaneously is good enough.
Next a warm up cycle was done. The queryperf tool was run for 10 minutes so the DNS server could load any caches that the drivers or databases may use. After the ten minute warmup, three unloaded latency tests were run. The results from these tests were averaged to create the numbers below. A load was placed on the server by running queryperf, and then the loaded latency tests were run. The results from three tests were averaged. The queryperf tool was then stopped.
Last, the query speed tests were run. The queryperf utility was run 3 times with a duration of 1 minute each time. The averaged results are presented below.
|Performance Results Summary|
The two tables below provide a quick summary of the performance results of each DLZ driver and an unmodified Bind. The first table is the performance results with a 2.4.22 Linux kernel, and the second is with the 2.6.1 kernel. The numbers represent the Queries Per Second (QPS) that Bind was able to answer using the associated driver. The best performance of each driver on each kernel is highlighted green for quick reference.
For drivers which connect to external databases (PostgreSQL, MySQL, LDAP), the number of database connections allocated was set to match the number of threads. The BTREE column is the original BDB driver using btree indexes. The HASH column is the original BDB driver using hash indexes. The driver is actually the same; the type of indexes used is the only difference.
The BDBHPT driver can operate in three modes: Transactional (HPT-T), Concurrent (HPT-C) and Private (HPT-P). The different modes of operation provide varying degrees of read / write locking within Berkeley DB itself. Transactional mode provides the most locking and concurrency for simultaneous reads / writes, but its performance suffers because of the locking overhead. Concurrent mode provides less locking, and as a result operates faster. Private mode does no locking at all, and thus provides the fastest results. For more information on the various modes of operation and the benefits of each, see the BDBHPT drivers documentation.
* to get a "clean run" the query timeout for the file system driver had to be increased from its default of 5 seconds. For the 2.4.22 kernel, the timeout was set to 9 seconds for both 1 and 2 threads. With the 2.6.1 kernel, the timeout was set to 10 seconds for one thread and 12 seconds with 2 threads. I estimate about 5 percent of the queries would have been lost if the timeout values were not increased.