Project sponsored by



Project hosted on

SourceForge Logo



This driver allows you to use Berkeley DB as a database for storage of DNS data. It has been tested on Windows 2K and Redhat Linux 8.0. The driver should build properly on any UN*X system that BIND & Berkeley DB support. Be sure to specify --with-dlz-bdb when running configure so that the BDBHPT driver is built with BIND. Actually, both the original BDB driver and the BDBHPT driver are built when --with-dlz-bdb is specified. By default, DLZ and its drivers are not built. When you specify a DLZ driver, the DLZ core is automatically built too.

BE SURE TO READ THE SPECIAL BUILD INSTRUCTIONS FOR THE BDB DRIVERS!

BDBHPT stands for Berkeley DB High Performance Text. This driver was developed during the performance testing and profiling of DLZ. The profiling of the original BDB driver revealed the cursor joins to be very expensive, severely reducing the drivers performance. This driver was developed to use a different Berkeley DB database schema which eliminated the need to do cursor joins. As a result, it is much faster than the original BDB driver. Also, a bug in Berkeley DB itself causes applications which use the concurrent data store capability and joins to deadlock if large volumes of reads and writes are done simultaneously. The original BDB driver suffers from this issue. The BDBHPT driver does not do ANY joins. Thus, it does not suffer from the Berkeley DB bug.

This driver is very similar in configuration and function to the original BDB driver. There are three important differences however. First, be sure to specify "bdbhpt" in the DLZ configuration instead of "bdb" so Bind uses this driver. Second, as mentioned before, the database schema is different, so how the data is stored in BDBHPT is much different than how it was stored in the original BDB driver. Lastly, the BDBHPT does not have a utility program like dlzbdb. So you will need to use the Berkeley DB's db_load program to construct a BDBHPT database and develop your own utilities for maintaining the database.

Below is a sample of a proper dlz_bdbhpt_driver configuration. This configuration segment would be contained in BIND's config (named.conf) file. It is explained more below. When you are setting up your own BDBHPT driver, be sure to pass the following parameters to BIND "-g -d 1". The first "-g" tells BIND to write all log messages to stdout instead of a log file. The second parameter "-d 1" sets BIND's debug level to 1. The Berkely DB driver will output additional information when the debug level is set to at least 1. This can be very helpful while you are setting up the driver. The additional information will be output only when BIND is trying to answer DNS queries, not when BIND loads. Run a few sample DNS queries in order to see the output.

dlz "bdbhpt zone" {
   database "bdbhpt T /dns-root dnsdata.db";
};

The first line: dlz "bdb zone" {

This line tells BIND we want to use a DLZ driver. The word "dlz" is a new BIND keyword added by the DLZ patch. The next section "bdbhpt zone" is the label for this configuration segment. It is used in any error messages BIND displays while parsing its config file. The last piece "{" starts the DLZ configuration section in BIND's config file.

The second line: database "bdbhpt T /dns-root dnsdata.db";

This line is indented just to make it easier to read the configuration file. The keyword "database" is the only parameter that can be specified in a DLZ configuration segment. It is required. The double quote (") begins the command line that is passed to the DLZ driver--in this case, the Berkeley DB High Performance Text driver. The command line could be broken over many lines but is not necessary here. The next piece is the word "bdbhpt". This is the official name of the DLZ Berkeley DB High Performance Text driver. We are telling BIND that we want to use the Berkeley DB HPT driver. The word "bdbhpt" is located at argv[0]. I.E. This is the command line array passed to the driver, and the driver name must always be at argv[0]; it is not optional.

Next is the letter "T". This determines the "mode" the BDBHPT driver should use. Three modes are supported: T - transactional, C - concurrent and P - private. Modes are discussed later in the documentation.

Next is "/dns-root". It should NOT have a "/" at the end. This is the directory path where the Berkeley DB environment is stored. The BDBHPT driver can take advantage of several different Berkeley DB data store modes, and does not use the old style "DBM" access mode; thus it requires a Berkeley DB environment. When operating in "P" - private mode the BDB does not actually create any environment files, but this parameter is still required.

The next item "dnsdata.db" is the path and name of the database file to store the dns data in. This "path" is relative to the environment path. So, using the configuration as we have it here will use a database file called "dnsdata.db" in the path "/dns-root" for a full path of "/dns-root/dnsdata.db". You can also use absolute paths for the database file path and name if you don't want the database file stored in the same folder as or relative to the Berkeley DB environment. I highly recommend that you place the database file in the same folder as the database environment just for ease of administration.

The last characters on the command line are "; These characters complete and close the command line and are part of BIND's standard configuration file syntax.

Third line: };

This closes the DLZ configuration section in BIND's config file. It is part of BIND's standard configuration file syntax.

How the Berkeley DB HPT driver works:

The Berkeley database file actually has 4 "databases" in it. Berkeley DB and "DBM" / "NDBM" / "GDBM" style databases are only simple key = value databases. That is to say that each database can only store a key, and the corresponding value for that key. In order to perform more complex query operations using multiple keys you need to use multiple key = value databases. The four databases in the database file are:

dns_data

The DNS data. The key is zone_name(a space)host_name. For example: "example.com www" (without the quotes). It is used during the "lookup" operation in DLZ, and also during zone transfers. See below for the proper format of the value field. For the best performance this should be a HASH database.

dns_zone

The zone index for the DNS data. The key is a reversed zone name, the value is empty. For best performance this should be a BTREE database. Berkeley DB can take advantage of "locality of reference" to increase performance in BTREE databases.

When Bind uses DLZ, it must first determine which portion of the query is the zone and which portion is the host. So if the query was for "www.test.example.com", the zone name may be any of "com", "example.com", "test.example.com", or could even be "www.test.example.com". The beginning of each of these strings is very different alphabetically, and so as DLZ checks each one it has to move to very different places in the database - which takes time and reduces cache effectiveness.

Instead, BDBHPT reverses the string so the above examples would be: "moc", "moc.elpmaxe", "moc.elpmaxe.tset", "moc.elpmaxe.tset.www". With all the strings backwards, they are very close to each other alphabetically, and thus stored close to each other in the database. This drastically improves cache effectiveness and performance. Even though it does take some time to reverse the string, the performance of BDB is so much improved that overall the speed of the driver is increased.

The value in this database is empty because there is no need for a value. All we need is to check for a matching zone name. Keeping the database as small as possible also helps to increase performance.

dns_xfr

To do zone transfers we need to retrieve all of the DNS entries for a zone. For the best performance during standard DNS queries, the dns_data database uses both zone and host names as its key. Berkeley DB only supports searching by equality or matches and not searching by sub-string. Thus, the dns_data database cannot be used directly for zone transfers. There were two ways this could be solved.

Option one was to create another database that used zone name as the key, and had all the DNS data as the value. This would make the database very large because all the DNS data would be stored twice. Larger databases generally perform slower too. The purpose of the BDBHPT database was to be as efficient as a text database could.

Option two (which BDBHPT uses) was to create a database that used zone name as the key and host name as the value. Only unique zone/host pairs are allowed in the database. So even if the host www in the zone example.com has multiple DNS data entries, the zone/host pair of example.com/www is only in the database once, similar to how only unique zone/ip pairs are allowed in the dns_client database. Look at the dns_client information for an example. In this way, when a zone transfer needs to be performed, we can lookup the zone in dns_xfr to retrieve a list of hosts. Then for each host in that list we can lookup all the DNS data for that zone/host pair in the dns_data database. This may make zone transfers a little less speedy, but generally zone transfers don't have to be as fast or occur as often as normal DNS queries.

dns_client

This database is used to determine if zone transfers are allowed by the client. The keys are zone names, and the values are IP addresses. Multiple matching keys are allowed. Multiple matching IP addresses are allowed. Only one unique zone & IP address are allowed. I.E You may have:

key value
example.com 127.0.0.1
example.com 192.168.0.1
example2.com 127.0.0.1
example2.com 192.168.0.1
example.com 127.0.0.1 // not allowed!

All the above would be allowed except the last line because a key "example.com" already has a value "127.0.0.1".

Data in the "dns_data" value field must be specified in the following order. Only a single space " " must be between each field, and records of DNS type TXT should include any necessary quoting. Any un-needed fields should be ignored.

Order Name Data Type Description
1 replication_id string a unique alpha-numeric id for this record.
2 host string DNS host name
3 ttl string (num) Time to live (string must convert to number)
4 type string DNS data type
5 mx_priority string (num) MX Priority (only for MX DNS types)
6 data string IP address / Host name / Full domain name
7 primary_ns string Primary name server SOA record (SOA ONLY)
8 resp_person string Responsible person SOA record (SOA ONLY)
9 serial string (num) Serial # for SOA record (SOA ONLY)
10 refresh string (num) Refresh timefor SOA record (SOA ONLY)
11 retry string (num) Retry time for SOA record (SOA ONLY)
12 expire string (num) Expire time for SOA record (SOA ONLY)
13 minimum string (num) Minimum time for SOA record (SOA ONLY)

Like most other DLZ drivers, the Berkeley DB supports wildcard hostnames using "*" as the wild card, and the character "@" is used as the hostname at the zone apex.

Some examples:

1 @ SOA 10 ns1.example.com. root.example.com. 2 2800 7200 604800 86400
2 www A 10 192.168.0.1
3 mail A 10 192.168.0.2
4 backup A 10 192.168.0.3
5 @ MX 10 20 mail
6 @ MX 10 40 backup
7 www MX 10 20 mail
8 www MX 10 40 backup
9 ns1 A 10 192.168.0.4
10 ns2 A 10 192.168.0.5
11 @ NS 10 ns1.example.com.
12 @ NS 10 ns2.example.com.
13 @ TXT 10 "This is some sample text.  Notice it is inside of quotes"
14 @ TXT 10 "The quotes are required by BIND so that it returns the"
15 @ TXT 10 "string as a single entry.  Otherwise it will return each"
16 @ TXT 10 "word as a entry instead of the entire string."
17 @ TXT 10 "This text is actually 8 separate text records.  It was"
18 @ TXT 10 "broken up this way to be easier to read.  You may"
19 @ TXT 10 "specify long text entries if you want. i.e. this could"
20 @ TXT 10 "have all been one entry if we wanted to do that."
21 * A 10 192.168.0.6

As no dlzbdb type utility is provided for BDBHPT yet, you will need to build your own or use the db_load program that comes with Berkeley DB. Sample db_load files can be generated by using the bdbhpt writer that is included with the DLZ Performance Tools since version 1.1.

BDBHPT Operating Modes

The BDBHPT driver supports three modes of operation. They are T - transactional, C - concurrent and P - private. The mode setting determines which BDB flags are used to create and open the BDB database environment. If you develop an application to operate on the BDBHPT database, be sure your application's environment flags match those in use by the DLZ driver. For a better understanding of how the BDBHPT driver uses the environment flags, look at the dlz_bdbhpt_driver.c source code.

The "T" mode is transactional and allows multiple applications simultaneous read and write capability on the database with support for committing and rolling back transactions. The "C" mode uses BDB's concurrent data store features. This allows multiple applications read and write capability on the database, but does not support commit or rollback operations. In both transactional (T) and concurrent (c) modes the Berkeley DB automatically handles locking internally. The "P" mode is private. When using this mode of operation, Berkeley DB does not create any environment files and stores its environment in application memory. In this mode, no other application can safely make changes to the database while the BDBHPT driver is running. Since the BDBHPT driver does not do any writing to the database, none of BDB's locking sub-systems are used. This provides the highest possible speed because the overhead of locking is eliminated--and it can be a significant overhead. For more information about the different data store modes of the Berkeley DB, see the documentation provided with that software package.

The entire purpose of DLZ is to allow dynamic updates to a database while the Bind DNS server is running; so why provide a mode that does not allow updates? Put simply, SPEED. The locking which is necessary to support safely updating the BDB database creates a lot of overhead in the query time processing which reduces the overall performance of the Bind-DLZ server. However, the BDB database still provides other benefits, like nearly instantanous startup times. We can take advantage of this feature to still meet most of DLZ's goals while also providing high performance.

To perform "updates" to a BDBHPT database using the private mode is fairly simple. First, we make a copy of the database. This copy can be in the same directory or in another directory. Then we make updates to the copy of the BDB database using the transactional data store mode so we can perform rollbacks if necessary. The BDB environment used for these updates SHOULD NOT be created in the same directory that the DLZ driver accepts for its environment. When all the changes have been made the database should be closed and the environment files removed. Then using the UN*X rename function the two copies (original & updated) of the BDB database file can be swapped, and then a rndc reload can be executed. This will cause Bind to reload its configuration and begin using the newly updated BDB database. Because of the nearly instantaneous startup times when using the BDBHPT driver, there will be a minimal interruption of service by your DNS server. Of course this process requires careful synchronization to prevent more than one application trying to make updates simultaneously.

The private mode of operation for the DLZ BDBHPT driver should only be used when server performance is the absolutely most important criteria and there are few updates to the DNS data. The performance of the concurrent mode is fairly close and much easier to use and implement.