Hello,
all!
I've just committed a big bunch of improvements to back-sql, contributed by Sam Drake and Raj Damani from TimesTen Performance Software. It is available through CVS (HEAD branch). Summary of changes is cited below. In brief - this eliminates most of known deficiencies of back-sql, and is expected to boost out-of-box performance by SEVERAL TIMES ! The patch will undergo some cosmetic changes later, when I return from vacation, but it should be ready to test. Rajen kindly agreed to answer all questions asked about this patch on openldap-devel and -software lists, and provide bugfixes (if any will be needed). All those will be committed when I return. Big thanks to Sam and Raj, WBW, Dmitry ----- Original Message ----- From: "Rajen Damani" <rdamani@timesten.com> To: <mitya@seismic.ru> Sent: Saturday, July 28, 2001 2:52 AM Subject: Back-sql improvements > Hello Dmitry, > My name is Raj Damani and I work with Sam Drake at TimesTen Performance > Software. Sam had made bunch of improvements to OpenLDAP a while ago. I > recently made the same changes to latest OpenLDAP 2.0.11 source. We would > like to send those changes to you. (email below describes all those > changes). > > I was wondering, how I should submit these changes. I can send you the > entire upgraded (back-sql) files that include all improvements via email. > If this does not work for you, please let me know how we can submit these > changes. > > Thanks. > -Raj Damani > TimesTen Performance Software > > -----Original Message----- > From: Sam Drake [mailto:drake@timesten.com] > Sent: Saturday, April 07, 2001 10:40 PM > To: 'mitya@seismic.ru' > Cc: openldap-devel@OpenLDAP.org > Subject: RE: Slapd frontend performance issues > > > FYI, here is a short description of the changes I made. I'll package up the > changes asap, but it may take a couple of days. > > The performance numbers quoted in this report were seen at my location with > a 100,000 object database ... the slower numbers I mentioned earlier were > reported by a customer with a 1,000,000 object database. > > I also can't explain the very poor performance I saw with OpenLDAP and LDBM > with a 100,000 object database. > > ...Sam Drake / TimesTen Performance Software > > ---------- > > Work Performed > > OpenLDAP 2.0.9, including back-sql, was built successfully on Solaris > 8 using gcc. The LDAP server itself, slapd, passed all tests bundled > with OpenLDAP. OpenLDAP was built using Sleepycat LDBM release 3.1.17 > as the "native" storage manager. > > The experimental back-sql facility in slapd was also built > successfully. It was built using Oracle release 8.1.7 and the Oracle > ODBC driver and ODBC Driver Manager from Merant. Rudimentary testing > was performed with the data and examples provided with back-sql, and > back-sql was found to be functional. > > Slapd and back-sql were then tested with TimesTen, using TimesTen > 4.1.1. Back-sql was not immediately functional with TimesTen due to a > number of SQL limitations in the TimesTen product. > > Functional issues encountered were: > > 1. Back-sql issued SELECT statements including the construct, > "UPPER(?)". While TimesTen supports UPPER, it does not support the > use of parameters as input to builtin functions. Back-sql was > modified to convert the parameter to upper case prior to giving it > to the underlying database ... a change that is appropriate for all > databases. > > 2. Back-sql issued SELECT statements using the SQL CONCAT function. > TimesTen does not support this function. Back-sql was modified to > concatentate the necessary strings itself (in "C" code) prior to > passing the parameters to SQL. This change is also appropriate for > all databases, not just TimesTen. > > Once these two issues were resolved, back-sql could successfully > process LDAP searches using the sample data and examples provided with > back-sql. > > While performance was not measured at this point, numerous serious > performance problems were observed with the back-sql code and the > generated SQL. In particular: > > 1. In the process of implementing an LDAP search, back-sql will > generate and execute a SQL query for all object classes stored in > back-sql. During the source of generating each SQL query, it is > common for back-sql to determine that a particular object class can > not possibly have any members satisfying the search. For example, > this can occur if the query searches an attribute of the LDAP > object that does not exist in the SQL schema. In this case, > back-sql would generate and issue the SQL query anyway, including a > clause such as "WHERE 1=0" in the generated SELECT. The overhead > of parsing, optimizing and executing the query is non-trivial, and > the answer (the empty set) is known in advance. Solution: Back-sql > was modified to stop executing a SQL query when it can be > predetermined that the query will return no rows. > > 2. Searches in LDAP are fundamentally case-insensitive ("abc" is equal > to "aBc"). However, in SQL this is not normally the case. > Back-sql thus generated SQL SELECT statements including clauses of > the form, "WHERE UPPER(attribute) = 'JOE'". Even if an index is > defined on the attribute in the relational database, the index can > not be used to satisfy the query, as the index is case sensitive. > The relational database then is forced to scan all rows in the > table in order to satisfy the query ... an expensive and > non-scalable proposition. Solution: Back-sql was modified to allow > the schema designer to add additional "upper cased" columns to the > SQL schema. These columns, if present, contain an upper cased > version of the "standard" field, and will be used preferentially > for searching. Such columns can be provided for all searchable > columns, some columns, or no columns. An application using > database "triggers" or similar mechanisms can automatically > maintain these upper cased columns when the standard column is > changed. > > 3. In order to implement the hierarchical nature of LDAP object > hierarchies, OpenLDAP uses suffix searches in SQL. For example, to > find all objects in the subtree "o=TimesTen,c=us", a SQL SELECT > statement of the form, "WHERE UPPER(dn) LIKE '%O=TIMESTEN,C=US'" > would be employed. Aside from the UPPER issue discussed above, a > second performance problem in this query is the use of suffix > search. In TimesTen (and most relational databases), indexes can > be used to optimize exact-match searches and prefix searches. > However, suffix searches must be performed by scanning every row in > the table ... an expensive and non-scalable proposition. Solution: > Back-sql was modified to optionally add a new "dn_ru" column to the > ldap_entries table. This additional column, if present, contains a > byte-reversed and upper cased version of the DN. This allows > back-sql to generate indexable prefix searches. This column is > also easily maintained automatically through the use of triggers. > > Results > > A simple database schema was generated holding the LDAP objects and > attributes specified by our customer. An application was written to > generate test databases. Both TimesTen and Oracle 8.1.7 were > populated with 100,000 entry databases. > > Load Times > > Using "slapadd" followed by "slapindex", loading and indexing 100,000 > entries in an LDBM database ran for 19 minutes 10 seconds. > > Using a C++ application that used ODBC, loading 100,000 entries into > a disk based RDBMS took 17 minutes 53 seconds. > > Using a C++ application that used ODBC, loading 100,000 entries into > TimesTen took 1 minute 40 seconds. > > Search Times > > The command, "timex timesearch.sh '(cn=fname210100*)'" was used to > test search times. This command issues the same LDAP search 4000 > times over a single LDAP connection. Both the client and server > (slapd) were run on the same machine. > > With TimesTen as the database, 4000 queries took 14.93 seconds, for a > rate of 267.9 per second. > > With a disk based RDBMS as the database, 4000 queries took 77.79 seconds, > for a > rate of 51.42 per second. > > With LDBM as the database, 1 query takes 76 seconds, or 0.076 per > second. Something is clearly broken. |