Hi,
I would like to bring up the problem with back-mdb and
alias-dereferening again. There is Bug ITS#7657 for this already, and
in
http://www.openldap.org/lists/openldap-technical/201504/msg00008.html
the problem was also addressed. Probably in other places as well, but
I did not find anything else in the archives.
I would like to reconfigure all my LDAP clusters to use back-mdb,
however, I can't: my LDAP tree has about 2 Mio entries, 600000 of
which are alias entries.
I observe that when using back-mdb and making search request with
search scope "sub" and deref=always, than this search request will
take seconds before a response... in contrast, with deref=never, the
same search request takes only millisecs. It does not matter whether
there are aliases in the search scope or not... with deref=always it
always takes seconds, each request keeping one CPU thread very busy.
So if many requests come in with sub/always (which unfortunately
happens a lot because many LDAP clients just set deref=always as a
default) it is a sure way to bring down an LDAP cluster configured with
back-mdb.
The problem has haunted me for a while (incidents, incidents :-), and
I have tried to find out where it comes from. I would like to share my
conjectures here.
A search request with sub/always eventually ends up in the function
search_aliases(..), located in back-mdb/search.c (code/linenumbers is
from v2.4.46 ):
157 /* Find all aliases in database */
158 MDB_IDL_ZERO( aliases );
159 rs->sr_err = mdb_filter_candidates( op, isc->mt, &af, aliases,
160 curscop, visited );
161 if (rs->sr_err != LDAP_SUCCESS || MDB_IDL_IS_ZERO( aliases )) {
162 return rs->sr_err;
163 }
164 oldsubs[0] = 1;
165 oldsubs[1] = e_id;
166
167 MDB_IDL_ZERO( visited );
168 MDB_IDL_ZERO( newsubs );
169
170 cursoro = 0;
171 ido = mdb_idl_first( oldsubs, &cursoro );
172
173 for (;;) {
174 /* Set curscop to only the aliases in the current scope. Start with
175 * all the aliases, then get the intersection with the scope.
176 */
177 rs->sr_err = mdb_idscope( op, isc->mt, e_id, aliases, curscop );
After adding a bit of extra debug code, I could see in the logs that
the time is actually spent in line 177, i.e. the mdb_idscope function.
What I believe is happening is the following: in line 159 the index
list of all aliases in the DB is constructed and put into the
"aliases" list. The list comes from the objectClass index, but since
there are so many alias entries, the index contains not an explicit
list, but a range... a range that covers basically *all* 2 Mio entries
in the DB. (this happens also with the hdb backend, so the fact that
"aliases" is a range is perhaps part, but not cause of the problem).
Then the for loop is entered in 173, and mdb_idscope in 177 is
called. The idea there is to compute the intersection of alias list
"aliases" and search scope, i.e. get all aliases that are in the
actual search scope.
This is done by walking through the 2 millions indexes represented by
the range in "aliases", and deciding whether the id is in the scope or
not. This, if I understand correctly, by walking up the LDAP tree from
leaf towards root (actually walking up the structure of the dn2id DB)
... if we encounter the search base on the way, the alias is part of
the search scope, if we reach the root, then not.
Computing this intersection is what costs time, and it does not
depend on whether the intersection is empty or not.
Apparently it is done differently in back_hdb and by extension,
back_bdb. I have not analysed that code there in detail but I believe
the intersection is computed by actually looking at 2 different ID
lists, one containing the aliases (range) again, the other one the
actual IDs of the search scope. Which, it appears, is more efficient.
So the problem seems to be to compute the relevant aliases to start
expanding from. The way it is done now the complexity seems to be
linear in the number of aliases, if not number of entries on the LDAP
tree.
For my own use cases at least it would be much more efficient to just
do a tree traversal of the search scope, pick up the relevant aliases
and put them in the list. I believe even for large trees this would
work quite well... if I search my whole tree for entries with
objectClass=alias (deref=never, of course), this is amazingly fast. I
am currently looking into how something like that could be
implemented, but I need to dive much deeper in the code before I could
come up with something worth considering. >
What are the views of the developers on this? Does the analysis sound
plausible? Could a fix be easily obtained?