[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Faster updates
Hello, List!
Asking for some advice on a strategy for speeding up batch updates to a
directory.
If you think my whole approach is "challenged" please feel free to
provide advice on that subject, too.
But first, a request: I'm only a very amateur techie, so be gentle.
Scenario:
I'm running an OpenLDAP directory (installed by someone else) that's
accessed via a Web page (so the LDAP client is the appserver, and that's
Tomcat. ) I'm programming in Java, using the JNDI API.
The basic requirement is to maintain a directory fed by batch (FTP)
updates from some 20 completely diffferent organizations. The data
originate typically in Exchange, Notes or various DBMSes. To minimize
requirements for these orgs, we've set a policy that they send complete
replacement files for their orgs., in LDIF format. Most people seem to
be able to generate these from their systems without programming. We ask
only for PERSON entries, so we're responsible for building appropriate
parents (and removing sub-orgs that become empty, eg via name changes or
re-organization). We're also responsible for determining what's an add,
delete or modify.
Given that, I've coded an update program that:
(1) builds a "temporg" subtree in my directory from each new data file
(including required parents.)
(2) for each entry in this tree, looks for a matching entry in the
"permanent" tree. If not found, then add entry (and parent(s) if required.)
(3) If found, but attributes don.t match (!
oldBasicAttributes.equals(newBasicAttributes)), replace old attributes.
(4) Then, recursing through the permanent subtree for the updating org,
look for a matching entry in the temp tree. If not match, then delete
the entry (children first, taking advantage of the recursive approach
of this method.)
(5) Finally, delete the temp subtree.
Works great (mostly <g>), but takes approximately forever.
So, I was thinking, maybe I could maintain a separate database for the
temp subtree, and then (a) just delete it to save time on step (5); (b)
take it offline and batch build it with slapadd (with some mods to my
program, to be sure!), or (c) both.
Also, can I assume that slapadd is LOTS faster than using the JNDI API?
I guess I'd give up the relatively flexible error-handling I have now,
right? Entries that failed for any reason would just fail and have to be
actually looked at.
Anyhow: your advice appreciated.
TIA,
Martin
PS--Why not just do a complete rebuild for every update rather than
chack for A/D/M? Good question. It would certainly be faster. Only
reason is to preserve the option to have self-updateable attributes in
the database (favorite drink, as opposed to salary.)