[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
RE: dissection of search latency
Fascinating. Assuming that both backends were built using the same version
of Berkeley DB, I'd guess the slowdown in back-bdb is due to transaction
management. Except, the search code is not transaction-protected, so it's
hard to say what the real issue is. I believe in both cases Berkeley DB is
used with the Btree access method, so again the difference in size and
access times is rather odd.
Out of curiosity, how much RAM was available on the system during these
tests? I'll assume that swapping activity was zero at all times?
When the back-bdb database was built, were all of the log files committed
and then removed? I'm not sure it has any relevance to runtime performance,
but obviously it consumes disk space. (I don't recall if ext2fs cares, but
Berkeley Fast Filesystem performance always degraded if the usage went above
90%.)
A couple of the times are actually slower in the warm case, and it may not
just be a measurement anomaly since it appears for both backends. I wonder
why that is.
-- Howard Chu
Chief Architect, Symas Corp. Director, Highland Sun
http://www.symas.com http://highlandsun.com/hyc
Symas: Premier OpenSource Development and Support
> -----Original Message-----
> From: owner-openldap-devel@OpenLDAP.org
> [mailto:owner-openldap-devel@OpenLDAP.org]On Behalf Of Jonghyuk Choi
> Sent: Wednesday, November 14, 2001 1:55 PM
> To: openldap-devel@OpenLDAP.org
> Subject: dissection of search latency
>
>
> Below is a dissection of search latency.
>
> - Test Environment
>
> directory data : 10011 entries (from DirectoryMark dbgen tool)
> cache size : 10011 entries
> dbcache size : 200MB
> no indexing
> single search for an entry using the filter 'cn=Audy Flansburg'
> (s = seconds, m = milliseconds, u = microseconds)
> * cold run : all data from disk, warm run : all data from entry cache
>
> - Latency Dissection Result
>
> 1. back-ldbm
>
> cold run warm run
> ASN.1 decoding 130u 179u
> base entry retrieval 12u 20u
> candidate selection 127m882u 598u
> idl loop control 6m623u 6m057u
> lock, time check 7m557u 7m293u
> entry retrieval 2s400m668u 26m405u
> cache_find_entry_id 12m644u
> 25m718u
> ldbm_cache_open 80m335u
> 0
> ldbm_cache_fetch 1s167m728u
> 0
> str2entry 945m619u
> 0
> cache_add_entry 106m888u
> 0
> cache_entry_commit 5m749u
> 0
> misc 81m705u
> 687u
> test filter 417m209u 415m602u
> entry transmission 1m651u 1m570u
> status transmission 5m249u 5m354u
> total 2s966m981u 463m078u
>
>
> 2. back-bdb
>
> cold run warm run
> ASN.1 decoding 128u 168u
> base entry retrieval 128u 115u
> candidate selection 12u 11u
> idl loop control 6m301u 6m341u
> lock, time check 7m890u 7m901u
> entry retrieval 4s697m678u 606m488u
> db->get 4s544m708u
> 468m798u
> entry_decode 136m032u
> 136m186u
> misc 81m705u
> 1m504u
> test filter 418m095u 418m163u
> entry transmission 1m626u 1m634u
> status transmission 8m995u 5m169u
> total 5s140m853u 1s045m990u
>
> - Preliminary Analysis
>
> >From the above data, back-ldbm runs four times faster in the
> warm run than
> in the cold run, because of entry caching and BDB buffer pages.
> back-bdb also runs four times faster in the warm run than in the cold run.
> Even though back-bdb does ont have entry caches with it, the page cache of
> Linux seems to prevent costly I/O operations in the warm run.
>
> The base entry retrieval latency of back-ldbm is significantly lower than
> that of back-bdb because the search base is already cached by the previous
> bind operation. There is no such effect in back-bdb, simply
> because there's
> no
> entry cache in back-bdb.
>
> The entry caching of back-ldbm proved extremely effective. There's no I/O
> overhead in back-ldbm once all entries are cached and the cache hit is
> almost 100 times faster than disk access. Thus, it may be desirable
> to have efficient entry caching in back-bdb as well, even though
> further investigation should be necessary.
>
> Howard's recent entry_decode routine of back-bdb also proved very
> effective.
> entry_decode is almost seven times faster than str2entry of back-ldbm.
> entry_decode can improve search performance in case of cache miss and
> entry_encode seems to be able to improve update performance even with
> the corresponding entry cache entrie present in memory,
> because slapd updates both the in-memory cache entry and disk database
> entry
> upon updating directories.
>
> For back-ldbm warm runs, test_filter latency turns out to be 89.7% of the
> total
> latency and becomes possible bottleneck. If test_filter's performance is
> improved
> further, the cached search performance will be significantly improved.
>
> The latency of db->get of back-bdb is four times higher than that of
> ldbm_cache_fetch
> of back-ldbm and the size of back-bdb directory file (id2entry)
> is as twice
> as large
> as that of back-bdb.
> I'll appreciate if anyone explain this difference.
>
> Suggestions and comments for further evaluation directions are more than
> welcome.
>
> - Jong
>
> --------------
> Jong Hyuk Choi
> jongchoi@us.ibm.com
> IBM Thomas J. Watson Research Center
> Enterprise Linux Group
> Phone 914-945-3979
> Fax 914-945-4425
>