Hi,
Recently I reported that our slave instances (openldap-2.1.22 over
BerkeleyDB.4.1.25) were occasionally reaching 100% cpu time.
I tried upgrading our test boxes to use openldap-2.1.25 and BerkeleyDB
4.2.52-1 a little bit afterwards. However, we ran into the same problems,
but this time with the masters.
The interesting thing is that this only happened a while after we shut
down the slaves. The resulting unconsumed replogs became huge (several
hundred megs). We thought back about it and noticed that this was
happening to our slaves as well when they hit 100% cpu usage (we have
replication done in two stages, once from the master to slaves, and then
from the slaves to a set of backup slaves).
Does anyone know what the effect of an oversized replog is on a slapd? Is
it possible that if the replog gets big enough, slapd spends too much time
writing new data to the replog to service requests, or some other behavior
that would cause load to shoot up to 100%?