[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: (ITS#5342) DN and naming attribute mismatch
OK. I can reproduce in test, but I'm starting to not care due to the
circumstances.
My January 29 production database produces err=80s when hit. The important
open question is what caused the *initial* corruption: the hardware (which
I note has been reliable since downgrading to 2.3.39), 2.3.37 (or possibly
even some earlier version), or 2.3.40. I can think of no way to answer
this question at this time, short of additional experimentation/data
collection, and it may be near-impossible to find out. (Unfortunate,
because it is the important question.)
A perverse side effect is that the same corrupted production database,
loaded against 2.3.39, happily accepts all changes. Now, .39 and .40 would
be expected to produce identical results. Which of the two is wrong in
this case? I find myself saying, again, that expectations during
"impossible situations" are a difficult subject.
Now, why don't I care about figuring out which one of them is right?
Because I can only instigate failure when something is in an "impossible
situation" at t=0. In the test environment, I gained the luxury of a
debugging procedure I couldn't afford in production: a full rm/slapadd of
the entire database. If I rm/slapadd using entirely 2.3.39, things work.
And if I rm/slapadd using entirely 2.3.40, things work: so far I cannot
make an initial corruption with 2.3.40, although corruption at t=0 can be
worsened by it.
Given the fact that I observed database issues like #5262 in my own test
environment, I find it plausible that 2.3.37 corrupted my database on the
way out the door. I've also observed, in test, that I can work around this
with slapcat/rm/slapadd.
While all this has been going on, I've been stressing 2.3.40 with test008,
and it seems to be fine. I will likely try 2.3.40 again next week, but
plan on a slightly abnormal upgrade procedure that includes a
slapcat/rm/slapadd. This way, at least the sins of the past will be fully
purged prior to the first 2.3.40 start, so if any future issues arise
we'll have end-to-end 2.3.40 accountability.