Seems like I keep poking at this idea again every couple years.
If we use a<base,vector> we can cover a much wider range without losing any
precision. For a 32 bit machine, 32 bits per word would cover 5 bits worth
of
entry IDs. So 27 bits would comprise the base, and the next 32 bits would
represent up to 32 entry IDs.
To avoid wasting those 5 unneeded bits in the base, we could use them as a
run-length counter and use multiple vectors per base. But it might be better
to slide things over and use just a 24 bit base, and use the bottom 8 bits
of
the word as a bitmask to represent which following vectors are present. E.g.
......01 means 1 vector follows, representing base+ 0-31.
......02 means 1 vector follows, representing base+32-63.
......05 means 2 vectors follow, base+0-31 then base+64-95.
That allows us to represent up to 256 entryIDs in only 288 bits (instead of
the 16384 bits we currently need). That would allow us to track about 1.86M
entryIDs in the same space we currently use to track only 64K entryIDs.
It seems we would still need some kind of range representation to handle
IDLs
with more than 1.86M entries.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/