This 2.5MB number also doesn't take indexing into account. Each indexed
attribute uses another database file of its own, using a Hash structure.
Unlike the B-trees, where you only need to touch one data page to find an
entry of interest, doing an index lookup generally touches multiple keys, and
the point of a hash structure is that the keys are evenly distributed across
the data space. That means there's no convenient compact subset of the
database that you can keep in the cache to insure quick operation, you can
pretty much expect references to be scattered across the whole thing. My
strategy here would be to provide enough cache for at least 50% of all of the
hash data. (Number of hash buckets + number of overflow pages + number of
duplicate pages) * page size / 2.
The objectClass index for my example database is 5.9MB and uses 3 hash
buckets and 656 duplicate pages. So ( 3 + 656 ) * 4KB / 2 =~ 1.3MB.