[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
RE: indexing & hardware questions -- again
All, thanks for responding. Here are some follow-up
clarifications/remarks:
> >Could somebody explain what information is put in the index
> files when
> >substring indexes are generated? Or, more generally, what OpenLDAP's
> >indexing strategy is?
>
> I don't know the answer to that one.
If anybody knows more about this, I'd be curious to hear the details.
> >I'm also concerned about the sheer size of the index files
> -- generating
> >"sub" indices increases the size of the index files by an order of
> >magnitude. My directory is quite large -- about 1,000,000 entries --
> >so I have to make sure that the machine running LDAP can handle the
> >memory requirements, which are in part determined by options like
> >dbcachesize.
> >Does anybody have some rough metrics for memory usage, e.g. with X
> >number of entries, Y number of attributes per entry, Z indexes on
> >these attributes, etc?
>
> What is the "average" object size? Are you storing binary
> information (photos,
> etc..) in the directory? I think the backend you use in this
> case is the most
> important component (in addition to sufficient resources).
All information is mere text, but there is still a lot of data -- e.g. I
believe that the LDIF could be 1 GB (based on a sample LDIF of 1.5
million entries).
Add index files to this, and you see why I'm worried about memory usage.
> >One option that I'm considering is to maintain different indices on
> >different slave directories. This way, performance on the server used
> >by my web application can be optimized for the search
> filters used by the
> >web application, and performance on the server used by the
> service and
> >support staff can degrade a bit more...
> >Has anybody else tried this? Or do you have a better suggestion?
>
> I haven't tried this. Sounds like an interesting idea.
Anybody? (To preclude references to the FAQ, I understand that you
should index based on both the attributes that are searched on and the
specific search filters used...)
> >Finally, is slurpd actually more efficient at handling modify
> >operations? While a directory is handling a modify operation from a
> >client, its performance is pretty bad. So, by "efficient" I mean, do
> >the slaves take significantly LESS of a performance hit if the slurpd
> >process sends modify operations than if a slave handles client
> >requests directly?
>
> slurpd doesn't perform modify operations. It mirrors such
> operations on a
> master slapd to the slave slapd. At least as I understand
> it. slapd still is
> the one performing the modify op.
>
My specific question is this:
I am concerned about the costs and benefits of processing modify
operations via slurpd, with respect to how modify operations hurt the
speed of searches.
ldapadd is much more faster at importing a lot of data than some client
that I write would be. Is slurp also more efficient?
Or, since presumably multiple modify operations can be sent by slurpd at
once (or in a row?), could slurpd actually cause longer blocks of time
during which the slave is less responsive to searches?
Finally, can you control how slurpd sleeps?