[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: >1024 connections in slapd / select->poll
Volker.Lendecke@SerNet.DE wrote:
On Sun, Nov 14, 2004 at 09:04:04PM -0800, Howard Chu wrote:
Been there... I haven't been motivated to investigate poll() because it
really doesn't offer any scaling benefits vs select().
So can I take this as a 'no' for this patch?
In its present form, I would say no. But I'm interested in the rest of
this discussion, and hearing more about benefits of these alternative
methods.
As for /dev/poll and epoll() - they sound nice, but I don't want to get too
bogged down in OS-specific special cases. I guess a decent abstraction layer
above it would be OK, but I definitely don't want to see a lot of #ifdef
HAVE_DEVPOLL/HAVE_EPOLL junk littered all over daemon.c.
The slapd_add/remove and set_write & friends already should roughly map to at
least epoll() on Linux. I don't know about /dev/poll. It would only be
necessary to re-write the loops over readers and writers. Maybe with a
first_reader/next_reader style interface? This might work for select/poll as
well as epoll.
If it were only a discussion of select() vs poll(), I would ignore
poll() completely. It is a relic of System V Unix and all modern SysV
variants support select(). Indeed, all systems with the notion of
sockets support select() so for portability reasons select() is preferred.
From a performance standpoint, while both poll() and select() have a
linear scaling behavior, select() is more compact. For a 32-bit server
supporting 8192 file descriptors, select's descriptor arg list is only
1KB for reads and 1KB for writes, while poll requires 64KB. Obviously
the overhead in copying data to/from user/kernel space is much higher
using poll, and I see this as the principal reason why poll scales so
poorly. Even though both calls require the server to iterate over a
large number of descriptors, and even though select's use of bitfields
requires shifts/masks, this is cheaper than copying so much memory
across the user/kernel boundary.
The example is a bit extreme, but obviously it is only in these extreme
cases where performance bottlenecks really show up. The advantage I see
for /dev/poll and epoll is that you only need to specify descriptors of
interest once, so you don't need to be constantly passing huge argument
lists around. But on a heavily loaded server with thousands of active
sessions, the fact is that your list of interesting descriptors is not
static. Ultimately the server must iterate across all the thousands of
descriptors, because the majority of them are probably active, and the
server must insert and delete descriptors from the list continuously,
because client sessions tend to come and go continually.
--
-- Howard Chu
Chief Architect, Symas Corp. Director, Highland Sun
http://www.symas.com http://highlandsun.com/hyc
Symas: Premier OpenSource Development and Support