[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: Glibc regex concurrency issues (Was: (ITS#3932) regex/librewrite concurrency issue
Pierangelo Masarati wrote:
Whil einvestigating some odd issue I recently had with librewrite
while running test039 on multiple CPU machines, I found that I was
getting odd coredumps when 2 threads accessed the same regex_t inside
librewrite. The problem occurred inside the internals of regexec(),
and my glibc (2.3.4) does not have symbols in, so I couldn't trace the
exact place. However, before going to compiling my own glibc, I
searched glibc's CVS and I found that there might be an issue,
although I couldn't find a precise bug notification. Looking at
<http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/posix/regexec.c.diff?r1=1.78&r2=1.79&cvsroot=glibc&f=h>
it appears that in HEAD some concurrency issue was recently solved.
This fix has been merged into released code between 2.3.4 and 2.3.5,
so I suspect that my glibc, and many before 2.3.5, may suffer from the
same problem.
Now the point is: since wrapping regex calls 'round mutextes seemed to
cure my problem, and since regex is used in librewrite, ACLs,
authz-regexp, and limits, should we provide a compile switch to
protect regex if one fears its implementation may not be thread safe?
I've prepared a patch for this (I note that the patch I submitted with
ITS#3932 is incomplete: the file <libraries/lblutil/regex.c> is
missing, so I'm posting it now). This problem is not OpenLDAP
specific, but given the impact of regex on OpenLDAP software, and the
impact of rebuilding glibc (other implementations of regex may suffer
from the same problem) on production systems which cannot undergo an
upgrade shortly, I'd prefer to provide a workaround for those that
require it.
This sort of problem will have quite an impact on regexp performance.
Perhaps we should suggest they use an alternate regexp implementation
instead of the broken GNU version. I've been using the Henry Spencer
regexp on our Windows ports. There's no good reason why they've broken
re-entrancy like this, and there's no reason to use locking in the fix
as they have. All they had to do was wrap the regex_t in some other
structure in their internal calls so they could provide their per-match
state without mucking with the compiled state. So much for the quality
of glibc these days.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc
OpenLDAP Core Team http://www.openldap.org/project/