[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
race condition in -lldap/openssl??
This is unrelated to my previous NetBSD problem (now fixed, my error).
Executive summary:
I'm having a problem where two RHL7.2 LDAP clients out of many don't
authenticate against an OpenLDAP server.
In openldap-2.0.21/libraries/libldap/tls.c line ~625
err = SSL_connect( ssl );
If the failing client is "slightly bogged down by ltracing the sshd
process", then err == 1 (sucess), otherwise err == 0 (failure), checking
SSL_get_error I get SSL_ERROR_SYSCALL.
The man page says:
SSL_ERROR_SYSCALL
Some I/O error occurred. The OpenSSL error queue may contain
more information on the error. If the error queue is empty
(i.e. ERR_get_error() returns 0), ret can be used to find out
more about the error: If ret == 0, an EOF was observed that
violates the protocol.
The box is SMP dual Pentium III box, running Red Hat Linux 7.2 fully
updated with all official errata, plus the latest pam/nss_ldap, OpenLDAP
2.0.21, OpenSSL 0.9.6b. I'm also having, what appears to be, the same
problem on another box, which is single cpu AMD 1700+.
The Red Hat OpenSSL RPM was configured/built with:
./config no-asm 386 no-idea no-mdc2 no-rc5 shared
The OpenLDAP RPM configured/built with:
CPPFLAGS="-I/usr/kerberos/include"; export CPPFLAGS
CFLAGS="$CPPFLAGS $RPM_OPT_FLAGS -D_REENTRANT -DHAVE_KERBEROS_V -fPIC";
export CFLAGS
%configure \
--with-slapd --with-slurpd --without-ldapd \
--with-threads=posix --enable-shared --enable-static \
--enable-ldbm --with-ldbm-api=gdbm \
--enable-passwd \
--enable-shell \
\
--enable-local --enable-cldap --disable-rlookups \
\
--with-kerberos=k5only \
--with-tls \
--with-cyrus-sasl \
\
--enable-wrappers \
\
--enable-cleartext \
--enable-crypt \
--enable-kpasswd \
--enable-spasswd \
\
--libexecdir=%{_sbindir} \
--localstatedir=/%{_var}/run
Details:
I've tracked it down closely. I'm now officially "over my head" (tm).
Keep in mind, I'm just a Perl guy.
pam_ldap.so calls ldap_start_tls_s. I tracked that down to:
openldap-2.0.21/libraries/libldap/tls.c
Eventually the ldap_int_tls_connect function is called.
The important lines from this function are:
ssl = alloc_handle( ctx );
err = SSL_connect( ssl );
Then the existing code does:
if ( err <= 0 ) {
blah
I've modified it by adding this code right above it:
if ( err == 0 ) {
syslog (LOG_ERR, "SSL_connect returned 0\n");
switch(SSL_get_error(ssl, err)) {
case SSL_ERROR_NONE:
syslog (LOG_ERR, "SSL_ERROR_NONE\n");
break;
case SSL_ERROR_ZERO_RETURN:
syslog (LOG_ERR, "SSL_ERROR_ZERO_RETURN\n");
break;
case SSL_ERROR_WANT_READ:
syslog (LOG_ERR, "SSL_ERROR_WANT_READ\n");
break;
case SSL_ERROR_WANT_WRITE:
syslog (LOG_ERR, "SSL_ERROR_WANT_WRITE\n");
break;
case SSL_ERROR_WANT_CONNECT:
syslog (LOG_ERR, "SSL_ERROR_WANT_CONNECT\n");
break;
case SSL_ERROR_WANT_X509_LOOKUP:
syslog (LOG_ERR, "SSL_ERROR_WANT_X509_LOOKUP\n");
break;
case SSL_ERROR_SYSCALL:
syslog (LOG_ERR, "SSL_ERROR_SYSCALL\n");
break;
case SSL_ERROR_SSL:
syslog (LOG_ERR, "SSL_ERROR_SSL\n");
break;
default:
syslog (LOG_ERR, "Error in reading SSL handle\n");
}
}
SSH attempt (sucessful BTW) into the machine slightly bogged down:
Feb 7 02:04:33 mooru sshd[17186]: SSL_connect returned 1
SSH attempt into the machine not bogged down:
Feb 7 02:12:18 mooru sshd[19396]: SSL_connect returned 0
Feb 7 02:12:18 mooru sshd[19396]: SSL_ERROR_SYSCALL
Feb 7 02:12:18 mooru sshd[19396]: TLS: can't connect. (other debug I added)
Feb 7 02:12:18 mooru sshd[19396]: pam_ldap: ldap_starttls_s: Connect error
At this point, I am at a loss how to further debug/diagnosis it. I'm more
than happy to test out patches though.
Dax Kelson