[Date Prev][Date Next] [Chronological] [Thread] [Top]

SSL strangeness



Hello,

I'm seeing some really weird behaviour when using ldaps:// on an openldap-2.3.43 server. It's a Gentoo Linux box with glibc-2.9_p20081201-r2 and openssl-0.9.8k. I have already recompiled the entire system with gcc-4.3.4 (twice to be sure), with no errors. First of all, ldapsearch -H ldaps://bussard.lih.rwth-aachen.de just hangs. The strange part: when I strace -f slapd, from the second retry on, it works.

So I went on by debugging with openssl s_client, which exhibits just the same behaviour. However it reveals that slapd falls silent in the middle of sending the certificates.

So if I do:

$ openssl s_client -connect bussard.lih.rwth-aachen.de:636 -state -status -CAfile /etc/openldap/ssl/rwth-dfn-tcom.crt
CONNECTED(00000003)
SSL_connect:before/connect initialization
SSL_connect:SSLv2/v3 write client hello A
OCSP response: no response sent
SSL_connect:SSLv3 read server hello A
depth=3 /C=DE/O=Deutsche Telekom AG/OU=T-TeleSec Trust Center/CN=Deutsche Telekom Root CA 2
verify return:1
depth=2 /C=DE/O=DFN-Verein/OU=DFN-PKI/CN=DFN-Verein PCA Global - G01
verify return:1
depth=1 /C=DE/O=RWTH Aachen/CN=RWTH Aachen CA/emailAddress=ca@rwth-aachen.de
verify return:1
depth=0 /C=DE/O=RWTH Aachen/OU=Lehrstuhl fuer Ingenieur- und Hydrogeologie/CN=ldap.lih.rwth-aachen.de
verify return:1
SSL_connect:SSLv3 read server certificate A
^C

Now after I've done "strace -f -p `pidof slapd`" on the server, I get the same as above once. Then when I try a second time:

$ openssl s_client -connect bussard.lih.rwth-aachen.de:636 -state -CAfile /etc/openldap/ssl/rwth-dfn-tcom.crt
CONNECTED(00000003)
SSL_connect:before/connect initialization
SSL_connect:SSLv2/v3 write client hello A
SSL_connect:SSLv3 read server hello A
depth=3 /C=DE/O=Deutsche Telekom AG/OU=T-TeleSec Trust Center/CN=Deutsche Telekom Root CA 2
verify return:1
depth=2 /C=DE/O=DFN-Verein/OU=DFN-PKI/CN=DFN-Verein PCA Global - G01
verify return:1
depth=1 /C=DE/O=RWTH Aachen/CN=RWTH Aachen CA/emailAddress=ca@rwth-aachen.de
verify return:1
depth=0 /C=DE/O=RWTH Aachen/OU=Lehrstuhl fuer Ingenieur- und Hydrogeologie/CN=ldap.lih.rwth-aachen.de
verify return:1
SSL_connect:SSLv3 read server certificate A
SSL_connect:SSLv3 read server certificate request A
SSL_connect:SSLv3 read server done A
SSL_connect:SSLv3 write client certificate A
SSL_connect:SSLv3 write client key exchange A
SSL_connect:SSLv3 write change cipher spec A
SSL_connect:SSLv3 write finished A
SSL_connect:SSLv3 flush data
SSL_connect:SSLv3 read finished A
---
Certificate chain
0 s:/C=DE/O=RWTH Aachen/OU=Lehrstuhl fuer Ingenieur- und Hydrogeologie/CN=ldap.lih.rwth-aachen.de
   i:/C=DE/O=RWTH Aachen/CN=RWTH Aachen CA/emailAddress=ca@rwth-aachen.de
 1 s:/C=DE/O=RWTH Aachen/CN=RWTH Aachen CA/emailAddress=ca@rwth-aachen.de
   i:/C=DE/O=DFN-Verein/OU=DFN-PKI/CN=DFN-Verein PCA Global - G01
 2 s:/C=DE/O=DFN-Verein/OU=DFN-PKI/CN=DFN-Verein PCA Global - G01
i:/C=DE/O=Deutsche Telekom AG/OU=T-TeleSec Trust Center/CN=Deutsche Telekom Root CA 2 3 s:/C=DE/O=Deutsche Telekom AG/OU=T-TeleSec Trust Center/CN=Deutsche Telekom Root CA 2 i:/C=DE/O=Deutsche Telekom AG/OU=T-TeleSec Trust Center/CN=Deutsche Telekom Root CA 2
---
Server certificate
-----BEGIN CERTIFICATE-----

... (lotsa stuff) ...

/C=ES/ST=Barcelona/L=Barcelona/O=IPS Internet publishing Services s.l./O=ips@mail.ips.es C.I.F. B-60929452/OU=IPS CA Timestamping Certification Authority/CN=IPS CA Timestamping Certification Authority/emailAddress=ips@mail.ips.es
---
SSL handshake has read 26305 bytes and written 480 bytes
---
New, TLSv1/SSLv3, Cipher is AES256-SHA
Server public key is 2048 bit
Compression: NONE
Expansion: NONE
SSL-Session:
    Protocol  : TLSv1
    Cipher    : AES256-SHA
    Session-ID: 99...CB400
    Session-ID-ctx:
    Master-Key: 152...A2D
    Key-Arg   : None
    Start Time: 1256667603
    Timeout   : 300 (sec)
    Verify return code: 0 (ok)
---

This time the SSL handshake works, just because I'm strace'ing slapd? This looks like some really weird race condition. It's driving me crazy. Should I talk to the openssl people about this? But when I make an openssl testbed with openssl s_server and s_client, everything works fine, so it shouldn't be an openssl issue.

Oh and maybe you'd like to see the strace output. This is what it looks like when the SSL client hangs:

# strace -f -p `pidof slapd`
Process 3339 attached with 3 threads - interrupt to quit
[pid  3339] futex(0x8977560, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  3338] time(NULL)                  = 1256668096
[pid  3338] epoll_wait(6,  <unfinished ...>
[pid  3328] futex(0xad15ebd8, FUTEX_WAIT, 3338, NULL <unfinished ...>
[pid 3338] <... epoll_wait resumed> {{EPOLLIN, {u32=143663400, u64=143663400}}}, 1024, 1798000) = 1 [pid 3338] accept(7, {sa_family=AF_INET, sin_port=htons(37192), sin_addr=inet_addr("137.226.164.160")}, [16]) = 14
[pid  3338] setsockopt(14, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
[pid  3338] setsockopt(14, SOL_TCP, TCP_NODELAY, [1], 4) = 0
[pid 3338] open("/etc/hosts.allow", O_RDONLY) = -1 ENOENT (No such file or directory) [pid 3338] open("/etc/hosts.deny", O_RDONLY) = -1 ENOENT (No such file or directory)
[pid  3338] time(NULL)                  = 1256668099
[pid  3338] fcntl64(14, F_GETFL)        = 0x2 (flags O_RDWR)
[pid  3338] fcntl64(14, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 3338] epoll_ctl(6, EPOLL_CTL_ADD, 14, {EPOLLIN, {u32=143700808, u64=143700808}}) = 0
[pid  3338] time(NULL)                  = 1256668099
[pid 3338] epoll_wait(6, {{EPOLLIN, {u32=143700808, u64=143700808}}}, 1024, 1795000) = 1
[pid  3338] time(NULL)                  = 1256668099
[pid  3338] time(NULL)                  = 1256668099
[pid  3338] read(14, "\200\214\1\3\1\0c\0\0\0 "..., 11) = 11
[pid 3338] read(14, "\0\0009\0\0008\0\0005\0\0\210\0\0\207\0\0\204\0\0\26\0\0\23\0\0\n\7\0\300\0\0003"..., 131) = 131
[pid  3338] time(NULL)                  = 1256668099
[pid  3338] time(NULL)                  = 1256668099
[pid  3338] time(NULL)                  = 1256668099
[pid 3338] write(14, "\26\3\1\0J\2\0\0F\3\1J\347;\303\310\247w\24<\206!\334\3345\304\327\321\344\36FG\37"..., 4096) = 4096 [pid 3338] write(14, "\"0\r\6\t*\206H\206\367\r\1\1\1\5\0\3\202\1\17\0000\202\1\n\2\202\1\1\0\253\v\243"..., 4096) = 4096 [pid 3338] write(14, "ootCA1\0;091\v0\t\6\3U\4\6\23\2FI1\0170\r\6\3U\4\n\23"..., 12928) = 6288 [pid 3338] write(14, "go1,0*\6\3U\4\v\23#Wells Fargo Certific"..., 6640) = -1 EAGAIN (Resource temporarily unavailable) [pid 3338] write(14, "go1,0*\6\3U\4\v\23#Wells Fargo Certific"..., 6640) = 6640 [pid 3338] write(14, "\26\3\1\24\1ck Halozatbiztonsagi Kft.1\0320"..., 5126) = 2048 [pid 3338] write(14, "\4\n\23\37Software in the Public Intere"..., 3078) = -1 EAGAIN (Resource temporarily unavailable)
[pid  3338] time(NULL)                  = 1256668099
[pid  3338] epoll_wait(6,

(Strg-C on the client...)

{{EPOLLIN, {u32=143700808, u64=143700808}}}, 1024, 1795000) = 1
[pid  3338] time(NULL)                  = 1256668102
[pid  3338] time(NULL)                  = 1256668102
[pid 3338] write(14, "\4\n\23\37Software in the Public Intere"..., 3078) = 3078
[pid  3338] read(14, ""..., 5)          = 0
[pid 3338] epoll_ctl(6, EPOLL_CTL_MOD, 14, {0, {u32=143700808, u64=143700808}}) = 0
[pid  3338] write(5, "0"..., 1)         = 1
[pid 3338] epoll_ctl(6, EPOLL_CTL_DEL, 14, {0, {u32=143700808, u64=143700808}}) = 0 [pid 3338] shutdown(14, 2 /* send and receive */) = -1 ENOTCONN (Transport endpoint is not connected)
[pid  3338] close(14)                   = 0
[pid  3338] time(NULL)                  = 1256668102
[pid 3338] epoll_wait(6, {{EPOLLIN, {u32=143700768, u64=143700768}}}, 1024, 1792000) = 1
[pid  3338] read(4, "0"..., 8192)       = 1
[pid  3338] time(NULL)                  = 1256668102
[pid  3338] epoll_wait(6,

and that's it. Now when I try for the second time (now I get the server cert alright), it looks like this:

# strace -f -p `pidof slapd`
Process 3354 attached with 4 threads - interrupt to quit
[pid  3339] futex(0x8977560, FUTEX_WAIT_PRIVATE, 42, NULL <unfinished ...>
[pid  3354] futex(0x8977560, FUTEX_WAIT_PRIVATE, 42, NULL <unfinished ...>
[pid  3338] time(NULL)                  = 1256668222
[pid  3338] epoll_wait(6,  <unfinished ...>
[pid  3328] futex(0xad15ebd8, FUTEX_WAIT, 3338, NULL <unfinished ...>
[pid 3338] <... epoll_wait resumed> {{EPOLLIN, {u32=143663400, u64=143663400}}}, 1024, 1672000) = 1 [pid 3338] accept(7, {sa_family=AF_INET, sin_port=htons(37195), sin_addr=inet_addr("137.226.164.160")}, [16]) = 15
[pid  3338] setsockopt(15, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
[pid  3338] setsockopt(15, SOL_TCP, TCP_NODELAY, [1], 4) = 0
[pid 3338] open("/etc/hosts.allow", O_RDONLY) = -1 ENOENT (No such file or directory) [pid 3338] open("/etc/hosts.deny", O_RDONLY) = -1 ENOENT (No such file or directory)
[pid  3338] time(NULL)                  = 1256668224
[pid  3338] fcntl64(15, F_GETFL)        = 0x2 (flags O_RDWR)
[pid  3338] fcntl64(15, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 3338] epoll_ctl(6, EPOLL_CTL_ADD, 15, {EPOLLIN, {u32=143700812, u64=143700812}}) = 0
[pid  3338] time(NULL)                  = 1256668224
[pid 3338] epoll_wait(6, {{EPOLLIN, {u32=143700812, u64=143700812}}}, 1024, 1670000) = 1
[pid  3338] time(NULL)                  = 1256668224
[pid  3338] time(NULL)                  = 1256668224
[pid  3338] read(15, "\200\214\1\3\1\0c\0\0\0 "..., 11) = 11
[pid 3338] read(15, "\0\0009\0\0008\0\0005\0\0\210\0\0\207\0\0\204\0\0\26\0\0\23\0\0\n\7\0\300\0\0003"..., 131) = 131
[pid  3338] time(NULL)                  = 1256668224
[pid  3338] time(NULL)                  = 1256668224
[pid  3338] time(NULL)                  = 1256668224
[pid 3338] write(15, "\26\3\1\0J\2\0\0F\3\1J\347<@5\352%\335\336\264Q\2263\346\303\335\t\2\34\241\372Q"..., 4096) = 4096 [pid 3338] write(15, "\"0\r\6\t*\206H\206\367\r\1\1\1\5\0\3\202\1\17\0000\202\1\n\2\202\1\1\0\253\v\243"..., 4096) = 4096 [pid 3338] write(15, "ootCA1\0;091\v0\t\6\3U\4\6\23\2FI1\0170\r\6\3U\4\n\23"..., 12928) = 11584 [pid 3338] write(15, "\6\3U\4\6\23\2AU1\0230\21\6\3U\4\10\23\nQueensland1\0210"..., 4096) = 2896 [pid 3338] write(15, "ty1$0\"\6\3U\4\n\23\33Digital Signature Tr"..., 1200) = 1200 [pid 3338] write(15, "\26\31personal-basic@thawte.com\0\3210\201\3161"..., 2374) = 2374 [pid 3338] read(15, 0x8a08398, 5) = -1 EAGAIN (Resource temporarily unavailable)
[pid  3338] time(NULL)                  = 1256668224
[pid 3338] epoll_wait(6, {{EPOLLIN, {u32=143700812, u64=143700812}}}, 1024, 1670000) = 1
[pid  3338] time(NULL)                  = 1256668224
[pid  3338] time(NULL)                  = 1256668224
[pid  3338] read(15, "\26\3\1\0\7"..., 5) = 5
[pid  3338] read(15, "\v\0\0\3\0\0\0"..., 7) = 7
[pid  3338] read(15, "\26\3\1\1\6"..., 5) = 5
[pid 3338] read(15, "\20\0\1\2\1\0~\246\237\364\202\0\217\345#|\241\273k\34\251\277\224X\346\274\361\300\373\1\24\226\334"..., 262) = 262
[pid  3338] read(15, "\24\3\1\0\1"..., 5) = 5
[pid  3338] read(15, "\1"..., 1)        = 1
[pid  3338] read(15, "\26\3\1\0000"..., 5) = 5
[pid 3338] read(15, "\36\337\371\314\260\5\246\233\17\31^P\3027\227\333\257\374\221F\\\20?1\316\207\201BJQ\337\264\224"..., 48) = 48 [pid 3338] write(15, "\24\3\1\0\1\1\26\3\1\0000\356\336\3673\3034w\344\3364e\264\10dP\302\205\3058\357\272c"..., 59) = 59
[pid  3338] time(NULL)                  = 1256668224
[pid  3338] epoll_wait(6,


Hope that someone can make sense of this. Just to be clear: ldapsearch behaves the same way as described above for openssl s_client.

Thank you very much for even reading so far.