[Date Prev][Date Next] [Chronological] [Thread] [Top]

AW: (ITS#7655) segfault during initial mirror of multimaster delta replication



Hi, 

unfortunately i was not able to reproduce the exact problem with the segfault, but, after a few updates, 
we still have the problem that with replication enabled the slapd freezes during a write operation.

SETUP DESCRIPTION: 

Openldap Version 2.4.36  
Back-MDB (we have issues for quite a while, even when we where running on bdb) 
 

All write and read requests are directed to the active node, so the passive 
node is replicating. 

So, if I did not understand something wrong I have two threads: The main thread, 
and the one which is doing the replication.



Netstat of TCP Replication connections, the second is initiated by the 
passive system polling from the active

tcp        0     53 10.169.127.13:389       10.169.126.13:43340     ESTABLISHED
tcp   1905336      0 10.169.127.13:52384     10.169.126.13:389       ESTABLISHED


top -H of the LDAP Processes: 

 7767 ldap      20   0 84.4g 7.1g 6.9g S      1 10.1   1:02.13 slapd
 7768 ldap      20   0 84.4g 7.1g 6.9g S      0 10.1   7:54.44 slapd
 8023 ldap      20   0 84.4g 7.1g 6.9g S      0 10.1   0:32.31 slapd
 7766 ldap      20   0 84.4g 7.1g 6.9g S      0 10.1   0:00.00 slapd
 7769 ldap      20   0 84.4g 7.1g 6.9g S      0 10.1   0:32.81 slapd
 7770 ldap      20   0 84.4g 7.1g 6.9g S      0 10.1   7:44.94 slapd
 8024 ldap      20   0 84.4g 7.1g 6.9g t      0 10.1   0:32.53 slapd

PASTEBIN: 

I Pastebinned all the backtraces to: 

http://pastebin.com/vVGEqEUt


I hope this helps to track back the problem. 


Kind regards - Mit freundlichen Grüßen 

i.A. Hans Freitag
» Linux Administrator

ENTIRETEC AG . Pforzheimer Strasse 33 . 01189 Dresden . Germany
T: +49.351.41355.0 . M:  . F: +49.351.41355.99
E: hans.freitag@entiretec.com

ENTIRETEC | http://www.entiretec.com
Germany | Switzerland | United Arab Emirates | Malaysia | United States of America

ENTIRETEC AG
Vorstand: Thomas Herrmann (Vorsitzender), Thomas Wetzel, Carsten Klemm . Aufsichtsratsvorsitzende: Dr. Jutta Horezky
Sitz der Gesellschaft: Dresden . Amtsgericht Dresden HRB 24915 . USt-IdNr. DE227705033



> -----Ursprüngliche Nachricht-----
> Von: openldap-bugs-bounces@OpenLDAP.org [mailto:openldap-bugs-
> bounces@OpenLDAP.org] Im Auftrag von quanah@zimbra.com
> Gesendet: Montag, 5. August 2013 05:15
> An: openldap-its@openldap.org
> Betreff: Re: (ITS#7655) segfault during initial mirror of multimaster
> delta replication
> 
> --On Sunday, August 04, 2013 4:27 PM +0000 hans.freitag@entiretec.com
> wrote:
> 
> > Full_Name: Hans Freitag
> > Version: 2.4.35 and 33
> > OS: SLES 11SP2
> > URL: ftp://ftp.openldap.org/incoming/
> > Submission from: (NULL) (193.200.138.3)
> >
> >
> > I have a Multimaster Delta replication setup here with bdb on a 18 GB
> > Database.
> >
> > After a crash due to a full disk I made a new database on one node
> ans
> > started over.
> >
> > The empty node started to replicate, from the full one but after a
> while
> > (approx. 2GB) it crashed with a segfault:
> >
> > Aug  4 11:45:32 mhr-dd-lda-01 kernel: [52189.476209] slapd[10158]:
> > segfault at 20 ip 00007ff97ebfabc0 sp 00007ff6e57e6b38 error 4 in
> > libc-2.11.1.so[7ff97eb79000+155000]
> >
> > So i thought, maybe it is not e good Idea to put in a package for SP2
> in a
> > machine running SP1 so my first attempt to solve was an upgrade.
> After the
> > upgrade I got this:
> >
> > Aug  4 12:46:29 mhr-dd-lda-01 kernel: [ 1414.757587] slapd[3704]:
> > segfault at 20 ip 00007fc82eee6182 sp 00007fc592e0acf0 error 4 in
> > slapd[7fc82ee7a000+1e6000]
> >
> > So I created a brandnew openldap RPM 2.4.35 rpm to try out if the
> problem
> > is maybe related to the 2.4.33 version I am running. But fail:
> >
> > Aug  4 13:47:19 mhr-dd-lda-01 kernel: [ 5063.074410] slapd[8749]:
> > segfault at 20 ip 00007fcbc1b537dc sp 00007fc92624fb88 error 4 in
> > slapd[7fcbc1ac8000+1ea000]
> >
> > At the moment I deactivated the accesslogging on the node which seems
> to
> > work. I will know for sure in a few hours. ;-) I can try to reproduce
> > that on a backup node next week. Whenn all the main nodes are up and
> > running again. :)
> 
> I would suggest you build with debugging symbols, enable core files,
> and
> provide a backtrace of the problem.  What you have provided does not
> give
> any useful information for debugging purposes.  You also fail to state
> the
> backend you are using (back-bdb or back-hdb).
> 
> For information on how to provide a backtrace:
> 
> <http://www.openldap.org/faq/data/cache/59.html>
> 
> Regards,
> Quanah
> 
> --
> 
> Quanah Gibson-Mount
> Lead Engineer
> Zimbra, Inc
> --------------------
> Zimbra ::  the leader in open source messaging and collaboration
>