[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (delta-)syncrepl and nagios

To: Samuel Tran <stran@amnh.org>
Subject: Re: (delta-)syncrepl and nagios
From: Buchan Milne <bgmilne@staff.telkomsa.net>
Date: Fri, 10 Feb 2006 09:45:04 +0200
Cc: Aaron Richton <richton@nbcs.rutgers.edu>, Francis Swasey <Frank.Swasey@uvm.edu>, OpenLDAP software list <openldap-software@OpenLDAP.org>
In-reply-to: <1139507850.15133.7.camel@mistral>
Organization: Telkom Internet
References: <43E79BF1.1090002@uvm.edu> <Pine.SOL.4.58.0602061430480.956@toolbox.rutgers.edu> <1139507850.15133.7.camel@mistral>
User-agent: KMail/1.9.1

On Thursday 09 February 2006 19:57, Samuel Tran wrote:
> On Mon, 2006-02-06 at 14:41 -0500, Aaron Richton wrote:
> > That's been on my todo list for over a year now. (So I'll join in the
> > request for a copy if there is such a script!)
> >
> > If anybody does write this, it's important to note that something that
> > strictly compares contextcsns is likely useless (I think it would just be
> > a false positive disaster). Replication doesn't happen instantly; there
> > should be some sort of configurable threshold for "csns should be within
> > <time>".
> >
> >
> > I've been meaning to ask the list: how many of you check up on your
> > slaves from a consistency perspective? What do you do? (contextcsn is the
> > approach I've wanted to take. Every time I get annoyed enough to write a
> > nagios plugin, I notice that everything is in sync and defer it...)
>
> I wrote a very generic python script with exhaustive comments/debugging.
> It can be modified to be used as a Nagios script plugin.
>
> To view a description of the script:
> $ pydoc ldapSynchCheck
>
> To view the help:
> $ ./ldapSynchCheck.py -h
>

I guess you didn't look at the perl extension script for BigBrother/Hobbit 
that I posted. It assumes that it will be able to:
1)read sufficient configuration information from cn=config to be able to 
determine all the databases using sync-repl, and the master for each 
database, on any server
2)read the contextCSN for any database on any server
anonymously, but, due to this, requires absolutely no configuration. For use 
with Hobbit, it just needs to be run on the hobbit server, and any host in 
the bb-hosts file just needs 'ol'. Of course, the hobbit server needs to be 
able to access all the LDAP servers involved.

You may want to take a look, so a user of your script doesn't need to provide 
the URIs, but instead can just provide the server to check.

http://www.zarb.org/~bgmilne/hobbit/

At present, it only goes yellow (not red), since there's no real way to 
determine if the server being 3 months behind (ie you catch the 30 second 
perion it takes to replicate the first change to one database in 3 months) is 
severe enough for an error .. but it does show how far ahead (which could 
indicate checkpointing/recover problems on the master) or behind the slave is 
(so you don't have to compare contextCSNs in your head).

I could take a look at making it work for nagios, but we're phasing nagios 
out, and the only LDAP servers monitored for anything by nagios don't use 
sync-repl.

Regards,
Buchan

-- 
Buchan Milne
ISP Systems Specialist
B.Eng,RHCE(803004789010797),LPIC-2(LPI000074592)

Attachment: pgp4JqN4zJnDm.pgp
Description: PGP signature

Follow-Ups:
- Re: (delta-)syncrepl and nagios
  - From: Samuel Tran <stran@amnh.org>

References:
- (delta-)syncrepl and nagios
  - From: Francis Swasey <Frank.Swasey@uvm.edu>
- Re: (delta-)syncrepl and nagios
  - From: Aaron Richton <richton@nbcs.rutgers.edu>
- Re: (delta-)syncrepl and nagios
  - From: Samuel Tran <stran@amnh.org>

Prev by Date: RE: ACI syntax changes in 2.3 / OpenLDAPaci does not like multipleattributes
Next by Date: Rép. : Re: Problem with overlay ppolicy: server don't check the syntax
Index(es):
- Chronological
- Thread