[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#6648) Syncrepl cold refresh fails with mirrormode supplier



Full_Name: Andrew Findlay
Version: 2.4.23
OS: OpenSuSE 11.1
URL: ftp://ftp.openldap.org/incoming/ajf-syncrepl-config-20100915a.tgz
Submission from: (NULL) (88.97.25.132)


When a read-only consumer server uses one peer of a mirrormode pair as its
supplier, there is a case where the initial refresh phase of the synchronisation
loses deletions.

This is very similar to the case where a single master server has its serverID
changed between taking a slapcat backup to LDIF and bringing up a consumer
server fom that LDIF file. See this thread for some background:
http://www.openldap.org/lists/openldap-technical/201009/msg00193.html

The problem occurs when a deletion was originated on a server whose serverID
does not match the serverID of the supplier when a new consumer is brought up.
It is masked by the volatile syncprov-sessionlog if the deletion is still in the
provider's log, which is why the instructions below stop and start the servers
so much.

I have uploaded the configs and data files for this to ftp.openldap.org.


Case 1: serverID of supplier server changes during the creation of a consumer:

people.ldif contains entries #1, #2, #3, #4 under dc=people,dc=example,dc=org

1)      Load people.ldif into empty master server
2)      Start master                              
3)      Stop master                               
4)      Dump master DB using slapcat > m1.ldif    
5)      Start master                              
6)      Delete entry#1                            
7)      Stop master                               
8)      change master serverID                    
9)      Start master                              
10)     Dump master DB using slapcat > m2.ldif    
11)     Delete entry#2                            
11a)            Stop master                       
11b)            Start master                      
12)     Load m1.ldif into empty consumer server   
13)     Start consumer                            
14)     Check ContextCSN matches on each server   
15)     Check for Entry#1 Entry#2 Entry#3         

If 11a/b are run:
        Entry#1 and Entry#2 are still found on the consumer

If 11a/b are not run:
        Entry#1 is still found on the consumer
        Entry#2 is deleted - presumably because it was in the volatile
sessionlog


Case 2: mirrormode pair supplying read-only consumer:

1)      Load people.ldif into empty peer server with serverID 1
2)      Start peer 1                                            
3)      Start peer 2 and allow it to sync up                    
4)      Stop peer 1                                             
4)      Dump peer 1 DB using slapcat > p1.ldif
5)      Start peer 1
6)      Delete entry#1 on peer 1
7)      Delete entry#2 on peer 2
8)      Verify that both entries have gone on both servers
9)      Stop both peer servers (this clears the volatile syncrepl-sessionlog)
10)     Start peer 1 and peer 2
11)     Load p1.ldif into empty consumer server (which is configured to sync
from peer 1)
12)     Start consumer
13)     Check ContextCSN matches on each server
14)     Check for Entry#1 Entry#2 Entry#3 on all servers

Entries #1 and #2 have gone on both mirrormode peers
Entries #1 and #2 are still present on the consumer
The consumer only has a ContextCSN value from peer 1

15)     Restart the consumer server

The consumer now has both ContextCSN values but it still has entries #1 and #2

16)     Delete Entry #3 on peer 1
17)     Check for entry #3 on all servers

Correctly deleted

18)     Delete Entry #4 on peer 2
19)     Check for entry #3 on all servers

Correctly deleted

Entries #1 and #2 are still present on the consumer


Andrew