Cisco Cucm DB Rep
Cisco Cucm DB Rep
Cisco Cucm DB Rep
Home
Home
Top Contributors
NetPro
Expert Corner
IP Telephony
Up to Documents in IP Telephony
Actions
Register / Login for more Actions View as PDF View print preview VERSION 61
Document
Created on: Oct 25, 2010 9:29 AM by Lang Hardison- Last Modified: Feb 23, 2011 9:28 AM by Lang Hardison
share
Bookmarked By (28)
View: Everyone
CUCM Database Replication is an area in which Cisco customers and partners have asked for more indepth training in being able to properly assess a replication problem and potentially resolve an issue without involving TAC. This document discusses the basics needed to effectively troubleshoot and resolve replication issues.
Next
Replication Architecture
CUCM 5.x Replication Architecture
Communications Manager 5.x has a similar replication topology to Callmanager 4.X. They both follow a hub and spoke topology. The publisher establishes a connection to every server in the cluster and the subscribers establish a connection the local database and the publisher only. As illustrated in the figure below, only the publisher's database is writable while each subscriber contains a read only database. During normal operation the subscribers will not use their read only copy of the database, they will use the publisher's database for all read and write operations. In the event the publisher goes down or becomes inaccessible the subscribers will use their local copy of the database. Since the subscriber's database is read only and the publisher's database is inaccessible, no changes are permitted to the database during the failover period. Changes in architecture are implemented in later versions to address this limitation.
CUIC Reports Not Showing Up on all Subs Troubleshooting the replication issues To Verify Successful UCCX v 8.x Replication Status Cisco Unity Connection Clustering
Incoming Links
converted by Web2PDFConvert.com
Re: MOH Upload problem How do you tell if 'utils dbreplication reset all' completed successfully? Call Manager 7.1.3 DB Replication issues after changing IPs CUCM Reset Cisco CCM Publisher 7.1.5 DB Replication shows bad
IDS replication is configured so that each server is a "root" node in the replication network. Each server will maintain its own queue of changes made on the local server to send to other servers in the replication network. Aroot node will not pass a replication change on to another root node. Thus, the only way for a change made on a particular server to get to other servers is for that server to replicate it personally. In other words, a change made on "A" will be sent to "B" by "A". But, "B" will not send that same change on to "C". Server "A" must send it to "C" and all other nodes.
converted by Web2PDFConvert.com
Value 0 1 2
Meaning Initialization State Number of Replicates not correct Replication is good Tables are
Description This state indicates that replication is in the process of trying to setup. Being in this state for a period longer than an hour could indicate a failure in setup. This state is rarely seen in 6.x and 7.x but in 5.x can indicate its still in the setup process. Being in this state for a period longer than an hour could indicate a failure in setup. Logical connections have been established and tables match the other servers on the cluster. Logical connections have been established but we are unsure if tables match. In 6.x and 7.x all servers could show state 3 if one server is down in the cluster.
converted by Web2PDFConvert.com
This can happen because the other servers are unsure if there is an update to a user facing feature that has not been passed from that sub to the other device in the cluster. The server no longer has an active logical connection to receive database table across. No replication is occurring in this state.
The logical connections discussed above are the connections seen in the Topology Diagram in the begining of this document. The way we look at these logical connections is through our cdr list serv (Cisco Database Replicator List of Server Connections).
Next Steps
Now that the state of replication has been identified, if the servers are in a state other than 2 it is necessary to identify what other information is needed in order to proceed in taking further acction. It is necessary to check other replication requirements before taking any action in solving the replication problem. Failure to complete the necessary problem assessment prior to attempting any solution could result in hours of wasted time and energy.
Server/Cluster Connectivity
Confirm the connectivity between nodes. In 5.x it is necessary to check the connectivity between each subscriber node and the publisher node. In 6.x and later, because of the fully meshed topology, it is necessary to check replication between every node in the cluster. This is important to keep in mind if an upgrade has taken place from 5.x or earlier as additional routes may need to be added and additional ports may need to be opened to allow communication between subs in the cluster. The documentation on checking connectivity is linked below. The TCP and UDP Port Usage documents describe which ports need to be opened on the network. CUCM 8.5: http://www.cisco.com/en/US/docs/voice_ip_comm/cucm/port/8_5_1/portlist851.html CUCM 8.0: http://www.cisco.com/en/US/docs/voice_ip_comm/cucm/port/8_0_2/portlist802.html
Configuration Files
Check all the hosts files that will be used when setting up replication. These files play a role in what each server will do and which servers we will trust. The files we are referring to here are listed below Purpose This file is used to locally resolve hostnames to IP addresses. It should include the hostname and /etc/hosts IP address of all nodes in the cluster including CUPS nodes. /home/informix/.rhosts Alist of hostnames which are trusted to make database connections Full list of CCM servers for replication. Servers here should have the correct hostname and node id $INFORMIXDIR/etc/sqlhosts (populated from the process node table). This is used to determine to which servers replicates are pushed. File
/etc/hosts
Below is the /etc/hosts as displayed Verified in Unified Reporting. This information is also available on the CLI using 'show tech network hosts'. Cluster Manager populates this file and is used for local name resolution.
converted by Web2PDFConvert.com
.rhosts file
sqlhosts
DNS (Optional)
If DNS is configured on a particular server it is required for both forward and reverse DNS to resolve correctly. Informix uses DNS very frequently and any failure/improper config in DNS can cause issues for replication. Verifying DNS The best command to verify DNS is utils diagnose test. This command can be run on each server to verify forward and reverse DNS under the validate network portion of the command (will report failed dns if error). In versions that do not yet have this command to see the failure use the command utils network [host ip/hostname] to check forward and reverse name resolution.
command
utils dbreplication configuration information from the syscdr database which forces the replicator to reread the configuration files. dropadmindb Later examples talk about identifying a corrupt syscdr database. utils dbreplication Always run from the publisher node, used to reset replication connections and do a broadcast of all tables. Error checking is ignored. Following this command 'utils dbreplication reset all' should be run in order to get clusterreset correct status information. Finally after that has returned to state 2 all subs in the cluster must be rebooted utils dbreplication Available in 7.X and later this command shows the state of replication as well as the state of replication repairs runtimestate and resets. This can be used as a helpful tool to see which tables are replicating in the process. utils dbreplication This command forces a subscriber to have its data restored from data on the publisher. Use this command forcedatasyncsub only after the 'utils dbreplication repair' command has been run several times and the 'utils dbreplication status' ouput still shows non-dynamic tables out of sync.
Examples
Reestablish logical CDR connections to all servers for replication
Getting on I would instantly check the RTMT or Unified Report in order to identify the current state of replication. I choose to ask for the Database Status report as the customer is in a version that has this available. In the report the information I find is the following. The cluster is a 5 node cluster. The publisher is in Replication State = 3 All subscribers in the cluster are in Replication State = 4 We check in the report for Replication Server List and only the publisher shows local as connected. This verifies to us that based on our descriptions the servers are indeed in the states listed above. We now do some other checks to prepare to fix replication. We verify in the report that all of the hosts files look correct. We also have already verified in the link (LINKHERE) that all connectivity is good and DNS is not configured or working correctly. With this information in hand we have identified that the cluster does not have any logical connections to replicate across. Thus the recommendation to the customer would be to follow the most basic process that fixes about 50 percent of replication cases. Below are these steps. 1. utils dbreplication stop on all subscribers. This command can be run on all subscribers at the same time but needs to complete on all subscribers prior to being run on the publisher. The amount of time this command takes to return is based on your cluster's repltimeout. This can be seen through the Command show tech repltimeout and by default is 300 seconds or 5 minutes. At the end of this document I will provide a calculation for determining what you should set your repltimeout (via utils dbreplication setrepltimeout) in your cluster and how this value affects replication. (If utils dbreplication stop all is available (7.X and later) then step 1 and step 2 can be accomplished in this one command) 2. utils dbreplication stop on publisher. Again this command can be done through the utils dbreplication stop all if available. This also will wait the repltimeout as said above. 3. utils dbreplication reset all - This command will take an hour to complete or longer depending on your cluster. You can monitor the status through the utils dbreplication runtimestate or through the procedure following the examples portion of this document.
converted by Web2PDFConvert.com
Replication Steps
These steps are done automatically (by replication scripts) when the system is installed. When we do a utils dbreplication reset all they get done again.
List of steps
Define Pub - Set it up to start replicating Define template on pub and realize it (Tells pub which tables to replicate) Define each Sub Realize Template on Each Sub (Tells sub which tables they will get/send data for) Sync data using "cdr check" or "cdr sync" (older systems use sync)
It is possible to determine where in the process the replication setup is using commands, log files, and the database status report. This command utils dbreplication runtimestate is most helpful in identifying the current runtimestate of replication setup. In versions where that is not available or as a supplement here is how to follow replication using logs.
Publisher syscdr/define
Confirm that publisher has brought its on syscdr back up (In Cisco Unified Reporting-> Database Status -> Replication Server List confirm that you see the publisher's local connection up. This should occur within a few minutes of the reset. You can also look in the informix log on that box to confirm this. "file view activelog cm/log/informix/ccm.log" from CLI Publisher Define
converted by Web2PDFConvert.com
Subscriber define
You can also check the output of file list activelog cm/trace/dbl date detail. This should show corresponding defines for each subscriber in the cluster. The define is shown in white below.
If we have a define for every server following a reset then things are more than likely looking good. Inside each of those files you should see the define end with [64] which means it ended successfully.
After all subscribers have been defined we then wait the repltimeout (Can check from show tech repltimeout) it will then do a broadcast file that actually pushes the replicates across. The broadcast is shown in yellow. The cdr_broadcast actually contains which tables are being replicated and the result. Below is the list and then an excerpt from the cdr_broadcast log (Broadcast shown in Yellow Box)
converted by Web2PDFConvert.com
With this you should be able to follow and fix replication cases. Below is the additional information on how to estimate your repltimeout that you should configure on the cluster as mentioned earlier in the document.
(12 ratings)
Comments (4)
CARLOS ENRIQUE GUTIERREZ CARRASCO Feb 10, 2011 11:22 AM Excellent contribution. 10 / 10 points. Very helpful
bmarkel123 Mar 8, 2011 12:59 AM Great guide! Saved me hours of extra work.
ivanobalice Jun 16, 2011 10:04 AM An excellent and comprehensive DB replication guide! Its does a great job of explaining how to troubleshoot issues with DB rep beyond "just restart the servers and hope for 2's"
converted by Web2PDFConvert.com
Mark VandeVere Feb 13, 2012 11:40 AM TAC engineer on a replication issue case referred me to this link as a helpful education resource. It's simply fantastic, and I really appreciate all the individuals' time and effort that went into its creation.
Postings may contain unverified user-created content and change frequently. The content is provided as-is and is not warrantied by Cisco.
1992-2012 Cisco Systems Inc. All rights reserved.
T erms & Conditions Privacy Statement Cookie Policy T rademarks of Cisco Systems, Inc.
converted by Web2PDFConvert.com