User Guide > Helpdesk and Service Automation > MSP Center Plus Failover

MSP Center Plus Failover

Tags:  

MSP Center Plus Failover

It is very very essential, at the same time quite difficult to ensure 100% availability of the Central server. There are chances of DB getting crashed, Server going down etc., where your effective monitoring gets halted. If the Central DB crashes, it takes hours (depending on the data size) to restore and resume the service. Also, it is very complicated and time consuming to take backups of the Central, periodically. As the data grows more and more, the complication and the time in taking backups also grows more and more. Failover is the best and the recommended technique to overcome these issues and to achieve 100% availability of the Central Server.

MSP Center Plus' failover architecture is a master to master two replication system i.e., when the Primary Central Server is active, it periodically updates the data collected into the Secondary Central Server's DB. In the absence of Primary Central Server, the Secondary Central Server takes over and updates the Primary Central Server's DB periodically. Thus the data is redundant.



Ensure that MySQL always runs on both the Primary and Failover Central server.


Note:
How Failover affects desktop agents ?
Desktop agents don't have a means of knowing the secondary central server and hence failover will affect all agents. To overcome this you need to create a subdomain such as https://service.companyname.com and NAT it alternatively to primary and secondary central server.

Failover steps for build 7100:

Configuring failover support for MSP Center Plus has been made very easy with FailoverConfig.bat file. It is just enough if you execute this batch file and follow the onscreen instructions to configure failover support. The batch file has to be executed in both the Primary Central Server and the Secondary Central Server. When executed on the Primary Central Server, it asks for the details about the Seconday and when executed on the Secondary Central Server, it asks for the details about the Primary Central Server.

Pre-requisite:
  1. While executing FailoverConfig.bat file on the Primary Central Server, the Central service and MySQL should be stopped.
  2. While executing the batch file on the Secondary Central Server, the Central service running on it and MySQL should be stopped. But, the MySQL on Primary Central Server should be running.

Note: If you have planned to move the database of either the Primary Central Server or the Secondary Central Server in a remote machine, then fist configure the failover settings and then move the data base to the remote machine.

To configure failover support:

  1. Execute the batch file FailoverConfig.bat [available under <Central Home>\bin] from the command prompt.
  2. Follow the on-screen instructions [Below shown image is a sample of the failover setup wizard that was executed on the Primary Central server].


Steps to be followed on the Secondary Central Server after executing FailoverConfig.bat
  1. Copy the server.keystore file under [Central-Home]\tomcat\conf\ from the Primary Central Server to [Central-Home]\tomcat\conf folder in the Secondary Central server.
  2. Copy the https.truststore file under [Central-Home]\conf from the Primary Central Server to [Central-Home]\conf folder in the Secondary Central server.
Configurations to be done on the Probe
  1. Stop the Probe service.
  2. Edit the NOCServerDetails.xml file under [Probe-Home]\conf and add the following entries in succession after the DMSID="probe-name" entry.

    StandByNOCServerName="hostname-or-ipaddress-of-the-secondary-central-server"
    StandByNOCServerPort="port-on-which-the-secondary-central-server-runs"
    SwitchOverInterval="3600"
    Retries="0"
    ReEstablishInterval="30"

    [ SwitchOverInterval - Time period after which the probe will start communicating with the secondary central server
    Retries - Count to reestablish the connection with the secondary central server
    ReEstablishInterval - Time period to reestablish the connection with the secondary central server after the probe loses its connection ]
  3. Save the file.
  4. Start the Probe service
The probe will start communicating with the secondary central server after the primary central server fails, based on the SwitchOverInterval given in the NOCServerDetails.xml file.

Use Cases
Follow the below given use cases on both the Primary Central Server and Secondary Central Server in order to confirm that the failover configuration has been set up successfully:
  1. connect to mysql, and execute the following command
    show slave status \G;

    The result should have the following fields displaying "yes"
    check slave_io_running
    slave_sql_running


  2. Goto Central\Mysql\data and open the file <machine name>.err (eg:opmsp-d620.err) and check  for replication started message.






 RSS of this page

rtttrrb