.
Eloquence B.07.10 contact contact

Documentation / Eloquence Replication

Eloquence Replication

 
.
 

Revision: 2008-05-05

This document describes the Eloquence replication functions that will be available with Eloquence B.08.00 and may be added to version B.07.10 with patches.

Contents


Introduction

This feature allows to replicate database transactions to other database environments (local or remote). Replication is unidirectional and performed asynchronously. It is typically close to real-time if allowed by server performance and connection bandwidth.

On the master server, committed transactions are saved in Eloquence forward-log files. The dbrepl replication utility is used to transfer and apply committed transactions from the master to any replicated servers ("slave servers"). On the slave server(s), the databases are available for access in read-only mode.

The Eloquence replication functionality is not specific to a database but performs a replication of all changes to a database environment. It does not perform regular database calls but replicates transaction results. This improves performance and allows covering any changes that are described by internal transactions, including administrative changes such as creation of databases, structural changes and restoring single databases. Technically, the replication function performs a continuous incremental forward recovery on slave servers of committed transactions. Consequently, any information that is not maintained by the database server is outside the scope of the Eloquence replication functions.

Expected use of this function includes load sharing (e.g. use a replicated environment for reporting), hot standby (allow switching users to a running server instance in case of a problem) and making replicated data available in branch offices that are connected through WAN lines.

To use Eloquence replication, an additional "repl" license key is required.


Prerequisites

To use replication the following requirements must be met:

  • Eloquence B.07.10 or later must be installed.
  • The patch(es) enabling replication must be installed on both the master and the slave system, if using the Eloquence B.07.10 version. The patch mainly adds new functionality executed on the the slave server but also implements limited changes to the code base executed on the master side.
  • The slave function is only enabled if a specific (repl) license key is present.
  • FwLog must be enabled on the master server; using EnableAudit=1 is optional (it is not required by the replication function). Enabling FwLog on the slave server is optional.
  • The master system needs sufficient disk space to hold any forward- log files that are not yet replicated to the slave servers entirely.
  • Replication master and slave must share the same architecture (same byte order and alignment, may depend on the CPU or operating system).

For Eloquence B.07.10 the following patches must be installed:

B.07.10 patches:
http://eloquence.marxmeier.com/support/B0710/patch/B0710.html


Configuration

The Eloquence replication functions add new options for the server configuration file. The supplied default file /opt/eloquence6/newconfig/config/eloqdb6.cfg has been updated accordingly.

[Replication]
Role = Standalone | Master | Slave
RedirectWrite = server:service
IgnoreWrite = 0 | 1
TmpDir = /tmp

The Replication.Role configuration item defines the role of this server.
If not specified, it will default to Standalone. If set to Master this specifies a master server. If set to Slave this specifies a slave server.

Setting Role to Master has the effect that some operations that would result in replicated servers to become unsynchronized (such as bimport) are disallowed. It also causes the master server to write additional information on opened databases to the forward-log files, that allow the slave server to detect conflicts between replication and concurrent use.

Role must be set to Slave for a replicated server. If configured as a slave server, any write attempt is rejected (unless RedirectWrite or IgnoreWrite options are used) and the replication is enabled.

The RedirectWrite configuration item is only used on a slave server and may be used to specify the corresponding master server for a slave server. It specifies the server name or IP address and service name or port number, separated by a colon.
If RedirectWrite is defined on a slave server, some DBOPEN modes (modes 1, 3, and 4) are transparently redirected to the specified server (this will also add a note to the log file).

The IgnoreWrite configuration item is only used on a slave server. If set, opening a database in write mode on a slave server is accepted but internally converted into a read-only open mode. This way, a program that opens a database in write mode but only performs read operations may also run on a slave server.

Please note: If IgnoreWrite is set, RedirectWrite is implicitly disabled.

The TmpDir configuration item may be used on a slave server to specify a temporary directory that is used as a scratch storage for collecting and processing partial transaction information. It needs to provide sufficient disk space to hold the size of the largest transaction. It defaults to the /tmp directory.

Example configuration on a master server:

[Replication]
Role = Master

[ForwardLog]
FwLog = /fwlog/fw-%N

A master server needs to enable forward-logging to files (managed by the server process) and should define a Role=Master.

Example configuration on a slave server:

[Replication]
Role = Slave
RedirectWrite = 194.64.71.28:8202

A slave server needs to define Role=Slave and may define RedirectWrite to bounce DBOPEN calls in write mode to the master server.

Please note that the RedirectWrite configuration string (which may specify server or service by name) is resolved by the client library.

Also note that RedirectWrite only affects application programs using one of the dbopen modes 1, 3, or 4. Some database utilities, such dberase, use special dbopen modes and will fail when used with a slave server.


dbrepl utility

The Eloquence dbrepl utility is used to replicate committed transactions from a master server to a slave server. This utility needs to be run with dba privileges.

The dbrepl utility reads the master server config file to obtain the location and naming convention of forward-log files. It then contacts the specified slave server and obtains the most recently synchronized checkpoint on the slave server. With this information the master server forward-log files are searched to locate a synchronization point. dbrepl then submits any enqueued transactions from this point to the slave server.

Once the slave server is up to date, any subsequently committed transactions should be replicated close to real-time, subject to communication bandwidth. Replication should only place minor load on the master and slave server once synchronization has been achieved.

As the dbrepl utility reads the master server forward-log files, it should be run on the same system as the master server and should have appropriate access rights to the forward-log files of the master server (read-only access is sufficient).

Usage:

dbrepl [options] [slave_server_addr]

options:
 -help        - show usage (this list)
 -c cfg       - master server configuration file
 -v           - verbose, display progress
 -u name      - user name (defaults to dba)
 -p pswd      - password
 -d flags     - debug flags
 -S           - synchronize on existing log, then exit
 -G           - process current log generation, then exit
 -b bps[k|m]  - limit bandwidth to bps [kilo|mega] bits per second
 -T timestamp - process until point in time (incl.)

timestamp formats:
 YYYY-MM-DD HH:MM:SS
 MM/DD/YYYY HH:MM:SS
 DD.MM.YYYY HH:MM:SS

The -c option is used to specify the master server config file.
If not present, it defaults to the default config file on the local system (/etc/opt/eloquence6/eloqdb6.cfg for HP-UX and Linux).

The slave_server_addr command line option specifies the slave server host name or IP address and service name or port number, separated by a colon (e.g. 194.64.71.28:8202). The host name or IP address may be omitted and defaults to localhost (127.0.0.1).

The EQ_DBSERVER environment variable may be used to specify the slave server address. The EQ_DBUSER and EQ_DBPASSWORD environment variables may be used as an alternative to the -u / -p options.

By default, dbrepl synchronizes all enqueued changes and then closely follows any on-going changes on the master server.

  • If the -S option is present, dbrepl exits once all enqueued changes are synchronized.
  • If the -G option is present, dbrepl exits if the volume generation changes (eg. master server is restarted, on-line backup or "dbctl forwardlog restart" is run).
  • If the -T option is present, dbrepl exits after processing all enqueued changes up to (and including) a given checkpoint date and time. The time defaults to 00:00:00 if not specified explicitly. Date and time can be separated by space or other characters, shell quoting might be required.

For example:

$ dbrepl -c /etc/opt/eloquence6/eloqdb6.cfg -v -S :8202
R1: processing forward-log file: /fwlog/fw-1-1
R1: found synchronization point with slave server
...
R1: processing forward-log file: /fwlog/fw-12-1
R1: slave server is up-to-date until 2006-04-18 18:54:17

$

The -b option can be used to limit the network bandwidth consumed by the replication. This throttling can be useful in case the network link between master and slave(s) is not dedicated to dbrepl and should not be saturated by the replication during high activity periods on the master. Note, however, that such throttling results in slave servers no longer being updated "as fast as possible".

The bandwidth limit is specified as bits, kilo bits (k suffix) or megabits (m suffix) per second. Any suffix must directly follow the value. For example: options "-b 2m" or "-b 2048k" or "-b 2097152" each specify a limit of 2 megabits per second, which is equivalent to 256 kilobytes per second.

To temporarily stop synchronization of a slave server, it is sufficient to stop the dbrepl utility. On next start dbrepl will continue from the previous point.

Database locks are not replicated between master and slave servers, i.e. DBLOCK calls issued on the master have no effect on the slaves and DBLOCK calls issued on a slave have no effect on the master.
This might cause applications behave differently when run on a slave server instead of the master server. For example, if a report uses DBLOCK calls to prevent certain data from being changed concurrently by other programs. On a slave server, the DBLOCK calls by the report program would not prevent dbrepl from applying concurrent database updates because dbrepl does not use IMAGE calls.

Information about databases opened for writing on the master is recorded in the forward-log files and thus available to dbrepl and the slave server. A slave server uses this information to prevent conflicting modes of access by suspending replication temporarily or by rejecting the conflicting DBOPEN attempts on the slave. For example:

  • If a database is open in mode 8 on the slave server, replication is temporarily suspended as soon as the forward-log indicates a segment where the specific database was opened for writing on the master server. The slave resumes replication when the conflicting DBOPEN mode 8 ends.
    Please note that this suspended state does not only apply to the specific database, but to the slave server as a whole.

  • If a dberase is replicated to the slave while the target database is still in use on the slave server, replication is temporarily suspended until the required exclusive access to this specific database is possible.
    Please note that this suspended state does not only apply to the specific database, but to the slave server as a whole.

  • While replicating transactions that have been written on the master server using a DBOPEN in mode 3, the slave server will reject DBOPEN attempts for this specific database on the slave server.


dbctl utility

The dbctl utility may be used to obtain replication status information from a master or slave server. Output includes server role, forward-log status and most recently processed checkpoint. For a slave server it also shows whether replication is active or inactive.

For example:

$ dbctl replication status
Server is configured as MASTER
Last checkpoint is 12-1.1083 (2006-04-18 18:57:13)
Forward-logging is enabled.
Forward-log is '/fwlog/fw-12-1'.

$ dbctl -h slave_host -s slave_port replication status
Server is configured as SLAVE
Replication is active
Last checkpoint is 12-1.1083 (2006-04-18 18:57:13)
Forward-logging is enabled
Forward-log is '/slave/fwlog/fw-12-1'

Note that slave server status displays checkpoint timestamps as received from the master server, i.e. when the respective checkpoint occured on the master, not when it was finally applied on the slave. This info may be helpful in cases where replication has been stopped or is processing backlog during "catch up" phases.

The "dbctl replication stop" command may be used on a slave server to disconnect an active replication session. dba authorization is required. This is equivalent to performing a "dbctl killthread" on the TID of the respective replication session.


Starting (or re-synchronizing) a slave server

To setup a slave server, forward-logging needs to be configured on the master server.

Then a backup copy of the volume files of the master server is transferred to the slave environment. This could also be a previous backup if the forward-log files since this backup are present on the master server.

On the Slave server, the server config file needs to specify Replication.Role = Slave. The slave server needs to be started.

On the master server, run the dbrepl utility to start replicating any committed changes to the server. After some time the slave server should become current and follow the master server closely.

Using forward-logging with a slave server

While the use of forward-logging is mandatory on the master server, it is optional on the slave server. If enabled, the slave server writes a replicated copy of the forward-logs in addition to maintaining the replicated copy of the databases. This may be helpful in disaster recovery or auditing situations; for example, when the master server is inaccessible.

In the eloqdb6 configuration file on a slave server, the only relevant parameter of the [ForwardLog] section is FwLog. Any other configuration parameters in the [ForwardLog] section (for example, FwMaxSize or EnableAudit) are ignored by the slave server because they are defined by the master server.

For slave server forward-logging, the FwLog parameter must be configured for automatic file management, that is, it must refer to a directory with sufficient disk space available and have the %N token in the filename part.
On the slave server, the sequence of the forward-log files equals the file sequence on the master server. In other words, all the information contained in the master forward-log files are copied to files with the same generation and sequence numbers on the slave server.

If a replication is suspended and later resumed, the slave server locates the last checkpoint in the existing forward-log files and continues to append new replication actions at that point. Therefore, in the end the slave server forward-log files will always equal the master server files.

The dbctl forwardlog interface allows to manually enable/disable forward- logging and to query the current forward-log status.

If forward-logging is disabled the slave server immediately stops to write to its forward-log files. Any information that is replicated after forward-logging was disabled will never be written to the slave server forward-log files (i.e., the forward-log is interrupted at that point).

The disabled state is retained until forward-logging is manually enabled or the slave server is restarted.

When forward-logging is manually enabled the slave server forward-log will typically not be written immediately. Instead, the slave server starts to write the forward-log when a new replication segment begins (i.e., when the master server begins a new forward-log segment).

Changing roles of master and slave servers

The following procedures should be used to switch roles of any replicated server:

  1. Stop the master server
  2. Make sure replication has synced the most recent transaction. If the dbrepl utility is not active you may want to consider running dbrepl with the -S option to make sure the most recent changes are replicated.
  3. Shutdown the slave server.
  4. Change the master and slave configuration files (change role from master to slave and vice versa) and restart the server processes. Please make sure the former slave server has forward-logging enabled.

This procedure ensures an orderly handover and allows to replicate changes from the new master server. In all other cases the new slave server must be re-synchronized from the new master server (see section above).

Using slave server volume files to recover the master

After the master server has failed, the recommended procedure is to change the role on the slave server to a master. This allows to continue with minimal interruption. The former master server is eventually recovered from the former slave server and updated by replication as described above.

However, if changing server roles is not desired it is also possible to recover the master server from the slave server with the procedure described below:

  1. Make sure the dbrepl utility is not active on the master server
  2. Remove the fwlog files from the master server (or move them to a backup directory). If the master has failed its content may not be consistent with the slave server.
  3. Copy the volume files from the slave server to the master server. The slave server may be running and on-line backup mode is not required.
  4. Start the master server process.
  5. Start dbrepl to replicate database changes to the slave.

Limitations

Any modification to a database environment that is not described through transactions will have the effect that the replication gets out of sync and any slave servers need to be re-synchronized.

This may be the result of using Eloquence off-line support utilities that could affect the database content while the server process is not active, for example:

  • dbfsck (when used to perform database low level repairs)
  • dbcfix (when used to repair database chain linkage)

Disabling forward-logging on the master server for any reason (eg. due to lack of disk space or for administrative purposes) will have the effect that any slave servers need to be re-synchronized.

The Eloquence bimport function (for data migration) only partially uses transactions (only meta data are covered by transactions for performance reasons). Consequently, the bimport function may not be used on a master server (this function is disabled when configured as a master server) and its use would require re-synchronizing any slave servers.

Any configuration changes affecting the database volume files (e.g. adding a volume file or changing the volume file size limits) requires the same change to be applied to any replicated database environment. Future versions of the replication function should be able to detect this condition and stop replication unless the slave server is configured equivalently. However, the current release will result in an abort of the slave server.

In case the master server aborts (e.g. due to an internal problem or due to a system crash when SyncMode is enabled) the replication should usually be able to continue once the master server was restarted and recovered successfully. During startup the master server (or a dblogreset) would attempt to recover any possibly missing records from the transaction journal. If successful, the replication should be able to continue.

Modification of database content on a replicated server (slave server) is not allowed in any case. Using Eloquence off-line maintenance utilities (such as dbfsck in write mode) to modify the volume files content will either corrupt the slave environment or cause the replication to stop (and requires re-synchronization).

A recovery of the master server from backup (restoring the volume files from the backup and running dbrecover) could corrupt a replicated server state or cause replication to fail unless the forward recovery continues beyond the point the slave server was most recently synchronized with. For example, restoring a backup and running dbrecover on a partial set of forward-log files and then starting the master server would have this effect.


 
 
.
 
 
  Privacy | Webmaster | Terms of use | Impressum Revision:  2010-02-16  
  Copyright © 2006-2010 Marxmeier Software AG