Sindbad~EG File Manager
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Running Replication Manager in multiple processes</title>
<link rel="stylesheet" href="gettingStarted.css" type="text/css" />
<meta name="generator" content="DocBook XSL Stylesheets V1.73.2" />
<link rel="start" href="index.html" title="Berkeley DB Programmer's Reference Guide" />
<link rel="up" href="rep.html" title="Chapter 12. Berkeley DB Replication" />
<link rel="prev" href="rep_filename.html" title="Managing replication directories and files" />
<link rel="next" href="rep_replicate.html" title="Running replication using the db_replicate utility" />
</head>
<body>
<div xmlns="" class="navheader">
<div class="libver">
<p>Library Version 18.1.40</p>
</div>
<table width="100%" summary="Navigation header">
<tr>
<th colspan="3" align="center">Running Replication Manager in
multiple processes</th>
</tr>
<tr>
<td width="20%" align="left"><a accesskey="p" href="rep_filename.html">Prev</a> </td>
<th width="60%" align="center">Chapter 12. Berkeley DB Replication </th>
<td width="20%" align="right"> <a accesskey="n" href="rep_replicate.html">Next</a></td>
</tr>
</table>
<hr />
</div>
<div class="sect1" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h2 class="title" style="clear: both"><a id="rep_mgrmulti"></a>Running Replication Manager in
multiple processes</h2>
</div>
</div>
</div>
<div class="toc">
<dl>
<dt>
<span class="sect2">
<a href="rep_mgrmulti.html#idm140654537587456">One replication process and multiple subordinate
processes</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="rep_mgrmulti.html#idm140654537551776">Persistence of local site network address
configuration</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="rep_mgrmulti.html#idm140654537529360">Programming considerations</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="rep_mgrmulti.html#idm140654537569984">Handling failure</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="rep_mgrmulti.html#idm140654537528176">Other miscellaneous rules</a>
</span>
</dt>
</dl>
</div>
<p>
Replication Manager supports shared access to a database
environment from multiple processes or environment handles.
</p>
<div class="sect2" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h3 class="title"><a id="idm140654537587456"></a>One replication process and multiple subordinate
processes</h3>
</div>
</div>
</div>
<p>
Each site in a replication group has just one network
address (TCP/IP host name and port number). This means
that only one process can accept incoming connections. At
least one application process must invoke the
<a href="../api_reference/C/repmgrstart.html" class="olink">DB_ENV->repmgr_start()</a> method to initiate communications and
management of the replication state. This is called the
<span class="emphasis"><em>replication process</em></span>. That is, the
replication process is the Replication Manager process
that is responsible for initiating and processing most
replication messages.
</p>
<p>
If it is convenient, multiple processes may issue calls
to the Replication Manager configuration methods, and
multiple processes may call <a href="../api_reference/C/repmgrstart.html" class="olink">DB_ENV->repmgr_start()</a>. Replication
Manager automatically opens the TCP/IP listening socket in
the first process to do so, and so this becomes the
replication process. Replication Manager ignores this step
in any subsequent processes, called <span class="emphasis"><em>subordinate
processes</em></span>.
</p>
<p>
If the replication process quits and there are one or
more subordinate processes available, one subordinate
process will automatically take over as the replication
process. During an automatic takeover on the master site,
the rest of the replication group briefly delays elections
to prevent an unnecessary change of master. Automatic
takeover does not occur if the replication process fails
unexpectedly. Automatic takeover is most reliable when the
<a href="../api_reference/C/repset_timeout.html#set_timeout_DB_REP_ACK_TIMEOUT" class="olink">DB_REP_ACK_TIMEOUT</a> is the same value on all sites in the
replication group.
</p>
</div>
<div class="sect2" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h3 class="title"><a id="idm140654537551776"></a>Persistence of local site network address
configuration</h3>
</div>
</div>
</div>
<p>
The local site network address is stored in shared
memory, and remains intact even when (all) processes close
their environment handles gracefully and terminate. A
process which opens an environment handle without running
recovery automatically inherits the existing local site
network address configuration. Such a process may not
change the local site address (although it is allowed to
redundantly specify a local site configuration matching
that which is already in effect).
</p>
<p>
In order to change the local site network address, the
application must run recovery. The application can then
specify a new local site address before restarting
Replication Manager. The application should also remove
the old local site address from the replication group if
it is no longer needed.
</p>
</div>
<div class="sect2" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h3 class="title"><a id="idm140654537529360"></a>Programming considerations</h3>
</div>
</div>
</div>
<p>
Note that Replication Manager applications must follow
all the usual rules for Berkeley DB multi-threaded and/or
multi-process applications, such as ensuring that the
recovery operation occurs single-threaded, only once,
before any other thread or processes operate in the
environment. Since Replication Manager creates its own
background threads which operate on the environment, all
environment handles must be opened with the <a href="../api_reference/C/dbopen.html#open_DB_THREAD" class="olink">DB_THREAD</a>
flag, even if the application is otherwise single-threaded
per process.
</p>
<p>
At the replication master site, each Replication
Manager process opens outgoing TCP/IP connections to all
clients in the replication group. It uses these direct
connections to send to clients any log records resulting
from update transactions that the process executes. But
all other replication activity —message processing,
elections, etc.— takes place only in the
"replication process".
</p>
<p>
Replication Manager notifies the application of certain
events, using the callback function configured with the
<a href="../api_reference/C/envevent_notify.html" class="olink">DB_ENV->set_event_notify()</a> method. These notifications occur only
in the process where the event itself occurred. Generally
this means that most notifications occur only in the
replication process. Currently there are only two
replication notification that can occur in a subordinate
process:
</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>
<a href="../api_reference/C/envevent_notify.html#event_notify_DB_EVENT_REP_PERM_FAILED" class="olink">DB_EVENT_REP_PERM_FAILED</a>.
</p>
</li>
<li>
<p>
<a href="../api_reference/C/envevent_notify.html#event_notify_DB_EVENT_REP_AUTOTAKEOVER_FAILED" class="olink">DB_EVENT_REP_AUTOTAKEOVER_FAILED</a>
</p>
</li>
</ol>
</div>
<p>
It is not supported for a process running Replication
Manager to spawn a subprocess. In an application where
a parent process spawns child subprocesses, Replication
Manager can only run in the spawned child subprocesses.
The following operations do not run Replication Manager
and can safely be performed in the parent process: opening
or closing an environment handle, running <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a>,
running recovery. The following operations can implicitly
run a Replication Manager thread and should only be performed
in a child subprocess: opening or closing a database handle,
transactions, gets, puts, checkpoints.
</p>
<p>
If your application is a group of related processes managed
by a single "watcher" process (as described in
<a class="xref" href="transapp_app.html" title="Architecting Transactional Data Store applications">Architecting Transactional Data
Store applications</a>),
the "watcher" process is the parent process and is subject to
the same restrictions.
</p>
</div>
<div class="sect2" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h3 class="title"><a id="idm140654537569984"></a>Handling failure</h3>
</div>
</div>
</div>
<p>
Multi-process Replication Manager applications should
handle failures in a manner consistent with the rules
described in <a class="xref" href="transapp_fail.html" title="Handling failure in Transactional Data Store applications">Handling failure in Transactional Data Store applications</a>. To summarize, there
are two ways to handle failure of a process:
</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>
The simple way is to kill all remaining
processes, run recovery, and then restart all
processes from the beginning. But this can be a
bit drastic.
</p>
</li>
<li>
<p>
Using the <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> method, it is sometimes
possible to leave surviving processes running, and
just restart the failed process.
</p>
<p>
Multi-process Replication Manager applications
using this technique must start a new process when
an old process fails. A subordinate process cannot
automatically take over as the replication process
if the previous replication process failed. If the
failed process happens to be the replication
process, then after a
<code class="methodname">failchk()</code> call, the
next process to call <a href="../api_reference/C/repmgrstart.html" class="olink">DB_ENV->repmgr_start()</a> will become the
new replication process.
</p>
</li>
</ul>
</div>
</div>
<div class="sect2" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h3 class="title"><a id="idm140654537528176"></a>Other miscellaneous rules</h3>
</div>
</div>
</div>
<div class="orderedlist">
<ol type="1">
<li>
<p>
A database environment may not be shared
between a Replication Manager application process
and a Base API application process.
</p>
</li>
<li>
<p>
It is not possible to run multiple Replication
Manager processes during mixed-version live
upgrades from Berkeley DB versions prior to 4.8.
</p>
</li>
</ol>
</div>
</div>
</div>
<div class="navfooter">
<hr />
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left"><a accesskey="p" href="rep_filename.html">Prev</a> </td>
<td width="20%" align="center">
<a accesskey="u" href="rep.html">Up</a>
</td>
<td width="40%" align="right"> <a accesskey="n" href="rep_replicate.html">Next</a></td>
</tr>
<tr>
<td width="40%" align="left" valign="top">Managing replication directories and files </td>
<td width="20%" align="center">
<a accesskey="h" href="index.html">Home</a>
</td>
<td width="40%" align="right" valign="top"> Running replication using the
db_replicate utility</td>
</tr>
</table>
</div>
</body>
</html>
Sindbad File Manager Version 1.0, Coded By Sindbad EG ~ The Terrorists