jrepnotes.txt #10

p4jrep Release Notes
Version 0.91 (beta)

p4jrep is not supported by Perforce Software. p4jrep is a skunkworks project
authored by Michael Shields, who happens to be employeed by Perforce Software
as the Performance Lab Manager. p4jrep was developed on my own time, and when
possible, will be supported and enhanced on my own time. If you have a problem
or suggestion regarding p4jrep, please email me at mshields@perforce.com.

p4jrep is provided "as is", without a warranty of any kind. All express or
implied representations and warranties, including any implied warranty of
merchantability, fitness for a particular purpose, or non-infringement,
are hereby excluded.

Definition of Terms
-------------------

source server: the server from which the live journal is replicated.

target server: the server into which the source server's journal is replayed.
The target server must not modify versioned files, that is, submits
must not occur in the target server.

user starting the source server: the user that owns the parent p4d process
of the source server.

user starting the target server: the user that owns the parent p4d process
of the target server. The user starting the target server needs to be
different from the user starting the source server to ensure that
the target server does not modify the versioned files.

source server's live journal: the journal that is actively being used by
the source server.

source server's rotated journal: the journal into which the source server's
live journal is renamed or copied as a result of executing p4d -jc,
p4d -jj, p4 admin checkpoint, or p4 admin journal.

Initial Setup
-------------

1. Ensure that the source server already exists and is functioning correctly,
including writing journal records as the metadata is updated.

2. Change the "Map" field of all local depots to use absolute paths rather
than relative paths. An absolute path in a depot definition
appears as:

$ p4 depot -o depot
...
Type: local

Map: /p4roots/source/depot/...

3. Ensure that all the versioned files are owned by the UID.GID of the user
starting the source server, and that all the versioned files have a
permissions mask of 644 or 755 (directories in the versioned files
tree should have a permissions mask of 755).

4. chmod the directory containing the source server's live journal so that
the directory is writable by the user starting the target server.

5. chmod the directory containing the source server's rotated journals so that
the directory is readable by the user starting the target server.

6. Make a checkpoint of the source server.

7. Log on as the user starting the target server.

8. Create the target server by replaying (using p4d -jr) the checkpoint
created in step six. The source server and the target server
must run the same release of the server. The server build must
be 2002.2/48374 (or later build), 2003.1/48379 (or later build),
or any 2003.2 (or later release) build.

9. Start the target server.

Starting p4jrep
---------------

p4jrep is started by the same user starting the target server.

In the simplest case, the source server's live journal is named "journal" and
no prefix is used when rotating the journal. For this case, cd to the directory
where the source server's live journal is located and execute:

p4jrep p4d -r <target-root> -f -jr

The above assumes that both the correct p4jrep and p4d can be located using
the shell's path mechanism. If they cannot be located, provide the absolute
path for either or both as required.

The p4d in the p4jrep command line must be the same release of the server as
the source server and the target server. The server build of the p4d in the
p4jrep command line must be 2002.2/48374 (or later build), 2003.1/48379
(or later build), or any 2003.2 (or later release) build.

The -f flag is required so that the p4d -jr will continue when processing
a journal delete record for a record that has already been deleted from the
target server. This can occur, for example, when a client is deleted from
the target server and then later deleted from the source server.

Note that nothing follows the -jr flag; p4jrep provides the rest.

For example:

% cd /p4roots/source
% p4jrep p4d -r /p4roots/target -f -jr
Recovering from -...

As work is performed in the source server, p4jrep will exec as needed
p4d -r <target-root> -f -jr <journal-fragment>, resulting in the output:

Recovering from -...
Recovering from -...

p4jrep records in the <journal>.position file the offset following the last
journal entry successfully replicated from the source server to the target
server. Using this offset, p4jrep can be restarted, and will begin
replicating from where it left off.

Messages from p4jrep and the command executed by p4jrep can be captured in a
log file using the -L flag. For example:

% p4jrep -L log.p4jrep p4d -r /p4roots/target -f -jr

The log file is not held open by p4jrep and can therefore be rotated using
appropriate OS commands. Note that p4jrep's -L flag is separate from any
-L flag specified in the command executed by p4jrep.

If the source server's live journal is named something other than "journal",
or if you desire to specify the source server's live journal as an absolute
path (therefore negating the need to change directory prior to starting
p4jrep), you can use p4jrep's -J flag. For example:

% p4jrep -J /p4roots/source/journal p4d -r /p4roots/target -f -jr

If the rotated journals are in a different directory than the source server's
live journal, use p4jrep's -j flag to specify the same prefix that is used
in the p4d -jc, p4d -jj, p4 admin checkpoint, or p4 admin journal. This
will allow p4jrep to correctly replicate the tail of the journal when
the journal is rotated. For example, if a prefix of /p4backups/source
is used when rotating the journal, as in:

% cd /p4roots/source
% p4d -r /p4roots/source -jc /p4backups/source

then p4jrep should be started using the -j flag, as in:

% p4jrep -j /p4backups/source p4d -r /p4roots/target -f -jr

p4jrep should always be running when the source server's live journal is
rotated. If it is not running when the journal is rotated, the tail of the
rotated journal may not be correctly replicated to the target server.

For other options, see p4jrep -h.

Interesting Configurations
--------------------------

By default, p4jrep replicates everything. It's therefore possible for a client
on the target server to do things such as sync to the have list of a client on
the source server by executing p4 sync @<client-name>, where <client-name> is
maintained on the source server.

But if, for example, there is no need to reference the have list of clients on
the source server, the db.have journal entries from the source server can be
filtered out. This can be done by providing a script to p4jrep to filter out
journal entries. An example of such a script that filters out db.have journal
entries, and also filters out the journal entries for locked files, pending
integrations, and opened files on the source server, is as follows:

% cat egrepfilter.sh
#!/bin/sh

# Filter out named tables by excluding the journal entries.

egrep -v ' @db\.have@ | @db\.locks@ | @db\.resolve@ | @db\.working@ ' $1 | p4d -r /p4roots/target -f -jr -

The egrepfilter.sh script can be obtained from here:

ftp://public.perforce.com/guest/michael_shields/src/p4jrep/egrepfilter.sh

The script to filter out the journal entries can be used by starting
p4jrep as:

% p4jrep egrepfilter.sh

Journal entries for certain tables can span multiple journal records. Scripts
to either filter out or only replicate these tables must take into account
that the journal entries can span multiple journal records. The tables
with journal entries that can span multiple journal records are:

db.bodtext db.changex db.domain
db.change db.desc db.job

An example of a script that takes into account that the journal entries can
span multiple journal records is awkfilter.sh, which only replicates
changelists from the source server. The awkfilter.sh script can be
obtained from here:

ftp://public.perforce.com/guest/michael_shields/src/p4jrep/awkfilter.sh

Some tables, if not replicated, can be deleted from the target server's root
following the initial load of the target server from the source server's
checkpoint. However, the following tables should not be deleted from
the target server's root:

db.counters db.domain
db.depot db.message

Limitations in this Release
---------------------------

p4jrep should always be running when the source server's live journal is
rotated. This release does not detect when the journal has been rotated
when p4jrep was not running.

Limitations by Design
---------------------

The check for journal rotation first determines if the live journal's inode
was changed, which occurs if the journal was rotated by the source server
renaming the live journal. If the live journal's inode was not changed, the
check for journal rotation then compares the current size of the live journal
with the position last read from the journal. If the current size is less than
the position last read, then the journal was rotated by the source server
copying and truncating the live journal. These checks for journal rotation
should be sufficient for most production environments, but might not catch two
journal rotations by the source server copying and truncating the live journal
with nothing written to the journal between the two rotations.

Since p4jrep replicates the tail of the source server's live journal after
the journal has been rotated based upon the offset following the last journal
entry replicated, the rotated journal needs to be an exact copy of what was
the source server's live journal. Therefore, the -z flag cannot be used on
commands that rotate the journal. The rotated journal can be compressed
using the gzip utility once p4jrep has finished replicating the tail
of the rotated journal.

Bidirectional Replication
-------------------------

Bidirectional replication can result in corruption of versioned files.
Do not attempt to replicate bidirectionally.

#	Change	User	Description
#12	7886	Michael Shields	Correctly handle 2010.2 journals, which have an @nx@ record as the first record in the journal after the journal has been rotated.
#11	7598	Michael Shields	Add note pointing to 'p4 replicate'.
#10	6439	Michael Shields	Updating for the 2008.1 p4d. The format of the @vv@ record changed in the 2008.1 p4d. The @vv@ record is the one and only journal record the format of which is important to p4jrep. This p4jrep can also be used with prior releases of p4d. Version string: p4jrep Version 0.91 (beta)
#9	6256	Michael Shields	Additional tidbits to bring these current as of the 2007.3 release.
#8	6155	Michael Shields	Updated for the 2007.3 release while maintaining compatibility with prior releases. 2007.3 and later servers might rotate the journal by renaming it rather than copying and truncating it. A renamed journal is now detected by comparing the device and inode returned from statting by the journal's file name and statting by the journal's file descriptor. This algorithm (suggested by J.T. Goldstone; thanks J.T.!) is faster than reopening the journal and seeking if the journal was not rotated (~2.0 seconds vs. ~2.7 seconds for 1,000,000 iterations on my laptop).
#7	5379	Michael Shields	Fix regression introduced in version 0.86. The regression was a side-effect of funtionality added to keep the pipe open to the command executed by p4jrep. The regression caused the journal position to be erroneously updated when the forked process exited with a non-zero exit code. This could result in journal entries missed when restarting replication following a failure of the command executed by p4jrep. Credit (and my thanks) goes to Brian Moyers for finding and diagnosing the regression, and coding and testing the fix.
#6	5119	Michael Shields	Added -t delay in an attempt to ensure transactional atomicity. If no additional journal entries are written during this delay, the transaction is assumed complete, which closes the pipe, which terminates the command (which if includes a p4d -jr, releases the locks in the target server, allowing commands access to the replicated atomic transaction). By default, this delay is ten milliseconds. Locking the journal does not ensure transactional atomicity since the server locks the journal once for each journal entry written, not once per transaction. And we would like to avoid locking the journal since that would introduce a potential concurrency problem. Not all operations in the server are atomic transactions and therefore cannot be replicated atomically. For example, updating a client's have list as files are being synced is not an atomic transaction. But committing a submit is an atomic transaction, and this change (with perhaps some site-specific tuning of the -t delay) attempts to replicate the commit atomically.
#5	4839	Michael Shields	Pushing p4jrep source into the public depot.
#4	4275	Michael Shields	p4jrep Version 0.85 (beta) and cpipe Version 0.77 (beta) for solaris8sparc (sunultra).
#3	3626	Michael Shields	p4d used w/ p4jrep must be 2002.2/48374 (or later build) or 2003.1/48379 (or later build).
#2	3621	Michael Shields	p4jrep Version 0.80 (beta) and cpipe Version 0.76 (beta) for solaris26sparc (shucks).
#1	2443	Michael Shields	beta p4jrep and cpipe. See jrepnotes.txt for details.