cvs2p4 2.0	May 6, 2002

Release 2.0 of cvs2p4 includes a radically different approach to
importing CVS history into Perforce, in order to provide a much faster
conversion process. cvs2p4 1.x releases put data into Perforce by
"replaying" all of the CVS changes into a live Perforce server.  This
newer version works by directly generating Perforce metadata, and
linking (or copying) the RCS archives from the CVS repository
directory directly into the Perforce file archive.

If you have problems with this version, you can still get a copy of
the older (and slower, but time-tested) version at

  ftp://ftp.perforce.com/perforce/utils/cvs2p4/cvs2p4-1.3.3.tar


==== INTRODUCTION    

This small set of tools provides a means for importing a CVS module
into Perforce.

It was originally developed for use at Network Appliance, to convert
our product source code revision history from CVS into Perforce.

As such it sprouted some NetApp-specific features suited to our
special needs, but I have made an attempt to make these unobtrusive to
the general user.

Basically, it is patterned at a high level after the PVCS to
Perforce converter available on the Perforce web site, doing the
following steps during a conversion:

  - Scans the CVS repository to generate a metadata file;

  - Scans the metadata file to identify groups of RCS revisions
    that comprise Perforce changes;

  - Imports the revisions/log history into a Perforce depot, by
    directly generating Perforce metadata in "journal" format.
    (driven by the output of the previous phase);

  - Finally, (and optionally), generates a map of RCS revisions and
    the Perforce changes they belong to.

cvs2p4 tries make the resultant Perforce depot look as if the work in
CVS had been going on in Perforce. In particular, it attempts to
create changes corresponding to the whole creation of new branches a
la

  p4 integrate //depot/branchA/... //depot/branchB/...

This is in contrast to rcstoperf.sh, which scattered the "integrates"
corresponding of the creation of files on new branches into many
changes (basically, according to when the file was actually first
changed in the new branch).

cvs2p4 also allows you to import only selected branches, and/or to map
some branch other than the the CVS trunk to become the new "main"
branch in Perforce. See the notes in the template config file
("test/config") for more information on these features.

As of version 1.3, cvs2p4 will also import CVS symbolic version tags.

Note: A CVS tagged revision will make it into Perforce labels ONLY
when the revision is in fact present in the converted depot, subject
to the branches selected for import. (See the notes for the
"WANTLINES" variable in the config file).


==== MANIFEST

After unpacking the distribution archive, use the MANIFEST script to
verify that you have all of the pieces.  The output should go
something like this:

$ MANIFEST  
  MANIFEST
  Artistic
  README
  NEWS
  bin/genmetadata
  bin/genchanges
  bin/dochanges
  bin/dolabels
  bin/revmap
  lib/util.pl
  test/file,v
  test/dollar$file,v
  test/space file,v
  test/config
  test/runtest
  test/norm
  test/metadata.good
  test/lines.good
  test/changes.good
  test/p4_changes_-l.good
  test/p4_describe.good
  test/p4_filesat.good
  test/p4_labels.good

All ok


==== REQUIREMENTS    

This stuff should work on any Unix host that supports:

  - Perl 5.x, with working dbm support (i.e., dbmopen()/dbmclose()
    work). The scripts assume that perl will be found via $PATH. It
    must be a perl5! Some people have reported problems that seem to
    be related to dbm limitations with some perls when converting very
    large repositories. I like implementations based on Berkeley-DB.

  - Perforce release 2002.1.

    Later Perforce releases may work, but since this script generates
    journal-format metadata directly, it may need to be changed in order
    to work correctly with other Perforce releases.


==== WHAT IT DOES

This converter will import a CVS module into Perforce, preserving the
branching structure seen in the RCS ,v file in the CVS repository, and
translating them into Perforce branches within the depot. As it
stands, it will only import RCS branches up through the highest
numbered revisions on branches that have branch tags referring to
them; thus, it will not necessarily bring *every* revision in the CVS
module into Perforce, but *will* bring in every revision leading up to
the current revision for every branch it imports. I think this is what
most people will want; if not, hack away.

Like the "rcstoperf.sh" converter available on the Perforce web site,
it applies heuristics to try and identify multiple changes in CVS that
are highly likely to comprise what would be seen as a single change in
Perforce, and makes them appear as a single Perforce change. (The
heuristics are: checked in by the same user, proximal in time, and
bearing an identical log message).

It deals correctly with files that are dead on the CVS trunk (I.e.,
where the RCS ,v files are in the "Attic/".

The converter attempts to leave converted files in perforce with a
sensible Perforce file type (See `p4 help filetypes` for a description
of file tyeps in Perforce) after the conversion. However, due to
limitations in RCS's notion of "file type" (the -k options,
controlling keyword expansion), cvs2p4 must currently decide to import
all "text" files as Perforce type "text" (text with no keyword
expansion) or "ktext" (text with keyword expansion). This is
controlled by the "$KTEXT" configuration option, which is on by
default.

Also note that binary files will be converted to Perforce type
"binary+D"; the (unusual) "+D" is there because the converter works by
using the existing RCS archive files directly; normally in perforce,
filetype "binary" implies storage of complete revisions, rather than
as RCS archives. Rest assured that "binary+D" is correct.

The "UI" for the converter is not very slick, but for most people it's
a one-time kind of tool anyway. Feel free to improve it if you are so
inclined.

While I am currently a Perforce employee, please understand that this
is *not* presently officially supported by Perforce. It is supplied in
hopes that somebody will find it useful (Or perhaps only entertaining
:-).


==== TESTING

I have included a *very* rudimentary automated test "suite", in the
test/ directory. You can use this to verify that it seems to work in
your environment.

To run it:

  1. Edit test/config, and change the lines

       # p4 command location (If other than "/usr/local/bin/p4")
       #
       $P4             = "/usr/local/bin/p4";
       
       # p4 command location (If other than "/usr/local/bin/p4d")
       #
       $P4D            = "/usr/local/bin/p4d";
       
       # Perforce server we're using.
       #
       $P4PORT = "localhost:1680";
       
     to reflect the actual location of your "p4" and "p4d" commands,
     and the server port that you are using.

     *** Note: Pervious versions of this tools allowed you to run the
     Perforce server on a different host than the one where the
     conversion tools were run. This is no longer the case; thus you
     should probably never change the "localhost" part of the P4PORT
     configuration setting, above.
     
  2. Run the tests with

       test/runtest

     This should run all of the conversion scripts on the test CVS
     module (well, file - it's a one-file module!), and then verify a
     few things by querying the Perforce server after the conversion
     is complete.

     If everything goes well, the end of the output should be

       runtest: ok

In this version, the converted CVS "module" consists of a very few files,
but it does have a carefully constructed branching structure, intended
to verify that the converter does the right stuff with respect to
branching.


==== USAGE

1. Make a directory to hold all the working files for the conversion,
   and create a config file, starting with test/config as a template:

     $ mkdir convdir; cp test/config convdir

   Edit the convdir/config file to reflect your locale and
   intent. (See the comments in the config file).

2. Run bin/genmetadata:

   It takes a single argument - the name of the directory where the
   "config" file resides. (It will create all intermediate, temp, and
   working files under this directory.)

     $ bin/genmetadata convdir
     genmetadata: rm -rf convdir/logmsgs.dir convdir/logmsgs.pag ...
     .
     . (filenames of each file in the CVS module, as they are scanned)
     .
     ===== Lines referenced:
     chupa
     curly
     ha         <- a list of branch tags encountered in the scan;
     larry         also saved to convdir/lines.
     shemp
     xxx

   This reads cvsdir/config to get its marching orders, then scans the
   CVS module for all ,v and Attic/,v files, creating:

     convdir/metadata      <- the extracted RCS/CVS metadata
     convdir/logmsgs.pag   <- An ndbm database
     convdir/logmsgs.dir   <-   of the log messages
     convdir/lines         <- A list of "codelines" (== branch tags)

   At this point, you may want to look at the list of branch tags
   encountered, (which was written to convdir/lines), edit the config
   file, setting WANTLINES to 1, and filling in the "<<LINES" here
   file with the names of the branches you want to import to Perforce;
   then, rerun bin/genmetadata to rescan and pick up only those
   revisions you care about.

   
4. Run bin/genchanges:

   Again, this takes a single argument - the name of the "conversion
   directory":

     rmg $ bin/genchanges convdir
     16354                    <- This counter spins as it's running.
                                 It will count up to the number of
                                 lines in the metadata file.

   This reads convdir/config and convdir/metadata, and writes
   convdir/changes.


5. Run bin/dochanges:

   You might want to save a copy of the output with "tee".
   The output will look something like:

     rmg $ bin/dochanges convdir 2>&1 | tee OUT
     dochanges> /bin/rm -f convdir/revmap.db ...
     dochanges> /bin/rm -f convdir/depotmap.db ...
     dochanges> /bin/rm -rf p4root && mkdir -p p4root
     dochanges> /bin/mkdir -p /home/rmg/web/richard_geiger/...
     dochanges> /bin/ln -s /home/rmg/web/richard_geiger/...
     ========== change group 1
     ========== change group 2
     ========== change group 3
      .
      .
      .
     ========== change group 17
     ========== change group 18
     dochanges> cd /home/rmg/web/richard_geiger/...
     Recovering from dbmeta...
     dochanges> cd /home/rmg/web/richard_geiger/...
     Dumping to checkpoint...
     
   Basically, that's it. When this command finishes, your CVS module
   has been imported to Perforce, in the Perforce server database
   identified by the $P4ROOT configuration variable. The state of the
   resultant database is saved in a checkpoint file named
   $P4ROOT/checkpoint.

   NOTE: cvs2p4 does not create new RCS-format archives (,v files)
   under $P4ROOT; rather, it uses the existing RCS archives in the CVS
   tree directly. By defasult, does this by making a symbolic link
   named $P4ROOT/depot/IMPORT pointing to the $CVS_MODULE tree. If
   you'd rather have dochanges copy in the CVS module for you, set
   COPYIMPORT in the config file.


6. If you want to import labels from CVS tags, run

     $ bin/dolabels convdir
     make label: testlabel
     dolabels> cd p4root && /usr/local/bin/p4d -jr dblbls
     /home/rmg/web/richard_geiger/guest/richard_geiger/utils/cvs2p4_meta/p4root
     Recovering from dblbls...
     dolabels> cd p4root && rm -f checkpoint; /usr/local/bin/p4d -jd checkpoint
     /home/rmg/web/richard_geiger/guest/richard_geiger/utils/cvs2p4_meta/p4root
     Dumping to checkpoint...

   This step adds the symbolic tag information from the CVS archive
   (for "plain", non-branch tags) to the Perforce database identified
   by the $P4ROOT configuration variable.  The state of the resultant
   database is saved in a checkpoint file named $P4ROOT/checkpoint.
 
** NOTE: This version of cvs2p4 does *not* create new RCS archives in
** $P4ROOT/depot/...; Rather, it creates a symbolic link
** "$P4ROOT/depot/IMPORT -> $CVS_MODULE"; i.e., the existing RCS
** archives form the CVS repository are used by the Perforce server,
** in place. If you'd rather have it make a _copy_ of the RCS archive
** files from your CVS repository, set "$COPYIMPORT = 1" in your
** config file.
   
7. If you want the RCS revision-to-Perforce change map, run:

     $ bin/revmap convdir

   Or, for the reverse mapping:

     $ bin/revmap -map rrevmap convdir


==== INCREMENTAL CONVERSIONS

At this time, the recommended procedure for doing "incremental"
conversions - i.e., combining multiple CVS repositories, or doing
subsets of the CVS modules ni a repository one at a time - is to do
each as a new conversion (starting with change 1), and then to combine
them as desired using the "perfmerge2.pl" tool. (See:)

  ftp://ftp.perforce.com/perforce/r02.1/tools/server/perfmerge2.pl

This is also a useful pattern when you want to combine some new chunk
of CVS (or RCS) repository into an existing Perforce depot.


==== SUPPORT

I originally wrote and contributed this tool while working for Network
Appliance in 1997. 

I now work for Perforce, and, while I _am_ chartered with supporting
Open Source software (such as this) as part of my job, it must be
understood that Perforce Software still does not officially support
it.

I (and Perforce Software) can make absolutely no warranty that this
will be helpful or even nontoxic for you, nor make any guarantee that
I will be able to provide support.

On the other hand, I have been able to be help in supporting many
users in the past, so it's worth a try!

  - Richard Geiger
  Open Source Engineer at Perforce   opensource@perforce.com

  Note: because of my role at Perforce, it would be helpful if
  questions or requests for help with cvs2p4 be sent to the
  "opensource@perforce.com" address, as shown above. Thanks!

  (revised May 6, 2002, release 2.0)