- This is a loose collection of useful information culled from various
- email messages. It's meant to save some time in reading the archives
- for prospective hackers.
-
- Adding Backends
- ===============
-
- Getting Started
- ---------------
-
- I'd copy from a "real" back end instead of the null backends. Choose
- based on which one is most like ab:
-
- - CVS branches in revision number space (foo#1.1.2.1 is a branch of
- foo#1.1), has no atomic changes, and VCP can read RCS files
- directly.
- - p4 is the nicest to work with. It branches in name space
- (foo/file can be a branch of bar/file) and captures enough
- metadata to allow us to reproduce a source repository fairly
- accurately. There is experimental support for P4::Client, an
- interface to the experimental p4 api library (the library is
- solid, mind you, but has some minor issues that keep us from using
- it reliably all the time).
- - There's an in-development svn back end out on the web, but it's
- still working through svn-specific and VCP internals issues (I
- refactored a lot and broke it :/).
- - VSS also branches in name space, but is awkward to work with for
- many reasons: missing metadata, poor command line tools and
- inconsistent data model unstable operation.
-
- Backend Modules Footprint
- -------------------------
-
- The back ends have three parts, usually:
-
- - VCP::Dest::foo: handle_header(), handle_rev(), handle_footer() and
- several other functions. If the dest handles changes, then
- handle_rev() must accumulate changes until a rev for a different
- change arrives or handle_footer() is called.
-
- - VCP::Source::foo: handle_header(), copy_revs(), and
- handle_footer() scan the repository and emit metadata. get_file()
- retrieves files as needed by the downstream filters or dest.
-
- - VCP::Utils::foo: These are common infrastructure for the dest &
- source.
-
-
- Implementation Order and Test Suites
- ------------------------------------
-
- We try to build matched pairs of VCP::Source::foo and VCP::Dest::foo in
- order to provide for reasonable testing and to encourage VCP to be
- balanced and not merely a mechanism for fleeing to a given repository
- type. I don't want VCP to become a mechanism for one-way migration, it
- needs to be balanced where possible.
-
- We would need to do the least-common-denominator testing in the
- t/90revml2foo_*.t (which needs VCP::Dest::foo) and t/91svn2revml.t,
- which needs VCP::Source::foo).
-
- t/95*.t is good for ensure a particular conversion works to expectation
- given one of the repositories generated in the t/90*.t tests.
-
- t/99*.t is good for testing that a conversion works with a special case
- repository. The other t/9*.t tests use just enough test data to make
- sure there is no fundamental breakage.
-
-
- Repository Interaction Patterns
- -------------------------------
-
- The command line is often a good place to start, it lets you use
- relatively debugged and well defined access points (in general, not sure
- how stable/usabel ab's CLI is). Other access methods are usually much
- faster:
-
- - Going direct to disk (if the repository has a supported /
- published external on-disk representation) is pretty speedy and,
- for CVS at least, takes about as much work as parsing the log file
- emmitted by the cvs CLI.
-
- - Implement a direct-to-backend protocol like P4::Client. This has
- all the advantages of the CLI without the pain and suffering
- involved in repeatedly spawning child processes. It's more
- flexible than direct file reads because the repository can be
- located over the net.
-
- - Check to see if there's a backend available. Perl modules
- should be drop-in useful, you can also wrap C interfaces
- easily with the Inline:: modules, at least for prototyping and
- quite likely "for real".
-
- - More and more servers offer a web or WebDAV frontend, Perl's
- WWW and HTTP modules may be helpful here.
-
- - Read / generate a data dump for the backend. p4d can dump all
- metadata as a "checkpoint" and import checkpoints and "journal"
- files. We've not implemented this because it's in proprietary
- format and we don't want to have to track it every change.
-
-
- Interactive User Interfaces
- ---------------------------
-
- > When I run 'perl -Ilib bin/vcp' I don't see my new backend in the list of
- > source and dest I get prompted for.
-
- You won't yet, those are for production ready backends and they're
- hand-crafted in the ui_machines directory using state machines specified
- in XML to specify the flow of the UI. Leave that until last, the
- command line and config files are far more appropriate for rapidly
- evolving backends.
-
-
- Internals Notes
- ===============
-
- TODO: most of this should move in to code comments or POD.
-
-
- The VCP::Rev::*_info fields
- ---------------------------
-
- The intent of this field in VCP::Rev is to capture source
- repository information that does not survive least common
- denominator processing, like p4 or CVS file modes.
-
- - Destinations and filters could then use this to convert from
- source flags like CVS's keyword expansion controls or p4's
- stored-compressed-or-not flags to the destination's.
-
- - These are not intended for internal processing, though you had
- no way of knowing that
-
- - VCP::Revs may be serialized between filters, so storing refs
- in them is no longer a Good Idea
-
- This should really be replaced by a more general mechanism like a
- source_info member that is a HASH keyed by plugin ID (Perl package name
- plus position in the plugins chain? Dunno), which then contains a HASH
- of data members in plugin-specific format. Or something...
-
-
- Filter Working Set Data
- -----------------------
-
- Filters should store internal-use-only data off to the side in a data
- member or, for data sets that can get big, perhaps in a VCP::DB_File
- structure
-
- Filter sets that need to share data should also store them off to the
- side and coordinate with eachother.
-
- - If this is a need, VCP will need to provide some sort of rendevous
- mechanism, probably using the feature negotiation mechanism that
- should also replace sort_filter() mentioned elsewhere.
-
-
- Cloned Revs and placeholders
- ----------------------------
-
- One type of placeholder has the action "clone". This is used (so far)
- for CVS branches that are given multiple branch tags, so a master branch
- is cloned on to several "clone" branches using placeholder revs that
- have a previous_id of the rev on the branch master. It is likely to
- have use for VSS shares as well but that's not implemented as of VCP v
- 0.9.
-
- This diagram illustrates what happens when a rev 1.10 is branched on to
- physical branch 1.10.2 and that branch is given three branch tags
- "bt_1", "bt_2", and "bt_3". The arcs in the graph represent VCP::Rev
- instances. "B" revs have $r->is_branch_rev TRUE (action eq "branch"),
- while the "C" revs have $r->is_clone_rev TRUE (action eq "clone"). Both
- have $r->is_placeholder_rev TRUE.
-
- ||||||||||||||||||||||||||||||||||||||||
- 1.10
- | \B
- . 1.10.2.1<bt_1>----------------+
- . | \C \C
- . | 1.10.2.1<bt_2> 1.10.2.1<bt_3>
- |
- 1.10.2.2<bt_1>----------------+
- | \ \C
- | 1.10.2.2<bt_2> 1.10.2.2<bt_3>
- |
- 1.10.2.3<bt_1>----------------+
- | \C \C
- | 1.10.2.3<bt_2> 1.10.2.3<bt_3>
- |
- ||||||||||||||||||||||||||||||||||||||||
-
- Kind of strange, but it seems to capture the semantic of "hey, I created
- this branch, then I also labelled it this way and that" so that
- VCP::Dest::p4 users can tell the master branch from the cloned branches.
-
-
- Per-Backend Notes
- =================
-
- Some things to be aware of when seeing data from particular backends
- (see also the LIMITATIONS sections for each of the modules you might be
- dealing with).
-
- CVS Oddities
- ------------
-
- See the discussion of Cloned Revs elsewhere.
-
- CVS does not guarantee that 1.1 is the first rev on the trunk (I've seen
- 3.0, etc). In general, people get very funky with RCS files and you can
- only count on the branches, next and symbols fields to give you
- structural information and you need to actually check to see the base
- rev on the trunk.
-
- VCP::Source::cvs can issue two revs for a deleted revision with changes,
- so a dead 1.1 would cause two VCP::Rev instances: one to create the file
- and another to delete it. This is necessary for dead 1.1 revs and for
- multiple consecutive dead but edited revs.
-
- CVS does not supply complete metadata for deletes and branches. user_id
- and time are missing for many deletes and for all branch creations.
-
-
- VCP::Dest::sort_filter()
- ------------------------
-
- The destinations insert filters using this. This is so that users don't
- need to add must-have filters to every .vcp file:
-
- - ChangeSets
- - See VCP::Dest::p4
- - StringEdit
- - This is to clean up filenames, labels, usernames, etc.,
- that would cause svn to choke.
-
- Background:
-
- - sort_filter() is a first cut at a generalized negotiation
- mechanism.
- - The VCP::Filter::svn* are the first filters that are
- requires
- - I'd like to generalize the sort_filter() implementation in
- to a generalized contract negotiation mechanism where each
- filter places guarantees in to a HASH ref and the
- downstream filters can:
- - ignore what they don't care about
- - adapt if need be
- - die for illegal input (missing guarantees)
- - warn for odd or dangerous
- - insert their own prefilters
-