package VCP::Source::p4 ; =head1 NAME VCP::Source::p4 - A Perforce p4 repository source =head1 SYNOPSIS vcp p4://depot/...@10 # all files after change 10 applied vcp p4://depot/...@1,10 # changes 1..10 vcp p4://depot/...@-2,10 # changes 8..10 vcp p4://depot/...@1,#head # changes 1..#head vcp p4://depot/...@-2,#head # changes 8..10 vcp p4:...@-2,#head # changes 8..10, if only one depot To specify a user name of 'user', P4PASSWD 'pass', and port 'host:1666', use this syntax: vcp p4:user(client)password@host:1666:files Note: the password will be passed in the environment variable P4PASSWD so it shouldn't show up in error messages. This means that a password specified in a P4CONFIG file will override the password you set on the command line. This is a bug. User, client and the server string will be passed as command line options to make them show up in error output. You may use the P4... environment variables instead of any or all of the fields in the p4: repository specification. The repository spec overrides the environment variables. =head1 DESCRIPTION Driver to allow L<vcp|vcp> to extract files from a L<Perforce|http://perforce.com/> repository. Note that not all metadata is extracted: users, clients and job tracking information is not exported, and only label names are exported. Also, the 'time' and 'mod_time' attributes will lose precision, since p4 doesn't report them down to the minute. Hmmm, seems like p4 never sets a true mod_time. It gets set to either the submit time or the sync time. From C<p4 help client>: modtime Causes 'p4 sync' to force modification time to when the file was submitted. nomodtime * Leaves modification time set to when the file was fetched. =head1 OPTIONS =over =item -b, --bootstrap -b '...' --bootstrap='...' -b file1[,file2[,...]] --bootstrap=file1[,file2[,...]] (the C<...> there is three periods, a L<Regexp::Shellish|Regexp::Shellish> wildcard borrowed from C<p4> path syntax). Forces bootstrap mode for an entire export (-b '...') or for certain files. Filenames may contain wildcards, see L<Regexp::Shellish> for details on what wildcards are accepted. Controls how the first revision of a file is exported. A bootstrap export contains the entire contents of the first revision in the revision range. This should only be necessary when exporting for the first time. An incremental export contains a digest of the revision preceding the first revision in the revision range, followed by a delta record between that revision and the first revision in the range. This allows the destination import function to make sure that the incremental export begins where the last export left off. The default is decided on a per-file basis: if the first revision in the range is revision #1, the full contents are exported. Otherwise an incremental export is done for that file. This option is necessary when exporting only more recent revisions from a repository. =item -r, --rev-root B<Experimental>. Falsifies the root of the source tree being extracted; files will appear to have been extracted from some place else in the hierarchy. This can be useful when exporting RevML, the RevML file can be made to insert the files in to a different place in the eventual destination repository than they existed in the source repository. The default C<rev-root> is the file spec up to the first path segment (directory name) containing a wildcard, so p4:/a/b/c... would have a rev-root of C</a/b>. In direct repository-to-repository transfers, this option should not be necessary, the destination filespec overrides it. =back =head1 METHODS =over =cut $VERSION = 1.0 ; use strict ; use Carp ; use Getopt::Long ; use Fcntl qw( O_WRONLY O_CREAT ) ; use VCP::Debug ":debug" ; use Regexp::Shellish qw( :all ) ; use VCP::Rev ; use IPC::Run qw( run io timeout new_chunker ) ; use base qw( VCP::Source VCP::Utils::p4 ) ; use fields ( 'P4_REPO_CLIENT', ## Set by p4_parse_repo_spec in VCP::Utils::p4 'P4_INFO', ## Results of the 'p4 info' command 'P4_LABEL_CACHE', ## ->{$name}->{$rev} is a list of labels for that rev # 'P4_LABELS', ## Array of labels from 'p4 labels' 'P4_MAX', ## The last change number needed 'P4_MIN', ## The first change number needed ) ; =item new Creates a new instance of a VCP::Source::p4. Contacts the p4d using the p4 command and gets some initial information ('p4 info' and 'p4 labels'). =cut sub new { my $class = shift ; $class = ref $class || $class ; my VCP::Source::p4 $self = $class->SUPER::new( @_ ) ; ## Parse the options my ( $spec, $options ) = @_ ; $self->parse_p4_repo_spec( $spec ) ; my $rev_root ; GetOptions( 'b|bootstrap:s' => sub { my ( $name, $val ) = @_ ; $self->bootstrap( $val ) ; }, 'r|rev-root' => \$rev_root, ) or $self->usage_and_exit ; my $name = $self->repo_filespec ; unless ( defined $rev_root ) { if ( length $name >= 2 && substr( $name, 0, 2 ) ne '//' ) { ## No depot on the command line, default it to the only depot ## or error if more than one. my $depots ; $self->p4( ['depots'], \$depots ) ; $depots = 'depot' unless length $depots ; my @depots = split( /^/m, $depots ) ; die "vcp: p4 has more than one depot, can't assume //depot/...\n" if @depots > 1 ; debug "vcp: defaulting depot to '$depots[0]'" if debugging $self ; $name = join( '/', '/', $depots[0], $name ) ; } $self->deduce_rev_root( $name ) ; } else { $self->rev_root( $rev_root ) ; } die "no depot name specified for p4 source '$name'\n" unless $name =~ m{^//[^/]+/} ; $self->repo_filespec( $name ) ; $self->load_p4_info ; $self->load_p4_labels ; return $self ; } sub load_p4_info { my VCP::Source::p4 $self = shift ; my $errors = '' ; $self->p4( ['info'], \$self->{P4_INFO} ) ; } sub is_incremental { my VCP::Source::p4 $self= shift ; my ( $file, $first_rev ) = @_ ; my $bootstrap_mode = $first_rev == 1 || $self->is_bootstrap_mode( $file ) ; return ! $bootstrap_mode ; } # A typical entry in the filelog looks like #-------8<-------8<------ #//revengine/revml.dtd #... #6 change 11 edit on 2000/08/28 by barries@barries (text) # # Rev 0.008: Added some modules and tests and fixed lots of bugs. # #... #5 change 10 edit on 2000/08/09 by barries@barries (text) # # Got Dest/cvs working, lots of small changes elsewhere # #-------8<-------8<------ # And, from a more tangled source tree, perl itself: #-------8<-------8<------ #... ... branch into //depot/ansiperl/x2p/a2p.h#1 #... ... ignored //depot/maint-5.004/perl/x2p/a2p.h#1 #... ... copy into //depot/oneperl/x2p/a2p.h#3 #... ... copy into //depot/win32/perl/x2p/a2p.h#2 #... #2 change 18 integrate on 1997/05/25 by mbeattie@localhost (text) # # First stab at 5.003 -> 5.004 integration. # #... ... branch into //depot/lexwarn/perl/x2p/a2p.h#1 #... ... branch into //depot/oneperl/x2p/a2p.h#1 #... ... copy from //depot/relperl/x2p/a2p.h#2 #... ... branch into //depot/win32/perl/x2p/a2p.h#1 #... #1 change 1 add on 1997/03/28 by mbeattie@localhost (text) # # Perl 5.003 check-in # #... ... branch into //depot/mainline/perl/x2p/a2p.h#1 #... ... branch into //depot/relperl/x2p/a2p.h#1 #... ... branch into //depot/thrperl/x2p/a2p.h#1 #-------8<-------8<------ # # This next regexp is used to parse the lines beginning "... #" my $filelog_rev_info_re = qr{ \G # Use with /gc!! ^\.\.\.\s+ \#(\d+)\s+ # Revision change\s+(\d+)\s+ # Change nubmer (\S+)\s+ # Action \S+\s+ ### 'on ' (\S+)\s+ # date \S+\s+ ### 'by ' (\S(?:.*?\S))\s+ # user id. Undelimited, so hope for best \((\S+?)\) # type .*\r?\n }mx ; # And this one grabs the comment my $filelog_comment_re = qr{ \G ^\r?\n ((?:^[^\S\r\n].*\r?\n)*) ^\r?\n }mx ; sub scan_filelog { my VCP::Source::p4 $self = shift ; my ( $first_change_id, $last_change_id ) = @_ ; my $log = '' ; my $delta = $last_change_id - $first_change_id + 1 ; my $spec = join( '', $self->repo_filespec . '@' . $last_change_id ) ; my $temp_f = $self->command_stderr_filter ; $self->command_stderr_filter( qr{//\S* - no file\(s\) at that changelist number\.\s*\n} ) ; my %oldest_revs ; { my $log_state = "need_file" ; my VCP::Rev $r ; my $name ; my $comment ; my $p4_filelog_parser = sub { local $_ = shift ; REDO_LINE: if ( $log_state eq "need_file" ) { die "\$r defined" if defined $r ; die "vcp: p4 filelog parser: file name expected, got '$_'" unless m{^//(.*?)\r?\n\r?} ; $name = $1 ; $log_state = "revs" ; } elsif ( $log_state eq "revs" ) { return if m{^\.\.\.\s+\.\.\..*\r?\n\r?} ; unless ( m{$filelog_rev_info_re} ) { $log_state = "need_file" ; goto REDO_LINE ; } my $change_id = $2 ; if ( $change_id < $self->min ) { undef $r ; $log_state = "need_comment" ; return ; } my $type = $6 ; my $norm_name = $self->normalize_name( $name ) ; die "\$r defined" if defined $r ; $r = VCP::Rev->new( name => $norm_name, rev_id => $1, change_id => $change_id, action => $3, time => $self->parse_time( $4 ), user_id => $5, p4_info => $_, comment => '', ) ; my $is_binary = $type =~ /^(?:u?x?binary|x?tempobj|resource)/ ; $r->type( $is_binary ? "binary" : "text" ) ; $r->labels( $self->get_p4_file_labels( $name, $r->rev_id ) ); ## Filelogs are in newest...oldest order, so this should catch ## the oldest revision of each file. $oldest_revs{$name} = $r ; debug "vcp: ", $r->as_string if debugging $self ; $log_state = "need_comment" ; } elsif ( $log_state eq "need_comment" ) { unless ( /^$/ ) { die "vcp: p4 filelog parser: expected a blank line before a comment, got '$_'" ; } $log_state = "comment_accum" ; } elsif ( $log_state eq "comment_accum" ) { if ( /^$/ ) { if ( defined $r ) { $r->comment( $comment ) ; $self->revs->add( $r ) ; $r = undef ; } $comment = undef ; $log_state = "revs" ; return ; } unless ( s/^\s// ) { die "vcp: p4 filelog parser: expected a comment line, got '$_'" ; } $comment .= $_ ; } else { die "unknown log_state '$log_state'" ; } } ; $self->p4( [qw( filelog -m ), $delta, "-l", $spec ], '>', new_chunker, $p4_filelog_parser ) ; $self->command_stderr_filter( $temp_f ) ; die "\$r defined" if defined $r ; } my @base_rev_specs ; for my $name ( sort keys %oldest_revs ) { my $r = $oldest_revs{$name} ; my $rev_id = $r->rev_id ; if ( $self->is_incremental( "//$name", $r->rev_id ) ) { $rev_id -= 1 ; push @base_rev_specs, "//$name#$rev_id" ; } else { debug "vcp: bootstrapping '", $r->name, "#", $r->rev_id, "'" if debugging $self ; } $oldest_revs{$name} = undef ; } if ( @base_rev_specs ) { undef $log ; $self->command_stderr_filter( qr{//\S* - no file\(s\) at that changelist number\.\s*\n} ) ; $self->p4( [qw( filelog -m 1 -l ), @base_rev_specs ], \$log ) ; $self->command_stderr_filter( $temp_f ) ; while ( $log =~ m{\G(.*?)^//(.*?)\r?\n\r?}gmsc ) { warn "vcp: Ignoring '$1' in p4 filelog output\n" if length $1 ; my $name = $2 ; my $norm_name = $self->normalize_name( $name ) ; while () { next if $log =~ m{\G^\.\.\.\s+\.\.\..*\r?\n\r?}gmc ; last unless $log =~ m{$filelog_rev_info_re}gc ; my VCP::Rev $br = VCP::Rev->new( name => $norm_name, rev_id => $1, change_id => $2, # Don't send these on a base rev for incremental changes: # action => $3, # time => $self->parse_time( $4 ), # user_id => $5, type => $6, # comment => '', ) ; $self->revs->add( $br ) ; $log =~ m{$filelog_comment_re}gc ; } } } } sub min { my VCP::Source::p4 $self = shift ; $self->{P4_MIN} = shift if @_ ; return $self->{P4_MIN} ; } sub max { my VCP::Source::p4 $self = shift ; $self->{P4_MAX} = shift if @_ ; return $self->{P4_MAX} ; } sub load_p4_labels { my VCP::Source::p4 $self = shift ; my $labels = '' ; my $errors = '' ; $self->p4( ['labels'], \$labels ) ; my @labels = map( /^Label\s*(\S*)/ ? $1 : (), split( /^/m, $labels ) ) ; $self->command_ok_result_codes( 0, 1 ) ; my $marker = "//.../NtLkly" ; my $p4_files_args = join( "", ( map { ( "$marker\n", "//...\@$_\n" ) ; } @labels ), ) ; my $files ; $self->p4( [ qw( -x - -s files) ], "<", \$p4_files_args, ">", \$files ) ; my $label ; for my $spec ( split /\n/m, $files ) { last if $spec =~ /^exit:/ ; if ( $spec =~ /^error: $marker/o ) { $label = shift @labels ; next ; } next if $spec =~ m{^error: //\.\.\.\@.+ file(\(s\))? not in label.$} ; $spec =~ /^.*?: *\/\/(.*)#(\d+)/ or die "Couldn't parse name & rev from '$spec' in '$files'" ; debug "vcp: p4 label '$label' => '$1#$2'" if debugging $self ; push @{$self->{P4_LABEL_CACHE}->{$1}->{$2}}, $label ; } $self->command_ok_result_codes( 0 ) ; return ; } sub denormalize_name { my VCP::Source::p4 $self = shift ; return '//' . $self->SUPER::denormalize_name( @_ ) ; } sub get_p4_file_labels { my VCP::Source::p4 $self = shift ; my $name ; my VCP::Rev $rev ; ( $name, $rev ) = @_ ; return ( ( exists $self->{P4_LABEL_CACHE}->{$name} && exists $self->{P4_LABEL_CACHE}->{$name}->{$rev} ) ? @{$self->{P4_LABEL_CACHE}->{$name}->{$rev}} : () ) ; } my $filter_prog = <<'EOPERL' ; use strict ; my ( $name, $working_path ) = ( shift, shift ) ; } EOPERL sub get_rev { my VCP::Source::p4 $self = shift ; my VCP::Rev $r ; ( $r ) = @_ ; return if defined $r->action && $r->action eq "delete" ; my $fn = $r->name ; my $rev = $r->rev_id ; $r->work_path( $self->work_path( $fn, $rev ) ) ; my $wp = $r->work_path ; $self->mkpdir( $wp ) ; my $denormalized_name = $self->denormalize_name( $fn ) ; my $rev_spec = "$denormalized_name#$rev" ; sysopen( WP, $wp, O_CREAT | O_WRONLY ) or die "$!: $wp" ; binmode WP ; my $re = quotemeta( $rev_spec ) . " - .* change \\d+ \\((.+)\\)"; ## TODO: look for "+x" in the (...) and pass an executable bit ## through the rev structure. $self->p4( [ "print", $rev_spec ], ">", sub { $_ = shift ; s/\A$re\r?\n//m if $re ; print WP or die "$! writing to $wp" ; $re = undef ; }, ) ; close WP or die "$! closing wp" ; return ; } sub handle_header { my VCP::Source::p4 $self = shift ; my ( $header ) = @_ ; $header->{rep_type} = 'p4' ; $header->{rep_desc} = $self->{P4_INFO} ; $header->{rev_root} = $self->rev_root ; $self->dest->handle_header( $header ) ; return ; } sub copy_revs { my VCP::Source::p4 $self = shift ; $self->revs( VCP::Revs->new ) ; $self->scan_filelog( $self->min, $self->max ) ; $self->dest->sort_revs( $self->revs ) ; ## Discard the revs so they'll be DESTROYed and thus ## clean up after themselves. while ( my VCP::Rev $r = $self->revs->shift ) { $self->get_rev( $r ) ; $self->dest->handle_rev( $r ) ; } } =head1 SEE ALSO L<VCP::Dest::p4>, L<vcp>. =head1 AUTHOR Barrie Slaymaker <barries@slaysys.com> =head1 COPYRIGHT Copyright (c) 2000, 2001, 2002 Perforce Software, Inc. All rights reserved. See L<VCP::License|VCP::License> (C<vcp help license>) for the terms of use. =cut 1
# | Change | User | Description | Committed | |
---|---|---|---|---|---|
#1 | 1375 | Sean McCune | Creating my own branch for work on vcp. | ||
//guest/perforce_software/revml/lib/VCP/Source/p4.pm | |||||
#14 | 1367 | Barrie Slaymaker | lots of docco updates | ||
#13 | 1358 | Barrie Slaymaker | Win32 changes | ||
#12 | 814 | Barrie Slaymaker |
Use p4's -x when listing all files in all labels. Much nicer. |
||
#11 | 722 | Barrie Slaymaker | Remove old optimization code that was no longer optimizing anything. | ||
#10 | 703 | Barrie Slaymaker | VCP::Source::p4 now uses VCP::Utils::p4::parse_p4_repo_spec() | ||
#9 | 628 | Barrie Slaymaker | Cleaned up POD in bin/vcp, added BSD-style license. | ||
#8 | 619 | Barrie Slaymaker |
Avoid using p4 print -s, it puts linebreaks in every 4098 characters or so. |
||
#7 | 608 | Barrie Slaymaker |
Lots of changes to get vcp to install better, now up to 0.066. Many thanks to Matthew Attaway for testing & suggestions. |
||
#6 | 480 | Barrie Slaymaker |
0.06 Wed Dec 20 23:19:15 EST 2000 - bin/vcp: Added --versions, which loads all modules and checks them for a $VERSION and print the results out. This should help with diagnosing out-of-sync modules. - Added $VERSION vars to a few modules :-). Forgot to increment any $VERSION strings. - VCP::Dest::cvs: The directory "deeply" was not being `cvs add`ed on paths like "a/deeply/nested/file", assuming "deeply" had no files in it. - VCP::Dest::revml: fixed a bug that was causing files with a lot of linefeeds to be emitted in base64 instead of deltaed. This means most text files. - Various minor cleanups of diagnostics and error messages, including exposing "Can't locate Foo.pm" when a VCP::Source or VCP::Dest module depends on a module that's not installed, as reported by Jeff Anton. |
||
#5 | 478 | Barrie Slaymaker |
0.05 Mon Dec 18 07:27:53 EST 2000 - Use `p4 labels //...@label` command as per Rober Cowham's suggestion, with the '-s' flag recommended by Christopher Siewald and Amaury.FORGEOTDARC@atsm.fr. Though it's actually something like vcp: running /usr/bin/p4 -u safari -c safari -p localhost:5666 -s files //.../NtLkly //...@compiler_a3 //.../NtLkly //...@compiler_may3 and so //on //for 50 parameters to get the speed up. I use the //.../NtLkly "file" as //a separator between the lists of files in various //revisions. Hope nobody has any files named that :-). What I should do is choose a random label that doesn't occur in the labels list, I guess. - VCP::Source::revml and VCP::Dest::revml are now binary, control code, and "hibit ASCII" (I know, that's an oxymoron) clean. The <comment>, <delta>, and <content> elements now escape anything other than tab, line feed, space, or printable chars (32 <= c <= ASCII 126) using a tag like '<char code="0x09">'. The test suite tests all this. Filenames should also be escaped this way, but I didn't get to that. - The decision whether to do deltas or encode the content in base64 is now based on how many characters would need to be escaped. - We now depend on the users' diff program to have a "-a" option to force it to diff even if the files look binary to it. I need to use Diff.pm and adapt it for use on binary data. - VCP::Dest::cvs now makes sure that no two consecutive revisions of the same file have the same mod_time. VCP::Source::p4 got so fast at pulling revisions from the repositories the test suite sets up that CVS was not noticing that files had changed. - VCP::Plugin now allows you to set a list of acceptable result codes, since we now use p4 in ways that make it return non-zero result codes. - VCP::Revs now croaks if you try to add two entries of the same VCP::Rev (ie matching filename and rev_id). - The <type> tag is now limited to "text" or "binary", and is meant to pass that level of info between foreign repositories. - The <p4_info> on each file now carries the one line p4 description of the file so that p4->p4 transferes can pick out the more detailed info. VCP::Source::p4, VCP::Dest::p4 do this. - VCP::{Source,Dest}::{p4,cvs} now set binaryness on added files properly, I think. For p4->p4, the native p4 type is preserved. For CVS sources, seeing the keyword substitution flag 'o' or 'b' implies binaryness, for p4, seeing a filetype like qr/u?x?binary/ or qr/x?tempobj/ or "resource" implies binaryness (to non-p4 destinations). NOTE: Seeing a 'o' or 'b' in a CVS source only ends up setting the 'b' option on the destination. That should be ok for most uses, but we can make it smarter for cvs->cvs transfers if need be. |
||
#4 | 473 | Barrie Slaymaker |
0.04 Tue Dec 12 00:15:57 EST 2000 - Reorg of VCP::Source::p4 - One large filelog command is run instead of many small ones. This takes advantage of the -m option to make sure enough changes are listed. Many extra revisions of most files are probably listed, but listing and ignoring them is quicker than spawning p4 over and over. Wish p4 filelog had a revision range... - it now doesn't suck the entire filelog output in to memory, it parses it line by line as it's emitted from the `p4 filelog` - `p4 print` is now used to print a bunch of files at once, using the header line to separate one file from the next, kind of like splitting a mime-encoded message. There's a very slight chance that it will misjudge the boundary between two files if a file happens to have a line that looks very much like the header line for the next file. This is pretty unlikely and I'll fix it if it crops up. I could batch them more, right now it never puts two revisions of the same filename in the same batch, for no really good reason. Another method might be to batch 25 or 50 revs each time. - it turns out there's a problem spawning multiple p4 commands at the same time against the same p4d (p4d is 99.2, FWIW). Or at least running large `p4 files ...` while there's a large `p4 filelog` still also running. - filelog lines beginning with "... ..." are now ignored. These are notifications of copy, branch, and integrate events that we don't yet do anything with. - deleted cur() and P4_CUR - deleted P4_IS_INCREMENTAL - Made an assertion in VCP::Dest::revml::handle_rev() a little clearer - Added some ok(1) calls to 90p4.t to make it easier to figure out which child process is whining or aborting - Made the message that's printed when a subcommand emits unexpected output say "stderr" instead of "stdout". - Cleaned up documentation for VC::Plugin::work_path(). |
||
#3 | 470 | Barrie Slaymaker |
- Cleaned up VCP::Source::p4 a bit. It doesn't whine as much now when it sees what it considers to be old news in the log file. - Added an easy way to monitor the commands being issued to a repository: simply add "::cmd" to the debug spec for that source: vcp -d Source::cvs::cvs vcp -d Dest::p4::p4 - The next step is to use the -m option to p4 filelog to speed things up. |
||
#2 | 468 | Barrie Slaymaker |
- VCP::Dest::p4 now does change number aggregation based on the comment field changing or whenever a new revision of a file with unsubmitted changes shows up on the input stream. Since revisions of files are normally sorted in time order, this should work in a number of cases. I'm sure we'll need to generalize it, perhaps with a time thresholding function. - t/90cvs.t now tests cvs->p4 replication. - VCP::Dest::p4 now doesn't try to `p4 submit` when no changes are pending. - VCP::Rev now prevents the same label from being applied twice to a revision. This was occuring because the "r_1"-style label that gets added to a target revision by VCP::Dest::p4 could duplicate a label "r_1" that happened to already be on a revision. - Added t/00rev.t, the beginnings of a test suite for VCP::Rev. - Tweaked bin/gentrevml to comment revisions with their change number instead of using a unique comment for every revision for non-p4 t/test-*-in-0.revml files. This was necessary to test cvs->p4 functionality. |
||
#1 | 467 | Barrie Slaymaker | Version 0.01, initial checkin in perforce public depot. |