Monthly Scripting Column for May

Greetings! This continues a series of bulletins, in which we introduce one of the scripts that will be in a future release.

Tip: retrieve the data, then look at it.

Today's script

Back in 1997, there was a thread in the perforce-user mailing list in which customers asked for a script that tells you what files you need to run "p4 add" on.

Greg Spencer posted a script called p4unknown that did this. (We like the name.) We've written a variant, below, as an example of P4Ruby; the full source code is here and at the end of this page.

Note for reading any code on this page: Green text is what you'll cut/paste when you make your own P4Ruby script.

Comments should indicate the flow:

  1. Get a list of files from 'p4 fstat //myclient/...'
  2. Get a list from a recursive directory list (we use the one provided in a library function, instead of writing our own or calling the 'find' command - which might not exist on another platform)
  3. Compare the two lists. In Ruby, the set intersection operations are built-in, so it's easy!

First step: Setting up P4 object

If you're using P4Ruby, which is the Perforce hook for Ruby, then you'll need to initialize your Perforce connection:

   require "P4"
   p4 = P4.new
   p4.port = defaultPort           if defaultPort != nil
   p4.user = defaultUser           if defaultUser != nil
   p4.client = defaultClient       if defaultClient != nil
   p4.tagged
   p4.parse_forms
   p4.connect
   begin
      ...
   end
   p4.disconnect

(There will need to be an end somewhere at the end of your Perforce script, as you see in the example.)

Note that we copy this block into most of our P4Ruby programs, setting a default user/port/client in the argument processing. The other calls are handy because they foist the parsing off to someone else:

  1. The 'parse_forms' call make it easy to process client specs in the next step,
  2. and the 'tagged' call makes it easy to process the fstat output a bit later.

Second step: retrieving information from a client spec

Note how easy certain things are. "Retrieve a client spec" boils down to this:

  cl_spec = p4.fetch_client
  cl_name = cl_spec['Client']
  cl_root = cl_spec['Root']

If we'd wanted to update it and stash it back into the database, we'd use a call to 'save_client'.

  cl_spec = p4.fetch_client
  cl_name = cl_spec['Client']
  cl_spec['Root'] = "/tmp/herman"
  puts "Setting root of #{cl_name} to #{cl_spec['Root']}"
  ret = p4.save_client(cl_spec)

Third step: getting information about what's mapped in

We chose to run:

  ret = p4.run_fstat("//#{cl_name}/...")

This, in turn, runs "p4 fstat //myclient/...".

You might think, why not just run "p4 have" on every file we find in the workspace? For performance reasons, we choose not to: we'll poll the database once to retrieve data, and then look at data separately. That will save the expense of building/running many similar, small queries. (Those small queries would in turn poll the database individually.)

It turns out that "p4 fstat" returns the local pathname as one of the columns/fields in its -Ztag output, which Python and P4Ruby/P4Perl users see in a hash/dict/associative array. Although "p4 fstat" is not a trivial command, it will still be less expensive to call a single time, than other commands ("have" and "opened" for every possible file). Large sites should always examine this closely; it's often worth the ounce of investigation, or a note to Tech Support, to verify such assumptions.

There's a way to specify pathnames, that describes all the files in the client workspace. It's "//myclient/...", and it includes those mapped onto my workspace (but not sync'ed) and those opened for add (but not yet submitted). This is a tidy way to get specific information about the files mapped to your workspace, without showing clutter that wouldn't be mapped to the local area anyhow. Hence, "p4 fstat //myclient/..." provides a local pathname for every file that was mapped into the local area. (Aside: The Tech Support folk that I consulted were happy to help, and pointed out that the //myclient/... version of the syntax helped optimize some database accesses.)

The results included all files, including those that had been officially deleted, so I added a bit of follow-up to remove that specific case:

  ret = p4.run_fstat("//#{cl_name}/...").delete_if { |r| r['headAction'] == 'delete' }

Fourth step: Figuring out what's on the disk

This really has nothing to do with P4Ruby, just with scripting. We needed a recursive directory list, and used the library functions to get it:

  allFilesPresent = []
  Find.find(cl_root) do |f|
    Find.prune if f == "." || f == ".."
    allFilesPresent << f		if File.stat(f).file?
  end

The rule applies: always use library functions. Writing the code to do this will usually be nastier and more problematic.

Last step: Home free!

From there to the end, it's just a Ruby program. I invite you to look through the rest: it's just grabbing information from two sources, and the set intersection ("puts This - That") make it easy.


Reminder: Green text for P4Ruby hooks.
#

# num of calls to 'p4': 2 
# status:    tested on Darwin Mac OS X using P4Ruby API
#

require "P4"

require 'getoptlong'
require "find"

verboseOption = false
defaultPort = nil
defaultUser = nil
defaultClient = nil
options = GetoptLong.new(
	[ '--verbose', '-v',  GetoptLong::OPTIONAL_ARGUMENT],
	[ '--user', '-u',  GetoptLong::REQUIRED_ARGUMENT],
	[ '--port', '-p',  GetoptLong::REQUIRED_ARGUMENT],
	[ '--client', '-c',  GetoptLong::REQUIRED_ARGUMENT],
	[ '--help', '-h',  GetoptLong::REQUIRED_ARGUMENT],
	[ '--quiet', '-q',  GetoptLong::REQUIRED_ARGUMENT]
)
options.each do |opt, arg|
	case opt
		when "--verbose"
			verboseOption = true
		when "--user"
			defaultUser = arg
		when "--client"
			defaultClient = arg
		when "--port"
			defaultPort = arg
		when "--quiet"
			puts "'--quiet' not implemented yet."
		when "--help"
			puts options.Usage
	end
end


p4 = P4.new
p4.port = defaultPort           if defaultPort != nil
p4.user = defaultUser           if defaultUser != nil
p4.client = defaultClient       if defaultClient != nil
p4.tagged
p4.parse_forms
p4.connect

begin
  #-----------------------------------------------------------
  # first call to P4: 'p4 client -o'
  #-----------------------------------------------------------
  cl_spec = p4.fetch_client
  cl_name = cl_spec['Client']
  cl_root = cl_spec['Root']

  #-----------------------------------------------------------
  # second call to P4: 'p4 fstat //myclient/...'
  #-----------------------------------------------------------
  ret = p4.run_fstat("//#{cl_name}/...").delete_if { |r| r['headAction'] == 'delete' }


  #
  # at this point, we create two arrays to hold
  # the filenames:
  #     allFilesPerforce - from "p4 fstat //myclient/..."
  #     allFilesPresent  - from "Find.find(cl_root)"
  # we can use set operations for the tricky stuff, and
  # it's a great advert for Ruby.
  #
  # (note that we map the path-separator to be '/', regardless
  # of platform. Ruby's polite about using '/' everywhere; the
  # output of "p4 fstat" uses '\' for Windows.)
  #
  allFilesPerforce = ret.collect { |r| r['clientFile'].tr('\\', '/') }

  allFilesPresent = []
  Find.find(cl_root) do |f|
    Find.prune if f == "." || f == ".."
    allFilesPresent << f		if File.stat(f).file?
  end

  puts "List of files present in workspace, but unknown to Perforce:"
  puts (allFilesPresent - allFilesPerforce)

  puts "List of files known to Perforce, but not (yet) sync'ed to workspace:"
  puts (allFilesPerforce - allFilesPresent)


  rescue P4Exception
    p4.errors.each { |e| $stderr.puts( e ) }
    raise
end
p4.disconnect


Note: all the programs shown in these columns have been written four times: in Perl, in P4Perl, in Python, and in P4Ruby. Look into the Perforce example database for the other versions. $Id: //guest/jeff_bowles/scripts/0530ruby.html#1 $
© 2004 Perforce Corporation, Inc.