P4XFER-8 #2

  • //
  • spec/
  • job/
  • P4XFER-8
  • View
  • Commits
  • Open Download .zip Download (5 KB)
# The form data below was edited by robert_cowham
# Perforce Workshop Jobs
#
#  Job:           The job name. 'new' generates a sequenced job number.
#
#  Status:        Job status; required field.  There is no enforced or
#                 promoted workflow for transition of jobs from one
#                 status to another, just a set of job status values
#                 for users to apply as they see fit.  Possible values:
#
#                 open - Issue is available to be worked on.
#
#                 inprogress - Active development is in progress.
#
#                 blocked - Issue cannot be implemented for some reason.
#
#                 fixed - Fixed, optional status to use before closed.
#                 
#                 closed - Issue has been dealt with definitively.
#
#                 punted - Decision made not to address the issue,
#                    possibly not ever.
#
#                 suspended - Decision made not to address the issue
#                    in the immediate future, but noting that it may
#                    have some merit and may be revisited later.
#
#                 duplicate - Duplicate of another issue that.
#
#                 obsolete - The need behind the request has become
#                    overcome by events.
#
#  Project:       The project this job is for. Required.
#
#  Severity:      [A/B/C] (A is highest)  Required.
#
#  ReportedBy     The user who created the job. Can be changed.
#
#  ReportedDate:  The date the job was created.  Automatic.
#
#  ModifiedBy:    The user who last modified this job. Automatic.
#
#  ModifiedDate:  The date this job was last modified. Automatic.
#
#  OwnedBy:       The owner, responsible for doing the job. Optional.
#
#  Description:   Description of the job.  Required.
#
#  DevNotes:      Developer's comments.  Optional.  Can be used to
#                 explain a status, e.g. for blocked, punted,
#                 obsolete or duplicate jobs.  May also provide
#                 additional information such as the earliest release
#                 in which a bug is known to exist.
#
# Component:      Projects may use this optional field to indicate
#                 which component of the project a givenjob is associated
#                 with.
#
#                 For the SDP, the list of components is defined in:
#                 //guest/perforce_software/sdp/tools/components.txt
#
#  Type:          Type of job [Bug/Feature].  Required.
#
#  Release:       Release in which job is intended to be fixed.

Job:	P4XFER-8

Status:	open

Project:	perforce-software-p4transfer

Severity:	B

ReportedBy:	ronprestenback

ReportedDate:	2019/04/18 17:48:43

ModifiedBy:	robert_cowham

ModifiedDate:	2021/04/06 04:49:36

OwnedBy:	ronprestenback

Description:
	Encoding issues
	
	I encountered a file that contained an accented character in the filename: Melée.jpg.  The code point for that character was E9, but the UTF-8 code point is C3 A9.  The file that was downloaded from the source server had the correct characters in the filename.  But the file name that was being tracked by the script had incorrect characters (usually the UTF-8 replacement char, uFFF0).  When it tries to open that file for add on the target server, it fails because the filename it has in memory no longer matches the name of the file that's on disk.  Additionally, the logging library threw exceptions when trying to log the filename, which led to some additional "fun".
	I still not sure exactly which encoding is the "right" one for this (since python uses different names for some encodings than what I could find elsewhere online), but I was able to work around the issue by setting self.p4.encoding = '1252' in P4Base.connect.
	
	This worked great for the filename, but introduced a new issue.  Now that the p4 connection was running in 1252 encoding, it threw exceptions when it encountered a changelist description (different changelist) that contained those microsoft "smart quotes", or as I like to call them "screw-up-perforce-encoding quotes" (as they also cause p4's encoding detection algorithm when to fail when adding a text file, if the unicode characters don't occur in the first X number of bytes of an otherwise ANSI compatible file....and they frequently don't).
	
	Note: copy/pasting that é character will often change the code page being used for encoding that character, so you'll need to type it manually if you're trying to repro the issue.  The problematic code point is E9, which you can type with a US keyboard by holding ALT while typing 0233 on the numeric keypad (has to be the numpad; the numbers at the top of the keyboard don't work as those are accelerator/shortcut keys).  I had to use HexEdit4 to verify the character was still encoded as E9, because copy/pasting the filename would "helpfully" re-encode that character using UTF-8

DevNotes:
	There is a new charset option which may help with this.
	Note that invalid encodings on Windows are sometimes best handled with Python 2.7 on Windows...

Type:	Bug
# Change User Description Committed
#2 default
#1 default