SDP-654 #1

  • //
  • spec/
  • job/
  • SDP-654
  • View
  • Commits
  • Open Download .zip Download (5 KB)
# The form data below was edited by tom_tyler
# Perforce Workshop Jobs
#
#  Job:           The job name. 'new' generates a sequenced job number.
#
#  Status:        Job status; required field.  There is no enforced or
#                 promoted workflow for transition of jobs from one
#                 status to another, just a set of job status values
#                 for users to apply as they see fit.  Possible values:
#
#                 open - Issue is available to be worked on.
#
#                 inprogress - Active development is in progress.
#
#                 blocked - Issue cannot be implemented for some reason.
#
#                 fixed - Fixed, optional status to use before closed.
#                 
#                 closed - Issue has been dealt with definitively.
#
#                 punted - Decision made not to address the issue,
#                    possibly not ever.
#
#                 suspended - Decision made not to address the issue
#                    in the immediate future, but noting that it may
#                    have some merit and may be revisited later.
#
#                 duplicate - Duplicate of another issue that.
#
#                 obsolete - The need behind the request has become
#                    overcome by events.
#
#  Project:       The project this job is for. Required.
#
#  Severity:      [A/B/C] (A is highest)  Required.
#
#  ReportedBy     The user who created the job. Can be changed.
#
#  ReportedDate:  The date the job was created.  Automatic.
#
#  ModifiedBy:    The user who last modified this job. Automatic.
#
#  ModifiedDate:  The date this job was last modified. Automatic.
#
#  OwnedBy:       The owner, responsible for doing the job. Optional.
#
#  Description:   Description of the job.  Required.
#
#  DevNotes:      Developer's comments.  Optional.  Can be used to
#                 explain a status, e.g. for blocked, punted,
#                 obsolete or duplicate jobs.  May also provide
#                 additional information such as the earliest release
#                 in which a bug is known to exist.
#
# Component:      Projects may use this optional field to indicate
#                 which component of the project a givenjob is associated
#                 with.
#
#                 For the SDP, the list of components is defined in:
#                 //guest/perforce_software/sdp/tools/components.txt
#
#  Type:          Type of job [Bug/Feature].  Required.
#
#  Release:       Release in which job is intended to be fixed.

Job:	SDP-654

Status:	open

Project:	perforce-software-sdp

Severity:	C

ReportedBy:	tom_tyler

ReportedDate:	2021/06/18 15:19:12

ModifiedBy:	tom_tyler

ModifiedDate:	2021/06/18 15:19:12

OwnedBy:	tom_tyler

Description:
	Make automated clearing of ckp_running.txt configurable.
	
	After the ckp_running.txt sempahore file was introduced into
	the SDP checkpoint mechanism, the initial behavior was that the
	file would be cleared upon successful completion of a checkpiont,
	but remain in place until manually cleared after a failure.
	The idea was to force a human admin to investigate and resolve
	whatever caused the breakage. This also provided concurrency
	protection in case multiple admins were working on the system at
	the same time, or a human ran an off-schedule checkpoint that
	might interfere with a crontab-initiated one.
	
	That worked, but sometimes there were situations that semaphore
	file was left in place due to a temporary problem, e.g. a change
	of password for the 'perforce' user without updating the SDP
	password file.  In environments where the SDP was unattended,
	checkpoints would stop evey trying to run until the ckp_running.txt
	file was removed, even if the underlying probelm (e.g. updating
	the password file) was addressed. Sites could go without
	checkpoints for longer than necessary due to not understanding
	how to clear that file.
	
	Along the way (@22658), the logic was changed so that, after
	reporting an error with ckp_running.txt, the ckp_running.txt was
	immediately removed.  This is a better choice for unattended
	environments, as the failure is reported only once, and then the
	next day, checkpoints continue to try to run.  Howwever, this is
	possibly a suboptimal choice for well-managed environments where
	admins can be expected to be responsive to SDP notifications.
	
	Attentive admins might prefer the original behavior that demands
	and relies on the attention of human admins. Further, the original
	model provided protection against accidental concurrent operations
	that might occur if there are multiple admins, the protection
	from which was forgone with the current approach.
	
	It would be desirable to have the behavior configurable via
	a setting in the instance_vars file.  The default could be
	the current behavior, optimzed for unattended operation.
	Admins at larger sites with active monitoring of the SDP
	(e.g checking SDP emails and/or P4Prometheus) might choose the
	model where concurrent operation safety is maintained, and human
	admin interaction required to resolve checkpoint failures.
	
	option

Component:	core-unix

Type:	Feature
# Change User Description Committed
#1 default