# The form data below was edited by tom_tyler
# Perforce Workshop Jobs
#
# Job: The job name. 'new' generates a sequenced job number.
#
# Status: Job status; required field. There is no enforced or
# promoted workflow for transition of jobs from one
# status to another, just a set of job status values
# for users to apply as they see fit. Possible values:
#
# open - Issue is available to be worked on.
#
# inprogress - Active development is in progress.
#
# blocked - Issue cannot be implemented for some reason.
#
# fixed - Fixed, optional status to use before closed.
#
# closed - Issue has been dealt with definitively.
#
# punted - Decision made not to address the issue,
# possibly not ever.
#
# suspended - Decision made not to address the issue
# in the immediate future, but noting that it may
# have some merit and may be revisited later.
#
# duplicate - Duplicate of another issue that.
#
# obsolete - The need behind the request has become
# overcome by events.
#
# Project: The project this job is for. Required.
#
# Severity: [A/B/C] (A is highest) Required.
#
# ReportedBy The user who created the job. Can be changed.
#
# ReportedDate: The date the job was created. Automatic.
#
# ModifiedBy: The user who last modified this job. Automatic.
#
# ModifiedDate: The date this job was last modified. Automatic.
#
# OwnedBy: The owner, responsible for doing the job. Optional.
#
# Description: Description of the job. Required.
#
# DevNotes: Developer's comments. Optional. Can be used to
# explain a status, e.g. for blocked, punted,
# obsolete or duplicate jobs. May also provide
# additional information such as the earliest release
# in which a bug is known to exist.
#
# Component: Projects may use this optional field to indicate
# which component of the project a givenjob is associated
# with.
#
# For the SDP, the list of components is defined in:
# //guest/perforce_software/sdp/tools/components.txt
#
# Type: Type of job [Bug/Feature]. Required.
#
# Release: Release in which job is intended to be fixed.
Job: SDP-654
Status: open
Project: perforce-software-sdp
Severity: C
ReportedBy: tom_tyler
ReportedDate: 2021/06/18 15:19:12
ModifiedBy: tom_tyler
ModifiedDate: 2021/06/18 15:19:12
OwnedBy: tom_tyler
Description:
Make automated clearing of ckp_running.txt configurable.
After the ckp_running.txt sempahore file was introduced into
the SDP checkpoint mechanism, the initial behavior was that the
file would be cleared upon successful completion of a checkpiont,
but remain in place until manually cleared after a failure.
The idea was to force a human admin to investigate and resolve
whatever caused the breakage. This also provided concurrency
protection in case multiple admins were working on the system at
the same time, or a human ran an off-schedule checkpoint that
might interfere with a crontab-initiated one.
That worked, but sometimes there were situations that semaphore
file was left in place due to a temporary problem, e.g. a change
of password for the 'perforce' user without updating the SDP
password file. In environments where the SDP was unattended,
checkpoints would stop evey trying to run until the ckp_running.txt
file was removed, even if the underlying probelm (e.g. updating
the password file) was addressed. Sites could go without
checkpoints for longer than necessary due to not understanding
how to clear that file.
Along the way (@22658), the logic was changed so that, after
reporting an error with ckp_running.txt, the ckp_running.txt was
immediately removed. This is a better choice for unattended
environments, as the failure is reported only once, and then the
next day, checkpoints continue to try to run. Howwever, this is
possibly a suboptimal choice for well-managed environments where
admins can be expected to be responsive to SDP notifications.
Attentive admins might prefer the original behavior that demands
and relies on the attention of human admins. Further, the original
model provided protection against accidental concurrent operations
that might occur if there are multiple admins, the protection
from which was forgone with the current approach.
It would be desirable to have the behavior configurable via
a setting in the instance_vars file. The default could be
the current behavior, optimzed for unattended operation.
Admins at larger sites with active monitoring of the SDP
(e.g checking SDP emails and/or P4Prometheus) might choose the
model where concurrent operation safety is maintained, and human
admin interaction required to resolve checkpoint failures.
option
Component: core-unix
Type: Feature