session_log_2026-06-16.md #12

Session Log — 2026-06-16

Work on trim_excess_metadata.sh. Bumped from v2.2.0 to v3.0.0.

Ask

Review ToDo.md and implement all feasible items.
Set up a realistic lab test environment (full simulation of post-divestiture standalone server).
Run the script live against the test environment and verify results.
Add p4 snap phase to resolve lazy-copy archive leakage.

Key Decisions

Version: 3.0.0 — major change warranting major version bump.
Shelf delete failures: Keep as errors (not journal patch). User may handle via journal patch manually later.
Server spec to keep: p4d_ffr_gf (the filtered forwarding replica spec).
Style: All spec-piping operations use temp files (mktemp) so the spec content is visible in error messages.
p4 snap: New Phase 17. Without -snap flag, propose commands; with -snap, execute. Always run snap BEFORE applying the journal patch.

Lab Environment

Topology (Battle School Workshop — 5 servers, same subnet)

Host	Role	ServerID	Port
p4c-bos-01	Commit (master)	commit.p4demo.1	1999
p4c-bos-02	HA standby	p4d_ha_bos	1999
p4c-nyc-03	DR standby → sacrificed for test	p4d_fs_nyc → p4d_ffr_gf → p4d_commit_gf	1999
p4c-syd-04	Edge	p4d_edge_syd	1999
p4c-syd-05	Edge HA	p4d_ha_edge_syd	1999

All SSH accessible as perforce OS user without password.

Test Setup Procedure (Lab 0 → trim test environment on nyc-03)

Starting state: Standard Battle School Lab 0 reset.

Step 1: Add gf site tag.

if ! grep -q ^gf /p4/common/config/SiteTags.cfg 2>&1; then
  echo 'gf: GF - Filtered Forwarding Replica test site' >> /p4/common/config/SiteTags.cfg
fi

Step 2: Run mkrep.sh (from bos-01, as perforce):

mkrep.sh -t ffr -s gf -r p4c-nyc-03 -i 1

Creates: server spec p4d_ffr_gf, service user svc_p4d_ffr_gf, all configurables.

Step 3: Add RevisionDataFilter and ArchiveDataFilter to p4d_ffr_gf server spec:

RevisionDataFilter: //jam/...
                    //pb/...
ArchiveDataFilter: //jam/...
                   //pb/...

Step 4: Rotate journal, create filtered seed checkpoint, stop p4d_fs_nyc, reset ServerID to p4d_ffr_gf, load checkpoint.

# On bos-01:
rotate_journal.sh 1
p4d_1 -r /p4/1/offline_db -P p4d_ffr_gf -J off -Z \
  -jd /p4/1/checkpoints.ffr_gf/p4_1.ckp.ffr_gf.1.gz

# On nyc-03:
sudo systemctl stop p4d_1
mkdir -p /p4/1/checkpoints.ffr_gf
echo 'p4d_ffr_gf' > /p4/1/root/server.id
load_checkpoint.sh 1 /p4/1/checkpoints.ffr_gf/p4_1.ckp.ffr_gf.1.gz
sudo systemctl start p4d_1

Fix service user password expiry (dm.user.resetpassword=1 triggers on new users):

# On bos-01:
p4 passwd svc_p4d_ffr_gf   # set a known password
# On nyc-03 — log the service user in:
p4 -p 1999 -u svc_p4d_ffr_gf login   # enter password set above

Step 5: Blast depot archives on nyc-03:

ssh perforce@p4c-nyc-03 'rm -rf /p4/1/depots/*'

Step 6: Pull filtered archives via p4verify:

ssh perforce@p4c-nyc-03 'p4verify.sh 1'

Expected: only //jam/... and //pb/... archive files pulled. But note: lazy-copy archives may land in /p4/1/depots/depot/ even though //depot/... is filtered — this is correct behaviour (see Lazy Copy section below).

Step 7: Promote to standalone commit:

# On nyc-03:
sudo systemctl stop p4d_1
echo 'p4d_commit_gf' > /p4/1/root/server.id   # new ID — avoids inheriting db.replication=readonly
sudo systemctl start p4d_1
# Clear auth.id so the standalone handles its own auth (no longer points to central auth cluster):
P4PORT=localhost:1999 p4 configure unset auth.id
# Do NOT touch run.users.authorize — just ensure you are logged in:
P4PORT=localhost:1999 p4 login   # enter admin password

Verify: p4 -p p4c-nyc-03:1999 info → ServerID: p4d_commit_gf, Services: standard

Step 8: Create test config files (on bos-01, in /home/perforce/tem/):

cat > .p4config.gf <<EOF
P4PORT=p4c-nyc-03:1999
P4USER=perforce
EOF
echo "perforce" > keep_users.gf.txt
echo "testers" > keep_groups.gf.txt   # Randall_Scott is sole member → tests last-group-member fix

Step 9: Run the script:

# Dry run first:
bash trim_excess_metadata.sh gf
# Then live:
bash trim_excess_metadata.sh gf -y
# Then with snap:
bash trim_excess_metadata.sh gf -y -snap

Lazy Copy / Archive Leakage (CRITICAL)

Background

Perforce uses a lazy copy mechanism for branching: instead of physically copying archive files, db.rev records in the new path point to the existing archive in the original path via db.storage. No physical file copy happens at branch time.

Consequence for filtered replication

When a RevisionDataFilter keeps //jam/... but filters out //depot/..., p4d during p4verify.sh will pull all archives needed to make //jam/... fully accessible — including archives physically stored in /p4/1/depots/depot/ that serve as the backing storage for files in //jam/... via lazy copy. This is correct and expected.

After the filtered replica is promoted to standalone:

//depot/... has 0 db.rev records (filtered) → appears empty
//depot/... db.storage still has entries for lazy-copy-source archives
Physical archives exist in /p4/1/depots/depot/ that //jam/... depends on
p4 depot -df depot → fails with "isn't empty of archive contents"

This is observed in our lab:

/p4/1/depots/depot/: 78 archive files (Jam MAIN src) — needed by //jam/...
p4 snap -n //jam/... confirms: 50+ files in //jam/... are lazy copies from //depot/Jam/MAIN/src/

Resolution: p4 snap + journal patch

p4 snap //jam/... //pb/... — physically copies archives into their target depot directories, breaks lazy-copy chains. After snap: jam/ has 348 archive files, pb/ has 441 (up from 4 and 9 respectively).
Journal patch — removes depot spec entries from db.domain via p4d -jr <file> while offline.
After snap AND journal patch: archives in /p4/1/depots/depot/ are no longer referenced by any db.storage record for kept paths → can be safely deleted.

DO NOT do before snapping:

rm -rf /p4/1/depots/depot/ — would corrupt //jam/... by removing its lazy-copy backing archives.

Changes Made to trim_excess_metadata.sh (v2.2.0 → v3.0.0)

Bug fixes

Phase 7 last-group-member retry: when p4 user -df fails with "last member of group X", script extracts group name, adds p4admin as Owner via temp file, retries user deletion.
Duplicate Phase 2 block: removed duplicate client message block.
3× typo "excepot" → "except" in Phases 2/3/4.
p4 fix -d syntax: was p4 fix -d -c CL -j Job (invalid) → p4 fix -d -c CL Job (positional).
p4 server -d syntax: was p4 server -f -d serverID (invalid -f) → p4 server -d serverID.
Shelved CL failure counting: was msg (not counted) → errmsg (counted in ErrorCount).

New phases implemented

Phase 9: Job/fix cleanup — deletes fixes first (p4 fix -d -c CL Job), then jobs (p4 job -df); resets jobspec from default_jobspec.p4s.
Phase 11 (enhanced): Failed front-door depot deletions appended to journal patch file (trim_excess_metadata.<timestamp>.jnl); summary notes path and apply command.
Phase 12: Server spec cleanup — deletes all except p4d_ffr_gf.
Phase 13: Remote spec cleanup — deletes all.
Phase 14: Typemap reset — writes empty Typemap:\n via temp file.
Phase 15: Triggers reset — copies default_triggers.p4s via temp file.
Phase 16: Protections cleanup — rebuilds table keeping lines for kept users/groups; appends super entries for p4admin and perforce.
Phase 17: Snap lazy copies — proposes p4 snap //<depot>/... for each non-empty local/stream depot. Executes with -snap flag.

Operator Tips added to -man

Dry run first.
Space for journal bloat.
Run snap (with -snap) BEFORE applying journal patch.
After snap, filtered-out depot archive dirs can be safely deleted.
Apply journal patch while p4d is offline.

Live Test Results (on p4c-nyc-03)

Run 1 (bugs present)

Exit 39 = 33 fix failures (wrong -j syntax) + 6 server spec failures (wrong -f flag)
Journal patch created with 5 depot entries (HR, depot, gwt, gwt-streams, system) ✅
Phases 1-10, 13-16 all worked correctly

Run 2 (after syntax fixes)

Exit 0, no errors, 1 warning (Stream phase not implemented) ✅
Jobs: 17 deleted (plus 24 in run 1), Fixes: 33 deleted, Server specs: 6 deleted ✅

Run 3 (shelve error counting fix)

Exit 4 = 4 shelved CL failures (CLs in filtered depots, no shelved content to delete) ✅
These are expected and intentionally kept as errors per user decision.

Run 4 (with -snap)

p4 snap //jam/... → 50+ lazy copies resolved from //depot/Jam/MAIN/src/
p4 snap //pb/... → lazy copies resolved
jam/: 4 → 348 archive files; pb/: 9 → 441 archive files ✅
depot/ archives now orphaned (safe to delete after journal patch applied)

Post-run manual steps

Deleted orphaned archives: ssh perforce@p4c-nyc-03 "rm -rf /p4/1/depots/depot/" ✅ (HR, gwt, gwt-streams, system dirs were already absent — never pulled by p4verify)
Applied journal patch to delete 5 depot specs from db.domain (see below) ✅
Final live depot list: Perforce (remote), jam, pb, spec, unload ✅

Journal Patch Format — CRITICAL FINDINGS

Perforce journal verb glossary (from Tom, confirmed by testing)

Verb	Meaning	Notes
`@rv@`	replace version	Write a full DB record (live journal format)
`@dv@`	delete value	Delete a DB record — requires ALL fields
`@pv@`	put version	Write a full DB record (checkpoint format)
`@dl@`	delete library	Delete a versioned file/archive — NOT for DB records
`@ex@`	execute/commit	Controls when buffered data is flushed to DB during live replication — NEVER use in patches
`@vv@`	verify value	Triggers journal sequence check against db.counters — NEVER use in patches

Correct format for depot spec deletion

@dv@ 8 @db.domain@ @<name>@ <type> @<owner>@ @<host>@ @<root>@ @<opts>@ @<map>@ <ctime> <mtime> <flags> @<desc>@ @<stream>@ @<specmap>@ <extratag>

All fields must be present, even if they are empty placeholders (@@ for strings, 0 for ints).
8 = db.domain table schema version.
Get the exact record via: p4d -r <P4ROOT> -jd - db.domain | grep " @db.domain@ @<name>@ "
Replace @pv@ with @dv@ on the line extracted from p4d -jd output.
The script now does this automatically (fetches full record at journal patch generation time).

Multi-record patch files work correctly

Multiple @dv@ records in a single .jnl file are processed correctly by p4d -jr as long as each record has all required fields. Early testing failures were caused by key-only truncated records: the second line was being consumed as continuation fields of the first record.

Apply command

p4d -r <P4ROOT> -jr <patchfile>.jnl

P4ROOT is shown in the script summary output (from p4 -ztag -F %serverRoot% info -s).

What does NOT work (documented for reference)

@dl@ 8 @db.domain@ @HR@ — "Bad transaction marker!" (@dl@ = delete library, not DB record)
@dv@ @db.domain@ @HR@ (no version #) — "Table HR not known" (parses key as table name)
@dv@ 8 @db.domain@ @HR@ with -jrF — "Bad opcode 'db.domain'" (wrong flag for journal format)
Key-only @dv@ (missing trailing fields) in a multi-record file — second record silently consumed as fields of first
@ex@ between records — @ex@ controls live-replication buffer flushing; it is bad news in patches and stops further processing

p4d Notes

dm.user.resetpassword=1: causes newly created service users to require password reset. Fix: p4 passwd <svcuser> then p4 login <svcuser> from replica.
p4 -ztag -F %fileCount% sizes -sah returns "" (empty string, not "0") for unreachable remote depots. Code guards against this using == 0 comparison (empty string ≠ "0") — keeps remote depot as "non-empty", which is safe (we don't delete it).
p4 depot -df checks db.storage (not just db.rev) for archive content. A depot with 0 db.rev records but db.storage entries will refuse deletion. This is the correct behaviour; use journal patch instead.
Promotion from ffr → standalone: change server.id to a NEW ID that has no scoped configurables. Reusing p4d_ffr_gf would inherit db.replication=readonly.
run.users.authorize=1: do NOT remove this configurable. It is a security control. Instead, ensure the operator running trim_excess_metadata.sh is logged in (valid ticket) before running the script. Use p4login -v or p4 login manually first. The script calls p4 users and other commands that require auth — a valid ticket is sufficient.
auth.id: when promoting a replica to standalone, the auth.id configurable (pointing to the central auth cluster) must be cleared (p4 configure unset auth.id) so the standalone server handles its own authentication. This is a one-time setup step, not something the trim script does.

Files in /home/perforce/tem/ (not in P4 depot)

.p4config.gf — P4PORT=p4c-nyc-03:1999, P4USER=perforce, P4TICKETS=/home/perforce/tem/.p4tickets
keep_users.gf.txt — contains only perforce
keep_groups.gf.txt — contains only testers (tests last-group-member scenario)

Bugs Found and Fixed (Session Continuation)

Bug 1: SSH stdin consumes while-loop input (Phase 11 loop terminates early)

Symptom: Phase 11 only examined 1 depot (the first) regardless of how many were present.
Root cause: The while read -r DepotData; done < "$TmpFile" loop passes its stdin file descriptor to the ssh command (run to do p4d -jd dump on remote host). SSH reads from stdin, consuming all remaining depot lines from TmpFile. The while loop then exits early.
Fix: Added ssh -n (redirects SSH stdin from /dev/null) and added < /dev/null on the local p4d -jd invocation.
Impact: Previously, only the first empty depot got a journal patch entry. All subsequent depots were silently skipped.

Bug 2: `grep -m1` truncates multi-line db.domain records

Symptom: Journal patch replay failed: "End of input in middle of word! Bad quoting in journal file at line 2!"
Root cause: Some depot descriptions contain embedded newlines. In p4d -jd output, the @text@ field starts on one line and its closing @ is on the next. grep -m1 only captured the first line, leaving an unclosed @ field.
Fix: Replaced grep -m1 with an awk that collects the full logical record — from @pv@/@rv@ start through the next @pv@/@rv@ or EOF.
Affected depots in this lab: HR, gwt-streams, system (descriptions have trailing newline).

Bug 3: awk `exit` triggers END block (double-printing records)

Symptom: Each depot appeared twice in the journal patch: once as @dv@ and once as @pv@.
Root cause: In awk, calling exit from within a rule still executes the END block. The record was printed once in the rule (on exit) and again in the END block.
Fix: Added a found=1 flag set before exit, and guarded the END block with !found.

Resulting journal patch format (correct)

Multi-line records now produce valid multi-line @dv@ entries, e.g.:

@dv@ 8 @db.domain@ @HR@ 100 @@ @@ @@ @@ @bruno@ 1297219747 1297219747 0 @Stream depot for Doc review
@ @@ @@ 0

Successful end-to-end test result

Phase 11: All 5 empty depot specs (HR, depot, gwt, gwt-streams, system) added to journal patch ✅
Phase 17: p4 snap //jam/... //pb/... resolved all lazy copies ✅ (jam: 4→348, pb: 9→441 archive files)
Journal patch applied: p4d -r /p4/1/root -jr <file> → exit 0 ✅
Post-patch p4 depots: only jam, pb, Perforce (remote), spec, unload remain ✅
p4 verify //jam/... //pb/...: clean ✅
Note: /p4/1/depots/depot/ still has 78 archive files — these are now safe to delete since snap resolved all lazy-copy references.

Note: `p4d -jr NONEXISTENT_FILE` is harmless

Tom confirmed: a p4d -jr call with a non-existent file simply exits with an error and does NOT affect the running database or require a restart. The "Recovering from..." message is normal p4d startup output.

Note: Journal patch file must be on the p4d server's filesystem

The p4d -r ROOT -jr FILE command reads FILE from the LOCAL filesystem of the machine running p4d. If p4d is on a remote host (nyc-03) and the script runs on bos-01, the operator must SCP or otherwise transfer the .jnl file to the remote host before applying.

p4 snap — Context: "Deep Rename" Operation

Tom provided useful context on where p4 snap fits in the broader SDP toolbox. A deep rename (making it look like a file always had its new path, including all historical revisions) uses a trio of commands:

p4 duplicate //depot/old/path/... //depot/new/path/...   # copy history to new path
p4 snap      //depot/new/path/... //depot/old/path/...   # break lazy copy: give new path its own archives
p4 obliterate //depot/old/path/...                        # remove the source path entirely

The snap step is what severs the lazy-copy link — after snap, //depot/new/path/... has its own physical archive files and no longer depends on the old path's archives. This makes the subsequent obliterate safe (nothing left pointing into the old archives).

In our divestiture handling:

We skip duplicate (the kept depots already existed with full history)
We run snap //jam/... //pb/... to give those depots their own physical archives (severing lazy-copy links into the filtered-out //depot/... archives)
We skip obliterate and instead remove the depot spec via journal patch, then rm -rf the orphaned archive dirs (Phase 18)

This makes Phase 17+18 the moral equivalent of step 2+3 of a deep rename. The duplicate step (step 1) already happened implicitly when the filtered forwarding replica was populated via p4verify.

Note: Deep renames involving streams or top-level depot renames are more complex admin operations; the above describes the basic non-stream case.

Phase 17b Redesign — Eliminate p4d -jd Back-Door (Changes 55)

Motivation

The original Phase 11 used p4d -jd db.domain (back-door) to dump the database and extract verbatim journal records for constructing @dv@ delete entries. This requires knowing P4ROOT, having p4d in PATH, and ideally being on the p4d server host — none of which can be assumed in non-SDP customer environments.

New architecture

Primary path (p4 storage -d, p4d 2021.1+):

Phase 11: Try p4 depot -df — if fails, add to FailedDepots[] (don't journal-patch yet)
Phase 17: p4 snap for kept depots (resolves lazy-copy chains)
Phase 17b: For each failed depot:
- p4 storage -d -y //<depot>/... — removes orphaned db.storage entries
- Retry p4 depot -df — should now succeed (db.storage is clean)
- No offline window, no p4d access, no journal patch

Fallback path (journal patch via p4 dbschema): If p4 storage -d fails (unavailable or non-zero exit):

Call p4 dbschema db.domain — returns authoritative schema version + field types
Build @dv@ record dynamically: int*/intv fields → 0, text/key fields → @@
Only the depot name (key field) is set to its real value
Hardcoded schema v8 field-type list as final fallback if p4 dbschema also fails
Still requires operator to apply patch offline: p4d -r <P4ROOT> -jr <file>.jnl

p4 dbschema db.domain output (p4d 2025.2)

Schema version 8, 14 attrs (1 key + 13 non-key):	Field	Name	Type	Format
0	DOname	key	string	(key — real value)
1	DOtype	int8	string	0
2	DOextra	text	string	@@
3	DOmount	text	string	@@
4	DOmount2	text	string	@@
5	DOmount3	text	string	@@
6	DOowner	key	string	@@
7	DOupdate	intv	string	0
8	DOaccess	intv	string	0
9	DOoptions	int	integer	0
10	DOdesc	text	string	@@
11	DOstream	key	string	@@
12	DOserverid	key	string	@@
13	DOcontents	int	string	0

Generated @dv@ record format:

@dv@ 8 @db.domain@ @<depotname>@ 0 @@ @@ @@ @@ @@ 0 0 0 @@ @@ @@ 0

Variables removed

P4PortHost, P4ServerAddr, DomainDumpFile — no longer needed

Variables added

FailedDepots[], FailedDepotMaps[], FailedDepotTypes[] — deferred from Phase 11
DepotsStorageCleaned — counter for Phase 17b successes
DbDomainSchema, DbDomainSchemaVer, DbDomainFieldTypes[] — fetched once in Phase 17b

UNTESTED — requires fresh lab run

The two new paths have NOT yet been tested against a live p4d:

Does p4 storage -d -y //<depot>/... succeed when db.storage entries have non-zero lbrRefCount (which may be stale after RevisionDataFilter)?
Does p4d -jr accept @dv@ with DOtype=0 (placeholder)? The correct stored value is 100 for all observed depot types.

Test Setup Automation (Change 54)

Created test/ directory with two scripts:

test/setup_lab.sh — full Lab 0 → trim-ready setup (12 phases, idempotent)
test/run_trim_test.sh — dry run → live run → snap run; prints journal patch commands
test/README.md — full documentation including Battle School dependency

Current State of Changes

Change	Description
52	Bugs 1-3 fixed (SSH stdin, grep -m1 multi-line, awk exit+END)
53	Phase 18 added (orphaned archive cleanup, P4DepotRoot)
54	test/ directory: setup_lab.sh + run_trim_test.sh + README.md
55	Phase 17b: p4 storage -d primary + p4 dbschema fallback (removes p4d -jd)
56	Session close-out: keep files, resume notes
57	Session log update: Phase 17b design notes
58	Tom: Added .p4ignore sample file

2026-06-17 — Lab Reset + setup_lab.sh Debug Session

Lab State at Session Start

Lab was reset to Lab 0 baseline (journal counter at 43 due to overnight daily_checkpoint.sh cron — harmless). nyc-03 returned to p4d_fs_nyc as expected.

First Run of setup_lab.sh — Bugs Found

Ran bash test/setup_lab.sh. Several bugs found and fixed:

Bug A: P4CONFIG overrides SDP shell environment

Symptom: Phase 3 (p4 server -o p4d_ffr_gf) failed:

Access for user 'tom_tyler' has not been enabled by 'p4 protect'.

Cause: The calling shell had P4CONFIG pointing to .p4config.local, which sets P4USER=tom_tyler. The script's own p4 calls inherited this override — bypassing the SDP shell's P4USER=perforce.

Fix: Added unset P4CONFIG near the top of setup_lab.sh (before any p4 calls). Unset is the right approach; setting export P4USER=perforce would also work but is less robust. With P4CONFIG unset, the SDP shell environment (P4USER=perforce, P4PORT=1999) takes effect naturally.

Bug B: load_checkpoint.sh argument order wrong

Symptom: Phase 6 failed:

Error: Specified checkpoint does not exist: 1

Cause: Script called load_checkpoint.sh ${SDP_INSTANCE} '${FILTERED_CKP}' (instance first), but the correct SDP calling convention is checkpoint-file first:

load_checkpoint.sh <ckp_file> -i <instance> -y

The -y flag is also required to suppress interactive confirmation prompts.

Fix: Changed to load_checkpoint.sh '${FILTERED_CKP}' -i ${SDP_INSTANCE} -y.

Bug C: Phase 6 idempotency checked server.id file, not live p4d

Cause: Phase 6 would skip the checkpoint load if server.id already said p4d_ffr_gf. But the server.id file is written before load_checkpoint.sh runs — so a failed load left the file in place, causing the phase to be incorrectly skipped on re-run.

Fix: Changed idempotency check to connect to p4d on nyc-03 and verify it responds with the expected ServerID. If p4d is down or returns a different ID, the full setup runs.

Bug D: Phase 10 same server.id idempotency problem

Same as Bug C but for the promotion phase. Changed to check live p4d responds as p4d_commit_gf with services=standard.

Bug E: Phase 10 — p4 configure unset auth.id before p4login

Cause: p4 configure unset auth.id requires an authenticated connection. The original order was: configure first, then p4login. This would fail silently (had || true).

Fix: Swapped order — p4login -v 1 first, then p4 configure unset auth.id.

Bug F: Phase 9 p4verify.sh ran without perforce user ticket on FFR

Cause: Phase 7 only logged in the service user (svc_p4d_ffr_gf). Phase 9 runs p4verify.sh as the perforce OS user, which requires a valid ticket for the FFR. With run.users.authorize=1 and security=4, this would fail or fall back to anonymous access.

Fix: Added p4login -v 1 call on nyc-03 in Phase 7, after the service user login. With rpl.forward.login=1 on the FFR, this login is forwarded to the commit server and produces a ticket that the FFR also accepts.

All Bugs Fixed in Change 59 (pending)

All six bugs fixed in test/setup_lab.sh. Script now needs a clean lab run to verify. Second lab reset requested before session close.

Phase 17b still untested

The primary path (p4 storage -d -y) and the @dv@ fallback path in Phase 17b have still not been tested against a live p4d. This remains the top priority for the next session.

2026-06-17 — Continued: setup_lab.sh Debug + Phase 17b Live Validation

Additional Bug Found: Phase 2 Idempotency (Bug G)

p4 server -o <nonexistent> always returns a template with ServerID: <name> in it (p4d behavior, not a bug). The idempotency check grep "^ServerID:" therefore always fired, and mkrep.sh was always skipped. Fix: use p4 server --exists -o <name> which errors if the spec does not exist.

Additional Bug Found: Phase 6 Missing MD5 File (Bug H)

load_checkpoint.sh requires an accompanying .md5 file. The scp only copied the .gz file. Fix: also copy ${FILTERED_CKP}.md5. Also: the existing read-only checkpoint file on nyc-03 caused scp to fail on re-run — fixed by rm -f before scp.

Additional Bug Found: `services` vs `serverServices` ztag field (Bug I)

p4 -ztag info -s returns the field as serverServices, not services. This caused the verification check in Phase 12 of setup_lab.sh and the pre-flight check in run_trim_test.sh to always show a warning. Fixed in both scripts.

Additional Bug Found: Stale Auth Ticket After Promotion (Bug J)

After promoting nyc-03 to standalone and unsetting auth.id, the ticket in .p4tickets (issued via the FFR's rpl.forward.login=1) was no longer valid. p4login (SDP script) targets bos-01 and does not help here. Fix: use raw p4 login -a < .p4passwd.p4_1.admin with P4CONFIG set to .p4config.gf at end of Phase 11 in setup_lab.sh, and similarly in run_trim_test.sh pre-flight.

setup_lab.sh Now Runs to Completion ✅

All 12 phases completed successfully. Final state verified:

ServerID=p4d_commit_gf, Services=standard ✅
Archive counts: 78 files in depot/, 4 in jam/, 9 in pb/ ✅

trim test run_trim_test.sh ✅ — All 3 Passes

Pass 1 (dry run): exit 0 ✅ Pass 2 (live run): exit 4 ✅ (4 shelved CLs expected to fail) Pass 3 (snap run): exit 4 ✅

Phase 17b — p4 storage -d Behavior (CONFIRMED)

p4 storage -d only removes entries where lbrRefCount = 0 (truly orphaned). In our test scenario, all 5 filtered-out depots had storage entries with non-zero lbrRefCount (e.g., //HR/draft/401k.rtf had lbrRefCount 3). These are NOT orphaned from p4d's perspective — they are referenced by storagesx or other tables.

Result: p4 storage -d -y ran without error ("Storage entries removed"), but p4 depot -df still refused to delete the depot because non-zero-refcount entries remain. Phase 17b correctly fell back to journal patch for all 5 depots.

Key insight: p4 storage -d helps when there are ZERO-refcount orphans (which can occur after snap resolves lazy copies). In this scenario there were none (the storage table was populated from the full checkpoint, not from lazy copies).

Journal Patch Format — CONFIRMED WORKING ✅

Generated patch:

@dv@ 8 @db.domain@ @HR@ 0 @@ @@ @@ @@ @@ 0 0 0 @@ @@ @@ 0
@dv@ 8 @db.domain@ @depot@ 0 @@ @@ @@ @@ @@ 0 0 0 @@ @@ @@ 0
@dv@ 8 @db.domain@ @gwt@ 0 @@ @@ @@ @@ @@ 0 0 0 @@ @@ @@ 0
@dv@ 8 @db.domain@ @gwt-streams@ 0 @@ @@ @@ @@ @@ 0 0 0 @@ @@ @@ 0
@dv@ 8 @db.domain@ @system@ 0 @@ @@ @@ @@ @@ 0 0 0 @@ @@ @@ 0

Applied with: p4d -r /p4/1/root -jr <file> — exit 0 ✅ DOtype=0 (placeholder, real value is 100) was accepted without error. Post-patch depot list: Perforce (remote), jam, pb, spec, unload — exactly expected ✅

p4 verify //jam/... //pb/... — CLEAN ✅

348 files in //jam/..., 441 files in //pb/... — zero MISSING or BADDIGEST. Orphaned archive dirs removed: HR, depot, gwt, gwt-streams, system. Remaining: jam/, pb/, spec/ ✅

Change Summary for This Session

Bug	Fix
G	Phase 2 idempotency: `p4 server --exists -o <name>`
H	Phase 6: copy .md5 alongside .gz; rm -f existing files before scp
I	`serverServices` not `services` in p4 ztag info output (setup_lab.sh + run_trim_test.sh)
J	Fresh ticket in .p4tickets via raw `p4 login -a` after standalone promotion

Script Readiness Assessment (v3.0.0)

trim_excess_metadata.sh v3.0.0 has been tested end-to-end:

All phases run correctly
Phase 17b correctly handles the case where p4 storage -d is insufficient
Journal patch (@dv@ format) is confirmed valid
p4 verify clean after full trim + patch application The script is functionally complete and ready for customer shipment. Remaining concern: the customer's data is much larger; the journal patch approach does not require any back-door access and scales linearly with number of depots to delete.

Change 61: Phase 16b — Spec Depot Obliterate/Delete (v3.1.0)

Problem

After a trim run, the spec depot (singleton depot type "spec") had content in it. Root cause: Phases 14 (typemap), 15 (triggers), 16 (protections) each write new versioned entries to the spec depot — writing typemap.p4s, (no triggers in test), and protect.p4s history. If Phase 11 obliterated the spec depot content, Phase 16 would write new entries after the obliterate, leaving the depot non-empty and non-deletable.

Fix

Added Phase 16b: runs AFTER Phase 16 (protections), obliterates spec depot content, then tries front-door delete → storage cleanup → journal patch (same cascade as other depots).

Also handled in Phase 11 restructuring:

remote depot type: p4 depot -d (no -f needed — metadata only, no archives)
spec depot type: record name/map to SpecDepots[]/SpecDepotMaps[] arrays for deferred Phase 16b
unload depot type: p4 depot -df directly (no snap needed; content from pre-filter period)
local/stream: existing flow (check empty, keep or attempt delete)

Test Results (nyc-03, 2026-06-17)

Non-snap live run:

Phase 11: spec depot recorded for Phase 16b
Phase 16b: p4 obliterate -y //spec/... → purged 2 revisions (protect.p4s#1, #2)
p4 depot -df spec → failed (orphaned db.storage entries from filtered checkpoint)
p4 storage -d → no-op (lbrRefCount=1 entries, not zero; from filtered checkpoint)
spec added to FailedDepots for Phase 17b

Snap run (after above live run):

Phase 16b: obliterate (1 more revision from Phase 16 rerun) → depot delete fails again
Phase 17b: p4 storage -d -y //spec/... → no-op (still lbrRefCount=1)
Journal patch generated: @dv@ 8 @db.domain@ @spec@ 0 @@ @@ @@ @@ @@ 0 0 0 @@ @@ @@ 0
Patch applied via p4d -r /p4/1/root -jr spec_patch.jnl
p4 depots → only jam, pb remain ✅
p4 verify //... → 0 MISSING ✅
Phase 18: rm -rf /p4/1/depots/spec applied manually → clean

Why orphaned lbrRefCount=1 entries persist

The filtered checkpoint includes db.storage entries for spec depot files (branch/.p4s, client/.p4s, etc.) with lbrRefCount=1, but NO corresponding db.rev records were replicated (filtered). So:

p4 obliterate can't purge them (no db.rev to target)
p4 storage -d won't remove them (lbrRefCount != 0)
Only journal patch (remove db.domain) + rm -rf (physical files) resolves this

This is the same root cause as all other "foreign" depots (HR, depot, gwt, etc.). The db.storage orphan entries remain in the database, but are harmless — they reference a depot that no longer exists. p4d does not enforce db.storage integrity against deleted db.domain entries.

Version bump

v3.0.0 → v3.1.0 for Phase 16b addition and Phase 11 case restructuring.

Session Close — 2026-06-17 19:16

Session closing for lab reset. Changes submitted:

Change 61: Phase 16b (spec depot deferred obliterate/delete)

State at close

trim_excess_metadata.sh v3.1.0 — change 61 submitted, DVCS clean
test/setup_lab.sh — working end-to-end (all bugs A-J fixed)
test/run_trim_test.sh — working end-to-end (all 3 passes validated)

What has NOT been tested yet (next session)

A full end-to-end cycle from Lab 0 after v3.1.0 changes — specifically:

The remote depot type (Phase 11 p4 depot -d branch)
The unload depot type (Phase 11 p4 depot -df branch)
The spec depot type (Phase 16b) — tested in isolation but not from Lab 0
All of the above in a single run_trim_test.sh invocation

Next session procedure

cd /home/perforce/tem
P4CONFIG=.p4config.local p4 fetch    # pull change 61
bash test/setup_lab.sh               # Lab 0 → trim-ready (all 12 phases)
export P4CONFIG=/home/perforce/tem/.p4config.gf
bash test/run_trim_test.sh           # dry + live + snap

Change 63: Full Reset Validation — setup_lab.sh Bug K + v3.1.0 Fix (2026-06-17)

Bug K: Phase 3 RevisionDataFilter idempotency (setup_lab.sh)

Symptom: After lab reset and re-run, p4 verify //jam/... showed MISSING files; p4 snap failed with "open for read: ...yyacc,v: No such file or directory".

Root cause: Phase 3 idempotency check grep -q "^RevisionDataFilter:" matched the EMPTY field that p4 always outputs in server spec templates. So Phase 3 ALWAYS skipped, leaving RevisionDataFilter empty. The unfiltered checkpoint included ALL metadata; p4verify only pulled jam/pb archives (6 pb files) but not the depot/ backing files for lazy copies.

Additional root cause: Even with the correct filter, p4 verify -q //depot/... on the FFR returns "no such file(s)" because //depot/... has no revision data in the FFR's RevisionDataFilter. The depot/ backing archives ARE pulled by p4verify.sh (SDP) when it verifies //jam/... and //pb/... — because the FFR's db.storage entries reference depot/ archive paths, and the FFR fetches them from master to satisfy verify.

Fix A: Phase 3 idempotency check changed to detect a non-empty value:

if grep -qE "^[[:space:]]+//" "$SPEC_TMP" && grep -A5 "^RevisionDataFilter:" "$SPEC_TMP" | grep -q "^[[:space:]]//"; then

Fix B: Phase 3 filter insertion changed from append (broken) to sed in-place:

sed -i "s|^RevisionDataFilter:\$|RevisionDataFilter:\n\t//jam/...\n\t//pb/...|" "$SPEC_TMP"
sed -i "s|^ArchiveDataFilter:\$|ArchiveDataFilter:\n\t//jam/...\n\t//pb/...|" "$SPEC_TMP"

Fix C: Phase 9 — depot/ backing archives missing from nyc-03 The p4 verify -q //depot/... step was added but found to be a no-op (FFR has no //depot/... revision data). Investigation showed: with correct filter, p4verify.sh DOES pull the depot/ backing archives (78 files in depot/, including yyacc,v) because the FFR's db.storage references those paths. Archive counts after correct setup:

/p4/1/depots/depot/: 78 files (backing archives for lazy copies)
/p4/1/depots/jam/: 4 files (directly-modified jam revisions)
/p4/1/depots/pb/: 9 files (directly-modified pb revisions) Zero MISSING files confirmed before trim test.

v3.1.0 Fix: Phase 16b error count correction

Symptom: Snap run exit code was 5 instead of expected 4. Root cause: Phase 16b called errmsg when spec depot fell through to journal patch fallback. Phase 17b uses msg for the same scenario on other depots. Inconsistency. Fix: Changed Phase 16b spec depot journal patch message from errmsg to msg.

Full End-to-End Test Results (v3.1.0, 2026-06-17)

Setup: setup_lab.sh ran to completion (all 12 phases clean). Archive counts correct. Trim test (3 passes):

Pass	Exit	Status
Dry run	0	✅
Live run	4	✅
Snap run	4	✅

Depot type handling (all branches exercised):

remote (Perforce depot): p4 depot -d → deleted ✅
unload (unload depot): p4 depot -df → deleted ✅
spec (spec depot): Phase 16b obliterate → journal patch fallback ✅
local/stream (HR, depot, gwt, gwt-streams, system): snap → storage cleanup → journal patch ✅

Snap: 2 depots snapped (jam, pb) — no failures ✅ Journal patch: 6 entries (HR, depot, gwt, gwt-streams, system, spec) ✅ Post-patch: p4 depots → jam + pb only; p4 verify //jam/... //pb/... → 0 MISSING ✅

Script Readiness Assessment (v3.1.0)

trim_excess_metadata.sh v3.1.0 is fully validated end-to-end:

All depot types handled correctly
snap works (backing archives confirmed present after correct filter setup)
Journal patch confirmed valid and accepted by p4d -jr
p4 verify clean after full trim + patch + rm -rf
Exit codes consistent and documented

Ready for customer shipment.

Session Continuation — 2026-06-17 Evening (Changes 64–66, Ship)

Changes

Change	Description
64	Document streams limitation and Phase 10 empty-identifier noise in -man
65	Clarify "Empty identifier not allowed" root cause (p4d bug job101555/P4-19364)
66	Add BACKGROUND section to -man: full divestiture process overview

Change 64: Document known limitations

Added KNOWN LIMITATIONS section to -man output covering:

Phase 8 (Stream cleanup): Deferred to future version. Stream spec deletion requires careful ordering of parent/child relationships, mainline vs. virtual streams, and stream history. No impact for environments using only classic/local depots (the current production use case). Manual workaround: p4 stream -df <stream> after confirming no clients are mapped.
Phase 10 "Empty identifier not allowed": Benign noise from submitted CLs with empty description fields. CLs not deleted; does not affect trim or data integrity.

Also updated Phase 8 runtime message from alarming **NOT IMPLEMENTED** to clearer (deferred).

Change 65: Clarify empty-identifier root cause

Expanded KNOWN LIMITATIONS entry for Phase 10 errors to reference the specific p4d defect:

job101555 / P4-19364: "Unable to delete empty submitted changelist"
Those CLs cannot be deleted until a p4d fix is available
Cleanup deferred until p4d bug is resolved; no impact on trim operation

Change 66: BACKGROUND section

Added new BACKGROUND section to -man output before DESCRIPTION, covering the full divestiture workflow for operators unfamiliar with the process:

Configure the FFR with RevisionDataFilter scoped to divested depot paths
Wait for replication to stabilize; run p4 verify to confirm 0 MISSING archives
Promote FFR to standalone commit server (filtered checkpoint load, unset auth.id, services=standard)
Run this script to trim excess spec-level metadata
Post-trim validation (verify archives, apply journal patch, remove orphaned dirs, manual steps)

References test/setup_lab.sh and test/run_trim_test.sh as a concrete end-to-end example.

Final State: v3.1.0 Shipped

All validation complete. p4 push executed (see below).

Check	Result
All known bugs fixed (A–K)	✅
All depot types handled	✅
Dry/live/snap exit codes: 0/4/4	✅
p4 verify 0 MISSING post-trim	✅
-man BACKGROUND + KNOWN LIMITATIONS	✅
DVCS submitted, pushed to public depot	✅

Open Issues (deferred to future versions)

Issue	Tracking	Priority
Phase 8: Stream spec cleanup	Future version	Low (no streams in current prod data)
Phase 10: Cannot delete empty-description CLs	p4d bug job101555/P4-19364	Deferred until p4d fix
Phase 10: Suppress per-CL error noise (184 lines)	Internal improvement	Low
Extensions depot obliterate + cert cleanup	Manual step; EXTRA MANUAL STEPS documents this	Low

Session Wrap-Up — 2026-06-17 Late Evening (Changes 68–69)

Changes

Change	Description
68	v3.1.1: fix DepotsFailed decrement bug (DepotsFailed-=1 → (( DepotsFailed-- ))); regenerate command_summary.txt
69	v3.1.2: prefer explicit arithmetic form DepotsFailed=$((DepotsFailed-1)); regenerate command_summary.txt

Bug: DepotsFailed decrement (found via shellcheck SC2276)

DepotsFailed-=1 is not valid bash syntax — bash has no -= compound assignment operator. Shellcheck reported SC2276 (error): "This is interpreted as a command name containing '='."

The line was silently a no-op (failed command invocation; no set -e), so DepotsFailed was never decremented after a successful Phase 17b storage cleanup. The summary line "Pending (journal patch required)" would show an inflated count. No effect on exit code or correctness of depot deletion, snap, or journal patch.

Fix (v3.1.1): (( DepotsFailed-- )) Revised (v3.1.2): DepotsFailed=$((DepotsFailed-1)) — more explicit and readable.

Both are equivalent; v3.1.2 preferred for readability.

Final shipped version: v3.1.2

shellcheck passes cleanly. command_summary.txt regenerated and reflects v3.1.2.

# Session Log — 2026-06-16

Work on `trim_excess_metadata.sh`. Bumped from **v2.2.0** to **v3.0.0**.

## Ask

1. Review ToDo.md and implement all feasible items.
2. Set up a realistic lab test environment (full simulation of post-divestiture standalone server).
3. Run the script live against the test environment and verify results.
4. Add `p4 snap` phase to resolve lazy-copy archive leakage.

## Key Decisions

- **Version**: 3.0.0 — major change warranting major version bump.
- **Shelf delete failures**: Keep as errors (not journal patch). User may handle via journal patch manually later.
- **Server spec to keep**: `p4d_ffr_gf` (the filtered forwarding replica spec).
- **Style**: All spec-piping operations use temp files (`mktemp`) so the spec content is visible in error messages.
- **p4 snap**: New Phase 17. Without `-snap` flag, propose commands; with `-snap`, execute. Always run snap BEFORE applying the journal patch.

## Lab Environment

### Topology (Battle School Workshop — 5 servers, same subnet)
| Host | Role | ServerID | Port |
|------|------|----------|------|
| p4c-bos-01 | Commit (master) | commit.p4demo.1 | 1999 |
| p4c-bos-02 | HA standby | p4d_ha_bos | 1999 |
| p4c-nyc-03 | DR standby → sacrificed for test | p4d_fs_nyc → p4d_ffr_gf → p4d_commit_gf | 1999 |
| p4c-syd-04 | Edge | p4d_edge_syd | 1999 |
| p4c-syd-05 | Edge HA | p4d_ha_edge_syd | 1999 |

All SSH accessible as `perforce` OS user without password.

### Test Setup Procedure (Lab 0 → trim test environment on nyc-03)

**Starting state**: Standard Battle School Lab 0 reset.

**Step 1**: Add `gf` site tag.
```bash
if ! grep -q ^gf /p4/common/config/SiteTags.cfg 2>&1; then
  echo 'gf: GF - Filtered Forwarding Replica test site' >> /p4/common/config/SiteTags.cfg
fi
```

**Step 2**: Run mkrep.sh (from bos-01, as perforce):
```bash
mkrep.sh -t ffr -s gf -r p4c-nyc-03 -i 1
```
Creates: server spec `p4d_ffr_gf`, service user `svc_p4d_ffr_gf`, all configurables.

**Step 3**: Add RevisionDataFilter and ArchiveDataFilter to `p4d_ffr_gf` server spec:
```
RevisionDataFilter: //jam/...
                    //pb/...
ArchiveDataFilter: //jam/...
                   //pb/...
```

**Step 4**: Rotate journal, create filtered seed checkpoint, stop p4d_fs_nyc, reset ServerID to p4d_ffr_gf, load checkpoint.
```bash
# On bos-01:
rotate_journal.sh 1
p4d_1 -r /p4/1/offline_db -P p4d_ffr_gf -J off -Z \
  -jd /p4/1/checkpoints.ffr_gf/p4_1.ckp.ffr_gf.1.gz

# On nyc-03:
sudo systemctl stop p4d_1
mkdir -p /p4/1/checkpoints.ffr_gf
echo 'p4d_ffr_gf' > /p4/1/root/server.id
load_checkpoint.sh 1 /p4/1/checkpoints.ffr_gf/p4_1.ckp.ffr_gf.1.gz
sudo systemctl start p4d_1
```

**Fix service user password expiry** (dm.user.resetpassword=1 triggers on new users):
```bash
# On bos-01:
p4 passwd svc_p4d_ffr_gf   # set a known password
# On nyc-03 — log the service user in:
p4 -p 1999 -u svc_p4d_ffr_gf login   # enter password set above
```

**Step 5**: Blast depot archives on nyc-03:
```bash
ssh perforce@p4c-nyc-03 'rm -rf /p4/1/depots/*'
```

**Step 6**: Pull filtered archives via p4verify:
```bash
ssh perforce@p4c-nyc-03 'p4verify.sh 1'
```
Expected: only //jam/... and //pb/... archive files pulled. But note: lazy-copy archives may land in /p4/1/depots/depot/ even though //depot/... is filtered — this is correct behaviour (see Lazy Copy section below).

**Step 7**: Promote to standalone commit:
```bash
# On nyc-03:
sudo systemctl stop p4d_1
echo 'p4d_commit_gf' > /p4/1/root/server.id   # new ID — avoids inheriting db.replication=readonly
sudo systemctl start p4d_1
# Clear auth.id so the standalone handles its own auth (no longer points to central auth cluster):
P4PORT=localhost:1999 p4 configure unset auth.id
# Do NOT touch run.users.authorize — just ensure you are logged in:
P4PORT=localhost:1999 p4 login   # enter admin password
```
Verify: `p4 -p p4c-nyc-03:1999 info` → `ServerID: p4d_commit_gf, Services: standard`

**Step 8**: Create test config files (on bos-01, in /home/perforce/tem/):
```bash
cat > .p4config.gf <<EOF
P4PORT=p4c-nyc-03:1999
P4USER=perforce
EOF
echo "perforce" > keep_users.gf.txt
echo "testers" > keep_groups.gf.txt   # Randall_Scott is sole member → tests last-group-member fix
```

**Step 9**: Run the script:
```bash
# Dry run first:
bash trim_excess_metadata.sh gf
# Then live:
bash trim_excess_metadata.sh gf -y
# Then with snap:
bash trim_excess_metadata.sh gf -y -snap
```

## Lazy Copy / Archive Leakage (CRITICAL)

### Background
Perforce uses a **lazy copy** mechanism for branching: instead of physically copying
archive files, db.rev records in the new path point to the *existing* archive in the
original path via db.storage. No physical file copy happens at branch time.

### Consequence for filtered replication
When a RevisionDataFilter keeps `//jam/...` but filters out `//depot/...`, p4d during
`p4verify.sh` will pull **all archives needed to make //jam/... fully accessible** —
including archives physically stored in `/p4/1/depots/depot/` that serve as the backing
storage for files in `//jam/...` via lazy copy. This is correct and expected.

**After the filtered replica is promoted to standalone:**
- `//depot/...` has 0 db.rev records (filtered) → appears empty
- `//depot/...` db.storage still has entries for lazy-copy-source archives
- Physical archives exist in `/p4/1/depots/depot/` that `//jam/...` depends on
- `p4 depot -df depot` → fails with "isn't empty of archive contents"

**This is observed in our lab:**
- `/p4/1/depots/depot/`: 78 archive files (Jam MAIN src) — needed by `//jam/...`
- `p4 snap -n //jam/...` confirms: 50+ files in `//jam/...` are lazy copies from `//depot/Jam/MAIN/src/`

### Resolution: p4 snap + journal patch
1. **`p4 snap //jam/... //pb/...`** — physically copies archives into their target depot directories, breaks lazy-copy chains. After snap: jam/ has 348 archive files, pb/ has 441 (up from 4 and 9 respectively).
2. **Journal patch** — removes depot spec entries from db.domain via `p4d -jr <file>` while offline.
3. **After snap AND journal patch**: archives in `/p4/1/depots/depot/` are no longer referenced by any db.storage record for kept paths → can be safely deleted.

### DO NOT do before snapping:
- `rm -rf /p4/1/depots/depot/` — would corrupt `//jam/...` by removing its lazy-copy backing archives.

## Changes Made to trim_excess_metadata.sh (v2.2.0 → v3.0.0)

### Bug fixes
- **Phase 7 last-group-member retry**: when `p4 user -df` fails with "last member of group X", script extracts group name, adds `p4admin` as Owner via temp file, retries user deletion.
- **Duplicate Phase 2 block**: removed duplicate client message block.
- **3× typo** `"excepot"` → `"except"` in Phases 2/3/4.
- **`p4 fix -d` syntax**: was `p4 fix -d -c CL -j Job` (invalid) → `p4 fix -d -c CL Job` (positional).
- **`p4 server -d` syntax**: was `p4 server -f -d serverID` (invalid -f) → `p4 server -d serverID`.
- **Shelved CL failure counting**: was `msg` (not counted) → `errmsg` (counted in ErrorCount).

### New phases implemented
- **Phase 9**: Job/fix cleanup — deletes fixes first (`p4 fix -d -c CL Job`), then jobs (`p4 job -df`); resets jobspec from `default_jobspec.p4s`.
- **Phase 11 (enhanced)**: Failed front-door depot deletions appended to journal patch file (`trim_excess_metadata.<timestamp>.jnl`); summary notes path and apply command.
- **Phase 12**: Server spec cleanup — deletes all except `p4d_ffr_gf`.
- **Phase 13**: Remote spec cleanup — deletes all.
- **Phase 14**: Typemap reset — writes empty `Typemap:\n` via temp file.
- **Phase 15**: Triggers reset — copies `default_triggers.p4s` via temp file.
- **Phase 16**: Protections cleanup — rebuilds table keeping lines for kept users/groups; appends super entries for p4admin and perforce.
- **Phase 17**: Snap lazy copies — proposes `p4 snap //<depot>/...` for each non-empty local/stream depot. Executes with `-snap` flag.

### Operator Tips added to -man
1. Dry run first.
2. Space for journal bloat.
3. Run snap (with `-snap`) BEFORE applying journal patch.
4. After snap, filtered-out depot archive dirs can be safely deleted.
5. Apply journal patch while p4d is offline.

## Live Test Results (on p4c-nyc-03)

### Run 1 (bugs present)
- Exit 39 = 33 fix failures (wrong -j syntax) + 6 server spec failures (wrong -f flag)
- Journal patch created with 5 depot entries (HR, depot, gwt, gwt-streams, system) ✅
- Phases 1-10, 13-16 all worked correctly

### Run 2 (after syntax fixes)
- Exit 0, no errors, 1 warning (Stream phase not implemented) ✅
- Jobs: 17 deleted (plus 24 in run 1), Fixes: 33 deleted, Server specs: 6 deleted ✅

### Run 3 (shelve error counting fix)
- Exit 4 = 4 shelved CL failures (CLs in filtered depots, no shelved content to delete) ✅
- These are expected and intentionally kept as errors per user decision.

### Run 4 (with -snap)
- `p4 snap //jam/...` → 50+ lazy copies resolved from //depot/Jam/MAIN/src/
- `p4 snap //pb/...` → lazy copies resolved
- jam/: 4 → 348 archive files; pb/: 9 → 441 archive files ✅
- depot/ archives now orphaned (safe to delete after journal patch applied)

### Post-run manual steps
- Deleted orphaned archives: `ssh perforce@p4c-nyc-03 "rm -rf /p4/1/depots/depot/"` ✅
  (HR, gwt, gwt-streams, system dirs were already absent — never pulled by p4verify)
- Applied journal patch to delete 5 depot specs from db.domain (see below) ✅
- Final live depot list: Perforce (remote), jam, pb, spec, unload ✅

## Journal Patch Format — CRITICAL FINDINGS

### Perforce journal verb glossary (from Tom, confirmed by testing)
| Verb   | Meaning            | Notes |
|--------|--------------------|-------|
| `@rv@` | replace version    | Write a full DB record (live journal format) |
| `@dv@` | **delete value**   | Delete a DB record — requires ALL fields |
| `@pv@` | put version        | Write a full DB record (checkpoint format) |
| `@dl@` | delete library     | Delete a versioned file/archive — NOT for DB records |
| `@ex@` | execute/commit     | Controls when buffered data is flushed to DB during live replication — **NEVER use in patches** |
| `@vv@` | verify value       | Triggers journal sequence check against db.counters — **NEVER use in patches** |

### Correct format for depot spec deletion
```
@dv@ 8 @db.domain@ @<name>@ <type> @<owner>@ @<host>@ @<root>@ @<opts>@ @<map>@ <ctime> <mtime> <flags> @<desc>@ @<stream>@ @<specmap>@ <extratag>
```
- All fields must be present, even if they are empty placeholders (`@@` for strings, `0` for ints).
- `8` = db.domain table schema version.
- Get the exact record via: `p4d -r <P4ROOT> -jd - db.domain | grep " @db.domain@ @<name>@ "`
- Replace `@pv@` with `@dv@` on the line extracted from `p4d -jd` output.
- The script now does this automatically (fetches full record at journal patch generation time).

### Multi-record patch files work correctly
Multiple `@dv@` records in a single `.jnl` file are processed correctly by `p4d -jr` as long as
each record has all required fields. Early testing failures were caused by key-only truncated
records: the second line was being consumed as continuation fields of the first record.

### Apply command
```
p4d -r <P4ROOT> -jr <patchfile>.jnl
```
P4ROOT is shown in the script summary output (from `p4 -ztag -F %serverRoot% info -s`).

### What does NOT work (documented for reference)
- `@dl@ 8 @db.domain@ @HR@` — "Bad transaction marker!" (`@dl@` = delete library, not DB record)
- `@dv@ @db.domain@ @HR@` (no version #) — "Table HR not known" (parses key as table name)
- `@dv@ 8 @db.domain@ @HR@` with `-jrF` — "Bad opcode 'db.domain'" (wrong flag for journal format)
- Key-only `@dv@` (missing trailing fields) in a multi-record file — second record silently consumed as fields of first
- `@ex@` between records — `@ex@` controls live-replication buffer flushing; it is bad news in patches and stops further processing

## p4d Notes
- `dm.user.resetpassword=1`: causes newly created service users to require password reset. Fix: `p4 passwd <svcuser>` then `p4 login <svcuser>` from replica.
- `p4 -ztag -F %fileCount% sizes -sah` returns `""` (empty string, not "0") for unreachable remote depots. Code guards against this using `== 0` comparison (empty string ≠ "0") — keeps remote depot as "non-empty", which is safe (we don't delete it).
- `p4 depot -df` checks db.storage (not just db.rev) for archive content. A depot with 0 db.rev records but db.storage entries will refuse deletion. This is the correct behaviour; use journal patch instead.
- Promotion from ffr → standalone: change `server.id` to a NEW ID that has no scoped configurables. Reusing `p4d_ffr_gf` would inherit `db.replication=readonly`.
- `run.users.authorize=1`: do NOT remove this configurable. It is a security control. Instead, ensure the operator running trim_excess_metadata.sh is logged in (valid ticket) before running the script. Use `p4login -v` or `p4 login` manually first. The script calls `p4 users` and other commands that require auth — a valid ticket is sufficient.
- `auth.id`: when promoting a replica to standalone, the `auth.id` configurable (pointing to the central auth cluster) must be cleared (`p4 configure unset auth.id`) so the standalone server handles its own authentication. This is a one-time setup step, not something the trim script does.

## Files in /home/perforce/tem/ (not in P4 depot)
- `.p4config.gf` — P4PORT=p4c-nyc-03:1999, P4USER=perforce, P4TICKETS=/home/perforce/tem/.p4tickets
- `keep_users.gf.txt` — contains only `perforce`
- `keep_groups.gf.txt` — contains only `testers` (tests last-group-member scenario)

## Bugs Found and Fixed (Session Continuation)

### Bug 1: SSH stdin consumes while-loop input (Phase 11 loop terminates early)
- **Symptom**: Phase 11 only examined 1 depot (the first) regardless of how many were present.
- **Root cause**: The `while read -r DepotData; done < "$TmpFile"` loop passes its stdin file descriptor to the `ssh` command (run to do `p4d -jd` dump on remote host). SSH reads from stdin, consuming all remaining depot lines from TmpFile. The while loop then exits early.
- **Fix**: Added `ssh -n` (redirects SSH stdin from /dev/null) and added `< /dev/null` on the local `p4d -jd` invocation.
- **Impact**: Previously, only the first empty depot got a journal patch entry. All subsequent depots were silently skipped.

### Bug 2: `grep -m1` truncates multi-line db.domain records
- **Symptom**: Journal patch replay failed: "End of input in middle of word! Bad quoting in journal file at line 2!"
- **Root cause**: Some depot descriptions contain embedded newlines. In `p4d -jd` output, the `@text@` field starts on one line and its closing `@` is on the next. `grep -m1` only captured the first line, leaving an unclosed `@` field.
- **Fix**: Replaced `grep -m1` with an awk that collects the full logical record — from `@pv@`/`@rv@` start through the next `@pv@`/`@rv@` or EOF.
- **Affected depots in this lab**: HR, gwt-streams, system (descriptions have trailing newline).

### Bug 3: awk `exit` triggers END block (double-printing records)
- **Symptom**: Each depot appeared twice in the journal patch: once as `@dv@` and once as `@pv@`.
- **Root cause**: In awk, calling `exit` from within a rule still executes the END block. The record was printed once in the rule (on `exit`) and again in the END block.
- **Fix**: Added a `found=1` flag set before `exit`, and guarded the END block with `!found`.

### Resulting journal patch format (correct)
Multi-line records now produce valid multi-line `@dv@` entries, e.g.:
```
@dv@ 8 @db.domain@ @HR@ 100 @@ @@ @@ @@ @bruno@ 1297219747 1297219747 0 @Stream depot for Doc review
@ @@ @@ 0 
```

### Successful end-to-end test result
- Phase 11: All 5 empty depot specs (HR, depot, gwt, gwt-streams, system) added to journal patch ✅
- Phase 17: `p4 snap //jam/... //pb/...` resolved all lazy copies ✅ (jam: 4→348, pb: 9→441 archive files)
- Journal patch applied: `p4d -r /p4/1/root -jr <file>` → exit 0 ✅
- Post-patch `p4 depots`: only jam, pb, Perforce (remote), spec, unload remain ✅
- `p4 verify //jam/... //pb/...`: clean ✅
- Note: `/p4/1/depots/depot/` still has 78 archive files — these are now safe to delete since snap resolved all lazy-copy references.

### Note: `p4d -jr NONEXISTENT_FILE` is harmless
Tom confirmed: a `p4d -jr` call with a non-existent file simply exits with an error and does NOT affect the running database or require a restart. The "Recovering from..." message is normal p4d startup output.

### Note: Journal patch file must be on the p4d server's filesystem
The `p4d -r ROOT -jr FILE` command reads FILE from the LOCAL filesystem of the machine running p4d. If p4d is on a remote host (nyc-03) and the script runs on bos-01, the operator must SCP or otherwise transfer the .jnl file to the remote host before applying.

## p4 snap — Context: "Deep Rename" Operation

Tom provided useful context on where `p4 snap` fits in the broader SDP toolbox.
A **deep rename** (making it look like a file always had its new path, including all historical revisions) uses a trio of commands:

```bash
p4 duplicate //depot/old/path/... //depot/new/path/...   # copy history to new path
p4 snap      //depot/new/path/... //depot/old/path/...   # break lazy copy: give new path its own archives
p4 obliterate //depot/old/path/...                        # remove the source path entirely
```

The `snap` step is what severs the lazy-copy link — after snap, `//depot/new/path/...` has its own physical archive files and no longer depends on the old path's archives.  This makes the subsequent `obliterate` safe (nothing left pointing into the old archives).

In our divestiture handling:
- We skip `duplicate` (the kept depots already existed with full history)
- We run `snap //jam/... //pb/...` to give those depots their own physical archives (severing lazy-copy links into the filtered-out `//depot/...` archives)
- We skip `obliterate` and instead remove the depot spec via journal patch, then `rm -rf` the orphaned archive dirs (Phase 18)

This makes Phase 17+18 the moral equivalent of step 2+3 of a deep rename.
The `duplicate` step (step 1) already happened implicitly when the filtered forwarding replica was populated via `p4verify`.

**Note**: Deep renames involving streams or top-level depot renames are more complex admin operations; the above describes the basic non-stream case.

## Phase 17b Redesign — Eliminate p4d -jd Back-Door (Changes 55)

### Motivation
The original Phase 11 used `p4d -jd db.domain` (back-door) to dump the database
and extract verbatim journal records for constructing `@dv@` delete entries.
This requires knowing P4ROOT, having p4d in PATH, and ideally being on the p4d
server host — none of which can be assumed in non-SDP customer environments.

### New architecture
**Primary path (p4 storage -d, p4d 2021.1+):**
1. Phase 11: Try `p4 depot -df` — if fails, add to `FailedDepots[]` (don't journal-patch yet)
2. Phase 17: `p4 snap` for kept depots (resolves lazy-copy chains)
3. Phase 17b: For each failed depot:
   - `p4 storage -d -y //<depot>/...` — removes orphaned db.storage entries
   - Retry `p4 depot -df` — should now succeed (db.storage is clean)
   - No offline window, no p4d access, no journal patch

**Fallback path (journal patch via p4 dbschema):**
If `p4 storage -d` fails (unavailable or non-zero exit):
- Call `p4 dbschema db.domain` — returns authoritative schema version + field types
- Build `@dv@` record dynamically: `int*/intv` fields → `0`, `text/key` fields → `@@`
- Only the depot name (key field) is set to its real value
- Hardcoded schema v8 field-type list as final fallback if `p4 dbschema` also fails
- Still requires operator to apply patch offline: `p4d -r <P4ROOT> -jr <file>.jnl`

### p4 dbschema db.domain output (p4d 2025.2)
Schema version 8, 14 attrs (1 key + 13 non-key):
| Field | Name     | Type | Format | Placeholder |
|-------|----------|------|--------|-------------|
| 0     | DOname   | key  | string | (key — real value) |
| 1     | DOtype   | int8 | string | 0 |
| 2     | DOextra  | text | string | @@ |
| 3     | DOmount  | text | string | @@ |
| 4     | DOmount2 | text | string | @@ |
| 5     | DOmount3 | text | string | @@ |
| 6     | DOowner  | key  | string | @@ |
| 7     | DOupdate | intv | string | 0 |
| 8     | DOaccess | intv | string | 0 |
| 9     | DOoptions| int  | integer| 0 |
| 10    | DOdesc   | text | string | @@ |
| 11    | DOstream | key  | string | @@ |
| 12    | DOserverid| key | string | @@ |
| 13    | DOcontents| int | string | 0 |

Generated `@dv@` record format:
```
@dv@ 8 @db.domain@ @<depotname>@ 0 @@ @@ @@ @@ @@ 0 0 0 @@ @@ @@ 0
```

### Variables removed
- `P4PortHost`, `P4ServerAddr`, `DomainDumpFile` — no longer needed

### Variables added
- `FailedDepots[]`, `FailedDepotMaps[]`, `FailedDepotTypes[]` — deferred from Phase 11
- `DepotsStorageCleaned` — counter for Phase 17b successes
- `DbDomainSchema`, `DbDomainSchemaVer`, `DbDomainFieldTypes[]` — fetched once in Phase 17b

### UNTESTED — requires fresh lab run
The two new paths have NOT yet been tested against a live p4d:
1. Does `p4 storage -d -y //<depot>/...` succeed when db.storage entries have
   non-zero lbrRefCount (which may be stale after RevisionDataFilter)?
2. Does `p4d -jr` accept `@dv@` with DOtype=0 (placeholder)?
   The correct stored value is 100 for all observed depot types.

## Test Setup Automation (Change 54)
Created `test/` directory with two scripts:
- `test/setup_lab.sh` — full Lab 0 → trim-ready setup (12 phases, idempotent)
- `test/run_trim_test.sh` — dry run → live run → snap run; prints journal patch commands
- `test/README.md` — full documentation including Battle School dependency

## Current State of Changes
| Change | Description |
|--------|-------------|
| 52 | Bugs 1-3 fixed (SSH stdin, grep -m1 multi-line, awk exit+END) |
| 53 | Phase 18 added (orphaned archive cleanup, P4DepotRoot) |
| 54 | test/ directory: setup_lab.sh + run_trim_test.sh + README.md |
| 55 | Phase 17b: p4 storage -d primary + p4 dbschema fallback (removes p4d -jd) |
| 56 | Session close-out: keep files, resume notes |
| 57 | Session log update: Phase 17b design notes |
| 58 | Tom: Added .p4ignore sample file |

---

## 2026-06-17 — Lab Reset + setup_lab.sh Debug Session

### Lab State at Session Start
Lab was reset to Lab 0 baseline (journal counter at 43 due to overnight daily_checkpoint.sh cron —
harmless). nyc-03 returned to `p4d_fs_nyc` as expected.

### First Run of setup_lab.sh — Bugs Found

Ran `bash test/setup_lab.sh`. Several bugs found and fixed:

#### Bug A: P4CONFIG overrides SDP shell environment
**Symptom:** Phase 3 (`p4 server -o p4d_ffr_gf`) failed:
```
Access for user 'tom_tyler' has not been enabled by 'p4 protect'.
```
**Cause:** The calling shell had `P4CONFIG` pointing to `.p4config.local`, which sets
`P4USER=tom_tyler`. The script's own `p4` calls inherited this override — bypassing the
SDP shell's `P4USER=perforce`.

**Fix:** Added `unset P4CONFIG` near the top of `setup_lab.sh` (before any `p4` calls).
Unset is the right approach; setting `export P4USER=perforce` would also work but is
less robust. With P4CONFIG unset, the SDP shell environment (`P4USER=perforce`,
`P4PORT=1999`) takes effect naturally.

#### Bug B: load_checkpoint.sh argument order wrong
**Symptom:** Phase 6 failed:
```
Error: Specified checkpoint does not exist: 1
```
**Cause:** Script called `load_checkpoint.sh ${SDP_INSTANCE} '${FILTERED_CKP}'` (instance
first), but the correct SDP calling convention is checkpoint-file first:
```
load_checkpoint.sh <ckp_file> -i <instance> -y
```
The `-y` flag is also required to suppress interactive confirmation prompts.

**Fix:** Changed to `load_checkpoint.sh '${FILTERED_CKP}' -i ${SDP_INSTANCE} -y`.

#### Bug C: Phase 6 idempotency checked server.id file, not live p4d
**Cause:** Phase 6 would skip the checkpoint load if `server.id` already said `p4d_ffr_gf`.
But the server.id file is written *before* `load_checkpoint.sh` runs — so a failed load
left the file in place, causing the phase to be incorrectly skipped on re-run.

**Fix:** Changed idempotency check to connect to p4d on nyc-03 and verify it responds
with the expected ServerID. If p4d is down or returns a different ID, the full setup runs.

#### Bug D: Phase 10 same server.id idempotency problem
Same as Bug C but for the promotion phase. Changed to check live p4d responds as
`p4d_commit_gf` with `services=standard`.

#### Bug E: Phase 10 — p4 configure unset auth.id before p4login
**Cause:** `p4 configure unset auth.id` requires an authenticated connection. The original
order was: configure first, then p4login. This would fail silently (had `|| true`).

**Fix:** Swapped order — `p4login -v 1` first, then `p4 configure unset auth.id`.

#### Bug F: Phase 9 p4verify.sh ran without perforce user ticket on FFR
**Cause:** Phase 7 only logged in the *service user* (`svc_p4d_ffr_gf`). Phase 9 runs
`p4verify.sh` as the perforce OS user, which requires a valid ticket for the FFR. With
`run.users.authorize=1` and `security=4`, this would fail or fall back to anonymous access.

**Fix:** Added `p4login -v 1` call on nyc-03 in Phase 7, after the service user login.
With `rpl.forward.login=1` on the FFR, this login is forwarded to the commit server and
produces a ticket that the FFR also accepts.

### All Bugs Fixed in Change 59 (pending)
All six bugs fixed in `test/setup_lab.sh`. Script now needs a clean lab run to verify.
Second lab reset requested before session close.

### Phase 17b still untested
The primary path (`p4 storage -d -y`) and the `@dv@` fallback path in Phase 17b have
still not been tested against a live p4d. This remains the top priority for the next session.

---

## 2026-06-17 — Continued: setup_lab.sh Debug + Phase 17b Live Validation

### Additional Bug Found: Phase 2 Idempotency (Bug G)
`p4 server -o <nonexistent>` always returns a template with `ServerID: <name>` in it
(p4d behavior, not a bug). The idempotency check `grep "^ServerID:"` therefore always
fired, and mkrep.sh was always skipped. Fix: use `p4 server --exists -o <name>` which
errors if the spec does not exist.

### Additional Bug Found: Phase 6 Missing MD5 File (Bug H)
`load_checkpoint.sh` requires an accompanying `.md5` file. The scp only copied
the `.gz` file. Fix: also copy `${FILTERED_CKP}.md5`. Also: the existing read-only
checkpoint file on nyc-03 caused scp to fail on re-run — fixed by `rm -f` before scp.

### Additional Bug Found: `services` vs `serverServices` ztag field (Bug I)
`p4 -ztag info -s` returns the field as `serverServices`, not `services`. This caused
the verification check in Phase 12 of setup_lab.sh and the pre-flight check in
run_trim_test.sh to always show a warning. Fixed in both scripts.

### Additional Bug Found: Stale Auth Ticket After Promotion (Bug J)
After promoting nyc-03 to standalone and unsetting `auth.id`, the ticket in
`.p4tickets` (issued via the FFR's `rpl.forward.login=1`) was no longer valid.
`p4login` (SDP script) targets bos-01 and does not help here.
Fix: use raw `p4 login -a < .p4passwd.p4_1.admin` with `P4CONFIG` set to `.p4config.gf`
at end of Phase 11 in setup_lab.sh, and similarly in run_trim_test.sh pre-flight.

### setup_lab.sh Now Runs to Completion ✅
All 12 phases completed successfully. Final state verified:
- ServerID=p4d_commit_gf, Services=standard ✅
- Archive counts: 78 files in depot/, 4 in jam/, 9 in pb/ ✅

### trim test run_trim_test.sh ✅ — All 3 Passes

**Pass 1 (dry run): exit 0** ✅
**Pass 2 (live run): exit 4** ✅ (4 shelved CLs expected to fail)
**Pass 3 (snap run): exit 4** ✅

### Phase 17b — p4 storage -d Behavior (CONFIRMED)
`p4 storage -d` only removes entries where `lbrRefCount = 0` (truly orphaned).
In our test scenario, all 5 filtered-out depots had storage entries with non-zero
lbrRefCount (e.g., //HR/draft/401k.rtf had lbrRefCount 3). These are NOT orphaned
from p4d's perspective — they are referenced by storagesx or other tables.

Result: `p4 storage -d -y` ran without error ("Storage entries removed"), but
`p4 depot -df` still refused to delete the depot because non-zero-refcount entries
remain. Phase 17b correctly fell back to journal patch for all 5 depots.

Key insight: `p4 storage -d` helps when there are ZERO-refcount orphans (which
can occur after snap resolves lazy copies). In this scenario there were none
(the storage table was populated from the full checkpoint, not from lazy copies).

### Journal Patch Format — CONFIRMED WORKING ✅
Generated patch:
```
@dv@ 8 @db.domain@ @HR@ 0 @@ @@ @@ @@ @@ 0 0 0 @@ @@ @@ 0
@dv@ 8 @db.domain@ @depot@ 0 @@ @@ @@ @@ @@ 0 0 0 @@ @@ @@ 0
@dv@ 8 @db.domain@ @gwt@ 0 @@ @@ @@ @@ @@ 0 0 0 @@ @@ @@ 0
@dv@ 8 @db.domain@ @gwt-streams@ 0 @@ @@ @@ @@ @@ 0 0 0 @@ @@ @@ 0
@dv@ 8 @db.domain@ @system@ 0 @@ @@ @@ @@ @@ 0 0 0 @@ @@ @@ 0
```
Applied with: `p4d -r /p4/1/root -jr <file>` — exit 0 ✅
DOtype=0 (placeholder, real value is 100) was accepted without error.
Post-patch depot list: Perforce (remote), jam, pb, spec, unload — exactly expected ✅

### p4 verify //jam/... //pb/... — CLEAN ✅
348 files in //jam/..., 441 files in //pb/... — zero MISSING or BADDIGEST.
Orphaned archive dirs removed: HR, depot, gwt, gwt-streams, system.
Remaining: jam/, pb/, spec/ ✅

### Change Summary for This Session
| Bug | Fix |
|-----|-----|
| G | Phase 2 idempotency: `p4 server --exists -o <name>` |
| H | Phase 6: copy .md5 alongside .gz; rm -f existing files before scp |
| I | `serverServices` not `services` in p4 ztag info output (setup_lab.sh + run_trim_test.sh) |
| J | Fresh ticket in .p4tickets via raw `p4 login -a` after standalone promotion |

### Script Readiness Assessment (v3.0.0)
trim_excess_metadata.sh v3.0.0 has been tested end-to-end:
- All phases run correctly
- Phase 17b correctly handles the case where p4 storage -d is insufficient
- Journal patch (@dv@ format) is confirmed valid
- p4 verify clean after full trim + patch application
The script is functionally complete and ready for customer shipment.
Remaining concern: the customer's data is much larger; the journal patch approach
does not require any back-door access and scales linearly with number of depots to delete.

---

## Change 61: Phase 16b — Spec Depot Obliterate/Delete (v3.1.0)

### Problem
After a trim run, the `spec` depot (singleton depot type "spec") had content in it.
Root cause: Phases 14 (typemap), 15 (triggers), 16 (protections) each write new versioned
entries to the spec depot — writing typemap.p4s, (no triggers in test), and protect.p4s
history. If Phase 11 obliterated the spec depot content, Phase 16 would write new entries
after the obliterate, leaving the depot non-empty and non-deletable.

### Fix
Added **Phase 16b**: runs AFTER Phase 16 (protections), obliterates spec depot content,
then tries front-door delete → storage cleanup → journal patch (same cascade as other depots).

Also handled in Phase 11 restructuring:
- `remote` depot type: `p4 depot -d` (no `-f` needed — metadata only, no archives)
- `spec` depot type: record name/map to `SpecDepots[]`/`SpecDepotMaps[]` arrays for deferred Phase 16b
- `unload` depot type: `p4 depot -df` directly (no snap needed; content from pre-filter period)
- `local`/`stream`: existing flow (check empty, keep or attempt delete)

### Test Results (nyc-03, 2026-06-17)

**Non-snap live run:**
- Phase 11: spec depot recorded for Phase 16b
- Phase 16b: `p4 obliterate -y //spec/...` → purged 2 revisions (protect.p4s#1, #2)
- `p4 depot -df spec` → failed (orphaned db.storage entries from filtered checkpoint)
- `p4 storage -d` → no-op (lbrRefCount=1 entries, not zero; from filtered checkpoint)
- spec added to FailedDepots for Phase 17b

**Snap run (after above live run):**
- Phase 16b: obliterate (1 more revision from Phase 16 rerun) → depot delete fails again
- Phase 17b: `p4 storage -d -y //spec/...` → no-op (still lbrRefCount=1)
- Journal patch generated: `@dv@ 8 @db.domain@ @spec@ 0 @@ @@ @@ @@ @@ 0 0 0 @@ @@ @@ 0`
- Patch applied via `p4d -r /p4/1/root -jr spec_patch.jnl`
- `p4 depots` → only jam, pb remain ✅
- `p4 verify //...` → 0 MISSING ✅
- Phase 18: `rm -rf /p4/1/depots/spec` applied manually → clean

### Why orphaned lbrRefCount=1 entries persist
The filtered checkpoint includes db.storage entries for spec depot files (branch/*.p4s,
client/*.p4s, etc.) with lbrRefCount=1, but NO corresponding db.rev records were
replicated (filtered). So:
- `p4 obliterate` can't purge them (no db.rev to target)
- `p4 storage -d` won't remove them (lbrRefCount != 0)
- Only journal patch (remove db.domain) + rm -rf (physical files) resolves this

This is the same root cause as all other "foreign" depots (HR, depot, gwt, etc.).
The db.storage orphan entries remain in the database, but are harmless — they
reference a depot that no longer exists. p4d does not enforce db.storage integrity
against deleted db.domain entries.

### Version bump
v3.0.0 → v3.1.0 for Phase 16b addition and Phase 11 case restructuring.

---

## Session Close — 2026-06-17 19:16

Session closing for lab reset. Changes submitted:
- Change 61: Phase 16b (spec depot deferred obliterate/delete)

### State at close
- `trim_excess_metadata.sh` v3.1.0 — change 61 submitted, DVCS clean
- `test/setup_lab.sh` — working end-to-end (all bugs A-J fixed)
- `test/run_trim_test.sh` — working end-to-end (all 3 passes validated)

### What has NOT been tested yet (next session)
A full end-to-end cycle from Lab 0 after v3.1.0 changes — specifically:
- The `remote` depot type (Phase 11 `p4 depot -d` branch)
- The `unload` depot type (Phase 11 `p4 depot -df` branch)
- The `spec` depot type (Phase 16b) — tested in isolation but not from Lab 0
- All of the above in a single run_trim_test.sh invocation

### Next session procedure
```bash
cd /home/perforce/tem
P4CONFIG=.p4config.local p4 fetch    # pull change 61
bash test/setup_lab.sh               # Lab 0 → trim-ready (all 12 phases)
export P4CONFIG=/home/perforce/tem/.p4config.gf
bash test/run_trim_test.sh           # dry + live + snap
```


---

## Change 63: Full Reset Validation — setup_lab.sh Bug K + v3.1.0 Fix (2026-06-17)

### Bug K: Phase 3 RevisionDataFilter idempotency (setup_lab.sh)

**Symptom**: After lab reset and re-run, `p4 verify //jam/...` showed MISSING files;
`p4 snap` failed with "open for read: ...yyacc,v: No such file or directory".

**Root cause**: Phase 3 idempotency check `grep -q "^RevisionDataFilter:"` matched the
EMPTY field that p4 always outputs in server spec templates. So Phase 3 ALWAYS skipped,
leaving RevisionDataFilter empty. The unfiltered checkpoint included ALL metadata; p4verify
only pulled jam/pb archives (6 pb files) but not the depot/ backing files for lazy copies.

**Additional root cause**: Even with the correct filter, `p4 verify -q //depot/...` on the
FFR returns "no such file(s)" because //depot/... has no revision data in the FFR's
RevisionDataFilter. The depot/ backing archives ARE pulled by p4verify.sh (SDP) when it
verifies //jam/... and //pb/... — because the FFR's db.storage entries reference depot/
archive paths, and the FFR fetches them from master to satisfy verify.

**Fix A**: Phase 3 idempotency check changed to detect a non-empty value:
```bash
if grep -qE "^[[:space:]]+//" "$SPEC_TMP" && grep -A5 "^RevisionDataFilter:" "$SPEC_TMP" | grep -q "^[[:space:]]//"; then
```

**Fix B**: Phase 3 filter insertion changed from append (broken) to sed in-place:
```bash
sed -i "s|^RevisionDataFilter:\$|RevisionDataFilter:\n\t//jam/...\n\t//pb/...|" "$SPEC_TMP"
sed -i "s|^ArchiveDataFilter:\$|ArchiveDataFilter:\n\t//jam/...\n\t//pb/...|" "$SPEC_TMP"
```

**Fix C**: Phase 9 — depot/ backing archives missing from nyc-03
The `p4 verify -q //depot/...` step was added but found to be a no-op (FFR has no
//depot/... revision data). Investigation showed: with correct filter, p4verify.sh
DOES pull the depot/ backing archives (78 files in depot/, including yyacc,v) because
the FFR's db.storage references those paths. Archive counts after correct setup:
- /p4/1/depots/depot/: 78 files (backing archives for lazy copies)
- /p4/1/depots/jam/: 4 files (directly-modified jam revisions)
- /p4/1/depots/pb/: 9 files (directly-modified pb revisions)
Zero MISSING files confirmed before trim test.

### v3.1.0 Fix: Phase 16b error count correction

**Symptom**: Snap run exit code was 5 instead of expected 4.
**Root cause**: Phase 16b called `errmsg` when spec depot fell through to journal patch
fallback. Phase 17b uses `msg` for the same scenario on other depots. Inconsistency.
**Fix**: Changed Phase 16b spec depot journal patch message from `errmsg` to `msg`.

### Full End-to-End Test Results (v3.1.0, 2026-06-17)

**Setup**: setup_lab.sh ran to completion (all 12 phases clean). Archive counts correct.
**Trim test** (3 passes):

| Pass | Exit | Status |
|------|------|--------|
| Dry run | 0 | ✅ |
| Live run | 4 | ✅ |
| Snap run | 4 | ✅ |

**Depot type handling (all branches exercised)**:
- `remote` (Perforce depot): `p4 depot -d` → deleted ✅
- `unload` (unload depot): `p4 depot -df` → deleted ✅
- `spec` (spec depot): Phase 16b obliterate → journal patch fallback ✅
- `local`/`stream` (HR, depot, gwt, gwt-streams, system): snap → storage cleanup → journal patch ✅

**Snap**: 2 depots snapped (jam, pb) — no failures ✅
**Journal patch**: 6 entries (HR, depot, gwt, gwt-streams, system, spec) ✅
**Post-patch**: `p4 depots` → jam + pb only; `p4 verify //jam/... //pb/...` → 0 MISSING ✅

### Script Readiness Assessment (v3.1.0)
trim_excess_metadata.sh v3.1.0 is **fully validated end-to-end**:
- All depot types handled correctly
- snap works (backing archives confirmed present after correct filter setup)
- Journal patch confirmed valid and accepted by p4d -jr
- p4 verify clean after full trim + patch + rm -rf
- Exit codes consistent and documented

**Ready for customer shipment.**

---

## Session Continuation — 2026-06-17 Evening (Changes 64–66, Ship)

### Changes

| Change | Description |
|--------|-------------|
| 64 | Document streams limitation and Phase 10 empty-identifier noise in -man |
| 65 | Clarify "Empty identifier not allowed" root cause (p4d bug job101555/P4-19364) |
| 66 | Add BACKGROUND section to -man: full divestiture process overview |

### Change 64: Document known limitations

Added **KNOWN LIMITATIONS** section to `-man` output covering:

1. **Phase 8 (Stream cleanup)**: Deferred to future version. Stream spec deletion requires careful
   ordering of parent/child relationships, mainline vs. virtual streams, and stream history. No
   impact for environments using only classic/local depots (the current production use case).
   Manual workaround: `p4 stream -df <stream>` after confirming no clients are mapped.

2. **Phase 10 "Empty identifier not allowed"**: Benign noise from submitted CLs with empty
   description fields. CLs not deleted; does not affect trim or data integrity.

Also updated Phase 8 runtime message from alarming `**NOT IMPLEMENTED**` to clearer `(deferred)`.

### Change 65: Clarify empty-identifier root cause

Expanded KNOWN LIMITATIONS entry for Phase 10 errors to reference the specific p4d defect:
- **job101555 / P4-19364**: "Unable to delete empty submitted changelist"
- Those CLs cannot be deleted until a p4d fix is available
- Cleanup deferred until p4d bug is resolved; no impact on trim operation

### Change 66: BACKGROUND section

Added new **BACKGROUND** section to `-man` output before DESCRIPTION, covering the full
divestiture workflow for operators unfamiliar with the process:

1. Configure the FFR with `RevisionDataFilter` scoped to divested depot paths
2. Wait for replication to stabilize; run `p4 verify` to confirm 0 MISSING archives
3. Promote FFR to standalone commit server (filtered checkpoint load, unset auth.id, services=standard)
4. Run this script to trim excess spec-level metadata
5. Post-trim validation (verify archives, apply journal patch, remove orphaned dirs, manual steps)

References `test/setup_lab.sh` and `test/run_trim_test.sh` as a concrete end-to-end example.

### Final State: v3.1.0 Shipped

All validation complete. `p4 push` executed (see below).

| Check | Result |
|-------|--------|
| All known bugs fixed (A–K) | ✅ |
| All depot types handled | ✅ |
| Dry/live/snap exit codes: 0/4/4 | ✅ |
| p4 verify 0 MISSING post-trim | ✅ |
| -man BACKGROUND + KNOWN LIMITATIONS | ✅ |
| DVCS submitted, pushed to public depot | ✅ |

### Open Issues (deferred to future versions)

| Issue | Tracking | Priority |
|-------|----------|----------|
| Phase 8: Stream spec cleanup | Future version | Low (no streams in current prod data) |
| Phase 10: Cannot delete empty-description CLs | p4d bug job101555/P4-19364 | Deferred until p4d fix |
| Phase 10: Suppress per-CL error noise (184 lines) | Internal improvement | Low |
| Extensions depot obliterate + cert cleanup | Manual step; EXTRA MANUAL STEPS documents this | Low |

---

## Session Wrap-Up — 2026-06-17 Late Evening (Changes 68–69)

### Changes

| Change | Description |
|--------|-------------|
| 68 | v3.1.1: fix DepotsFailed decrement bug (DepotsFailed-=1 → (( DepotsFailed-- ))); regenerate command_summary.txt |
| 69 | v3.1.2: prefer explicit arithmetic form DepotsFailed=$((DepotsFailed-1)); regenerate command_summary.txt |

### Bug: DepotsFailed decrement (found via shellcheck SC2276)

`DepotsFailed-=1` is not valid bash syntax — bash has no `-=` compound assignment operator.
Shellcheck reported SC2276 (error): "This is interpreted as a command name containing '='."

The line was silently a no-op (failed command invocation; no `set -e`), so `DepotsFailed` was
never decremented after a successful Phase 17b storage cleanup.  The summary line "Pending
(journal patch required)" would show an inflated count.  No effect on exit code or correctness
of depot deletion, snap, or journal patch.

Fix (v3.1.1): `(( DepotsFailed-- ))`
Revised (v3.1.2): `DepotsFailed=$((DepotsFailed-1))` — more explicit and readable.

Both are equivalent; v3.1.2 preferred for readability.

### Final shipped version: v3.1.2

shellcheck passes cleanly.  command_summary.txt regenerated and reflects v3.1.2.

#	Change	User	Description
#12	32785	C. Thomas Tyler	Session log: v3.1.1/3.1.2 shellcheck bug fix wrap-up Documents SC2276 DepotsFailed-=1 bug, fix, and preference revision. Final shipped version: v3.1.2. Co-authored-by: Copilot <[email protected]>
#11	32782	C. Thomas Tyler	Session log: changes 64-66, v3.1.0 shipped, open issues Documents the final session: - Change 64: stream limitation + empty-identifier noise in -man - Change 65: Phase 10 p4d bug job101555/P4-19364 reference - Change 66: BACKGROUND section for divestiture process overview - Final v3.1.0 validation table - Open issues table for future versions Co-authored-by: Copilot <[email protected]>
#10	32778	C. Thomas Tyler	v3.1.0: Fix Phase 16b error count; fix setup_lab.sh filter bugs (Bug K) trim_excess_metadata.sh: - Phase 16b: change errmsg->msg for spec depot journal patch fallback, consistent with Phase 17b behavior for other depots. Snap run now correctly exits 4 (shelved CLs only) not 5. test/setup_lab.sh: - Bug K fix: Phase 3 idempotency grep matched empty RevisionDataFilter field always present in p4 server spec templates, causing filter to never be set. New check tests for actual tab-indented path values. - Bug K fix: Filter insertion changed from append (broken — p4 ignored appended duplicate fields) to sed in-place replacement. - Phase 9: Added p4 verify -q //depot/... step (no-op on FFR but documents the intent; backing archives pulled by p4verify.sh via db.storage references). Full end-to-end validation results (all depot types): - remote depot: p4 depot -d (no -f) OK - unload depot: p4 depot -df OK - spec depot: Phase 16b obliterate + journal patch fallback OK - local/stream depots: snap + storage cleanup + journal patch OK - Snap: 2 depots (jam, pb) snapped successfully, 0 failures - Exit codes: dry=0, live=4, snap=4 (all expected) - p4 verify //jam/... //pb/... = 0 MISSING after journal patch Co-authored-by: Copilot <[email protected]>
#9	32772	C. Thomas Tyler	Session close 2026-06-17: update log and session notes Added Phase 16b test results and session-close summary to session log. Co-authored-by: Copilot <[email protected]>
#8	32771	C. Thomas Tyler	v3.1.0: Add Phase 16b for spec depot obliterate/delete The spec depot (singleton type 'spec') was not being cleaned up because Phases 14-16 (typemap, triggers, protections) write new versioned entries to the spec depot after Phase 11 ran. Obliterating in Phase 11 left new content behind, preventing depot deletion. Changes: - Phase 11 case statement restructured: remote -> p4 depot -d (no -f; no archives) spec -> record in SpecDepots[] for deferred Phase 16b unload -> p4 depot -df directly local\|stream -> existing flow - Phase 16b added after Phase 16: obliterate spec content, then cascade through front-door delete -> storage cleanup -> journal patch fallback - declare -a SpecDepots=() and SpecDepotMaps=() added - Man page updated to include Phase 16b in phase list Tested on nyc-03 (Battle School Workshop): - Non-snap run: Phase 16b obliterates new protect.p4s entries; journal patch path triggered for orphaned db.storage entries from filtered ckp - Snap run: Phase 17b storage cleanup + journal patch generated for spec - Patch applied; p4 depots shows only jam+pb; p4 verify //... = 0 MISSING Co-authored-by: Copilot <[email protected]>
#7	32770	C. Thomas Tyler	Fix setup_lab.sh bugs G-J; fix run_trim_test.sh; Phase 17b validated setup_lab.sh bugs fixed: G. Phase 2 idempotency: use 'p4 server --exists -o' (plain -o always returns template) H. Phase 6: copy .md5 alongside .gz; rm -f before scp (read-only file on re-run) I. serverServices (not services) in p4 -ztag info output -- fix in setup_lab.sh + run_trim_test.sh J. Fresh ticket after standalone promotion: raw 'p4 login -a' with P4CONFIG=.p4config.gf Phase 17b test results (confirmed): - p4 storage -d only removes lbrRefCount=0 entries; all 5 depots had non-zero refcounts - Falls back to journal patch correctly for all 5 depots - @dv@ patch format (DOtype=0 placeholder) accepted by p4d -jr -- exit 0 - Post-patch: correct 5 depots remain; p4 verify clean (348+441 files) Co-authored-by: Copilot <[email protected]>
#6	32769	C. Thomas Tyler	Fix 6 bugs in test/setup_lab.sh; update session log Bugs fixed (all in test/setup_lab.sh): A. unset P4CONFIG at top -- SDP shell env was overridden by .p4config.local B. load_checkpoint.sh arg order: file first, then -i N -y C. Phase 6 idempotency: check live p4d connectivity, not server.id file D. Phase 10 same idempotency fix -- check p4d responds as standalone E. Phase 10: p4login before p4 configure unset auth.id (auth required first) F. Phase 7: add p4login -v 1 for perforce user (needed before Phase 9 p4verify) Co-authored-by: Copilot <[email protected]>
#5	32759	C. Thomas Tyler	Update session log: Phase 17b redesign, p4 snap context, db.domain schema table Co-authored-by: Copilot <[email protected]>
#4	32756	C. Thomas Tyler	Add test/ directory: setup and run scripts for trim_excess_metadata testing Captures the full Lab 0 -> trim-ready setup procedure as runnable scripts, making it easy for a future agent or human to reproduce the test environment from scratch. Files added: - test/README.md — full documentation: Battle School dependency, lab topology, what the test simulates (filtered-replication divestiture), lazy-copy leakage explanation, expected outcomes, gotchas - test/setup_lab.sh — orchestration script (runs on bos-01); adds gf site tag, runs mkrep.sh, adds RevisionDataFilter/ArchiveDataFilter, fixes service user password, rotates journal, creates filtered seed checkpoint, initialises nyc-03 as FFR replica, pulls archives via p4verify, promotes to standalone commit server, creates .p4config.gf / keep_users / keep_groups files - test/run_trim_test.sh — runs trim in 3 passes (dry, live, snap); prints journal patch apply commands and cleanup instructions Also updated: - ai/session_log_2026-06-16.md — added p4 snap / deep-rename context section Co-authored-by: Copilot <[email protected]>
#3	32754	C. Thomas Tyler	fix(trim_excess_metadata.sh): ssh -n to prevent stdin consumption in while loop; fix multi-line db.domain record extraction using awk; fix awk double-print via exit+END; regenerate command_summary.txt; update session log Co-authored-by: Copilot <[email protected]>
#2	32753	C. Thomas Tyler	trim_excess_metadata v3.0.0: journal patch uses full-record @dv@, single patch file, no apply script - @dv@ records now include all db.domain fields (fetched via p4d -jd), matching the format required for multi-record journal patch files - Removed .apply.sh companion script generation; single .jnl file applied with: p4d -r <P4ROOT> -jr <patchfile>.jnl - P4ROOT detected from 'p4 info'; server host detected from P4PORT for local-vs-SSH decision when dumping db.domain records - @ex@ and @vv@ records must never appear in patch files (documented in comments) - Updated session log with correct journal format findings Co-authored-by: Copilot <[email protected]>
#1	32752	C. Thomas Tyler	Add Phase 17 (p4 snap), fix Phase 6/9/12 bugs, add session log (v3.0.0) New features: - Phase 17: Snap lazy copies; -snap flag resolves cross-depot archive references, eliminating depot name leakage at the archive layer. - Without -snap, Phase 17 proposes the p4 snap commands for review. New Operator Tips in -man for snap/journal patch ordering. Bug fixes: - p4 fix -d: was passing invalid -j flag; job name is positional. - p4 server -d: was passing invalid -f flag. - Shelved CL failures: now correctly reported via errmsg(). Co-authored-by: Copilot <[email protected]> <enter description here>

session_log_2026-06-16.md #12

Session Log — 2026-06-16

Ask

Key Decisions

Lab Environment

Topology (Battle School Workshop — 5 servers, same subnet)

Test Setup Procedure (Lab 0 → trim test environment on nyc-03)

Lazy Copy / Archive Leakage (CRITICAL)

Background

Consequence for filtered replication

Resolution: p4 snap + journal patch

DO NOT do before snapping:

Changes Made to trim_excess_metadata.sh (v2.2.0 → v3.0.0)

Bug fixes

New phases implemented

Operator Tips added to -man

Live Test Results (on p4c-nyc-03)

Run 1 (bugs present)

Run 2 (after syntax fixes)

Run 3 (shelve error counting fix)

Run 4 (with -snap)

Post-run manual steps

Journal Patch Format — CRITICAL FINDINGS

Perforce journal verb glossary (from Tom, confirmed by testing)

Correct format for depot spec deletion

Multi-record patch files work correctly

Apply command

What does NOT work (documented for reference)

p4d Notes

Files in /home/perforce/tem/ (not in P4 depot)

Bugs Found and Fixed (Session Continuation)

Bug 1: SSH stdin consumes while-loop input (Phase 11 loop terminates early)

Bug 2: grep -m1 truncates multi-line db.domain records

Bug 3: awk exit triggers END block (double-printing records)

Resulting journal patch format (correct)

Successful end-to-end test result

Note: p4d -jr NONEXISTENT_FILE is harmless

Note: Journal patch file must be on the p4d server's filesystem

p4 snap — Context: "Deep Rename" Operation

Phase 17b Redesign — Eliminate p4d -jd Back-Door (Changes 55)

Motivation

New architecture

p4 dbschema db.domain output (p4d 2025.2)

Variables removed

Variables added

UNTESTED — requires fresh lab run

Test Setup Automation (Change 54)

Current State of Changes

2026-06-17 — Lab Reset + setup_lab.sh Debug Session

Lab State at Session Start

First Run of setup_lab.sh — Bugs Found

Bug A: P4CONFIG overrides SDP shell environment

Bug B: load_checkpoint.sh argument order wrong

Bug C: Phase 6 idempotency checked server.id file, not live p4d

Bug D: Phase 10 same server.id idempotency problem

Bug E: Phase 10 — p4 configure unset auth.id before p4login

Bug F: Phase 9 p4verify.sh ran without perforce user ticket on FFR

All Bugs Fixed in Change 59 (pending)

Phase 17b still untested

2026-06-17 — Continued: setup_lab.sh Debug + Phase 17b Live Validation

Additional Bug Found: Phase 2 Idempotency (Bug G)

Additional Bug Found: Phase 6 Missing MD5 File (Bug H)

Additional Bug Found: services vs serverServices ztag field (Bug I)

Additional Bug Found: Stale Auth Ticket After Promotion (Bug J)

setup_lab.sh Now Runs to Completion ✅

trim test run_trim_test.sh ✅ — All 3 Passes

Phase 17b — p4 storage -d Behavior (CONFIRMED)

Journal Patch Format — CONFIRMED WORKING ✅

p4 verify //jam/... //pb/... — CLEAN ✅

Change Summary for This Session

Script Readiness Assessment (v3.0.0)

Change 61: Phase 16b — Spec Depot Obliterate/Delete (v3.1.0)

Problem

Fix

Test Results (nyc-03, 2026-06-17)

Why orphaned lbrRefCount=1 entries persist

Version bump

Session Close — 2026-06-17 19:16

State at close

What has NOT been tested yet (next session)

Bug 2: `grep -m1` truncates multi-line db.domain records

Bug 3: awk `exit` triggers END block (double-printing records)

Note: `p4d -jr NONEXISTENT_FILE` is harmless

Additional Bug Found: `services` vs `serverServices` ztag field (Bug I)