SDP-435

Details
Comments 4

Status: Open
Project: perforce-software-sdp
Severity: A
Reported By: Adam Morriss
Reported Date: 5 years ago
Modified By: C. Thomas Tyler
Modified Date: 5 years ago
Owned By: amo
Dev Notes: This was filed as a bug, but I have re-classified it as a Feature.
See comments in the job for a discussion on this issue. A fair
question is rasied about P4D behvaior, as well as reasons for the
current P4D behaviour. In any case, the issue here is not an SDP
bug.
Component: core-unix
Type: Feature

	Flag as Task Joel Kovisto commented 5 years ago There is a strong need for this.
	Reply ·1

	Flag as Task Adam Morriss commented 5 years ago I agree; the SDP is expected to provide a backup of a server instance, and without this it is not a complete snapshot.
	Reply ·0

	Flag as Task C. Thomas Tyler commented 5 years ago Just an FYI that there has been a discussion on this on internal Perforce resources. There are multiple valid use cases for partitioned clients. In one use case, the data could be deemed disposable, and for which losing the data in a failover would be acceptable. That usage delivers potentially significant performance gains by avoiding duplication of essentially ephemeral (garbage?) data, of which there can be large amounts, but at the cost of a less transparent failover. In another use case, we could optimize for a more transparent failover, but at the cost of duplicating lots of "mostly useless" data. My current thinking is not to address this in the SDP, because either the partitioned client data must be deemed persistent (and thus p4d would have it in the P4JOURNAL) or it is disposable (as it is currently). However, I have contributed to the discussion in the associated Helix Core Server jobs, suggesting that we provide a configurable to define whether a customer wants to incur the potentially expensive cost of journaling partitioned client data, negating some of the benefit but optimizing for a more seamless failover. Part of the reason for not addressing this in the SDP is that it would be hard to implement and test a highly reliable way to replicate data outside of what p4d does. I think that would be a high effort, and outside the scope of what SDP should do. Wanting a different behavior with respect to how partitioned clients is certainly reasonable: Who doesn't want fully transparent failover and comprehensive backups? But treating what p4d currently deems "disposable ephemeral data" isn't something I can envision doing reliably in the SDP, and so I defer this to P4D. In any case, some documentation improvement is warranted. For example, if we stay with the original concept of partitioned clients being disposable ephemeral data, we need the documentation to clarify what the user is signing up for when they use those features -- i.e. that they're using data that is neither backed up nor journaled. It would survive a reboot but not a failover without external processing. In other news, there was a separate discussion for having SDP define a best-practice value for the client.readonly.dir configurable, but that got stalled because the best practice isn't obvious. Putting it on /hxmetadata (in P4ROOT) loses the benefit of keeping P4ROOT reserved for only permanent journaled data, and goes against the grain of not having to store ephemeral data. Storing it on /hxdepots would cause it to be backed up, again not desired for the original intended use case for partitioned clients. Storing it on /hxlogs, often a smaller volume, could dramatically increase the size needs for that volume. So...without having a clear default, I decided on leaving a value for client.readonly.dir outside the SDP for now.
	Reply ·0

	Flag as Task Adam Morriss commented 5 years ago One consideration for the Helix Core Server is to ensure this 'disposable' client data is available for use in a failover scenario while excluding it from the longer-lived checkpoint. This would minimise the impact of data-loss in the immediate/short-term as the failover server could be brought online with as much, if not all of the data from the server it was 'shadowing', while ensuring this data would continue to be excluded from the checkpoint process where - at least longer-term - the information is not required. Amongst other things, the SDP's offline database is intended to provide a quick recovery point. However, it is unlikely to benefit from the method mentioned above. If so, the need to recover this 'disposable' data in the very short-term should be incorporated into the SDP's capabilities. If placing the readonly data on different, more resilient storage addresses this, then 'client.readonly.dir' is possibly the way to go.
	Reply ·0

jobs/SDP-435

SDP-435