|
Matt Attaway commented
9 years ago
Nice change! I'll talk with the developer about bringing this patch into the product if you're ok with that. |
Reply ·0 | |
|
|
|
lbarbier commented
9 years ago
Thanks, I'm totally ok with it. I'm the one that asked for the feature in case #00206843 configparse.c is certainly not necessary for the CL as it's generated at compil time. |
Reply ·0 | |
|
|
|
Matt Attaway commented
9 years ago
@nick_poole will be in touch tomorrow; he's the develop for p4dctl. |
Reply ·0 | |
|
|
|
Nick Poole commented
9 years ago
Hi @lbarbier First off, thank you for contributing your solution back to the community. The issue you've encountered is that 'p4dctl checkpoint' is calling 'p4d -z -jc', which is compressing the checkpoint and the rotated journal, breaking replication. But I believe that you'd also encounter the same problem if you ran 'p4dctl journal', which rotates the journal by calling 'p4d -z -jj'. There are two solutions to this problem:
I think there are probably use cases for both, so rather than providing a single compression on/off we should add two options: In this the case that CompressCheckpoint == true and CompressJournal == true, pass '-z'. What do you think? If you like, I can add my suggested changes to the review for you to test out, or if you'd prefer to make the change, I'll be happy to review it. |
Reply ·0 | |
|
|
|
lbarbier commented
9 years ago
Hi Nick, I wasn't aware of -Z for compressing the checkpoint only. Other than this, I totally agree that we then need to go further and add journal compression too. |
Reply ·0 | |
|
|
|
lbarbier commented
9 years ago
Hello, P4D::checkpoint might be problematic. I do get the idea about doing it with a 'p4d -jc' but replicas don't like that and enforce usage of 'p4 admin checkpoint'. Which make both journal and checkpoint fail. |
Reply ·0 | |
|
|
|
Nick Poole commented
9 years ago
Hi @lbarbier, The issue with compressing only the journal (not the checkpoint) is that there's no p4d flag to do that. We could compress the rotated journal afterwards, or uncompress the checkpoint, but I think it's an inadvisable enough case that we don't need to worry about it. I'll go ahead and make the changes in the review; I didn't want to go ahead without checking first, in case I caused offence. As for checkpointing replica servers, that could be tricky, but it should be possible to identify if a server is acting as a replica and do something appropriate. I'll do some research and let you know what I come up with. |
Reply ·0 | |
|
|
|
lbarbier commented
9 years ago
(edited)
Hi @nick_poole , I also added journal compression but not yet error handling on Journal = true & checkpoint = false. No clue yet how to handle it. |
Reply ·0 | |
|
|
|
Nick Poole commented
9 years ago
Hi @lbarbier, Thanks for your continued work on this. I've been thing a bit more about both cases, and I think we might want to consider a different direction: I propose that we don't add an option to rotate journals at all; instead, just change the '-z' flag to '-Z'. I still think that the option to enable/disable the flag is a great idea, and this is very similar to your original review. As for supporting replica Perforce Servers, I think that running 'p4 admin checkpoint' from p4dctl might be problematic: p4dctl needs to be configured to run the command as a privileged user, potentially with authentication. Unlike the P4D::Stop() implementation, there's no good fallback for this at the moment. I think we could adapt both P4D::Checkpoint() and P4D::Journal() to attempt to execute the 'p4 admin' commands, and fall back to calling 'p4d -jc/-jj'. If we added some error handling to detect the replica error message we could return a more useful message to the user. To improve the experience even further, I'll log an enhancement request for 'p4d -jc' against a replica to schedule a checkpoint, so that if 'p4dctl checkpoint' falls back to running that command against a replica, it will still succeed. |
Reply ·0 | |
|
|
|
lbarbier commented
9 years ago
Hi @nick_poole, I picked your changes and merged into my version and pushed back into shelve #16575. I think we need to preserve journal rotations. As a simple example : my edge server has 100GB of journal created each day while my commit only has 5GB a week. I want to rotate daily my edge journal so that weekly checkpoint doesn't take an hour and don't explode journal filesystem too. I got my work working without any modification. As you said you need a privileged user but p4dctl already has all the tools it need to do so : by using P4TICKETS and P4USER in the environment section it totally fix the problem. So to do a checkpoint on a replica with an automatic task you have first to schedule the checkpoint on the replica and then launch a journal rotation on master, 2 steps need to be done in the right order. With my current knowledge I think the whole checkpoint / rotation on replicas need a buff. Regards, |
Reply ·0 | |
|
|
|
Nick Poole commented
9 years ago
Hi @lbarbier, I see your point: if the server is at the end of a chain of servers, then uncompressed journals are just consuming disk space. I'm glad to hear that the 'p4 admin checkpoint' is working in your environment. I'm actually testing out a change at would use that logic for and P4D, falling back to calling 'p4d -jc' if the 'p4 admin' command failed. The behaviour you're seeing with rotations starting at the master, and checkpoints being scheduled is by design. The journal number is stored in the counters tables, and replicated as part of the journal, so in order to ensure that journals and checkpoints match up between replicas, the journal rotation on the master causes journal rotations on the replicas. If you rotated the journal on just a replica, it would have a different journal number to the master. If the journal numbers somehow got out of sync, and you had some sort of failure that required you to restore the master, you would have no reference point to restore and restart the replica. I agree that this could be made better, possibly by improving visibility of which replicas were scheduled to checkpoint, etc, but journal rotations are relatively cheap, so even if you rotated before you scheduled the checkpoint, and again after, I believe the overall time that the process will take will be about the same. I'll update the review tomorrow with an option to compress the journal. I'm also going to move it into the main branch: I'll need to backport if I it's to be included in a patch release. I'll also include the 'p4 admin checkpoint/journal' calls that I mentioned (the only real difference to your shelf is that they are on the P4D class: I'm not convinced that we need another server type, and I'd rather not add unnecessary configuration options). Thanks, Nick |
Reply ·0 | |
|