Putting changelists on another perforce machine - perforce

I did a migration of the perforce server. Because there are many depots and many data and because our department does not want that the server is down for a day, I did the following:
Step 2 : Stop the 'old' perforce server on machine A
Step 1 : Copy all data from machine A to machine B
Step 2 : Start again the 'old' server
Setp 3 : Do the complete migration and start the new server at machine B
Now the problem is that in the meanwile, there are several changelists submitted to machine A.
Is there an easy way to copy the changes to machine B ? Note that I have a checkpoint, so the metadata is not the problem.
I know that on version controls like git it is easy with patches. Is such a thing also possible in perforce in an easy way?

You may want to look at p4 pull and p4 replicate command given by perforce.
P4 Pull and P4 Replicate are the 2 commands given for purpose like these. I am assuming that your p4 server is p4d 2010.2 or above for these commands.
Also, I would recommend you reading Perforce's knowledge base for their article on Offline check pointing. This article is not a direct solution to your problem, but it gives you insight of how you may tackle your situation.
Hope this helps.

Did you truncate your journal when you did the first migration? If so, you can probably replay the latest production journal and then rsync/robocopy the archive content (files in the depots). That'll let you catch up.
I assume this is a one-off activity?
You should probably check in with Perforce tech support just to make sure you're not overlooking anything.

Related

Update the have list of a client to reflect what's in the HD

I have this situation: I have many files on my HD that come from an old perforce server. The new server has the same files, but the "have list" is not up to date, and according to it I have no files.
The server is not 100% like my HD, so if I run "p4 flush" I'll break it, because many files that are not in my HD but are in the server will be marked as "have" and will not be synced later.
Is there a way to sync the have list exactly to what you have in your client?
Thanks very much for you help!
Alex
Start with a p4 flush to your best guess, and then do p4 clean to force-sync everything that doesn't match.

P4V - Duplicate workspace pointing to existing data

I was wondering if anyone had any advice on how to do the following task in p4v (I am not too familiar with P4V commands, so apologise if this is some basic command that I am missing).
Currently I have a workspace setup and data synced to my root
e.g. C:\Data\
I access this workspace from two different windows machine. (data is on both machines at c:\Data
Now, I need to move the location of where the data is stored on ONE of the machines and not the other (Machine A : c:\Data, Machine B: D:\Data\
Is this possible to do, without having to sync all the data again from the server (there is a lot and bandwidth limitations).
My initial thoughts were to create another workspace pointing to another root, but I do not know how to get this new workspace pick up the data files at this location.
Any help would be greatly appreciated
Thanks in advance
I don't know of a way to do this through P4V, but it can be done with the command line client. Here's the procedure.
After you have moved your files on machine B, and created a new workspace (without performing an "update all"), you can pass the -k switch to the sync command to let the server know what files you have.
From the web page to which I linked:
Keep existing workspace files; update the have list without updating
the client workspace. Use p4 sync -k only when you need to update the
have list to match the actual state of the client workspace.
And the command line help has this to say:
The -k flag updates server metadata without syncing files. It is
intended to enable you to ensure that the server correctly reflects
the state of files in the workspace while avoiding a large data
transfer. Caution: an erroneous update can cause the server to
incorrectly reflect the state of the workspace.
FYI: p4 flush is an alias for p4 sync -k
You can also look at the AltRoots field in the workspace. You could have one root at c:\data and the other at d:\data. As raven mentioned since the data is living on two separate disks you'll need to make sure that the data is kept in sync on both machines, although I assume you've already figured this part out since you've been running on two machines.
Any reason you can't just have one workspace per machine?

How to move a perforce depot between two different servers such that revision history is copied but user info and workspaces are not?

I need to copy a depot from one Perforce server to another. The file revision history needs to be intact but the user information and workspace information can not be copied to the new server.
I've tried a standard checkpoint creation and restore procedure, but if there exist users or workspaces with the same name on both servers, the source server will overwrite this info on the destination server. This is pretty bad if those user accounts and workspaces do not have exactly identical details.
The goal of this sort of operation is to allow two separate, disconnected groups to view a versioned source tree with revision history. Updates would be single directional with one group developing and one just viewing. Each group's network is completely enclosed, no outside connections of any kind.
Any ideas would be appreciated, i've been busting my brains on this one for a while.
EDIT:
Ultimately my solution was to install an intermediate Perforce server on the same machine as my source server. Using that I could do a standard backup/restore from the source server to the intermediate server and then delete all unwanted meta data in the intermediate server before backing up from the intermediate server to the final destination server. Pretty complicated but it got the job done and it can all be done programatically in Windows Power Shell.
There are a few ways, but I think you are going about this one the hard way.
Continue to do what you are doing, but delete the db.user, db.view(I think) and db.group. Then when you start the perforce server, it will create these, but they will be empty, which will make it hard for anyone to log in. So you'll have to create users/groups. I'm not sure if you can take those db files from another server and copy them in, never tried that.
The MUCH easier way, make a replica. http://www.perforce.com/perforce/r10.2/manuals/p4sag/10_replication.html Make sure you look at the p4d -M flag to make sure it's a read only replica. I assume you have a USB drive or something to move between networks, so you can just issue a p4 pull onto the USB drive, then move the drive, and either run it off the USB, or issue another p4 pull, pulling to a final server. Never tried this, but with some work it should be possible, you'll have to run a server off the USB to issue the final p4 pull.
You could take a look at perforce git fusion, and make some git clones.
You could also look at remote depots. Basically you create a new depot on your destination server, and point it at a depot on your source server. This works if you have a fast connection between the 2 servers. Protections are handled by the destination server, as to who has access to that new depot. The source server can be set up to share it out as read only to the destination server. Here is some info
http://answers.perforce.com/articles/KB_Article/Creating-A-Remote-Depot
Just make sure you test it during a slow period, as it can slow down the destination server. I tried it from 2 remote locations, both on the east coast US, and it was acceptable, but not too useful. If both servers are in the same building it would be fine.

Faster clean Perforce sync over VPN

I have to regularly do a clean Perforce sync to new hardware/virtual machines over the VPN. This can take hours as the project is quite large. Is there a way that I can simply copy an up-to-date tree from an existing client and tell Perforce to use this tree?
The Perforce Proxy is the right way to go, but if you really want to, there is a way to do what you asked via the sync command, with the -k switch:
The -k flag bypasses the client file
update. It can be used to make the
server believe that a client workspace
already has the file. Typically this
flag is used to correct the Perforce
server when it is wrong about what
files are on the client, use of this
option can confuse the server if you
are wrong about the client's contents.
p4 sync -k //depot/someProject/...
You can also use flush, which is a synonym for sync -k:
p4 flush //depot/someProject/...
Just be careful. Remember those last words, "...use of this option can confuse the server if you are wrong about the client's contents."
Perforce Proxy is almost definitely the way to go, assuming you can dedicate a local machine for this purpose.
A useful tip for a Proxy is to get it to refresh its contents overnight, just by creating a dummy client (perhaps on the proxy machine), and kicking off a nightly task to do a sync - a normal sync will do, does not need to be a clean one. This will ensure any big changes people have checked in won't necessarily cause a massive lag the first time you need to do a local sync.
Note that you need a live VPN connection between the proxy and the server - the proxy still has to talk to the server to determine if it has the right versions cached. So the proxy needs a reasonably low latency link to the server, but at least you don't have to wait for actual file transfer.
Another alternative you may want to try is to use the compress option in your client specs (workspaces). This tells the server to compress each file before it gets sent, and your p4 client will decompress automatically. The trade-off here is CPU time on both the server and the client. However, given you want to sync several local clients, I think proxy will ultimately be the better solution.
No, but you shouldn't need to: Why do you need to do a clean perforce sync? What's wrong with a normal sync? If you need to clean the tree, then why not work on a copy of the tree?
One alternative might be to run a p4proxy on your end of the VPN connection, then unchanged files won't have to be transferred over the VPN.
If you only require an export - that is you don't need to keep it up-to-date or submit changes from it, then you could simply copy an existing checkout, and never use perforce against that tree. But I don't know anyway of convincing perforce server that you have a checkout without p4 actually checking out the files.

What's the best way to keep multiple Linux servers synced?

I have several different locations in a fairly wide area, each with a Linux server storing company data. This data changes every day in different ways at each different location. I need a way to keep this data up-to-date and synced between all these locations.
For example:
In one location someone places a set of images on their local server. In another location, someone else places a group of documents on their local server. A third location adds a handful of both images and documents to their server. In two other locations, no changes are made to their local servers at all. By the next morning, I need the servers at all five locations to have all those images and documents.
My first instinct is to use rsync and a cron job to do the syncing over night (1 a.m. to 6 a.m. or so), when none of the bandwidth at our locations is being used. It seems to me that it would work best to have one server be the "central" server, pulling in all the files from the other servers first. Then it would push those changes back out to each remote server? Or is there another, better way to perform this function?
The way I do it (on Debian/Ubuntu boxes):
Use dpkg --get-selections to get your installed packages
Use dpkg --set-selections to install those packages from the list created
Use a source control solution to manage the configuration files. I use git in a centralized fashion, but subversion could be used just as easily.
An alternative if rsync isn't the best solution for you is Unison. Unison works under Windows and it has some features for handling when there are changes on both sides (not necessarily needing to pick one server as the primary, as you've suggested).
Depending on how complex the task is, either may work.
One thing you could (theoretically) do is create a script using Python or something and the inotify kernel feature (through the pyinotify package, for example).
You can run the script, which registers to receive events on certain trees. Your script could then watch directories, and then update all the other servers as things change on each one.
For example, if someone uploads spreadsheet.doc to the server, the script sees it instantly; if the document doesn't get modified or deleted within, say, 5 minutes, the script could copy it to the other servers (e.g. through rsync)
A system like this could theoretically implement a sort of limited 'filesystem replication' from one machine to another. Kind of a neat idea, but you'd probably have to code it yourself.
AFAIK, rsync is your best choice, it supports partial file updates among a variety of other features. Once setup it is very reliable. You can even setup the cron with timestamped log files to track what is updated in each run.
I don't know how practical this is, but a source control system might work here. At some point (perhaps each hour?) during the day, a cron job runs a commit, and overnight, each machine runs a checkout. You could run into issues with a long commit not being done when a checkout needs to run, and essentially the same thing could be done rsync.
I guess what I'm thinking is that a central server would make your sync operation easier - conflicts can be handled once on central, then pushed out to the other machines.
rsync would be your best choice. But you need to carefully consider how you are going to resolve conflicts between updates to the same data on different sites. If site-1 has updated
'customers.doc' and site-2 has a different update to the same file, how are you going to resolve it?
I have to agree with Matt McMinn, especially since it's company data, I'd use source control, and depending on the rate of change, run it more often.
I think the central clearinghouse is a good idea.
Depends upon following
* How many servers/computers that need to be synced ?
** If there are too many servers using rsync becomes a problem
** Either you use threads and sync to multiple servers at same time or one after the other.
So you are looking at high load on source machine or in-consistent data on servers( in a cluster ) at given point of time in the latter case
Size of the folders that needs to be synced and how often it changes
If the data is huge then rsync will take time.
Number of files
If number of files are large and specially if they are small files rsync will again take a lot of time
So all depends on the scenario whether to use rsync , NFS , Version control
If there are less servers and just small amount of data , then it makes sense to run rysnc every hour.
You can also package content into RPM if data changes occasionally
With the information provided , IMO Version Control will suit you the best .
Rsync/scp might give problems if two people upload different files with same name .
NFS over multiple locations needs to be architect-ed with perfection
Why not have a single/multiple repositories and every one just commits to those repository .
All you need to do is keep the repository in sync.
If the data is huge and updates are frequent then your repository server will need good amount of RAM and good I/O subsystem

Resources