Gitolite and non-bare repos?

Gitolite and non-bare repos? - linux

Currently, I am tracking my repository with inotifywait. And it supports only flat files, so I created non-bare repo actually with git.
But I decided to go to Gitolite and I can’t see anything about creating non-bare repo. Is there is an option?

inotifywait waits for changes to files.
If you want to monitor changes at the server level (where gitolite operates, behind ssh), you will need to add a non-update hook, typically a post-receive one, which will checkout the bare repo somewhere monitored by inotifywait.

Related

Keeping an auto-updating clone of a bare git repository

Here is what I did
cd /git
git init --bare repo
What I want to do
I want an auto-updating non-bare clone of the repository to be available at a different location e.g. /srv/web/. What I mean is that everytime someone does a git push the contents in /srv/web/ should automatically update. Similarly, if the git repository is reverted back, then the files in /srv/web should also revert to that.

What I mean is that everytime someone does a git push [to the bare repository in /git/repo,] the contents in [non-bare] /srv/web/ should automatically update. Similarly, if the git repository is reverted back, then the files in /srv/web should also revert to that.
You have, in essence, two choices:
Make /git/repo actively update /srv/web. This /git/repo -> /srv/web path is a "push update" (not the same as git push, but might as well be): it has the "mastering" repository update the "slaving" one whenever there is an update available on the master side.
Make /srv/web actively update from /git/repo. This /git/repo <- /srv/web path is a "pull update" (not the same as git pull, unless you implement it that way, but might as well be): it has the slaving repository update from the mastering one at regular intervals.
Your second requirement ("if the git repository is reverted back") is rather mysterious. A bare repository, by definition, has no work-tree; so no one can do any work in it. It can only be updated by bringing in new commits from some other Git repository. If someone wants to do a git revert, they do it in some other repository, and then git push. So all updates to the bare repository should happen via git push and you should not need this second requirement.
Hence, I'll just ignore the second requirement entirely.
When and why to use one or the other
There's no particularly strong reason to favor either approach, but note that each has a different flaw.
If you use push updates, and the receiver is down, the update never happens. The master tries but fails to update the slave. When the slave comes back up, the master just sits around until there's a new update.
(If everything is on a single server, this problem goes away, and this method becomes the clear winner.)
If you use pull updates, there is a time-lag: however long the pull interval is, the slave can remain out of date. Furthermore, if the master goes down just before an update, the slave can remain out of date even longer than that.
Making /srv/web actively update from /git/repo (pull style update)
This is conceptually simpler. You kist have your /srv/web poll your /git/repo for any interesting updates. The poll frequency / interval determines how long it takes for the update to make it from point A to point B. To make this faster, you could poll infrequently, but also have a triggering mechanism that you invoke from, e.g., a post-receive script: "I just got some important update; please poll now." In other words, you use a hybrid of pull-and-push.
You can literally just run git pull from a crontab entry, for instance (though I recommend not using git pull ever, including here: break it up into git fetch followed by another Git command).
Making /git/repo actively update /srv/web (push style update)
[Edit: I got interrupted while writing the original answer, and mixed up the update and post-update hooks; this is now fixed.]
This is relatively straightforward, using a post-receive or post-update hook. There is also an update hook but that's the wrong place to do this. The difference between them all is I think illustrated best with an example: What happens in /git/repo if I, as someone with push access to it, do this from my own Git clone:
git push origin 1234567:refs/heads/zorg 8888888:refs/tags/lucky
Here, I am telling my Git to contact your server Git (my origin = your /git/repo) and deliver my commit 1234567 to your Git. My Git does so, along with any other objects required to make 1234567 useful. I am also telling my Git to deliver commit-or-tag 8888888 to your Git, so my Git does that, along with any other objects required to make 8888888 useful.
Once your Git has all those objects, my Git asks your Git:
Please set your branch zorg (refs/heads/zorg) to 1234567.
Please set your tag lucky (refs/tags/lucky) to 8888888.
At this point, your Git will invoke your pre-receive hook, if you have one. It delivers the old and new hash IDs for refs/heads/zorg and refs/tags/lucky on standard input. Your pre-receive hook's job is to examine these and decide yea-or-nay: "allow all these updates to proceed to the next step" or "forbid any of these updates from occurring at all."
Next, your Git will invoke your update hook twice (again, if you have one). One of these will say "hey, someone is asking to change refs/heads/zorg, here's the old and new hash values, should we let him?" The other will say "hey, someone is asking to change refs/tags/lucky, here's the old and new hash values, should we let him?" Your hook's job is to examine this one update and decide yea-or-nay: allow the update, or reject it. If you allow one and reject the other, the one update occurs and the other fails.
Finally, after all of the updates have been accepted or rejected, for whatever updates actually did occur, your Git invokes your post-receive and post-update hooks (if those exist). Your Git delivers to your post-receive hook, on standard input, one line for each update that did occur, in the same form it used in the pre-receive hook. Your post-receive hook can do whatever it wants with these input lines, but it's too late to stop the updates from happening: they are already done. Your zorg branch now points to commit 1234567 and your lucky tag now points to commit 8888888, assuming your pre-receive and update hooks did not reject these. Your Git delivers to your post-update hook, as arguments, one argument for each updated reference: refs/heads/zorg and refs/tags/lucky.
You may now take any action you like.
The obvious action to take, in the post-receive or post-update hook, is to trigger /srv/web to pick up the new commit(s) on any branch(es) you want it to update. (The update hook is not suitable as, at hook time, the actual change has not yet happened, so if your /srv/web is very fast, it might not be able to get the new objects from your /git/repo yet: they may still in the process of being cemented into place.)
The actual implementation could be as simple as: "Ditch $GIT_DIR environment variable, cd into slave repository, and run git pull." The reason to unset GIT_DIR is that any Git hook is run with this variable set, and it contains a relative path to the Git repository, which interferes with using other repositories. As before, I recommend avoiding git pull entirely.
Also, be aware that the user-ID (i.e., privileges) of the user that is running the post-receive script depends on the authentication method used to do the git push in the first place. This affects all deployment methods, even if the post-receive script simply sends a message (e.g., a packet on a socket port) to some independent process that does the slave-side update, since the privileges available to send a message may depend on user-ID.
Final note: do you really need a Git repository in the deployment area?
If your server is a typical Web server, it doesn't need a Git repository. You can simply update the equivalent of a work-tree. If your web server is on a different system, using a Git repository may be the simplest or most convenient way to achieve this, but if it is all on one machine, you can just run git --work-tree=/path/to/work-tree checkout ... from the bare repository.
(Note that what gets checked out, and how the update happens, depends on what is in the index and HEAD in the actual repository, and how the index compares to the supplied work-tree. Additional arguments to git checkout may change which branch is to be checked-out, which will update HEAD correspondingly.)

Using git is not actually perfect fit for the scenario you envision for a couple of reasons.
First you are completely reversing the normal use of git. A git repository is actually a logical picture of your project. There might be branches in the project so this logical picture is much more complex then latest version. You need to get actual branch you want to a working copy and work on it. This is what non-bare repositories are about. They are repository and a working copy. It is not the intended use of git to push latest version to a working copy.
Second there are technical difficulties about pushing to a non-bare repository. As a default behavior git would deny pushing to a non-bare repository. However there are ways to configure your non-bare for that. But that configuration is only feasible if you'll never ever modify your non-bare working copy. If you begin to modify the working copy at non-bare you'll definitely start having problems.
Third, if you're willing to serve your working copy on web keep in mind that .git directory will be served too. This might cause vulnerabilities. If you'll do this I at least recommend serving a sub folder of your project if possible. This way .git is left out.
However I'll recommend you another method for doing all this. Instead of initializing a directory under the web tree as a repository you can simply auto copy all you working copy (without repository -- .git folder) to the desired directory. Since you are only interested in serving the files that would be a more suitable method.
At your repository /git/repo, there is a folder named hooks. Create file /git/repo/hooks/post-receive under this directory with the content
#!/bin/bash
rm -rf /srv/web/*
git archive master | tar -x -C /srv/web
Also you need to give execute permission to this file.
chmod +x /git/repo/hooks/post-receive
Then after each push to this bare repo, HEAD of branch master will be copied to the directory of your choice without any repository information.
Update: I think the initial solution in the answer was not valid. So I removed it, alternative solution is still ok though.
Update 2: As #torek noticed this solution causes a small window of invalid content in the web directory. Since you indicated you'll serve the web content on local network, I guess that is not a problem. Moreover this is basically a kind of poor man's deployment scenario and should not be used any production deployment. However this can be improved with a temporary staging directory.
Replace the post-receive hook with the below script. This script reduces the time your /srv/web directory stays empty. Since rm -rf and mv are pretty fast (if your temp directory is on the same disk drive) and since repository size does not effect both commands the invalid content window will be smaller.
#!/bin/bash
STAGING=`mktemp -d`
git archive master | tar -x -C $STAGING
rm -rf /srv/web
mv $STAGING /srv/web
Or you can use a swap instead of deleting the folder first as #torek suggested.
#!/bin/bash
STAGING=`mktemp -d`
SWAP=`mktemp -d`
git archive master | tar -x -C $STAGING
mv /srv/web $SWAP
mv $STAGING /srv/web
rm -rf $SWAP
However note that you are deleting or swapping /srv/web and you'll lose any ownership, permission or ACL information of the folder if you follow this method.
You can alternatively use rsync which will still copy the files, but since it will operate selectively whole content will not be deleted at any instant. Also rsync can be tuned to preserve ownership, permissions, etc.
#!/bin/bash
STAGING=`mktemp -d`
git archive master | tar -x -C $STAGING
rsync -a --delete --remove-source-files $STAGING /srv/web

git workflow with multiple remotes and order of operations

I have a bare git repository that I use to push to and pull from on a linux machine (let's call the bare git repository remote originlinux). From my working repository that has originlinux as a remote I push and pull until finally I decide to put it on github. I add the repository for github on their web gui and add the remote repository on my working repository (let's call the remote origingithub) using the git remote add command followed by git pull --rebase, then git push (pull before push since I wasn't allowed to simply push to a newly created github repository without getting one of these: 'hint: Updates were rejected because the tip of your current branch is behind'. I figure this has something to do with their option to create a readme file). And here's the issue, after performing these steps, the originlinux repository is completely not synced with the origingithub repository even though they have exactly the same commits and were pushed to from the same exact working repository. Could someone please explain in good detail why this occurring and also what I could do differently to prevent this from happening without reordering how I create my remote repositories? It seems like the workflow or order of operations I'm using doesn't make sense in git land, but how else would you keep multiple remote repositories sync'd on one working copy?
Thanks!

The two repositories do not have the same commits.
When you did git pull --rebase, you rewrote the entire history of the project so that every revision contains that readme file. So every commit in the history will have a different SHA1 identifier.
There are a couple of ways that you may be able to recover from this.
First, you could revert the state of your local repository to match the state or your first (non-github) remote. This would eliminate the readme file that you created on github (you can copy that to some other location and add it back in to git later if desired), along with any changes that you hadn't pushed to the first remote (including changes that haven't been committed).
git reset --hard originlinux/master
git push -f origingithub
The -f option there causes the push to be forced even though that is removing some commits. This is something that should be avoided in general, but it is sometimes necessary such as in this case.
The other option would be to just do a force push to your first remote, accepting the new history caused by the rebase.
git push -f originlinux
If the three repositories that you mentioned are the only ones, it shouldn't matter much which of these methods you use. If there are other repositories you may want try to determine which version of the history is more widely known, and keep that version.

Including my own scripts in the admin-repo with gitolite

I have some scripts that I call from a common hook of gitolite, I want to manage them from the configuration directories of the admin-repo repository so I can modify them more easily and they will be versioned also.
I have tried by adding a new directory and by tracking it with git add, but it does not work as expected. Maybe gitolite has some way to do this but I have not found any information on how to do such a thing.

Note: the following is for Gitolite V3 or g3 only.
You can add/manage them in their own directories, namely the "hooks/common" sub-directory if the gitolite-admin (create it if it doesn't exists).
That directory will appear in your Gitolite server in ~/.gitolite/hooks/common, and if you define a LOCAL_CODE rc variable) pointing to it, it will be taken into account.
You might need to follow the push of the gitolite-admin repo by a gitolite setup --hooks-only on the server though.

git post-receive hook which connects to remote via ssh and git pulls

just trying to write my 1st git hook. I have a remote origin and when I push to it I would like a post-receive hook to fire which will ssh into my live server and git pull. Can this be done, and is it a good approach?
Ok i've got the hook firing and the live server is doing the git pull but it's saying its already up to date? any ideas?

yes, that can be done and yes, in my opinion, that is good practice. Here is our use-case:
I work in a small development group that maintains a few sites for different groups of people.
Each site has several environments (beta, which is a sandbox for all developers, staging, which is where we showcase changes to the content owners before going live, training which is what our training dude use to train new content managers and live, where everyone goes to consume content).
We control deployment to all these environments via post-receive hooks based on branch names. We may have a 'hot fix' branch that doesn't get deployed anywhere, but when we merge it with, say, the 'beta' branch, it then gets auto-deployed to the beta server so we can test how our code interacts with the code of the other developers.
There are many ways you can do this, what we do is setup your ssh keys so that the git server can ssh into your web server and do a git pull. This means you gotta add the public key for git#gitserver into your git#webserver authorized_keys file and vice-versa, then, on the post-receive hook you parse out the branch and write a 'case' statement for it, like so:
read line
echo "$line" | . /usr/share/doc/git-core/contrib/hooks/post-receive-email
BRANCH=`echo $line | sed 's/.*\///g'`
case $BRANCH in
"beta" )
ssh git#beta "cd /var/www/your-web-folder-here; git pull"
;;
esac
Hope that helps.

That can certainly be done, but it isn't the best approach. When you use git pull, git will fetch and then merge into your working copy. If that merge can be guaranteed to always be a fast-forward, that may be fine, but otherwise you might end up with conflicts to resolve in the deployed code on your live server, which most likely will break it. Another problem with deploying with pull is that you can't simply move back to an earlier commit in the history, since the pull will just tell you that your branch is already up-to-date.
Furthermore, if you're pulling into a non-bare repository on your live server, you'll need to take steps to prevent data in your .git directory from being publicly accessible.
A similar but safer approach is to deploy via a hook in a bare repository on the live server which will use git checkout -f but with the working directory set to the directory that your code should be deployed to. You can find a step-by-step guide to setting up such a system here.

Syncing website files between local and live servers using GIT?

Say I have two web servers, one local development and one live.
Under SVN I would checkout the website files to my local webserver's public_html directory and also to the live webserver's public_html directory. I would then work on the files directly on the local server and commit any changes to the central repository.
When I'm ready for those changes to go live on the live server, I would SSH in and perform an SVN update.
Essentially I have two working copies, one on live and one locally, though other users may also have working copies on their local machines. But there will only ever be one working copy on the live server. The reason for this is so that we can just perform SVN update on live server every time we want changes to be published.
How can a simiar workflow be accomplished using GIT?

To model your current work flow almost exactly do:
Set up a git repo.
Clone the repo on the server and locally.
Work locally
git push to the git repo
ssh to server
git pull.
Another way to do it would be to set up a "production" branch in git, have a cron job that continually pulls this branch on the server, and then just merge and push to the "production" branch any time you want to publish your changes. Sounds like you need a more concrete branching strategy.
See: Git flow branching model && git flow cli tool
Good luck! This is a very solvable problem with git.

You might find this useful: http://joemaller.com/990/a-web-focused-git-workflow/

In your local working copy:
git push ssh://you#yourserver/path/to/your/wc
will push the commited changes in your local version to yourserver.

Having a setup that triggers automatically pulling like leonbloy and codemac suggested may seem like a good idea at first but it tends to be very fragile. I suggest a different alternative.
http://toroid.org/ams/git-website-howto

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string