Clearcase multisite vs Gitlab.... anything? - gitlab

The company I work for uses Clearcase, even though it was EOL'd on the platforms in which we run it years ago. It is ancient and fragile tech, but one thing it does have is a multisite support that allows for the synchronization of air-gapped repos. Because of security issues, we use secure USB sticks to copy packets and take them to the other side, then apply them with scripts.
Developers and DevOps people want to make a business case to migrate to GitLab, but I cannot find any mention of a feature in GitLab that would allow me to do easily do this. There's something about bundles, but the info I have found is years old and it doesn't seem like too many people are using it.
Does GitLab not support this? Simple synchronization of one repo to another over an air gap using some sort of secure media? If so, it's no wonder so many teams are still using ClearCase.

While not exactly easy, air-gap updates of Git repository is possible through the git bundle command.
It produces one file (with all the history, or only the latest commits for an incremental update), that you can:
copy and distribute easily (it is just one file after all)
clone or pull from(!)
This is not tied to GitLab, and can be applied to any Git repository.
From there, I have written before on migration from ClearCase to Git, and I usually:
do not import the full history, only major labels or UCM baselines
split VObs per project, each project being one Git repository
revisit what was versioned in Vobs: some large files/binaries might need to be .gitignore'd in the new Git repository.
You would not "migrate views": they are just workspace (be it static or dynamic). A simple clone of a repository is enough to recreate such a workspace (static here).

Related

Fulltext search over all git repositories (gitlab)

Is there a way to search over all git repositories with gitlab?
At the moment we are using the open source version of gitlab.
Since it is open source, there should be a way to enable it without switching to the commercial version....
A search over "all Git repositories" must first determine its scope: all branches or only the master branch.
Because a search across all branches is not easy to implement, and is discussed in this 2 years old issue 1086.
Searching across all repositories was also recently requested (issue 9794), but also not implemented yet.
A GraphQL query might come close of what you are looking for, but its API is still alpha, and might be limited to one project or group.

Gitlab repository vs project vs submodule

I started exploring Gitlab for version control management and I got an issue at the first step itsself. When ever I create a project its creating a new repository. I have few webapplications which are independent to each other. In that case do I need to use different repository for every project.
What I am looking for is what is what and when to use what but not able to find what is repository and what is project in gitlab website as well as through other sources as well.
Also I came across a term submodule, when can it be used. Can I create one global project and have all the webapplications as different submodules.
Can any one please help me in understanding the difference between those 3 and when to use what based on their intended way of usage. Also please help me by pointing to a good learning site where I can get the information of doing basic version control operations in gitlab.
Thanks.
Gitlab manages projects: a project has many features in addition of the Git repo it includes:
issues: powerful, but lightweight issue tracking system.
merge requests: you can review and discuss code before it is merged in the branch of your code.
wiki: separate system for documentation, built right into GitLab
snippets: Snippets are little bits of code or text.
So fear each repo you create, you get additional features in its associated project.
And you can manage users associated to that project.
See GitLab documentation for more.
The Git repo and Git submodule are pure Git notions.
In your case, a submodule might not be needed, unless you want a convenient way to remember the exact versions of different webapp repo, recorded in one parent repo.
But if that is the case, then yes, you can create one global project and have all the webapplications as different submodules.
Each of those submodules would have their own GitLab project (and Git repo).

Drupal: do you use SVN for websites development?

Do you use Subversion while developing a website with drupal ?
I'm not talking about modules development, but websites development (i.e. adding hook functions, modifying template file.. etc)
thanks
Yes.
Anything that's got any kind of ongoing development or is going to change over time should be version controlled.
Even if you're just doing a very small project, the value of having a version history is indesputable, and being able to make changes without worrying about overwriting someone else's updates is priceless.
Yes, its's good keep a SVN repository synced with your local instance.For that purpose you can use Eclipse.
Yes, but we are moving to git in the near future because it offers a better feature set (distributed SCM ftw) and more options for managing our code base (git submodules, stashing, better hook integration, better merging support, rebasing, and so much more). For the time being we've got our repos setup like so:
/trunk
/branches/6.x/1.x/core
/branches/6.x/1.x/sitename.domain.edu
/branches/6.x/1.x/sitename2.domain.edu
/branches/6.x/1.1.x/core
/branches/6.x/1.1.x/sitename.domain.edu
...
/tags/6.x/1.x/core
/tags/6.x/1.x/sitename.domain.edu
/tags/6.x/1.x/sitename2.domain.edu
/tags/6.x/1.1.x/core
/tags/6.x/1.1.x/sitename.domain.edu
...
Each branch is a svn copy of the trunk repo (where we do most of our development) and each tag is a svn copy of it's corresponding branch. The core branch is the primary distro that we distribute to all of our sites that share the university's look and feel, and each subsite is a site with special modules, custom theme, or any other functionality that isn't part of the primary distro. It makes moving between drupal releases a lot easier, but you can start to run into problems merging occasionally. Also you run into performance issues when the repo starts to grow, which is part of the reasoning behind moving to git.
Yes. Version control is critical. Distributed version control systems such as Git, Mercurial, and Bazaar are particularly nice, and let you start committing immediately, without the need to push those changes to a central server.
My Drupal workflow: use Mercurial and its sub-repositories feature to create independent repositories for 1) Drupal + contributed modules, 2) theme, and 3) custom modules. That way, I can clone from a single URL, get my entire project, and be able to track changes to each distinct piece independently.

Using hg repository as web site

This is somewhat related to my security question here. Is it a bad idea to use an hg / mercurial repository for a live website? If so, why?
Furthermore, we have dev, test and production installations of our website, like dev.example.com, test.example.com and www.example.com. If it's a bad idea to use a repository for a live/production website, would it be OK to use an hg repository for the dev and test sites?
I'm also concerned about ease of deployment. We have technical and less technical co-workers who will be working with the site. The technical people (software engineers) won't have any problem working with the command line or TortoiseHG. I'm more concerned about the less technical people (web designers). They won't be comfortable working on the command line, and may even find TortoiseHG daunting. These co-workers mostly upload .css files and images to the server. I'd like for these files (at least the .css files) to be under version control, but I want this to be as transparent as possible for the non technical team members.
What's the best way to achieve this?
Edit:
Our 'site' is actually a multi-site CMS setup with a main repository and several subrepositories. Mock-up of the repository structure:
/root [main repository containing core files and subrepositories]
/modules [modules subrepository]
/sites/global [subrepository for global .css and .php files]
/sites/site1 [site1 subrepository]
...
/sites/siteN [siteN subrepository]
Software engineers would work in the root, modules and sites/global repositories. Less technical people (web designers) would work only in the site1 ... siteN subrepositories.
Yes, it is a bad idea.
Do not have your repository as your website. It means that things checked in, but unworking, will immediately be available. And it means that accidental checkins (it happens) will be reflected live as well (i.e. documents that don't belong there, etc).
I actually address this "concept" however (source control as deployment) with a tool I've written (a few other companies are addressing this topic now, as well, so you'll see it more). Mine is for SVN (at the moment) so it's not particularly relevant; I mention it only to show that I've considered this previously (not on a Repository though; a working copy, in that scenario the answer is the same: better to have a non-versioned "free" are as the website directory, and automate (via user action) the copying of the 'versioned' data to that directory).
Many folks keep their sites in repositories, and so long as you don't have people live-editing the live-site you're fine. Have a staging/dev area where your non-revision control folks make their changes and then have someone more RCS-friendly do the commit-pull-merge-push cycle periodically.
So long as it's the conscious action of a judging human doing the staging-area -> production-repo push you're fine. You can even put a hook into the production clone that automatically does a 'hg update' of the working directory within that production clone, so that 'push' is all it takes to deploy.
That said, I think you're underestimating either your web team or tortoiseHg; they can get this.
me personally (i'm a team of 1) and i quite like the idea of using src control as a live website. more so with hg, then with svn.
the way i see it, you can load an entire site, (add/remove files) with a single cmd
much easier then ftp/ssh this, delete that etc
if you are using apache (and probably iis as well) you can make a simple .htaccess file that will block all .hg files (or .svn if you are using svn)
my preferred structure is
development site is on local machine running directly out of a repository (no security is really required here, do what you like commit as required)
staging/test machine is a separate box or vm running a recent copy of the live database
(i have a script to push committed changes to staging server and run tests)
live machine
(open ssh connection, push changes to live server, test again, can all be scripted reasonably easily, google for examples)
because of push/pull nature of hg, it means you can commit changes and test without the danger of pushing a broken build to the live website. like you say in your comments, only specific people should have permission to push a version to the live site. (if it fails, you should easily be able to revert to the previous version, via src control)
Why not have a repo also be an active web server (for dev or test/QA environment anyway)?
Here's what I am trying to implement:
Developers have local test environments in which they can build and test their code
Developers make a clone of the dev environment on their local dev machine
Developers commit as often as they want to their local repo
When chunk of work is done and tested, then developer pushes working change sets to dev repo
Changes would be merged and tested on Dev, then pushed to Test/QA, and so on.
BTW, we're using Mercurial. I believe this model would only work using a distributed source code management tool.

How to set up a git repository where different users can only see certain parts?

How do you set up a git repository where some users can see certain parts of the source code and other users can see all of it? I've seen lots of guides for only giving certain users commit access, but these assume everyone should have read access. I've also heard of gitosis, but I'm not sure it supports this and it hasn't had any commits in over a year so I think it's dead.
In short: you can't. Git is snapshot based (at conceptual level at least) version control system, not changeset based one. It treats project (repository) as a whole. The history is a history of a project, not a union of single-file histories (it is more than joining of per-file histories).
Using hooks like update-paranoid hook in contrib, or VREFs mechanism of gitolite, you can allow or forbid access to repository, you can allow or forbid access to individual branches. You can even forbid any commits that change things in specified subdirectory. But the project is always treated as a whole.
Well, there is one thing you can do: make a directory you want to restrict access to into submodule, and restrict access to this submodule repository.
The native git protocol doesn't support this; git assumes in many places that everybody has a complete copy of all of the history.
That said, one option may be to use git-subtree to split off part of the repository into its own subset repository, and periodically merge back.
Git doesn't support access control on the repository. You can however, implement access control on the repository yourself, by using hooks, more specifically the update hook.
Jörg has already pointed out that you can use hooks to do this. Exactly which hook(s) you need depends on your setup. If you want the permissions on a repo that gets pushed to, you'll need the update hook like he said. However, if it's on a repo that you're actually working in (committing and merging), you'll also need the pre-commit and post-merge hooks. The githooks manpage (Jörg linked to this too) notes that there's in fact a script in the contrib section demonstrating a way to do this. You can get this by grabbing a git tarball, or pull it out of git's gitweb repo: setgitperms.perl. Even if you're only using the update hook, that might be a useful model.
In general, Git is not intended for this. By now it seems to have out-of-the-box access control only up to the repository level.
But if you need just to hide some part of secret information in your Git repository (which is often the case) you can use git-crypt (https://github.com/AGWA/git-crypt) with encryption keys shared based on users GPG keys (https://gnupg.org/).
Alternatively you can use git submodules (https://git-scm.com/book/en/v2/Git-Tools-Submodules) if you can break your codebase to logical parts. Then all users receive access only to certain repositories which you then integrate into 'large' codebase through sub-modules where you add other code and allow it for only 'privileged' users.

Resources