In my GitLab repository, I have a group with 20 projects. I want to clone all projects at once. Is that possible?
One liner with curl, jq, tr:
for repo in $(curl -s --header "PRIVATE-TOKEN: your_private_token" https://<your-host>/api/v4/groups/<group_id> | jq -r ".projects[].ssh_url_to_repo"); do git clone $repo; done;
For Gitlab.com use https://gitlab.com/api/v4/groups/<group_id>
To include subgroups add include_subgroups=true query param like
https://<your-host>/api/v4/groups/<group_id>?include_subgroups=true
Note: To clone with http url use http_url_to_repo instead of ssh_url_to_repo in jq (Thanks #MattVon for the comment)
Update Dec. 2022, use glab repo clone
glab repo clone -g <group> -p --paginate
With:
-p, --preserve-namespace: Clone the repo in a subdirectory based on namespace
--paginate: Make additional HTTP requests to fetch all pages of projects before cloning. Respects --per-page
That does support cloning more than 100 repositories (since MR 1030, and glab v1.24.0, Dec. 2022)
This is for gitlab.com or for a self-managed GitLab instance, provided you set the environment variable GITLAB_URI or GITLAB_HOST: it specifies the URL of the GitLab server if self-managed (eg: https://gitlab.example.com).
Original answer and updates (starting March 2015):
Not really, unless:
you have a 21st project which references the other 20 as submodules.
(in which case a clone followed by a git submodule update --init would be enough to get all 20 projects cloned and checked out)
or you somehow list the projects you have access (GitLab API for projects), and loop on that result to clone each one (meaning that can be scripted, and then executed as "one" command)
Since 2015, Jay Gabez mentions in the comments (August 2019) the tool gabrie30/ghorg
ghorg allows you to quickly clone all of an org's or user's repos into a single directory.
Usage:
$ ghorg clone someorg
$ ghorg clone someuser --clone-type=user --protocol=ssh --branch=develop
$ ghorg clone gitlab-org --scm=gitlab --namespace=gitlab-org/security-products
$ ghorg clone --help
Also (2020): https://github.com/ezbz/gitlabber
usage: gitlabber [-h] [-t token] [-u url] [--debug] [-p]
[--print-format {json,yaml,tree}] [-i csv] [-x csv]
[--version]
[dest]
Gitlabber - clones or pulls entire groups/projects tree from gitlab
Here's an example in Python 3:
from urllib.request import urlopen
import json
import subprocess, shlex
allProjects = urlopen("https://[yourServer:port]/api/v4/projects?private_token=[yourPrivateTokenFromUserProfile]&per_page=100000")
allProjectsDict = json.loads(allProjects.read().decode())
for thisProject in allProjectsDict:
try:
thisProjectURL = thisProject['ssh_url_to_repo']
command = shlex.split('git clone %s' % thisProjectURL)
resultCode = subprocess.Popen(command)
except Exception as e:
print("Error on %s: %s" % (thisProjectURL, e.strerror))
There is a tool called myrepos, which manages multiple version controls repositories. Updating all repositories simply requires one command:
mr update
In order to register all gitlab projects to mr, here is a small python script. It requires the package python-gitlab installed:
import os
from subprocess import call
from gitlab import Gitlab
# Register a connection to a gitlab instance, using its URL and a user private token
gl = Gitlab('http://192.168.123.107', 'JVNSESs8EwWRx5yDxM5q')
groupsToSkip = ['aGroupYouDontWantToBeAdded']
gl.auth() # Connect to get the current user
gitBasePathRelative = "git/"
gitBasePathRelativeAbsolut = os.path.expanduser("~/" + gitBasePathRelative)
os.makedirs(gitBasePathRelativeAbsolut,exist_ok=True)
for p in gl.Project():
if not any(p.namespace.path in s for s in groupsToSkip):
pathToFolder = gitBasePathRelative + p.namespace.name + "/" + p.name
commandArray = ["mr", "config", pathToFolder, "checkout=git clone '" + p.ssh_url_to_repo + "' '" + p.name + "'"]
call(commandArray)
os.chdir(gitBasePathRelativeAbsolut)
call(["mr", "update"])
I built a script (curl, git, jq required) just for that. We use it and it works just fine: https://gist.github.com/JonasGroeger/1b5155e461036b557d0fb4b3307e1e75
To find out your namespace, its best to check the API quick:
curl "https://domain.com/api/v3/projects?private_token=$GITLAB_PRIVATE_TOKEN"
There, use "namespace.name" as NAMESPACE for your group.
The script essentially does:
Get all Projects that match your PROJECT_SEARCH_PARAM
Get their path and ssh_url_to_repo
2.1. If the directory path exists, cd into it and call git pull
2.2. If the directory path does not exist, call git clone
Here is another example of a bash script to clone all the repos in a group. The only dependency you need to install is jq (https://stedolan.github.io/jq/). Simply place the script into the directory you want to clone your projects into. Then run it as follows:
./myscript <group name> <private token> <gitlab url>
i.e.
./myscript group1 abc123tyn234 http://yourserver.git.com
Script:
#!/bin/bash
if command -v jq >/dev/null 2>&1; then
echo "jq parser found";
else
echo "this script requires the 'jq' json parser (https://stedolan.github.io/jq/).";
exit 1;
fi
if [ -z "$1" ]
then
echo "a group name arg is required"
exit 1;
fi
if [ -z "$2" ]
then
echo "an auth token arg is required. See $3/profile/account"
exit 1;
fi
if [ -z "$3" ]
then
echo "a gitlab URL is required."
exit 1;
fi
TOKEN="$2";
URL="$3/api/v3"
PREFIX="ssh_url_to_repo";
echo "Cloning all git projects in group $1";
GROUP_ID=$(curl --header "PRIVATE-TOKEN: $TOKEN" $URL/groups?search=$1 | jq '.[].id')
echo "group id was $GROUP_ID";
curl --header "PRIVATE-TOKEN: $TOKEN" $URL/groups/$GROUP_ID/projects?per_page=100 | jq --arg p "$PREFIX" '.[] | .[$p]' | xargs -L1 git clone
Yep it's possible, here is the code.
prerequisites:
pip install python-gitlab
#!/usr/bin/python3
import os
import sys
import gitlab
import subprocess
glab = gitlab.Gitlab(f'https://{sys.argv[1]}', f'{sys.argv[3]}')
groups = glab.groups.list()
groupname = sys.argv[2]
for group in groups:
if group.name == groupname:
projects = group.projects.list(all=True)
for repo in projects:
command = f'git clone {repo.ssh_url_to_repo}'
process = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
output, _ = process.communicate()
process.wait()
Example:
create .py file (ex. gitlab-downloader.py)
copy-paste code from above
on Linux OS (or OSX) do chmod +x on the script file (ex. chmod +x gitlab-downloader.py)
run it with 3 params: Gitlab hostname, groupname, your Personal Access Token(see https://gitlab.exmaple.com/profile/personal_access_tokens)
I created a tool for that: https://github.com/ezbz/gitlabber, you can use glob/regex expressions to select groups/subgroups you'd like to clone.
Say your top-level group is called MyGroup and you want to clone all projects under it to ~/GitlabRoot you can use the following command:
gitlabber -t <personal access token> -u <gitlab url> -i '/MyGroup**' ~/GitlabRoot
Lot of good answers, but here's my take. Use it if you:
want to clone everything in parallel
have your ssh keys configured to clone from the server without entering a password
don't want to bother creating an access token
are using a limited shell like git bash (without jq)
So, using your browser, acess https://gitlab.<gitlabserver>/api/v4/groups/<group name>?per_page=1000 download the json with all projects info and save it as a file named group.json.
Now just run this simple command in the same dir:
egrep -o 'git#[^"]+.git' group.json|xargs -n 1 -P 8 git clone
Increase the number in -P 8 to change the number of parallel processes. If you have more than a thousand repositories, increase the number after the perpage=.
If <group name> has spaces or accented chars, note that it must be url encoded.
If you want to automatize the download, the easiest way to authenticate is to generate a access token in GitLab/GitHub and put it in the url: https://user:access_toke#mygitlab.net/api/v4/groups/<group name>?per_page=1000.
Using curl, jq and tr and the same approach described previously, but for more than 20 projects:
for repo in $(curl --header "PRIVATE-TOKEN:<Private-Token>" -s "https://<your-host>/api/v4/groups/<group-id>/projects?include_subgroups=true&per_page=100&page=n" | jq '.[].ssh_url_to_repo' | tr -d '"'); do git clone $repo; done;
For Gitlab.com use https://gitlab.com/api/v4/groups/[group-id]/projects
Only need to iterate changing page number.
An updated Python 3 script that accomplishes this really effectively using Gitlab's latest api and proper pagination:
import requests
import subprocess, shlex
import os
print('Starting getrepos process..')
key = '12345678901234567890' # your gitlab key
base_url = 'https://your.gitlab.url/api/v4/projects?simple=true&per_page=10&private_token='
url = base_url + key
base_dir = os.getcwd()
while True:
print('\n\nRetrieving from ' + url)
response = requests.get(url, verify = False)
projects = response.json()
for project in projects:
project_name = project['name']
project_path = project['namespace']['full_path']
project_url = project['ssh_url_to_repo']
os.chdir(base_dir)
print('\nProcessing %s...' % project_name)
try:
print('Moving into directory: %s' % project_path)
os.makedirs(project_path, exist_ok = True)
os.chdir(project_path)
cmd = shlex.split('git clone --mirror %s' % project_url)
subprocess.run(cmd)
except Exception as e:
print('Error: ' + e.strerror)
if 'next' not in response.links:
break
url = response.links['next']['url'].replace('127.0.0.1:9999', 'your.gitlab.url')
print('\nDone')
Requires the requests library (for navigating to the page links).
If you are okay with some shell sorcery this will clone all the repos grouped by their group-id (you need jq and parallel)
seq 3 \
| parallel curl -s "'https://[gitlabUrl]/api/v4/projects?page={}&per_page=100&private_token=[privateToken]'
| jq '.[] | .ssh_url_to_repo, .name, .namespace.path'" \
| tr -d '"' \
| awk '{ printf "%s ", $0; if (NR % 3 == 0) print " " }' \
| parallel --colsep ' ' 'mkdir -p {2} && git clone {1} {3}/{2}'
I have written the script to pull the complete code base from gitlab for particular group.
for pag in {1..3} // number of pages projects has span {per page 20 projects so if you have 50 projects loop should be 1..3}
do
curl -s http://gitlink/api/v4/groups/{groupName}/projects?page=$pag > url.txt
grep -o '"ssh_url_to_repo": *"[^"]*"' url.txt | grep -o '"[^"]*"$' | while read -r line ; do
l1=${line%?}
l2=${l1:1}
echo "$l2"
git clone $l2
done
done
Here's a Java version that worked for me using gitlab4j with an access token and git command.
I ran this on Windows and Mac and it works. For Windows, just add 'cmd /c' before 'git clone' inside the .exec()
void doClone() throws Exception {
try (GitLabApi gitLabApi = new GitLabApi("[your-git-host].com/", "[your-access-token]");) {
List<Project> projects = gitLabApi.getGroupApi().getProjects("[your-group-name]");
projects.forEach(p -> {
try {
Runtime.getRuntime().exec("git clone " + p.getSshUrlToRepo(), null, new File("[path-to-folder-to-clone-projects-to]"));
} catch (Exception e) {
e.printStackTrace();
}
});
}
}
You can refer to this ruby script here:
https://gist.github.com/thegauraw/da2a3429f19f603cf1c9b3b09553728b
But you need to make sure that you have the link to the organization gitlab url (which looks like: https://gitlab.example.com/api/v3/ for example organization) and private token (which looks like: QALWKQFAGZDWQYDGHADS and you can get in: https://gitlab.example.com/profile/account once you are logged in). Also do make sure that you have httparty gem installed or gem install httparty
Another way to do it with Windows "Git Bash" that has limited packages installed :
#!/bin/bash
curl -o projects.json https://<GitLabUrl>/api/v4/projects?private_token=<YourToken>
i=0
while : ; do
echo "/$i/namespace/full_path" > jsonpointer
path=$(jsonpointer -f jsonpointer projects.json 2>/dev/null | tr -d '"')
[ -z "$path" ] && break
echo $path
if [ "${path%%/*}" == "<YourProject>" ]; then
[ ! -d "${path#*/}" ] && mkdir -p "${path#*/}"
echo "/$i/ssh_url_to_repo" > jsonpointer
url=$(jsonpointer -f jsonpointer projects.json 2>/dev/null | tr -d '"')
( cd "${path#*/}" ; git clone --mirror "$url" )
fi
let i+=1
done
rm -f projects.json jsonpointer
In response to #Kosrat D. Ahmad as I had the same issue (with nested subgroups - mine actually went as much as 5 deep!)
#!/bin/bash
URL="https://mygitlaburl/api/v4"
TOKEN="mytoken"
function check_subgroup {
echo "checking $gid"
if [[ $(curl --header "PRIVATE-TOKEN: $TOKEN" $URL/groups/$gid/subgroups/ | jq .[].id -r) != "" ]]; then
for gid in $(curl --header "PRIVATE-TOKEN: $TOKEN" $URL/groups/$gid/subgroups/ | jq .[].id -r)
do
check_subgroup
done
else
echo $gid >> top_level
fi
}
> top_level #empty file
> repos #empty file
for gid in $(curl --header "PRIVATE-TOKEN: $TOKEN" $URL/groups/ | jq .[].id -r)
do
check_subgroup
done
# This is necessary because there will be duplicates if each group has multiple nested groups. I'm sure there's a more elegant way to do this though!
for gid in $(sort top_level | uniq)
do
curl --header "PRIVATE-TOKEN: $TOKEN" $URL/groups/$gid | jq .projects[].http_url_to_repo -r >> repos
done
while read repo; do
git clone $repo
done <repos
rm top_level
rm repos
Note: I use jq .projects[].http_url_to_repo this can be replaced with .ssh_url_to_repo if you'd prefer.
Alternatively strip out the rm's and look at the files individually to check the output etc.
Admittedly this will clone everything, but you can tweak it however you want.
Resources: https://docs.gitlab.com/ee/api/groups.html#list-a-groups-subgroups
This is a bit improved version of the oneliner in #ruben-lohaus post.
it will work for up to 100 repos in the group.
will clone every repository in the group including the path.
requirements:
grep
jq
curl
GITLAB_URL="https://gitlab.mydomain.local/api/v4/groups/1141/projects?include_subgroups=true&per_page=100&page=0"
GITLAB_TOKEN="ABCDEFABCDef_5n"
REPOS=$(curl --header "PRIVATE-TOKEN:${GITLAB_TOKEN}" -s "${GITLAB_URL}" | jq -r '.[].ssh_url_to_repo')
for repo in $(echo -e "$REPOS")
do git clone $repo $(echo $repo | grep -oP '(?<=:).*(?=.git$)')
done
Based on this answer, with personal access token instead of SSH to git clone.
One liner with curl, jq, tr
Without subgroups :
for repo in $(curl -s --header "PRIVATE-TOKEN: <private_token>" https://<your-host>/api/v4/groups/<group-name> | jq ".projects[]".http_url_to_repo | tr -d '"' | cut -c 9-); do git clone https://token:<private_token>#$repo; done;
Including subgroups :
for repo in $(curl -s --header "PRIVATE-TOKEN: <private_token>" "https://<your-host>/api/v4/groups/<group-name>/projects?include_subgroups=true&per_page=1000" | jq ".[]".http_url_to_repo | tr -d '"' | cut -c 9-); do git clone https://token:<private_token>#$repo; done;
Please note that the private_token for the curl must have API rights. The private_token for the git clone must have at least read_repository rights. It can be the same token (if it has API rights), but could also be 2 differents tokens
I know this question is a few years old, but I had problems with awk/sed and the scripts here (macOS).
I wanted to clone a root-group including their subgroups while keeping the tree structure.
My python script:
#!/usr/bin/env python3
import os
import re
import requests
import posixpath
import argparse
from git import Repo
parser = argparse.ArgumentParser('gitlab-clone-group.py')
parser.add_argument('group_id', help='id of group to clone (including subgroups)')
parser.add_argument('directory', help='directory to clone repos into')
parser.add_argument('--token', help='Gitlab private access token with read_api and read_repository rights')
parser.add_argument('--gitlab-domain', help='Domain of Gitlab instance to use, defaults to: gitlab.com', default='gitlab.com')
args = parser.parse_args()
api_url = 'https://' + posixpath.join(args.gitlab_domain, 'api/v4/groups/', args.group_id, 'projects') + '?per_page=9999&page=1&include_subgroups=true'
headers = {'PRIVATE-TOKEN': args.token}
res = requests.get(api_url, headers=headers)
projects = res.json()
base_ns = os.path.commonprefix([p['namespace']['full_path'] for p in projects])
print('Found %d projects in: %s' % (len(projects), base_ns))
abs_dir = os.path.abspath(args.directory)
os.makedirs(abs_dir,exist_ok=True)
def get_rel_path(path):
subpath = path[len(base_ns):]
if (subpath.startswith('/')):
subpath = subpath[1:]
return posixpath.join(args.directory, subpath)
for p in projects:
clone_dir = get_rel_path(p['namespace']['full_path'])
project_path = get_rel_path(p['path_with_namespace'])
print('Cloning project: %s' % project_path)
if os.path.exists(project_path):
print("\tProject folder already exists, skipping")
else:
print("\tGit url: %s" % p['ssh_url_to_repo'])
os.makedirs(clone_dir, exist_ok=True)
Repo.clone_from(p['ssh_url_to_repo'], project_path)
Usage
Download the gitlab-clone-group.py
Generate a private access token with read_api and read_repository rights
Get your group ID (displayed in light gray on below your group name)
Run the script
Example:
python3 gitlab-clone-group.py --token glabc-D-e-llaaabbbbcccccdd 12345678 .
Clones the group 12345678 (and subgroups) into the current working directory, keeping the tree structure.
Help:
usage: gitlab-clone-group.py [-h] [--token TOKEN] [--gitlab-domain GITLAB_DOMAIN] group_id directory
positional arguments:
group_id id of group to clone (including subgroups)
directory directory to clone repos into
options:
-h, --help show this help message and exit
--token TOKEN Gitlab private access token with read_api and read_repository rights
--gitlab-domain GITLAB_DOMAIN
Domain of Gitlab instance to use, defaults to: gitlab.com
Source: https://github.com/adroste/gitlab-clone-group
An alternative based on Dmitriy's answer -- in the case you were to clone repositories in a whole group tree recursively.
#!/usr/bin/python3
import os
import sys
import gitlab
import subprocess
glab = gitlab.Gitlab(f'https://{sys.argv[1]}', f'{sys.argv[3]}')
groups = glab.groups.list()
root = sys.argv[2]
def visit(group):
name = group.name
real_group = glab.groups.get(group.id)
os.mkdir(name)
os.chdir(name)
clone(real_group.projects.list(all=True))
for child in real_group.subgroups.list():
visit(child)
os.chdir("../")
def clone(projects):
for repo in projects:
command = f'git clone {repo.ssh_url_to_repo}'
process = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
output, _ = process.communicate()
process.wait()
glab = gitlab.Gitlab(f'https://{sys.argv[1]}', f'{sys.argv[3]}')
groups = glab.groups.list()
root = sys.argv[2]
for group in groups:
if group.name == root:
visit(group)
Modified #Hot Diggity's answer.
import json
import subprocess, shlex
allProjects = urlopen("https://gitlab.com/api/v4/projects?private_token=token&membership=true&per_page=1000")
allProjectsDict = json.loads(allProjects.read().decode())
for thisProject in allProjectsDict:
try:
thisProjectURL = thisProject['ssh_url_to_repo']
path = thisProject['path_with_namespace'].replace('/', '-')
command = shlex.split('git clone %s %s' % (thisProjectURL, path))
p = subprocess.Popen(command)
p_status = p.wait()
except Exception as e:
print("Error on %s: %s" % (thisProjectURL, e.strerror))
For powershell (replace and and pass in a private token from gitlab (or hardcode it)):
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
$url="https://<gitlab host>/api/v4/groups/<group>/projects?
simple=1&include_subgroups=true&private_token="+$args[0]
$req = Invoke-WebRequest $url | ConvertFrom-Json
foreach( $project in $req ) {
Start-Process git -ArgumentList "clone", $project.ssh_url_to_repo
}
One liner python3 version of Dinesh Balasubramanian response.
I only made this for lack of jq, only python3 (requests)
import requests,os; [os.system('git clone {[http_url_to_repo]}'.format(p)) for p in requests.get('https://<<REPO_URL>>/api/v4/groups/<<GROUP_ID>>',headers={'PRIVATE-TOKEN':'<<YOUR_PRIVATE_TOKEN>>'},verify=False).json()['projects']]
Replace <<REPO_URL>>, <<GROUP_ID>> and <<YOUR_PRIVATE_TOKEN>>
If you have far more then 20 projects in the group you have to deal with pagination https://docs.gitlab.com/ee/api/#pagination. That is you have to do several requests to clone all projects. The most challenging part is to obtain a full projects list in the group. This is how to do it in bash shell using curl and jq:
for ((i = 1; i <= <NUMBER_OF_ITERATIONS>; i++)); do curl -s --header "PRIVATE-TOKEN: <YOUR_TOKEN>" "https://<YOUR_URL>/api/v4/groups/<YOUR_GROUP_ID>/projects?per_page=100&page=${i}" | jq -r ".[].ssh_url_to_repo"; done | tee log
The number of iterations in a for loop can be obtained from X-Total-Pages response header in a separate request, see Pagination in Gitlab API only returns 100 per_page max.
After projects list is obtained, you can clone projects with:
for i in `cat log`; do git clone $i; done
No need to write code, just be smart
In chrome we have extension that grabs all URL's "Link Klipper"
Extract all the URL's in to an excel file just filter them in excel
create .sh file
#!/bin/bash
git clone https:(url) (command)
run this script, done all the repository's will be cloned to your local machine at once
I was a little unhappy with this pulling in archived and empty repo's, which I accept are not problems everyone has. So I made the following monstrosity from the accepted answer.
for repo in $(curl -s --header "PRIVATE-TOKEN: $GITLAB_TOKEN" "https://gitlab.spokedev.xyz/api/v4/groups/238?include_subgroups=true" | jq '.projects[] | select(.archived == false) | select(.empty_repo == false) | .http_url_to_repo' | sed s"-https://gitlab-https://oauth2:\${GITLAB_TOKEN}#gitlab-g"); do
echo "git clone $repo"
done | bash
This is derived from the top answer to give full credit; but it just demonstrates selecting based on properties (for some reason === does not work). I also have this using HTTPS clone, because we set passwords on our SSH keys and don't want them in key-chain, or to type the password many times.
I was wrestling with issues with the scripts posted here, so I made a dumbed down hybrid version using an API call with PostMan, a JSON Query, and a dumb bash script. Here's the steps in case someone else runs into this.
Get you group id. My group id didn't match what was displayed on my group's page for some reason. Grab it by hitting the api here: https://gitlab.com/api/v4/groups
Use Postman to hit the API to list your projects. URL: https://gitlab.com/api/v4/groups/PROJECT_ID/projects?per_page=100&include_subgroups=true
Set the Auth to use API Key, where the Key is "PRIVATE-TOKEN" and the value is your private API Key
Copy the results and drop them in at: https://www.jsonquerytool.com/
Switch the Transform to JSONata. Change the query to "$.http_url_to_repo" (ssh_url_to_repo if using SSH)
You should now have a JSON array of your git urls to clone. Change the format to match Bash array notation (change [] to () and drop the commas).
Drop your Bash array into the repos variable of the script below.
repos=(
""
)
for repo in ${repos[#]}; do git clone $repo
done
Save your script in the folder where you want your repos checked out.
Run bash {yourscriptname}.sh
That should do it. You should now have a directory with all of your repos checked out for backup purposes.
It's my understanding that all the answers only allow you to clone repos, but not issues, boards, other settings, etc. Please correct me if I am wrong.
I feel if I want to back up all the data from multiple projects, the intention is to include not only repos, but also other data, which could be as important as the repos.
Self-hosted Gitlab instances can achieve this with official support, see backup and restore GitLab.