One of our projects' storage size is increasing rapidly, and I no longer know where to look to free space. In the Admin Area, the project storage details say the Build Artifacts are what I need to clean up :
What I've tried so far
Set up expire time for artifacts - this doesn't seem to affect past artifacts, which are still taking up space
House cleaning - no visible changes noted
Delete pipelines through the Rest API - I later found out this does not seem to clean job artifacts and I should've erased jobs directly instead
Check Gitlab's database - no jobs/builds are linked to my project anymore (tables checked : ci_builds and ci_job_atifacts
Rake commands - Tested sudo gitlab-rake gitlab:cleanup:orphan_job_artifact_files until the number of deleted orphans reached 0; no effect on my project storage value; tested a couple other commands on this page, to no avail
I've also found this issue that seems to indicate that storage cleanup management doesn't yet seem to be properly (easily) handled by gitlab yet.
Questions
What is counted in a given project's storage value ?
How do I clean up once what I've tried so far hasn't had any influence ?
I have exactly the same problem using Gitlab 13.9.5. By executing the mentioned rake task, the artifacts were removed from disc. The artifacts folder contains < 1 GB of data. However, two of my projects are still showing about 4 GB under storage.
Related
Is there a way to delete a revision on Azure Container Apps?
Scenario
I have an Azure Container App instance for testing puposes which I regularly push new revisions to using the az containerapp update command in my CI/ CD pipeline whenever I merge a change onto my master branch. As the revisions all use a Docker image with the same tag :latest - but not (necessarily) the same code inside the Docker container - I create a new unique revision suffix for each revision in order to create a revision-scope change.
I am using the single-revision mode, therefore there's only ever one revision which serves 100% of the traffic. So whenever I push a new revision with a new revision-suffix a new revision gets created and activated and the previous revision gets deactivated.
Using this approach with time a lot of revisions get created and most of them will not be needed anymore but will still occupy storage and - as revision names must be unqiue - a lot of names which I would like to re-use, therefore I'd like to delete them.
However, looking at the available commands in the Azure CLI for revisions there does not seem to be a way to delete a revision.
The question therefore is, if there is a way how can I delete those revisions? Alternatively if revisions cannot be deleted, is there another way so I can force the container app to update the docker image it is running even though the tag of the docker image does not change (in that case I would not (necessarily) need to create a new revision every time)?
Expectation
I would have expected there was a deletion command as there will be many container apps with many revisions which will need lots of storage (which one might need to pay for eventually) as a revision might be activated again at any time, so Microsoft or Azure users should at least to my mind have the same desire to delete outdated/ deprecated/ unused revisions.
Agreed the point of #ahmelsayed that is not possible to delete the revisions manually and they should eventually be pruned to the most recent 100.
I would have expected there was a deletion command as there will be many container apps with many revisions which will need lots of storage (which one might need to pay for eventually) as a revision might be activated again at any time, so Microsoft or Azure users should at least to my mind have the same desire to delete outdated/ deprecated/ unused revisions.
As mentioned in this MS Doc, max. 100 revisions are allowed and older than that are purged where there is no cost for inactive revisions.
You can deactivate the unused or outdated revisions using the Azure Portal or Azure CLI or REST API or Code like Java, Go, and JS and activate also.
Here is the syntax of deactivating the Azure Container Apps Revisions using Azure CLI:
az containerapp revision deactivate --revision <Your_Container_Revision_Name> --resource-group <Your_Resource-Group_Name>
In the documentation, it seems that we can set the expire-in to several days or weeks. But I cannot decide an accurate fixed date. Is it possible to always keep the 'lastest' artifact, and remove the old one when a new one is successfully built?
Okey. I found a solution here :
https://docs.gitlab.com/ee/ci/pipelines/job_artifacts.html#keep-artifacts-from-most-recent-successful-jobs
With this help, it can be easily reached.
The question should be closed.
Here's the instruction given by gitlab official.
Keeping the latest artifacts can use a large amount of storage space in projects with a lot of jobs or large artifacts. If the latest artifacts are not needed in a project, you can disable this behavior to save space:
On the top bar, select Menu > Projects and find your project.
On the left sidebar, select Settings > CI/CD.
Expand Artifacts.
Clear the Keep artifacts from most recent successful jobs checkbox.
You can disable this behavior for all projects on a self-managed instance in the instance’s CI/CD settings.
We have a self-hosted Gitlab running on one instance but every now and then we are facing space issues because the large artifacts filled up the space.
We have to go and delete the older artifacts folders manually.
Is there a way to automate this? May be a script which runs overnight and delete the artifacts folder older than say 7 days?
The default expiration is set to 5 days in Gitlab Admin but that does not mean they are deleted from the box.
When artifacts expire, they should be deleted from disk. If your artifacts are not deleted from your physical storage, there is a configuration issue with your storage. Ensure you have write and delete permissions on your storage configuration.
Artifacts that were created before the default expiration setting was set will still need to be deleted manually -- but one time. All new artifacts will respect the artifact expiration.
However, you should do this through the API, not directly on the filesystem. Otherwise there will be a mismatch between what GitLab's database thinks exists and what actually exists on disk.
For an example script: see this answer.
Also note there are several circumstances under which artifacts are kept, such as the latest artifacts. New pipelines must run for old artifacts to expire. See documentation for more information.
I started running GitLab CE inside of an x86 Debian VM locally about two years ago, and last year I decided to migrate the GitLab CE instance to a dedicated Intel NUC server. Everything appeared to go well with no issues, and my GitLab CE instance is up-to-date as of today (running 13.4.2).
I discovered recently though, that some repos that were moved give a "NO REPOSITORY!" error when visiting their project pages, and if they had any issue boards, merge requests, etc, that these were also gone. But you wouldn't suspect it since the broken repos appear in the repo lists along with working repos that I use all the time.
If I had to reason about these broken repos, it would be that they had their last activity over a year ago, with either no pushes ever made to them other than an initial push, or if changes were made, issues created, or merge requests created, it was literally over a year ago.
Some of these broken repos are rather large with a lot of history, whereas others are super tiny (literally just tracking changes to a shell script), so I don't think repo size itself has anything to do with it.
If I run the GitLab diagnostic check sudo gitlab-rake gitlab:check, everything looks good except for "hashed storage":
All projects are in hashed storage? ... no
Try fixing it:
Please migrate all projects to hashed storage
But then running sudo gitlab-rake gitlab:storage:migrate_to_hashed doesn't appear to complete (with something like six failed jobs in the dashboard), and running the "gitlab:check" again still indicates this "hashed storage" problem. I've also tried running sudo gitlab-rake gitlab:git:fsck and sudo gitlab-rake cache:clear but these commands don't seem to make a difference.
Luckily I have the latest versions of all the missing repos on my machine, and in fact, I still have the original VM running GitLab CE 12.8.5 (with slightly out of date copies of the repos.)
So my questions are:
Is it possible to "repair" the broken repos on my current instance? I suspect I could just "re-push" my local copies of these repos back up to my server, but I really don't want to lose any metadata like issues / merge requests and such.
Is there any way to resolve the "not all projects are in hashed storage" issue? (Again the migrate_to_hashed task fails to complete.)
Would I be able to do something like "backup", "inspect / tweak backup", "restore backup" kind of thing to fix the broken repos, or at least the metadata?
Thanks in advance.
Okay, so I think I figured out what happened.
I found this thread on the GitLab User Forums.
Apparently the scenario here is:
Have a GitLab instance that has repos not in "hashed storage"
Backup your repo
Restore your repo (either to the same server or migrating to another server)
Either automatically or manually, attempt to update your repos to "hashed storage"
You'll now find that any repo with a "ci runner" (continuous integration runner) will now be listed as "NO REPOSITORY!" and be completely unavailable, since the "hashed storage" migration process will fail
The fix is to:
Reset runner registration tokens as listed in this article in the GitLab documentation
Re-run the sudo gitlab-rake gitlab:storage:migrate_to_hashed process
Once the background jobs are completed, run sudo gitlab-rake gitlab:check to ensure the output contains the message:
All projects are in hashed storage? ... yes
If successful, the projects that stated "NO REPOSITORY!" should now be fully restored.
A key to know if you need to run this process is if you:
Log in to your GitLab CE instance as an admin
Go to the Admin Area
Look under Monitoring->Background Jobs->Dead
and see a job with the name
hashed_storage:hashed_storage_project_migrate
with the error
OpenSSL::Cipher::CipherError:
I've recently been on a support ticket with Azure, and they've recommended turning on Local caching to eliminate occasional outage blips.
The problem with that, is that you need to watch your disk space, since >1Gig is not allowed. And if you deploy from git, like I do, that's an issue because the whole repository is checked out, then built locally, and then kudu-synced.
I've looked at trimming my repo down, but that's only going to yield small savings. What I'd like to do is to remove my repository folder once the deployment has complete. Is that a sensible idea, or are there other solutions to this problem?
There is an upcoming change to the Local Caching behavior that will make it skip the repository folder (since it's not needed at runtime). This should be in the next couple of weeks.
Once that change is out, this issue should automatically go away for you.
the repository folder only container a copy of your repo. it is ok to remove it if you want to safe some space. it will be re-create when there is a new deployment.
There is one side effect when you delete your repository folder, your next deployment will take longer time since it will need to sync your entire repository.
Other than repository folder, you can cleanup files that under D:\home\LogFiles as well, to safe you some more spaces
I would recommend you to use build sequence using Visual Studio Team Services - there, you can do anything you want and include operations into the build pipeline (build trigger => delete the folder).