List LFS tracked files in pre-receive hook - gitlab

I am trying to write a git pre-receive hook that rejects LFS files larger than a certain size (among other things).
I am trying to execute git lfs ls-files -l -s <new-ref-value> in my script, but it returns
2ec20be70bb1be824e124a61eabac40405d60de62c76d263eff9923f18c098ed - binary.dll (63 B)
Could not scan for Git LFS tree: missing object: a405ce05ac78ea1b820d036676831a474ddf8f90
I cannot even ignore the error message because it stops after the first file.
I guess that the problem has to do with the fact that the commits have not been "validated" on the remote yet. The frustrating thing is that the information that I need (new file paths + sizes) is accessible since it's printed for the first file.
Is there a way to run the git lfs ls-files command with the new ref value successfully at this stage?
Can I obtain the list of the added file paths and sizes in any other way?
EDIT: If that's relevant in any way, the Git server is a GitLab instance in its default configuration.

Related

How to fix merge conflicts for a lot of files in git?

I am using the git mergetool command to fix conflicts. However I have thousands of conflicts, is there way to simplify this so I get everything from the remote?
I am asked to enter c, d or a in the command.
{local}: deleted
{remote}: created file
Use (c)reated or (d)eleted file, or (a)bort?
Since I have thousands of files, I don't want to keep sending c. Is there way to just do this in bulk?
You can solve this outside of git mergetool: run git status --porcelain to get a list of all unmerged files and their states in machine-readable format.
If your Git is new enough, it will support --porcelain=v2. See the git status documentation for details on the output formats. Output format v2 is generally superior for all purposes, but you should be able to make do with either one.
Next, you must write a program. Unfortunately Git has no supplied programs for this. Your program can be fairly simple depending on the specific cases you want to solve, and you can use shell scripting (sh or bash) as the programming language, to keep it easy.
Since you're concerned about the cases where git mergetool says:
Use (m)odified or (d)eleted file, or (a)bort?
you are interested in those cases where the file name is missing in the stage 1 ("base") version and also missing in the stage 2 ("local") version, but exists in the stage 3 ("remote") version. (See the git status documentation again and look at examples of your git status --porcelain=v2 output to see how to detect these cases. Two of the three modes will be zero.) For those particular path names, simply run git add on the path name to mark the file as resolved in favor of the created file.
Once you have marked all such files, you can go back to running git mergetool to resolve additional conflicts, if there are any.
Note that your "program" can consist of running:
git status --porcelain=v2 > /tmp/commands.sh
and then editing /tmp/commands.sh to delete all but the lines containing files that you want to git add. Then change all of those lines to read git add <filename> where <filename> is the name of the file. Exit the editor and run sh /tmp/commands.sh to execute all the git add commands. That's your program!
supposing you want their change and modified yours you can do a pull as like:
git pull -X theirs
Other stackOverflow answers
git pull -X
git merge strategies this link will help understand any other merge strategies for the futuro
If you want that all the change you did will be deleted and you will be sync with the remote.
You should do the following:
git stash
git pull
And if you want to restore the change you did you should type:
git stash pop
Basically 'git stash' is moving the change to a temp repository.
you can learn more in:
NDP software:: Git Cheatsheet

git only clones sha for LFS files on Gitlab CI

I pushed .png files, each of which is 2+MB file size and tracked by git-lfs, to my gitlab.com repository, say repo_a. In CI job on another repo repo_b where git-lfs is installed, repo_a is cloned. Now I see the size of all .png files are 132, which seems to be the same volume as sha output (as the following. Note: some values are populated for privacy):
$ git show HEAD:file-a.png | tee sha_temp
version https://git-lfs.github.com/spec/v1
oid sha256:shashashaaaashashashaaaashashashaaaashashashaaaashashashaaaa
size 2430019
$ ls -l sha_temp
-rw-rw-r-- 1 crookednoodle crookednoodle 132 Nov 7 05:35 sha_temp
However, On my computer instead on Gitlab CI, I can see the original files when I git clone the repo_a.
This makes me feel that the content of these files are still pointers, not the original files. I also noticed that on my computer, I see in the output the original files are downloaded like this:
Downloading file-a.png (2.5 MB)
But I don't see this in the output on CI job.
Obviously related, subsequent process that opens the images by OpenCV fails.
What is wrong?
Managed to solve (get around) the issue by myself. In the targeted repo, I modified the CI script to run git lfs pull.

Trying to remove commit messages involving a long filename in Git revision history

So I created a file with a very long name (around 300 chars) to test something, but I no longer need it and have deleted it.. I did this in my svn repository. Then I cloned this svn repository into a git repository using git svn clone, which didn't give me any issue. However, the creation and deletion of this file is now recorded in my git commit history..
This is giving me issues when using git commands like filter-branch. Error is 'Filename too long cannot check out index'
I know the commit ids involving the particular file. Is there a way to get rid of these commits in the 'revision history' that will involve the file with the long name..?
Note: I performed the above on a Windows machine, but I tried moving to Linux machine and tried the filter branch command there as well but still getting the same 'filename too long issue'. I am new to Linux, so is there any setting that I missed out to handle long filenames?
Thanks in advance.
You should use git filter-branch with --index-filter so that you don't have to deal with filesystem limitations

dump directory data to a file for new/modified comparison later on a linux server

Is it possible to take some kind of "dump" of a directory on a Linux (Ubuntu) server that I can later use to compare against for new/modified files?
The idea being something like this:
Dump directory data (like file hashes)
24 hours later I take another dump and compare against #1 to find new or modified files
Well, this is not the answer you might be looking for but I would use GIT to track down the changes, or may be even git-annex if the files are too big for example.
Initialize the git repository in the directory you want to track: git --init
tell git to track all files: git add .
commit the changes: git commit -a -m "initial commit"
after 24 hours make git diff to see the changes

Failsafe Automated Git Update/Add/Commit

I have a website that I need to commit system generated files and folders to an existing git repository via linux command line. I have an automated process that monitors a folder for new bash scripts and runs them. The website creates the scripts and saves them to the folder.
I keep getting issues where something has changed on either the remote repo or my local one that is stopping git from completing the following commands
git pull --rebase origin
git add [repo path to updated file(s)]/*
git commit -m "commit message"
git push master
I need to bullet proof this process so that it will just run and I can forget about it. Right now, permissions issues on files pulled down, or merge conflicts, etc... keep getting the repo out of sync. How can I bullet proof the commands above so they will pull down any remote changes and then commit any new ones necessary?

Resources