Why does git fail on push/fetch with "Too many open files" - linux

I'm encountering an issue with Git where I'm receiving the following message:
> git fetch
error: cannot create pipe for ssh: Too many open files
fatal: unable to fork
The System Administrators have increased my file limit, but it has not corrected the issue. Additionally, I don't have an issue with creating new files with vi.
When trying to push a new branch, I get a similar message:
git push origin test_this_broken_git
error: cannot create pipe: Too many open files
fatal: send-pack: unable to fork off sideband demultiplexer
Could somebody answer exactly why this is happening? I have not made any recent changes to my git config and have verified that manually.

There are two similar error messages:
EMFILE: Too many open files
ENFILE: Too many open files in system
It looks like you're getting EMFILE, which means that the number of files for an individual process is being exceeded. So, checking whether vi can open files is irrelevant—vi will use its own, separate file table. Check your limits with:
$ ulimit -n
1024
So on my system, there is a limit of 1024 open files in a single process. You shouldn't need to ask your system administrator (please don't use the acronym SA, it's too opaque; if you must abbreviate, use "sysadmin") to raise the limit.
You may wish to check which files Git opens by running Git under strace.
This could be a bug in Git or in a library, or it could be you're using an old version of something, or it could be something more bizarre. Try strace first to see which files it opens, and check whether Git closes those files.
Update from Hazok:
After using the above recommendations, it turns out the error was caused by too many loose objects. There were too many loose objects because git gc wasn't being run often enough.

Why did this happen?
From the git documentation:
When there are approximately more than this many loose objects in the repository, git gc --auto will pack them. Some Porcelain commands use this command to perform a light-weight garbage collection from time to time. The default value is 6700.
Here "Some Porcelain commands" includes git push, git fetch etc. So if the max open files limit ulimit -n < 6700, you'll be eventually blocked by git gc --auto once you got ~6700 loose objects in a single git repo.
I'm in a hurry. How to fix it?
If you have sufficient permissions to adjust the system ulimit:
$ sudo ulimit -n 8192
Otherwise, you may disable git gc by setting git config gc.auto 0, so that you could push your local commits to the remote, delete the repo, and clone it back without thousands of loose objects.
How can we prevent this from happening again?
Set git config --global gc.auto 200, where 200 is some value less than your max open files limit. If you picked a too small value, git gc would run too frequently, so choose wisely.
If you set gc.auto=0, the loose objects will never be packed unless you run git gc manually. So there could be hundreds of thousands of files accumulated in the same directory, which might be a problem, especially for mechanical hard drive or Windows users. (See also: How many files in a directory is too many? and Is it OK (performance-wise) to have hundreds or thousands of files in the same Linux directory?).

Related

Resolving Errors With Git Index Too Small

I recently updated the development server that hosts our code repos to a newer version of Ubuntu (18.04). As part of the process git was upgraded to version 2.23.0. The actual application servers where the code gets deployed to need to be able to checkout the latest changes from the git repos. When I try to do a 'git fetch' on those servers I get a long list of errors that look like this:
error: index file
./objects/pack/._pack-5b58f700fea57ee6f8ff29514a376b945bb1c8a9.idx is
too small
I did some digging around to see if I could come up with a solution but so far noting has worked. I tried the answers listed here: git error: "index file is too small" .
Neither git index-pack nor git repack -a -d solved the issue. I even tried deleting the local copy of the files from the application server and installing fresh using git clone. The clone itself threw a bunch of errors similar to before
remote: error: index file
./objects/pack/._pack-5b58f700fea57ee6f8ff29514a376b945bb1c8a9.idx is
too small
At this point I'm out of ideas. Any help would be appreciated.
Edit: The output of du -h suggests that there is enough disk space.
The error message sounds like file corruption. If you have not run out of disk space, you can delete the index file and recreate it with:
git index-pack -v ./objects/pack/._pack-5b58f700fea57ee6f8ff29514a376b945bb1c8a9.idx
You might also want to run use git-fsck to
verify the connectivity and validity of the objects in the GIT database -- both the remote the local one.
If your index is corrupt, you can also try to reset the branch which will create a new index file:
To be safe, backup .git/index.
Remove the index file .git/index.
Perform git reset
References
The issue is a possible duplicate of git error: "index file is too small"
Documentation on git index-pack can be found at https://git-scm.com/docs/git-index-pack
Some notes on repairing a broken index: https://makandracards.com/makandra/5899-how-to-fix-a-corrupt-git-index
fatal: packfile name 'server' does not end with '.pack'
I encounter this error when transfer my git repo from Mac OS to another system. Files start with '._' are Mac OS meta files generated by tar command. So look at this question to avoid '._*' files: Tar command in mac os x adding "hidden" files, why?

How to fix merge conflicts for a lot of files in git?

I am using the git mergetool command to fix conflicts. However I have thousands of conflicts, is there way to simplify this so I get everything from the remote?
I am asked to enter c, d or a in the command.
{local}: deleted
{remote}: created file
Use (c)reated or (d)eleted file, or (a)bort?
Since I have thousands of files, I don't want to keep sending c. Is there way to just do this in bulk?
You can solve this outside of git mergetool: run git status --porcelain to get a list of all unmerged files and their states in machine-readable format.
If your Git is new enough, it will support --porcelain=v2. See the git status documentation for details on the output formats. Output format v2 is generally superior for all purposes, but you should be able to make do with either one.
Next, you must write a program. Unfortunately Git has no supplied programs for this. Your program can be fairly simple depending on the specific cases you want to solve, and you can use shell scripting (sh or bash) as the programming language, to keep it easy.
Since you're concerned about the cases where git mergetool says:
Use (m)odified or (d)eleted file, or (a)bort?
you are interested in those cases where the file name is missing in the stage 1 ("base") version and also missing in the stage 2 ("local") version, but exists in the stage 3 ("remote") version. (See the git status documentation again and look at examples of your git status --porcelain=v2 output to see how to detect these cases. Two of the three modes will be zero.) For those particular path names, simply run git add on the path name to mark the file as resolved in favor of the created file.
Once you have marked all such files, you can go back to running git mergetool to resolve additional conflicts, if there are any.
Note that your "program" can consist of running:
git status --porcelain=v2 > /tmp/commands.sh
and then editing /tmp/commands.sh to delete all but the lines containing files that you want to git add. Then change all of those lines to read git add <filename> where <filename> is the name of the file. Exit the editor and run sh /tmp/commands.sh to execute all the git add commands. That's your program!
supposing you want their change and modified yours you can do a pull as like:
git pull -X theirs
Other stackOverflow answers
git pull -X
git merge strategies this link will help understand any other merge strategies for the futuro
If you want that all the change you did will be deleted and you will be sync with the remote.
You should do the following:
git stash
git pull
And if you want to restore the change you did you should type:
git stash pop
Basically 'git stash' is moving the change to a temp repository.
you can learn more in:
NDP software:: Git Cheatsheet

Trying to remove commit messages involving a long filename in Git revision history

So I created a file with a very long name (around 300 chars) to test something, but I no longer need it and have deleted it.. I did this in my svn repository. Then I cloned this svn repository into a git repository using git svn clone, which didn't give me any issue. However, the creation and deletion of this file is now recorded in my git commit history..
This is giving me issues when using git commands like filter-branch. Error is 'Filename too long cannot check out index'
I know the commit ids involving the particular file. Is there a way to get rid of these commits in the 'revision history' that will involve the file with the long name..?
Note: I performed the above on a Windows machine, but I tried moving to Linux machine and tried the filter branch command there as well but still getting the same 'filename too long issue'. I am new to Linux, so is there any setting that I missed out to handle long filenames?
Thanks in advance.
You should use git filter-branch with --index-filter so that you don't have to deal with filesystem limitations

Git checkout untracked issue

I'm collaborating with a few other people on a Drupal website which we are version controlling Git. We setup a local Git repository containing our commits.
After a colleague pushed some updates and I fetched and merged into my local dev branch, I began experiencing the following problems:
user#server:/var/www/Intranet/sites/intranet/modules/custom$ git checkout dev
error: The following untracked working tree files would be overwritten by checkout:
themes/bigcompany/panels/layouts/radix_bryant_flipped/radix-bryant-flipped.png
themes/bigcompany/panels/layouts/radix_bryant_flipped/radix-bryant-flipped.tpl.php
themes/bigcompany/panels/layouts/radix_bryant_flipped/radix_bryant_flipped.inc
Please move or remove them before you can switch branches.
Aborting
The issue above typically shows up when I try to checkout into other branches which fails and I am effectively trapped in my current branch.
Referring to this question, there is a suggestion my issue is related to the gitignore file. However, my .gitignore file has nothing indicating any part of my themes directory should be ignored as the following shows:
# .gitignore for a standard Drupal 7 build based in the sites subdirectory.
# Drupal
files
settings.php
settings.*.php
# Sass.
.sass-cache
# Composer
vendor/
# Migrate sourec files
modules/custom/haringeygovuk_migrate/source_data
As mentioned above, my attempts to execute git checkout into any branch fails with the message above. I decided to force it with the -f switch and successfully switched into my target branch but I lost a couple of hundred lines of code - which I'd love to avoid going forward.
I work on a Linux-Ubuntu VirtualBox which my colleagues prefer working in a WAMP setup and use the Git Bash terminal emulator for executing the Git commands. Could the difference in environments be causing these serious issues?
How can I resolve this issue?
Well, the situation is rather simple. You, in your current branch, don't have certain files under the control of Git, but at the same time, you have those files in your working tree. The branch you're trying to switch to, has those files, so git would need to override files in the working tree to perform checkout.
To prevent possible data loss, Git stops the process of switching the branches and notifies you that you should either add those files under the control of Git in a separate commit in your current branch, and only then perform the switch, or simply remove those files from the git way.
Likely you have chosen the second way. Generally you should "force" any operation only if you really understand what you're doing.

Remove git-annex repository from file tree

I tried installing git-annex yesterday to backup my files. I ran git annex add . in the root of my repository tree and then a git commit. So far everything is fine.
What I didn't know git-annex was doing was turning my entire file tree into a whole bunch of symlinks. Every single file in my whole tree is now symlinked into .git/annex/objects! This is messing up my application which depends on files not being symlinks.
My question is, how do I get rid of git-annex and restore my file system to its original state? For a normal git repo I could do rm -r .git, but I'm afraid that won't do the job in git-annex. Thanks in advance.
Okay, so I stumbled upon some docs for git-annex, and they give two commands that achieve what I wanted to do:
unannex [path ...]
Use this to undo an accidental git annex add command. You can use git annex unannex to move content out of the annex at any point, even if you've already committed it.
This is not the command you should use if you intentionally annexed a file and don't want its contents any more. In that case you should use git annex drop instead, and you can also git rm the file.
uninit
Use this to stop using git annex. It will unannex every file in the repository, and remove all of git-annex's other data, leaving you with a git repository plus the previously annexed files.
I started running git annex uninit, but my god was it slow. It took about 5 minutes to "unannex" just a single file. My filesystem tree is about 200,000 files, so that was just unacceptable.
What I ended up doing was actually surprisingly simple and worked well. I used the cp -rL flags to automatically duplicate the contents of my file tree and reverse all symlinks in the duplicate copy. And it was blazing fast: around 30 seconds for my entire file tree. Only problem was that the file permissions were not retained from my original state, so I needed to run some chmod and chcon commands to fix up the permissions.
This second method worked for me because there were no other symlinks in my schema. If you do have symlinks in your schema beyond those created by git-annex, then my little shortcut probably isn't the right choice for you, and you should consider sticking with just git annex uninit.
I would like to include my own experience of using git annex uninit, in addition to OP's answer.
I didn't have full repository annexed, but only about 40 bigger files. After deciding that I have no particular benefit of using git-annex, I tried unannexing several files and it was over in several seconds per file. Then, I ran git annex uninit and it took more than a minute only for really huge files (more than few GB). Overall, it was done in about 20 minutes, which was acceptable in my case.
So, it seems that the complexity of unannexing increases with the size of annexed file tree.
If you have a v6 repository, you can do the following:
git unnannex . --fast
which replaces the symlinks w/ hardlinks instead of slowly replacing the symlinks with the original files again.
Only v6 repositories can execute the git-annex unannex command on uncommited changes, so it could be necessary to upgrade the git-annex repo to a v6 repository.
See the Official Upgrade Guide.
In my case I had to upgrade v5 -> v6 and I only had to execute
git annex upgrade
which took a few seconds and I was done.
Have you tried to use git-annex in direct mode?
Just change your repository with
git annex direct
This will not use symlinks any longer, but some git commands do not work with such annex repositories.
Check out the explanations on their website to see if this scheme fits your purposes.
Maybe the conversion process is faster then the previous mentioned tips.
I haven't tried it by myself with big repositories.

Resources