Handling unique case to untar and cd into the folder - linux

I have a script to untar/unzip a compressed file and cd into it and then run another load script to load. Everything works fine expect when I have unique cases when the top level directory is just /. Does anyone know how I should handle this unique case?
To be more clear, I need to cd into myfile_01 and not the root directory.
tar -xvzf $fname
cd $(tar -tf $fname | grep -m3 /$) #tar it and cd into it
loadIt #run load script
Unique case that would cause a problem:
[user#user my_directory]$ tar -tf myfile_01.tgz | grep -m3 /$
./ # it will cause it to cd to top level directory instead of my_file01
./myfile_01/

Following the pattern of your script, can't you just remove the top level directory that is giving you trouble?
Using grep:
tar -tf myfile_01.tgz | grep -m3 /$ | grep --invert-match --extended-regexp '^\./$'
Using tail:
tar -tf myfile_01.tgz | grep -m3 /$ | tail -1

Related

How to use xargs curl to read lines of a text file in subdirectory and keep downloaded files in subdirectory?

I have several subdirectories within a parent folder, each with a URLs.txt file within the subdirectory.
$ ls
H3K4me1_assay H3K4me2_assay H3K4me3_assay ... +30 or so more *_assay files
Each assay_file contains one URLs.txt file:
$ cat URLs.txt
https://www.encodeproject.org/files/ENCFF052HMX/##download/ENCFF052HMX.bed.gz
https://www.encodeproject.org/files/ENCFF052HMX/##download/ENCFF052HMX.bed.gz
https://www.encodeproject.org/files/ENCFF466DMK/##download/ENCFF466DMK.bed.gz
... +200 or more URLs
Is there any way I can execute a command from the parent folder that reads and curls the URLs.txt file in each subdirectory, and then downloads the file within each subdirectory?
I can cd into each file and run the following commands to download all of the files:
$ cd ~/largescale/H3K4me3_assay
$ ls URL* | xargs -L 1 -d '\n' zcat | xargs -L 1 curl -O -J -L
But I will have to run this command for experiments with +300 folders, so cd'ing in each time isn't really practical.
I have tried to run this, it does download the correct files but within the parent folder rather than the subdirectories. Any idea what I am doing wrong?
$ for i in ./*_assay; do cd ~/largescale/"$i" | ls URL* | xargs -L 1 -d '\n' zcat | xargs -L 1 curl -O -J -L; done
Thanks, Steven

Is this possible in this command to cd into the directory thats printed in output

When I do ls | grep -e *-folder1 it prints my-folder1 that's the name of the folder matched in the command in current directory.
Is there a way I can add something like cd into this directory. This is more of an attempt to learn Bash or commands on Linux, rather than about doing what I am trying to accomplish.
You could do
ls | grep -- -folder1 | while read -r dir
do
cd "$dir"
# do things in $dir
done
# do things in the original directory
but parsing the output of ls is not recommended. You could instead use globbing:
for dir in *-efolder*
do
cd "$dir"
# do things in $dir
cd .. # need to back out again
done
# do things in the original directory
If the purpose isn't to grep on all folders matching a certain pattern and to cd down into each one of them, but to simply cd into a directory ending with -folder1, then:
cd *-folder1
If you get zero or multiple hits, cd will shown an error.

Bash script creates symlink inside directory aswell

I have a bash script:
for file in `find -mindepth 1 -maxdepth 1`
do
old=`pwd`
new=".${file##*/}"
newfile="$old/${file##*/}"
cd
wd=`pwd`
link="$wd/$new"
ln -s -f $newfile $link
cd $old
done
This is meant to create a symlink in the user's home directory with '.' prepended to all the files and directories in the current working directory. This works fine, however, it also creates a symlink inside any sub directories to the same directory. E.G. foo will now have a link at ~/.foo and foo/foo.
Is there any way I can fix this?
EDIT: Thanks to Sameer Naik and dave_thompson_085 I have changed it up a bit, but the problem still persists even when a alternate directory is given as an argument. It isn't a problem with sub directories it is that two links are being made, one to ~/.foo123 and one to foo123/foo123 not a link is being made to ~/ for foo1 and foo1/foo2.
for file in `ls -l | grep -v total | awk '{print $NF;}'`
do
old=`pwd`
wd=`cd $1 && pwd`
new=".${file}"
newfile="$wd/${file}"
cd
wd=`pwd`
link="$wd/$new"
ln -s -f $newfile $link
cd $old
done
Since you don't want recurse into sub-directories, try using
for file in `ls -l | grep -v ^d | awk '{print $NF;}'`
or use -type f in find to exclude subdirectories

Use grep -lr output to add files to tar

In UBUNTU and CENTOS.
I have some files I want to tar based on their contents.
$ grep -rl "123.45" .
returns a list of about 10 files in this kind of format:
./somefolder/someotherfolder/somefile.txt
./anotherfolder/anotherfile.txt
etc...
I want to tar.gz all of them.
I tried:
$ grep -rl "123.45" . | tar -czf files.tar.gz
Doesn't work. That's why I'm here. Any ideas? Thanks.
Just tried this, and it worked in Ubuntu, but in CentOS I get "tar: 02: Cannot stat: No such file or directory".
$ tar -czf test.tar.gz `grep -rl "123.45" .`
If anyone else has a better way, let me know. That above one works great in Ubuntu, at least.
Like this:
... | tar -T - -czf files.tar.gz
"-T -" causes tar to read filenames from stdin. Second minus stands for stdin. –
grep -rl "123.45" . | xargs tar -czf files.tar.gz
Tar wants to be told what files to process, not given the names of the files via stdin.
It does however have a -T / --files-from option. So I'd suggest using that. Output your list of selected files to a temp file and then have tar read that, like this:
T=$(mktemp)
grep -rl "123.45" . > $T
tar cfz files.tar.gz -T $T
rm -f $T
If you want, you can also use shell expansion to do it like this:
tar cfz files.tar.gz -- $(grep -rl "123.45" .)
But that will fail if you have too many files or if any of the files have strange names (like spaces etc).

How to delete all files that were recently created in a directory in linux?

I untarred something into a directory that already contained a lot of things. I wanted to untar into a separate directory instead. Now there are too many files to distinguish between. However the files that I have untarred have been created just now (right ?) and the original files haven’t been modified for long (at least a day). Is there a way to delete just these untarred files based on their creation information ?
Tar usually restores file timestamps, so filtering by time is not likely to work.
If you still have the tar file, you can use it to delete what you unpacked with something like:
tar tf file.tar --quoting-style=shell-always |xargs rm -i
The above will work in most cases, but not all (filenames that have a carriage return in them will break it), so be careful.
You could remove the directories by adding -r to that, but it's probably safer to just remove the toplevel directories manually.
find . -mtime -1 -type f | xargs rm
but test first with
find . -mtime -1 -type f | xargs echo
There are several different answers to this question in order of increasing complexity.
First, if this is a one off, and in this particular instance you are absolutely sure that there are no weird characters in your filenames (spaces are OK, but not tabs, newlines or other control characters, nor unicode characters) this will work:
tar -tf file.tar | egrep '^(\./)?[^/]+(/)?$' | egrep -v '^\./$' | tr '\n' '\0' | xargs -0 rm -r
All that egrepping is to skip out on all the subdirectories of the subdirectories.
Another way to do this that works with funky filenames is this:
mkdir foodir
cd foodir
tar -xf ../file.tar
for file in *; do rm -rf ../"$file"; done
That will create a directory in which your archive has been expanded, but it sounds like you wanted that already anyway. It also will not handle any files who's names start with ..
To make that method work with files that start with ., do this:
mkdir foodir
cd foodir
tar -xf ../file.tar
find . -mindepth 1 -maxdepth 1 -print0 | xargs -0 sh -c 'for file in "$#"; do rm -rf ../"$file"; done' junk
Lastly, taking from Mat's answer, you can do this and it will work for any filename and not require you to untar the directory again:
tar -tf file.tar | egrep '^(\./)?[^/]+(/)?$' | grep -v '^\./$' | tr '\n' '\0' | xargs -0 bash -c 'for fname in "$#"; do fname="$(echo -ne "$fname")"; echo -n "$fname"; echo -ne "\0"; done' junk | xargs -0 rm -r
You can handle files and directories in one pass with:
tar -tf ../test/bob.tar --quoting-style=shell-always | sed -e "s/^\(.*\/\)'$/rmdir \1'/; t; s/^\(.*\)$/rm \1/;" | sort | bash
You can see what is going to happen leave off the pipe to 'bash'
tar -tf ../test/bob.tar --quoting-style=shell-always | sed -e "s/^\(.*\/\)'$/rmdir \1'/; t; s/^\(.*\)$/rm \1/;" | sort
to handle filenames with linefeeds you need more processing.

Resources