How to use xargs curl to read lines of a text file in subdirectory and keep downloaded files in subdirectory? - linux

I have several subdirectories within a parent folder, each with a URLs.txt file within the subdirectory.
$ ls
H3K4me1_assay H3K4me2_assay H3K4me3_assay ... +30 or so more *_assay files
Each assay_file contains one URLs.txt file:
$ cat URLs.txt
https://www.encodeproject.org/files/ENCFF052HMX/##download/ENCFF052HMX.bed.gz
https://www.encodeproject.org/files/ENCFF052HMX/##download/ENCFF052HMX.bed.gz
https://www.encodeproject.org/files/ENCFF466DMK/##download/ENCFF466DMK.bed.gz
... +200 or more URLs
Is there any way I can execute a command from the parent folder that reads and curls the URLs.txt file in each subdirectory, and then downloads the file within each subdirectory?
I can cd into each file and run the following commands to download all of the files:
$ cd ~/largescale/H3K4me3_assay
$ ls URL* | xargs -L 1 -d '\n' zcat | xargs -L 1 curl -O -J -L
But I will have to run this command for experiments with +300 folders, so cd'ing in each time isn't really practical.
I have tried to run this, it does download the correct files but within the parent folder rather than the subdirectories. Any idea what I am doing wrong?
$ for i in ./*_assay; do cd ~/largescale/"$i" | ls URL* | xargs -L 1 -d '\n' zcat | xargs -L 1 curl -O -J -L; done
Thanks, Steven

Related

How to grep all files beside current dir, parent dir and one definded?

I have a folder with the following files / folders:
.test
README.md
/dist
/src
I want to grep all files beside dist. So the result should look like:
.test
README.md
/src
When I do
ls -a | grep -v dist
it will remove dist. But . and .. are present. However I require the -a to get files with dot prefix.
When I try to add ls -a | grep -v -e dist -e . -e .. there is no output.
Why will -e . remove all files? How to do it?
Better to use find with -not option instead of error prone ls | grep:
find . -maxdepth 1 -mindepth 1 -not -name dist
btw just for resolving your attempt, correct ls | grep would be:
ls -a | grep -Ev '^(dist|\.\.?)$'
If you use bash, you can do :
shopt -s extglob
echo .[^.]* !(dist)

Bash script creates symlink inside directory aswell

I have a bash script:
for file in `find -mindepth 1 -maxdepth 1`
do
old=`pwd`
new=".${file##*/}"
newfile="$old/${file##*/}"
cd
wd=`pwd`
link="$wd/$new"
ln -s -f $newfile $link
cd $old
done
This is meant to create a symlink in the user's home directory with '.' prepended to all the files and directories in the current working directory. This works fine, however, it also creates a symlink inside any sub directories to the same directory. E.G. foo will now have a link at ~/.foo and foo/foo.
Is there any way I can fix this?
EDIT: Thanks to Sameer Naik and dave_thompson_085 I have changed it up a bit, but the problem still persists even when a alternate directory is given as an argument. It isn't a problem with sub directories it is that two links are being made, one to ~/.foo123 and one to foo123/foo123 not a link is being made to ~/ for foo1 and foo1/foo2.
for file in `ls -l | grep -v total | awk '{print $NF;}'`
do
old=`pwd`
wd=`cd $1 && pwd`
new=".${file}"
newfile="$wd/${file}"
cd
wd=`pwd`
link="$wd/$new"
ln -s -f $newfile $link
cd $old
done
Since you don't want recurse into sub-directories, try using
for file in `ls -l | grep -v ^d | awk '{print $NF;}'`
or use -type f in find to exclude subdirectories

Handling unique case to untar and cd into the folder

I have a script to untar/unzip a compressed file and cd into it and then run another load script to load. Everything works fine expect when I have unique cases when the top level directory is just /. Does anyone know how I should handle this unique case?
To be more clear, I need to cd into myfile_01 and not the root directory.
tar -xvzf $fname
cd $(tar -tf $fname | grep -m3 /$) #tar it and cd into it
loadIt #run load script
Unique case that would cause a problem:
[user#user my_directory]$ tar -tf myfile_01.tgz | grep -m3 /$
./ # it will cause it to cd to top level directory instead of my_file01
./myfile_01/
Following the pattern of your script, can't you just remove the top level directory that is giving you trouble?
Using grep:
tar -tf myfile_01.tgz | grep -m3 /$ | grep --invert-match --extended-regexp '^\./$'
Using tail:
tar -tf myfile_01.tgz | grep -m3 /$ | tail -1

How to use ls command output in rm for a particular directory

I want to delete oldest files in a directory when the number of files is greater than 5. I'm using
(ls -1t | tail -n 3)
to get the oldest 3 files in the directory. This works exactly as I want. Now I want to delete them in a single command with rm. As I'm running these commands on a Linux server, cd into the directory and deleting is not working so I need to use either find or ls with rm and delete the oldest 3 files. Please help out.
Thanks :)
If you want to delete files from some arbitrary directory, then pass the directory name into the ls command. The default is to use the current directory.
Then use $() parameter expansion to transfer the result of tail into rm like this
rm $(ls -1t dirname| tail -n 3)
rm $(ls -1t | tail -n 3) 2> /dev/null
ls may return No such file or directory error message, which may cause rm to run unnessesary with that value.
With the help of following answer: find - suppress "No such file or directory" errors and https://unix.stackexchange.com/a/140647/198423
find $dirname -type d -exec ls -1t {} + | tail -n 3 | xargs rm -rf

how to put all folders that do not have 'AAA' or 'BBB' in name into a zip file ? linux

In the current folder I have lots of subfolders and they have files in them. How can I put those folders with no AAA or BBB in the name into a .zip file?
123AAAcc
123.txt
BBB.csv
222BBBss
...
ADFAAA
BBB
adsf.txt
vvBB
111.mov
BBB.avi
I've tried the following and it also excludes vvBB\BBB.avi and ADFAAA\BBB\*. I would like to keep those.
tar -zcvf test.tgz --exclude='*AAA*' --exclude='*BBB*' .
The problem is, how can I have --exclude only work on level 1 subfolder names, but recursively on all filenames and folder names?
Hopefully I have a .zip file containing:
vvBB
111.mov
BBB.avi
I would like to achieve this as one command line. How can I do this?
You can accomplish this behavior by chaining together the following commands:
ls will list all of the files and folders in the current directory.
grep -v "\(.*BBB.*\)\|\(.*AAA.*\)" looks for all names with AAA or BBB (surrounded by anything), then (with -v) excludes them and returns all other results.
xargs tar -cvzf test.tgz will take arguments from a pipe and apply them to tar. All together, you get:
ls | grep -v "\(.*BBB.*\)\|\(.*AAA.*\)" | xargs tar -cvzf test.tgz
The partial results I get are:
$ ls
123AAAcc
222BBBss
ADFAAA
vvBB
$ ls | grep -v "\(.*BBB.*\)\|\(.*AAA.*\)"
vvBB
$ ls | grep -v "\(.*BBB.*\)\|\(.*AAA.*\)" | xargs tar -cvzf test.tgz
a vvBB
a vvBB/111.mov
a vvBB/BBB.avi

Resources