For loop on Subset of Files - linux

Cell1.1.annot.gz
Cell1.2.annot.gz
Cell1.3.annot.gz
Cell1.4.annot.gz
Cell1.5.annot.gz
Cell2.1.annot.gz
.
.
.
Cell3.5.annot.gz
Making for a total of 3 X 5 = 15 files. I would like to run a python script on them. However, the catch is that each number (clarified here:Cell2.NUMBER.annot.gz) has to be matched to another file in a separate directory. I have code that works below, although it only works for one Cell file at a time. How can I automate this so it works for all files? (So Cell1...Cell3?)
for i in `seq 1 5`;
do python script.py --file1 DNA_FILE.${i} --file2 Cell1.${i}.annot.gz --thin-annot --out Cell1.${i} ;done

Another loop?
for c in 1 2 3
do for i in 1 2 3 4 5
do python script.py --file1 DNA_FILE.${i} --file2 Cell${c}.${i}.annot.gz --thin-annot --out Cell${c}.${i}
done
done

Related

How to run bash file for (different directory) as input automatically

I have a bash file which takes 5 inputs.
Input1 = file1
Input2 = file2
Input3 = directory1
Input4 = func
Input5 = 50
input 4 and 5 are always the same, never change.
file1 and file 2 are located inside directory1
directory1 is located inside a code directory
/code/directory1/file1
/code/directory1/file2
and there are many directories with the same structure directory(1-70) inside the code folder
/code/directory1/*
/code/directory2/*
/code/directory3/*
...
/code/directory70/*
In order to run the bash file, I have to run the command from terminal 70 times :<
Is there a way to automatically run all these folders at once?
UPDATE: the directory(1-7) each one has a different name e.g bug1, test, 4-A and so on. Even the files are different e.g. bug1.c hash.c
/code/bug1/bug1.c
code/bug1/hash.c
Try this:
for dirs in $(ls -F /code | grep '/')
do
eval files=( "$(ls -1 ${dirs})" )
<ShellScript>.sh "${dirs}${files[0]}" "${dirs}${files[1]}" "${dirs%/}" func 50
done

How to get numbers to come after decimal point in alphabetical sorting order in Bash

I have this .sh script that goes through every folder in a parent folder and runs program in each. The code I used was the following:
for d in ./*/
do cp program "$d"
(cd "$d" ; ./program)
done
program, among other things, gets the name of each folder and writes it to a file data.dat, so that all folder names are listed there. These folders' names are numbers (decimal) that identify their contents. program writes the folder name to data.dat when it enters each folder, so that they will appear in the order that Bash goes through the folders.
I want them to be sorted, in data.dat, in alphabetical order, putting lower numbers before higher, regardless of being a 1-digit or 2-digit number. For example, I want 2.32 to come before 10.43 and not the other way around.
The problem, it seems, is that for Bash the . comes after numbers in the order.
How can I change it to come before numbers?
Thanks in advance!
EDIT:
program is in Fortran 77 and goes like this:
` program getData
implicit none
character counter*20, ac*4, bash*270, Xname*4, fname*15
double precision Qwallloss, Qrad, Nrad, Qth, QreacSUM
double precision Xch4in, Ych4in, length, porosity, Uin, RHOin
double precision MFLR, Area, Xvalue
integer I
bash="printf '%s\n'"//' "${PWD##*/}" > RunNumber.txt'
call system(bash) !this gets the folder name and writes
!to RunNumber.txt
open(21, form="FORMATTED", STATUS="OLD", FILE="RunNumber.txt")
rewind(21)
read(21,*) counter !brings the folder name into the program
close(21)
`
(...)
`
call system(' cp -rf ../PowerData.dat . ')
open(27, form="FORMATTED", STATUS="OLD", ACCESS="APPEND", !the new row is appended to the existing file
1 FILE="PowerData.dat")
write(27,600) Counter, Xvalue, Nrad, Qrad, Qth, !writes a row of variables,
1 Area, MFLR, Uin, RHOin, Xch4in, Ych4in !starting with the folder name,
!to the Data file
close(27)
call system('cp -rf PowerData.dat ../')
end program`
I expect that your program will in the future do perhaps a bit more and therefore I made the second loop.
for d in ./*/ ; do
echo "$d"a >> /tmp/tmpfile
done
for d in $(sort -n /tmp/tmpfile) ; do
cp program "$d"
(cd "$d" ; ./program)
done
There are more ways to do this; for example:
for d in $(ls | sort -n) ; do
(some will castigate me for parsing the output of ls) etcetera.
So if you do:
mkdir test
cd test
touch 100
touch 2.00
touch 50.1
ls will give you
100 2.00 50.1
ls | sort -n will give you
2.00
50.1
100
and, as a bonus, ls -v will give you
2.00 50.1 100

Read the last line of output of a bash command

I want to read the current line of output from a bash command.
I know I could get this with cmd | tail -1, but I want to run this as a seperate command (tint2 executable) as a sort of progress meter.
For example:
I have a python program that outputs Downloaded x out of y as it downloads images, and I want to get the output as a shell variable.
Or:
Maybe I'm running pacman -Syy and I want
extra 420.6 KiB 139K/s 00:09 [#####-----------------] 24%
Is this possible?
Edit: Something is running in the terminal. I want a command that outputs the last output of the command in the previous terminal, maybe inputting a pid.
You can use tee to write things to the terminal and some logfile.
Lets say your python program looks like this
function mypython {
for i in 10 30 40 50 80 90 120 150 160 180 190 200; do
(( progress = (100 * i + 50) / 200 ))
printf "extra xx Kb, total %-3d of 200 (%d %%)\n" $i ${progress}
sleep 1
done
}
You can redirect or tee the output to a tmp file:
(mypython > /tmp/robert.out) &
or
(mypython | tee /tmp/robert.out) &
In another window you can get the last line with
tail -1 /tmp/robert.out
When you only want to see a progress, you might want something like to get the last line to overwrite the previous one.
mypython | while read -r line; do
printf "Progress of mypython: %s\r" "${line}"
done
When this is what you want you might want to change your python program
printf "...\r" ...

Find which file has the issue from the below Shell Script

Problem Statement:-
Below is the script that someone else wrote and he left the company so I don't know whom should I ask about this. So that is the reason I am posting here to find the solution.
What this script does is- It gzip the data from a particular folder (/data/ds/real/EXPORT_v1x0) for a particular date (20121017) and move it back to HDFS (hdfs://ares-nn/apps/tech/ds/new/) directory.
date=20121017
groups=(0 '1[0-3]' '1[^0-3]' '[^01]')
for shard in 0 1 2 3 4 5 6 7 8 9 10 11; do
for piece in 0 1 2 3; do
group=${groups[$piece]}
if ls -l /data/ds/real/EXPORT_v1x0_${date}_${shard}_T_${group}*.dat.gz; then
gzip -dc /data/ds/real/EXPORT_v1x0_${date}_${shard}_T_${group}*.dat.gz | \
hadoop jar /export/home/ds/lib/HadoopUtil.jar com.host.hadoop.platform.util.WriteToHDFS -z -u \
hdfs://ares-nn/apps/tech/ds/new/$date/EXPORT-part-$shard-$piece
sleep 15
fi
done
done
So during the migration to HDFS I found out this file has some problem in HDFS-
hdfs://ares-nn/apps/tech/ds/new/20121017/EXPORT-part-8-3
So Is there any way by doing some permutation from the above script we can find out what are the files under this directory (/data/ds/real/EXPORT_v1x0) which ultimately got converted to this hdfs://ares-nn/apps/tech/ds/new/20121017/EXPORT-part-8-3 which has the problem.
Any thoughts?
Update:-
Something like this below?
groups=(0 '1[0-3]' '1[^0-3]' '[^01]')
for shard in 0 1 2 3 4 5 6 7 8 9 10 11; do
for piece in 0 1 2 3; do
group=${groups[$piece]}
if ls -l /data/ds/real/EXPORT_v1x0_${date}_${shard}_T_${group}*.dat.gz; then
[ "$date/EXPORT-part-$shard-$piece" == "20121017/EXPORT-part-8-3" ] && {
echo /data/real/EXPORT_v1x0_${date}_${shard}_T_${group}*.dat.gz
}
fi
done
done
Few Sample Files Format I have in the /data/real/EXPORT folder-
/data/real/EXPORT_v1x0_20121017_4_T_115600_115800.dat.gz
/data/real/EXPORT_v1x0_20121017_4_T_235600_235800.dat.gz
/data/real/EXPORT_v1x0_20121017_4_T_115800_120000.dat.gz
/data/real/EXPORT_v1x0_20121017_4_T_235800_000000.dat.gz
And few sample output that I got after making changes-
/data/real/EXPORT_v1x0_20121017_0_T_0*.dat.gz: No such file or directory
/data/real/EXPORT_v1x0_20121017_0_T_1[0-3]*.dat.gz: No such file or directory
/data/real/EXPORT_v1x0_20121017_0_T_1[^0-3]*.dat.gz: No such file or directory
/data/real/EXPORT_v1x0_20121017_0_T_[^01]*.dat.gz: No such file or directory
In this case reaplce the whole gzip line to:
[ "$date/EXPORT-part-$shard-$piece" == "20121017/EXPORT-part-8-3" ] && {
echo /data/real/EXPORT_v1x0_${date}_${shard}_T_${group}*.dat.gz
}
That should do the trick.
Edit: remove sleep to speed up the loop!

Bash script - iterate through folders and move into folders of 1000

I have 1.2 million files split out into folders, like so:
Everything
..........Folder 1
..................File 1
..................File 2
..................File 3
..................File 4
..................File 5 etc
..........Folder 2
..................File 1
..................File 2
..................File 3
..................File 4
..................File 5 etc
If I cd into Folder 1 I can run the following script to organize the files there into folders called 1, 2, 3, etc. of 1000 files each:
dir="${1-.}"
x="${2-1000}"
let n=0
let sub=0
while IFS= read -r file ; do
if [ $(bc <<< "$n % $x") -eq 0 ] ; then
let sub+=1
mkdir -p "$sub"
n=0
fi
mv "$file" "$sub"
let n+=1
done < <(find "$dir" -maxdepth 1 -type f)
However I really would like to run it once on the Everything folder at the top level. From there it would consider the child folders, and do the by-1000 sorting so I could move everything out of Folder 1, Folder 2, etc. and into folders of 1000 items each called 1, 2, 3, etc.
Any ideas?
Edit: Heres how I would like the files to end up (as per comments):
Everything
..........Folder1
.................file1(these filenames can be anything, they shouldnt be renamed)
.................(every file in between so file2 > file 999)
.................file1000
..........Folder2
.................file1001
.................(every file in between so file1002 > file file1999)
.................file2000
Every single possible file that is in the original folder structure is grouped into folders of 1000 items under the top level.
Let's assume your script is called organize.sh, and the Everything folder contains only directories. Try the following:
cd Everything
for d in *; do
pushd $d
bash ~/temp/organize.sh
popd
done
Update
To answer Tom's question in the comment: you only need one copy of organize.sh. Say if you put it in ~/temp, then you can invoke as updated above.
Pseudo Algo:
1) Do ls for all your directories and store them in a file.
2) Do cd into each directory you copied into your file.
3) Sort all your files
4) Do cd ..
5) Repeat step 2-4 in a for loop.

Resources