rename output files with iteration number and input with gnu parallel - gnu

I have a list of files (data1.txt, ... ,data6.txt) and I want to run the same commands on them 3 times as example. I am using gnu parallel
I want as output files: 1data1.txt, 2data1.txt, 3data1.txt, ... , 2data6.txt, 3data6.txt.
I tried:
for i in $(seq 3); do parallel -j 8 'myCommand data{}.txt > results/out/{$i}data{}.txt' ::: 1 2 3 4 5 6; done
but my output files are : {}data1.txt, ...., {}data6.txt
I've tried different possibilities but I don't have the expected results

Use GNU Parallel's feature of making combinations:
parallel -j 8 myCommand data{2}.txt '>' results/out/{1}data{2}.txt' ::: 1 2 3 ::: 1 2 3 4 5 6
If your CPU has 8 threads you can leave out -j8. This is a good idea, if you later are going to run it on a bigger system.
You can also use --results (requires version >20170222):
parallel --results results/out/{1}data{2}.txt myCommand data{2}.txt ::: 1 2 3 ::: 1 2 3 4 5 6 >/dev/null

Related

Merge one-line texts into a data-frame with basic ubuntu shell commands

I have let's say two files Input1.txt and Input2.txt. Each of them is a text file containing a single line of 5 numbers separated by a tab.
For instance Input1.txt is
1 2 3 4 5
and Input2.txt is
6 7 8 9 10
The output that I desire is Output.txt :
Input1 1 2 3 4 5
Input2 6 7 8 9 10
So I want to merge the files in a table with an extra first column containing the names of the original files. Obviously I have more than 2 files (actually 1000) and I would like to make it with a for loop. You can assume that all my files are named as Input*.txt with * between 1 and 1000 and that they are all in the same directory.
I know how to do it with R, but I would like to make it with a basic line of commands in the ubuntu shell. Is it feasible ? Thanks for any help.
Assuming the line in Input1.txt, Input2.txt, etc. is terminated with a newline character, you can use
for i in Input*.txt
do
printf "%s " "$i"
cat "$i"
done > Output.txt
The result is
Input1.txt 1 2 3 4 5
Input2.txt 6 7 8 9 10
If you want to get Input1 etc. without .txt you can use
printf "%s " "${i%.txt}"

Count occurrence of numbers in linux

I have a .txt file with 25,000 lines. Each line there is a number from 1 to 20. I want to compute the total occurrence of each number in the file. I don't know should I use grep or awk and how to use it. And I'm worried about I got confused with 1 and 11, which both contain 1's. Thank you very much for helping!
I was trying but this would double count my numbers.
grep -o '1' degreeDistirbution.txt | wc -l
With grep you can match the beginning and end of a line with '^' and '$' respectively. For the whole thing I'll use an array, but to illustrate this point I'll just use one variable:
one="$(grep -c "^1$" ./$inputfile)"
then we put that together with the magic of bash loops and loop through all the numbers with a while like so:
i=1
while [[ $i -le 20 ]]
do
arr[i]="$(grep -c "^$i$" ./$inputfile)"
i=$[$i+1]
done
if you like you can of course use a for as well
An easier method is:
sort -n file | uniq -c
Which will count the occurrences of each number in the sorted file and display the results like:
$ sort -n dat/twenty.txt | uniq -c
3 1
3 2
3 3
4 4
4 5
4 6
4 7
4 8
4 9
4 10
4 11
3 12
2 13
2 14
4 15
4 16
4 17
2 18
2 19
2 20
Showing I have 3 ones, 3 twos, etc.. in the sample file.

Sort range Linux

everyone. I have some questions about sorting in bash. I am working with Ubuntu 14.04 .
The first question is: why if I have file some.txt with this content:
b 8
b 9
a 8
a 9
And when I type this :
sort -n -k 2 some.txt
the result will be:
a 8
b 8
a 9
b 9
which means that the file is sorted first to the second field and after that to the first field, but I thought that is will stay stable i.e.
b 8
a 8
...
...
Maybe if two rows are equal it is applied lexicographical sort or what ?
The second question is: why the following doesn`t working:
sort -n -k 1,2 try.txt
The file try.txt is like this:
8 2
8 11
8 0
8 5
9 2
9 0
The third question is not actally for sorting, but it appears when I try to do this:
sort blank.txt > blank.txt
After this the blank.txt file is empty. Why is that ?
Apparently GNU sort is not stable by default: add the -s option
Finally, as a last resort when all keys compare equal, sort compares entire lines as if no ordering options other than --reverse (-r) were specified. The --stable (-s) option disables this last-resort comparison so that lines in which all fields compare equal are left in their original relative order.
(https://www.gnu.org/software/coreutils/manual/html_node/sort-invocation.html)
There's no way to answer your question if you don't show the text file
Redirections are handled by the shell before handing off control to the program. The > redirection will truncate the file if it exists. After that, you are giving an empty file to sort
for #2, you don't actually explain what's not working. Expanding your sample data, this happens
$ cat try.txt
8 2
8 11
9 2
9 0
11 11
11 2
$ cat try.txt
8 2
8 11
9 2
9 0
11 11
11 2
I assume you want to know why the 2nd column is not sorted numerically. Let's go back to the sed manual:
‘-n’
‘--numeric-sort’
‘--sort=numeric’
Sort numerically. The number begins each line and consists of ...
Looks like using -n only sorts the first column numerically. After some trial and error, I found this combination that sorts each column numerically:
$ sort -k1,1n -k2,2n try.txt
8 2
8 11
9 0
9 2
11 2
11 11

missing number from two squence

How do I findout missing number from two sequence using bash script
from example I have file which contain following data
1 1
1 2
1 3
1 5
2 1
2 3
2 5
output : missing numbers are
1 4
2 2
2 4
This awk one-liner gives the requested output for the specified input:
$ awk '$2!=l2+1&&$1==l1{for(i=l2+1;i<$2;i++)print l1,i}{l1=$1;l2=$2}' file
1 4
2 2
2 4
a solution using grep:
printf "%s\n" {1..2}" "{1..5} | grep -vf file

how to random lines in txt with bash

I have a txt file with some lines such as:
a
b
c
f
e
f
1
2
3
4
5
6
now I want to random lines and print it to another txt file for example:
f
6
e
1
and so on...
could any body help me?
I am new in bash scripting
You could use shuf (a part of GNU coreutils).
shuf inputfile > outfile
For example:
$ seq 10 | shuf
7
5
8
3
9
4
10
1
6
2
There is an option for that
sort -R /your/file.txt
Expanation
-R, --random-sort
sort by random hash of keys
Iterate over the file, outputting each line with a certain probability (in this example, with roughly a 10% chance for each line:
while read line; do
if (( RANDOM % 10 == 0 )); then
echo "$line"
fi
done < file.txt
(I say "roughly", because the value of RANDOM ranges between 0 and 32767. As such, there are slightly more values that will produce a remainder of 0-7 than there are that will produce a remainder of 8 or 9 when divided by 10. Other probabilities are have similar problems; you can fine-tune the expression to be more precise, but I leave that as an exercise to the reader.)
For less fortunates systems without GNU utils like BSD/OSX you can use this code:
for ((i=0; i<10; i++)); do
n=$((RANDOM%10))
sed $n'q;d' file
done

Resources