Sort files in a directory by their text character length and copy to other directory [closed] - linux

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I'm trying to find the smallest file by character length inside of a directory and, once it is found, I want to rename it and copy it to another directory.
For example, I have two files in one directory ~/Files and these are cars.txt and rabbits.txt
Text in cars.txt:
I like red cars that are big.
Text in rabbits.txt:
I like rabbits.
So far I know how to get the character length of a single file with the command wc -m 'filename' but I don't know how to do it in all the files and sort them in order. I know rabbits.txt is smaller in character length, but how do I compare both of them?

You could sort the files by size, then select the name of the first one:
file=$(wc -m ~/Files/* 2>/dev/null | sort -n | head -n 1 | awk '{print $2}')
echo $file

Related

Loop through every 'even' line in a file [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 10 days ago.
Improve this question
I have a fasta file with the following structure. For context, a fasta file is simply a text file with a header denoted by '>' and below it is the text. I want to create a for-loop that can iterate through every even line of this fasta file.
The name of the file is chicken_topmotifs.fasta
>gene8
ATGAATTATTATACACCTCAAATACTCTCCTCAATCTCTCCAACATTCCCCACCACAATTCTCGGTGACTTTACTACACTTCTACAATCATACACTTCT
>gene12
ATGGTAGATCTCTATTACGATTATCTTTCTTAGATCACATAATTATCACCCCCCCTTATAAATCTACACTTCTACAACCAATTACACTTCTACAAAACA
>gene18
ATGCTTTTACACTTCTACAACTACTTTTAACTCGATACTTCTACAATCTACACATATCACAATAACAAAAACAAAAAGCTACTAATATATATATATACA
>gene21
ATGTCTCAATTTCACCAATCTATAATTTACTACGCCGTACTCTTTATAACCTTACTTTCTTAAATAACATTACACTTCTACATTACATATTTTACATCA
for sequence in chicken_topmotifs.fasta;
do
echo $sequence
done
Just do two reads each time through the loop. The first read gets the odd line, the second one gets the even line after it.
while read -r gene; do
read -r sequence
# do stuff with $sequence
done < chicken_topmotifs.fasta
Assumptions:
ignore header (>) lines
ignore blank lines
One bash idea:
while read -r sequence
do
echo "$sequence"
done < <(grep '^[ATGC]' chicken_topmotifs.fasta)
If we don't have to worry about blank lines:
while read -r sequence
do
echo "$sequence"
done < <(grep -v '^>' chicken_topmotifs.fasta)
Both of these generate:
ATGAATTATTATACACCTCAAATACTCTCCTCAATCTCTCCAACATTCCCCACCACAATTCTCGGTGACTTTACTACACTTCTACAATCATACACTTCT
ATGGTAGATCTCTATTACGATTATCTTTCTTAGATCACATAATTATCACCCCCCCTTATAAATCTACACTTCTACAACCAATTACACTTCTACAAAACA
ATGCTTTTACACTTCTACAACTACTTTTAACTCGATACTTCTACAATCTACACATATCACAATAACAAAAACAAAAAGCTACTAATATATATATATACA
ATGTCTCAATTTCACCAATCTATAATTTACTACGCCGTACTCTTTATAACCTTACTTTCTTAAATAACATTACACTTCTACATTACATATTTTACATCA

Shell command to print the statements with N number of words present in other file [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 months ago.
Improve this question
Suppose I have a file with 3 lines:
output.txt:
Maruti
Zen
Suzuki
I used the command wc -l output.txt to get no. of lines
i got output as 3
Based on the above output I have to execute a command
echo CREATE FROM (sed -n 1p OUTPUT.txt)
echo CREATE FROM (sed -n 2p OUTPUT.txt)
echo CREATE FROM (sed -n 3p OUTPUT.txt)
:
:
echo CREATE FROM (sed -n np OUTPUT.txt)
Can you please suggest a command to replace 1 2 3 .....n in the above command based on the output i get (i.e., no. of lines in my file)
I just gave a sample explanation of my use case. Please suggest a command to execute n no. of times.
You just need one command.
sed 's/^/CREATE FROM /' output.txt
See also Counting lines or enumerating line numbers so I can loop over them - why is this an anti-pattern?

Extract the file name of last slash from path [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am finding the files in specific location now I need to extract the file name which is after last slash from that path without its extension like *.war by using shell scripting.
I tried below to find out the data in the path:
find /data1/jenkins_devops/builds/develop/5bab159c1c40cfc44930262d30511ac7337805fa -mindepth 1 -type f -name '*.war'
Ex.- This folder "5bab159c1c40cfc44930262d30511ac7337805fa" contains multiple .war file like interview.war, auth.war so I am expecting output is interview war.
Can someone please help?
There are much elegant ways to achieve the objectve. The following use awk to achieve the objectiv:
find /data1/jenkins_devops/builds/develop/5bab159c1c40cfc44930262d30511ac7337805fa -mindepth 1 -type f -name *.war | awk -F "/" '{print $NF}' | awk -F "." '{print $1}'
Awk NF returns the number of fields and you can use that to print the last column. First you seperate the columns with / as field seperator in awk and use it to print last column. Then use . as seperator and print the first column to achieve the desired result. It is done in the above script.
Just use basename:
the_path="/data1/jenkins_devops/builds/develop/ab7f302d157d839b4ac3d7917cfa2d550ba2e73e/auth.war"
basename "$the_path" .war

How to create a Unix script to segregate data Line by Line? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I have some data in a MyFile.CSV file like this:
id,name,country
100,tom cruise,USA
101,Johnny depp,USA
102,John,India
What will be the shell script to take the above file as input and segregate the data in 2 different files as per the country?
I tried using the FOR loop and then using 2 IFs inside it but I am unable to do so. How to do it using awk?
For LINE in MyFile.CSV
Do
If grep "USA" $LINE >0 Then
$LINE >> Out_USA.csv
Else
$LINE >> Out_India.csv
Done
You can try with this
grep -R "USA" /path/to/file >> Out_USA.csv
grep -R "India" /path/to/file >> Out_India.csv
Many ways to do:
One way:
$ for i in `awk -F"," '{if(NR>1)print $3}' MyFile.csv|uniq|sort`;
do
echo $i;
egrep "${i}|country" MyFile.csv > Out_${i}.csv;
done
This assumes that the country name would not clash with other columns.
If it does, then you can fine tune that by adding additional regex.
For example, it country will be the last field, then you can add $ to the grep

Linux Compare two text files [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have two text file like below:
File1.txt
A|234-211
B|234-244
C|234-351
D|999-876
E|456-411
F|567-211
File2.txt
234-244
999-876
567-211
And I want to compare both files and get containing values like below:
Dequired output
B|234-244
D|999-876
F|567-211
$ grep -F -f file2.txt file1.txt
B|234-244
D|999-876
F|567-211
The -F makes grep search for fixed strings (not patterns). Both -F and -f are POSIX options to grep.
Note that this assumes your file2.txt does not contain short strings like 11 which could lead to false positives.
Try:
grep -f File2.txt File1.txt

Resources