linux create directories and move corresponding files to the directories - linux

I have a text file listed the directory names and what files should included inside.
my text file:
SRS000111 ERR1045156
SRS000112 ERR1045188
SRS000123 ERR1045204
SRS000134 ERR1045237 ERR1045238 ERR1045239
SRS000154 ERR1045255 ERR1045256
SRS000168 ERR1045260 ERR1045261 ERR1045262
... ... ...
SRS001567 ERR1547451 ERR1547676
now I want to create all the directories using the first column of the text file but I don't know how to do the for loop.
for filename in cat file.txt | awk -F, '{print $1}'; do mkdir ${filename}; done
but it goes to error.
second I have all the ERR files and I want to move them to the corresponding directories according to the text file. I have not any idea how to do this part.

I recommend you read the file, split the folder name columns and file name columns and makes de directories and the movements.
This script makes it:
#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
dir=$(echo "$line" | awk '{print $1}')
files=$(echo "$line" | awk '{$1=""; print $0}')
mkdir $dir
mv $files $dir/
done < myfile.txt
Is not to complicated but if you have questions about it you can make me any question

You have to make system calls from awk for mkdir and mv files
This awk would do
awk 'FNR>1{system("mkdir \"" $1 "\""); for(i=2; i<=NF; i++) system("mv \"" $i "\" " "\"" $1 "\"")}' file
FNR>1 because we don't want to create directory for first line i.e header names in your CSV file
Note : Run this command from the directory where all the filenames as mentioned in your source/input file are present. This will create directories there itself and will move all the files in those newly created directories.

Related

Search, match and copy directories into another based on names in a txt file

My goal is copy a bulk of specific directories whose names are in a txt file as follows:
$ cat names.txt
raw1
raw2
raw3
raw4
raw5
These directories have subdirectories, hence it is important to copy all the contents. When I list in my terminal it looks like this:
$ ls -l
raw3
raw7
raw1
raw8
raw5
raw6
raw2
raw4
To perform this task, I have tried the following:
cat names.txt | while read line; do grep -l '$line' | xargs -r0 cp -t <desired_destination>; done
But, I get this mistake
cp: cannot stat No such file or directory
I suppose it's because the names in the file list (names.txt) don't match in sorting with the ones in the terminal. Notice that they are unsorted and by using while read line doesn't work. Thank you for taking the time and commitment to help me.
Having problems following the logic of the current code so in the name of K.I.S.S. I propose:
tgtdir=/my/target/directory
while read -r srcdir
do
[[ -d "${srcdir}" ]] && cp -rp "${srcdir}" "${tgtdir}"
done < <(tr -d '\r' < names.dat)
NOTES:
the < <(tr -d '\r' < names.dat) is used to remove windows/dos line endings from names.dat (per comments from OP); if names.dat is updated to remove the \r' then the tr -d with be a no-op (ie, bit of overhead to spawn the subprocess but the script should still read names.dat correctly)
assumes script is run from the directory where the source directories reside otherwise code can be modified to either cd to said directory or preface the ${srcdir} references with said directory
OP can add/modify the cp flags as needed, but I'm assuming at a minimum -r will be needed in order to recursively copy the directories
UUoC.
cat names.txt | while read line; do ...; done
is better written
while read line; do ...; done < names.txt
do grep -l '$LINE' | is eating your input.
printf "%s\n" 1 2 3 |while read line; do echo "Read: [$line]"; grep . | cat; done
Read: [1]
2
3
In your case, it is likely finding no lines that match the literal string $LINE which you have embedded in single-qote marks, which do not allow it to be parsed for content. Use "$line" (avoid capitals), and wouldn't be helpful even if it did match:
$: printf "%s\n" 1 2 3 | grep -l .
(standard input)
You didn't tell it what to read from, so -l is pointless since it's reading the same stdin stream that the read is.
I think what you want is a little simpler -
xargs cp -Rt /your/desired/target/directory/ < names.txt
Assuming you wanted to leave the originals where they were.

Changing file type for multiple files at once using awk command

I am trying to change .dat files to .csv files using the awk command. An example file has 3 columns of numbers with spaces between each column:
23.00005 320.0054 0.0039734
xx.xxxxx xxx.xxxx x.xxxxxxx
The filenames are organized as filenameX.project.dat where X is any number from 1 to a couple hundred. The folder has many other files that I do not want changed. I want to be able to change all of these files at once instead of having to do them over and over.
Here is my example command:
awk '{print $1","$2","$3}' filenameX.project.dat > filenameX.project.csv
How can I automate this to run one command that will make every file a csv file?
I have tried the below command and others similar but none work.
awk '{print $1","$2","$3}' filename*.project.dat > filename*.project.csv
Something like this:
$ for i in filename*dat; do awk '{print $1","$2","$3}' "$i" >> $(echo "$i" | sed 's,\.dat$,.csv,'); done
It will loop through all filename*dat files in a directory, execute awk command on them and redirect output to the file that has .csv instead of .dat at the end.
You can do this all in awk like so:
awk 'BEGIN {OFS=","}
FNR==1 {fn=FILENAME; sub(/\.dat$/,".csv",fn)
printf "Copying %s to %s\n", FILENAME, fn}
{ for (i=1;i<=NF;i++) printf "%s%s", $i, i<NF ? OFS : RS > fn}' *.dat
Please make a backup first, as I am still not certain what you mean, but suspect it is:
rename -n -S .dat .csv filename*.project.dat
If it looks good, remove the -n and run again for real.

Linux batch copy files into directories based on filename pattern

I have a list of almost 500 pdf files with the following filename structure:
XXXX-YYYY-MM-DD.pdf
where XXXX is a variable lenght numeric code (1 to 4 digits) always delimitated by "-", for example:
51-2016-08-22.pdf
776-2016-08-22.pdf
3881-2016-08-22.pdf
4-2016-08-22.pdf
2860-2016-08-22.pdf
The goal is to copy each file into its own directory, naming the directories like the pattern (ie: file 776-2016-08-22.pdf goes to directory 776). How can I use awk or sed to delimitate the variable lenght field?
Here's my code:
for f in *.pdf
do
FOLDERNAME=`echo $f| awk (awk or sed missing code here)`
mkdir /my/dir/structure/$FOLDERNAME
cp $f /my/dir/structure/$FOLDERNAME/
done
Thanks for your support.
You can use:
for f in *.pdf; do
d="${f%%-*}"
mkdir -p "$d" && cp "$f" "$d"
done
As rightly pointed out by ed-morton, This is NOT recommended solution as it fails in many cases. Please follow https://stackoverflow.com/a/39089589/3834860
Keeping this answer for reference.
awk -F '-' to specify delimiter and '{print $1}' for first element before delimiter.
for f in *.pdf
do
FOLDERNAME=`echo $f| awk -F '-' '{print $1}'`
mkdir /my/dir/structure/$FOLDERNAME
cp $f /my/dir/structure/$FOLDERNAME/
done

save the output of a bash file

i have some files in a folder, and i need the first line of each folder
transaction1.csv
transaction2.csv
transaction3.csv
transaction4.csv
and i have the next code
#All folders that begin with the word transaction
folder='"transaction*"'
ls `echo $folder |sed s/"\""/\/g` >testFiles
# The number of lines of testFiles that is the number of transaction files
num=`cat testFiles | wc -l`
for i in `seq 1 $num`
do
#The first transaction file
b=`cat testFiles | head -1`
#The first line of the first transaction file
cat `echo $b` | sed -n 1p
#remove the first line of the testFiles
sed -i '1d' testFiles
done
This code works, the problem is that i need save the first line of each file in a file
and if i change the line:
cat `echo $b` | sed -n 1p > salida
it not works =(
In bash:
for file in *.csv; do head -1 "$file" >> salida; done
As Adam mentioned in the comment this has an overhead of opening the file each time through the loop. If you need better performance and reliability use the following:
for file in *.csv; do head -1 "$file" ; done > salida
head -qn1 *.csv
head -n1 will print the first line of each file, and -q will suppress the header when more than one file is given on the command-line.
=== Edit ===
If the files are not raw text (for example, if they're compressed with "bzip2" as mentinoned in your comment) and you need to do some nontrivial preprocessing on each file, you're probably best off going with a for loop. For example:
for f in *.csv.bz2 ; do
bzcat "$f" | head -n1
done > salida
(Another option would be to bunzip2 the files and then head them in two steps, such as bunzip2 *.csv.bz2 && head -qn1 *.csv > salida; however, this will of course change the files in place by decompressing them, which is probably undesirable.)
this awk one-liner should do what you want:
awk 'FNR==1{print > "output"}' *.csv
the first line of each csv will be saved into file: output
Using sed:
for f in *.csv; do sed -n "1p" "$f"; done >salida

how to compare output of two ls in linux

So here is the task which I can't solve. I have a directory with .h files and a directory with .i files, which have the same names as the .h files. I want just by typing a command to have all .h files which are not found as .i files. It's not a hard problem, I can do it in some programming language, but I'm just curious how it will look like in cmd :). To be more specific here is the algo:
get file names without extensions from ls *.h
get file names without extensions from ls *.i
compare them
print all names from 1 that are not met in 2
Good luck!
diff \
<(ls dir.with.h | sed 's/\.h$//') \
<(ls dir.with.i | sed 's/\.i$//') \
| grep '$<' \
| cut -c3-
diff <(ls dir.with.h | sed 's/\.h$//') <(ls dir.with.i | sed 's/\.i$//') executes ls on the two directories, cuts off the extensions, and compares the two lists. Then grep '$<' finds the files that are only in the first listing, and cut -c3- cuts off the "< " characters that diff inserted.
ls ./dir_h/*.h | sed -r -n 's:.*dir_h/([^.]*).h$:dir_i/\1.i:p' | xargs ls 2>&1 | \
grep "No such file or directory" | awk '{print $4}' | sed -n -r 's:dir_i/([^:]*).*:dir_h/\1:p'
ls -1 dir1/*.hh dir2/*.ii | awk -F"/" '{print $NF}' |awk -F"." '{a[$1]++;b[$0]}END{for(i in a)if(a[i]==1 && b[i".hh"]) print i}'
explanation:
ls -1 dir1/*.hh dir2/*.ii
above will list all the files *.hh and *.ii files in both the directories.
awk -F"/" '{print $NF}'
above will just print the file name excluding the complete path of the file.
awk -F"." '{a[$1]++;b[$0]}END{for(i in a)if(a[i]==1 && b[i".hh"]) print i}'
above will create two associative arrays one with file name and one with excluding the extension.
if both hh and ii files exist the value in the assosciative array will 2 if there is only one file then the value will be 1.so we need array item whose value is 1 and it should be a header file (.hh).
this can be checked using the asso..array b which is done in the END block.
Assuming bash is your shell:
for file in $( ls dir_with_h/*.h ); do
name=${file%\.h}; # trim trailing ".h" file extension
name=${name#dir_with_h/}; # trim leading folder name
if [ ! -e dir_with_i/${name}.i ]; then
echo ${name};
fi
done
Undoubtedly this can be ported to virtually all other shells. I find this less cryptic than some other approaches (although this is surely my problem) but it is a little wordy. As such. a shell script might help recall it.

Resources