Command line bash for entering multiple directories and executing a command - linux

I'm new to this site (and to programming, more or less), but I'm hoping you can help.
I have numerous directories named 3K, 4K, 5K, etc. Within each directory I have 12 subdirectories named v1 to v12, each containing a file called OUTCAR. I am trying to write a bash command that will allow me to enter each of the subdirectories and gather data from OUTCAR.
The function works with no issues when I enter each subdirectory individually.
I'm using
for file in v{1..12} ; do grep "key_string" OUTCAR | awk '{print(relevant_stuff)}' > output.dat ; done
From the *K fine that contains the v{1..12} subdirectories.
However, I'm getting an error telling me that OUTCAR doesn't exist for each v{1..12}. I know it does, so I'm guessing that I haven't properly directed the command to cd into each subdirectory first. Any tips?
Thanks!

You would be better of using this find command from top level directory where these sub directories exist:
find . -type d -name 'v[1-9][[1-9]' \
-exec awk '/key_string/ {print FILENAME ":" $0}' {}/* >> output.dat \;

Related

shell script to read directory names and create .txt files with the same names in another directory

I have two directories, one called clients and another called test, inside the directory called clients I have some folders, I need a shell script that reads the name of the folders inside clients and creates .txt files with the same name inside the folder test, I am very new to shell and I have no idea how to do this, could you guys help me please?
Try using xargs with ls. ls -F displays all files in the directory client, but then displays the folders with an extra / at the end. the grep uses the extra / in the output of ls -F to only pass folders to the next command. Then, sed 's/\///g removes the extra / from grep, and passes the names to xargs. xargs will then pass the folders to the % symbol, and then make text files with the names.
ls -F client | grep / | sed 's/\///g' | xargs -I % touch tests/%.txt

Creating list of files of every subfolders in folders bash

I have a problem with creating a list of files with template *.cbf in any subfolders of every folders.
I wrote the script in Shell. But it always exit with "no such file or directory".
The structure of path is following /dir///*.cbf
#!/usr/bin/env bash
input_dir=$1
for i in `ls $input_dir/*/*/*_00001.cbf`; do
cbf=$(readlink -e $i)
cbf_fn=$(basename $cbf)
cbf_path=$(dirname $cbf)
cbf_path_p2=$(basename $cbf_path)
cbf_path_p1=$(basename $(dirname $cbf_path))
find `$input_dir/$cbf_path_p1/$cbf_path_p2` -name "*.cbf" -print > files.lst
done
The main reason is that the directory will probably not exist. I'll go through your code:
Suppose your input_dir is /hoppa and your link is /hoppa/1/2/a_00001.cbf. /hoppa/1/2/a_00001.cbf is a link that point to /level1/level2/level3/filename.ext
for i in `ls $input_dir/*/*/*_00001.cbf`; do
It is in general a bad idea to process the output of ls. Also, for those who once did Fortran (punch-card, ah those days...), i suggests an integer. f or file would probably a better choice. So, assuming that your input-dir does not contain spaces,
for file in $input_dir/*/*/*_00001.cbf ; do
cbf=$(readlink -e $i)
(those who sugested find probably missed the readlink)
cbf_fn=$(basename $cbf) # cbf_fn=filename.ext
cbf_path=$(dirname $cbf) # cbf_path=/level1/level2/level3
cbf_path_p2=$(basename $cbf_path)
# cbf_path_p2=level3
cbf_path_p1=$(basename $(dirname $cbf_path))
# cbf_path_p1=level2
find `$input_dir/$cbf_path_p1/$cbf_path_p2` -name "*.cbf" -print > files.lst
So the find will look in /hoppa/level2/level3, a directory which may not exist.
done

How to open all files in a directory in Bourne shell script?

How can I use the relative path or absolute path as a single command line argument in a shell script?
For example, suppose my shell script is on my Desktop and I want to loop through all the text files in a folder that is somewhere in the file system.
I tried sh myshscript.sh /home/user/Desktop, but this doesn't seem feasible. And how would I avoid directory names and file names with whitespace?
myshscript.sh contains:
for i in `ls`
do
cat $i
done
Superficially, you might write:
cd "${1:-.}" || exit 1
for file in *
do
cat "$file"
done
except you don't really need the for loop in this case:
cd "${1:-.}" || exit 1
cat *
would do the job. And you could avoid the cd operation with:
cat "${1:-.}"/*
which lists (cats) all the files in the given directory, even if the directory or the file names contains spaces, newlines or other difficult to manage characters. You can use any appropriate glob pattern in place of * — if you want files ending .txt, then use *.txt as the pattern, for example.
This breaks down if you might have so many files that the argument list is too long. In that case, you probably need to use find:
find "${1:-.}" -type f -maxdepth 1 -exec cat {} +
(Note that -maxdepth is a GNU find extension.)
Avoid using ls to generate lists of file names, especially if the script has to be robust in the face of spaces, newlines etc in the names.
Use a glob instead of ls, and quote the loop variable:
for i in "$1"/*.txt
do
cat "$i"
done
PS: ShellCheck automatically points this out.

rsync to backup one file generated in dynamic folders

I'm trying to backup just one file that is generated by other application in dynamic named folders.
for example:
parent_folder/
back_01 -> file_blabla.zip (timestam 2013.05.12)
back_02 -> file_blabla01.zip (timestam 2013.05.14)
back_03 -> file_blabla02.zip (timestam 2013.05.22)
and I need to get the latest generated zip, just that one it doesnt matter the name of the file as long as is the latest, is a zip and is inside "parent_folder" get that one.
as well when I do the rsync the folder structure + file name is generated and I want to omit that I want to backup that file in a folder and with a name so I know where is the latest and it will be always named the same.
now im doing this with a perl that get the latest generated folder with
"ls -tAF | grep '/$' | head -1"
and perform the rsync but it does brings the last zip but with the folder structure that I dont want because it doesnt override my latest zip file.
rsync -rvtW --prune-empty-dirs --delay-updates --no-implied-dirs --modify-window=1 --include='*.zip' --exclude='*.*' --progress /source/ /myBackup/
as well it would be great if I could do the rsync without needing to use perl or any other script.
thanks
The file names will differ each time ?
This would be hard for any type of syncing to work.
What you could do is :
create a new folder outside of where it is found, then :
Before you start remove the last sym linked file in that folder
When the file is found i.e. ls -tAF | grep '/$' | head -1 ....
symlink it this folder
then rsync,ssh,unison file across to new node.
If the symlink name is file-latest.zip then it will always be this
one file sent across.
But why do all that when you can just scp and you can take a look at here:
https://github.com/vahidhedayati/definedscp
for a more long winded approach, and not for this situation but it uses the real file date/time stamp then converts to seconds... It might be useful if you wish to do the stat in a different way
Using stat to work out file, work out latest file then simply scp it across, here is something to get you started:
One liner:
scp $(find /path/to/parent_folder -name \*.zip -exec stat -t {} \;|awk '{print $1" "$13}'|sort -k2nr|head -n1|awk '{print $1}') remote_server:/path/to/name.zip
More long winded way, maybe of use to understand what above is doing:
#!/bin/bash
FOUND_ARRAY=()
cd parent_folder;
for file in $(find . -name \*.zip); do
ptime=$(stat -t $file|awk '{print $13}');
FOUND_ARRAY+=($file" "$ptime)
done
IFS=$'\n'
FOUND_FILE=$(echo "${FOUND_ARRAY[*]}" | sort -k2nr | head -n1|awk '{print $1}');
scp $FOUND_FILE remote_host:/backup/new_name.zip

Launching program several times

I am using Mac Os. This is command line code to lauch my programm (two parts)
nucmer --mum file1.txt file2.txt
show-snps -Clr -x 2 out.delta > out_file1.snps
First part of the programm creates file out.delta. My file2.txt is always the same, but I want to launch this both parts 35000 times whith different file1.txt. All the file1s are located in the same directory.
Is it possible to do it using BASH?
Keep all the input files in a directory. Create a wrapper script to invoke nucmer script and then show-snps script. Your wrapper script will accept path to file directory as input. Iterate over all files in the directory and call your two scripts.
You could do something along these lines:
find . -maxdepth 1 -type f -print | grep -v './out_' | while read f
do
b=$(basename ${f})
nucmer --mum ${f} file2.txt
show-snps -Clr -x 2 out.delta > out_${b}.snps
done
The find bit finds all files in the current directory. grep filters out any previous output files, in case you've run some previously. The basename line strips off the leading ./ and trailing extension, and then your two programs get run with the input file name and an output filename based on the basename output.
If you don't get an argument list too long error, you could just use for:
for f in file*.txt; do nucmer --mum $f second.txt; show-snps -Clr -x 2 out.delta > out_${f%.txt}.snps; done

Resources