BASH loop through multiple files replacing content and filename - linux

I'm trying to find all files with a .pux extension 1 level below my parent directory. Once the files are identified, I want to add a carriage return to each line inside each file (if possible, only if that line doesn't already have the CR). Finally I want to rename the extension to .pun ensuring there is no .pux left behind.
I've been trying different methods with this and my biggest problem is that I cannot develop or debug this code easily as I cannot access the command line directly. I can't access the Linux server that the script will run on. I can only call it from my application on my windows server (trust me, I'm thinking exactly what you are right now).
The Linux server is running BASH 3.2.57(2). I don't believe the Unix2Dos utility is installed as I've tried using it in it's most basic form with no success. I've confirmed my find command can successfully identify the files I need as I have ran this and checked my log file output.
#!/bin/bash
MYSCRIPTS=${0%/*}
PARENTDIR=/home/clnt/parent/
LOGFILE="$MYSCRIPTS"/PUX2PUN.log
find "$PARENTDIR" -mindepth 2 -maxdepth 2 -type f -name "*.pux" > "$LOGFILE"
Logfile output:
/home/clnt/parent/z3y/prz3y.pux
/home/clnt/parent/wsl/prwsl.pux
However when I have tried to build on this code and pipe those results to a while read do, it doesn't appear to do anything.
#!/bin/bash
MYSCRIPTS=${0%/*}
PARENTDIR=/home/clnt/parent/
LOGFILE="$MYSCRIPTS"/PUX2PUN.log
find "$PARENTDIR" -mindepth 2 -maxdepth 2 -type f -name "*.pux" -print0 | while IFS= read -r file; do
sed -i '/\r/! s/$/\r/' "${file}" &&
mv "${file}" "${file/%pux/pun}" >> "$LOGFILE"
done
I'm open to other methods if they are standard in my BASH version and safe. Below my parent first should be anywhere from 1-250 folders max and each of those children folders can have up to 1 pr*.pux file each (* will match the folder name as shown in my example output earlier). So were' not dealing with a ton of files.

Related

Linux - How to zip files per subdirectory separately

I have directory structure like this.
From this I want to create different zip files such as
data-A-A_1-A_11.zip
data-A-A_1-A_12.zip
data-B-B_1-B_11.zip
data-C-C_1-C_11.zip
while read line;
do
echo "zip -r ${line//\//-}.zip $line";
# zip -r "${line//\//-}.zip" "$line"
done <<< "$(find data -maxdepth 3 -mindepth 2 -type d)"
Redirect the result of a find command into a while loop. The find command searches the directory data for directories only, searching 3 directories deep only. In the while loop with use bash expansion to convert all forward slashes to "-" and add ".zip" in such a way that we can build a zip command on each directory. Once you are happy that the zip command looks fine when echoed for each directory, comment in the actual zip command

Iterate through files in sub-folders inside a main folder in bash [duplicate]

This question already has answers here:
Interacting with files from multiple directories via bash script
(2 answers)
Closed 3 years ago.
The question is really simple, I know how to do it in Python but I want to do it in Linux shell (bash).
I have a main folder Dataset inside which there are multiple sub-folders Dataset_FinalFolder_0_10 all the way up to Dataset_FinalFolder_1090_1100 each with 10 files.
I want to run a program on each of those files. In Python I would do this with something like:
for folder in /path/to/folders:
for file in folder:
run program
Is there any way so mimic this in Shell / bash?
I have this code which I have used for more direct iterations:
for i in /path/to/folder/*;
do program "$i";
done
Thanks in advance
If you are sure that there are no files mixed in with the folders, and no folders mixed in with the files:
for folder in /path/to/Dataset/*; do
for file in "$folder"/*; do
program "$file"
done
done
Alternatively, it is possible to give more than one *:
for file in /path/to/Dataset/*/*; do
program "$file"
done
If you aren't sure about the folder contents, then find can help. This example selects files in just the first-level subdirectories of the given folder and xargs calls program for each one:
find /path/to/Dataset/ -mindepth 2 -maxdepth 2 -type f |\
xargs -n1 program
The find method may also be useful if .../*/*/*/... could expand to a huge number of paths. On linux, the commandline length limit is shown by:
getconf ARG_MAX
On my machine that is 2^21 (~2 million) characters. So the limit is high, but worth keeping at the back of your mind that there is one.
From the Linux perspective, you have to watch out for properly escaping spaces, new lines, etc which can get kinda funky. There are multiple references for why not to do it - see
http://mywiki.wooledge.org/ParsingLs
And
https://unix.stackexchange.com/questions/128985/why-not-parse-ls-and-what-do-to-instead
That said...
You can always use the find command wiht the -exec option -
find /path/to/top/level -type f -exec /path/to/processing/program {} \;
The \; at the end is required to indicate the end of the exec
You don't need nested loops in either Python or the shell unless you have so many files that you are running up against "argument list too long" errors.
for file in /path/to/folders/*/*; do
program "$file"
done
This is equivalent to the Python code
from glob import glob
from subprocess import run
for file in glob('/path/to/folders/*/*'):
run(['program', file])
Of course, if program is at all competently written, you can probably simply do
program /path/to/folders/*/*
This corresponds to
run(['program'] + glob('/path/to/folders/*/*')
If program accepts a list of file name arguments, but you do need to break up the command line to avoid "argument list too long" errors, try
printf '%s\0' /path/to/folders/*/* |
xargs -r0 program
(The zero-terminator pattern is a GNU find extension, as is the -r option.)
for dir in ./* ./**/* # list directories in the current directory
do
python $dir
done
./* are files in dir and ./**/* are files in subfolders.
Make sure you have only python files in your directory it will run all the files in that directory
Actually I have already answered it here
Iterate shell script over list of subdirectories

Batch file convert to Linux script

due to migrating of batch job to Linux server I have problem finding the equivalent of the following commands in Linux:
Y drive is a map drive to the NAS drive which is also connected to Ubuntu server /NAS/CCTV . Need to search every sub folders for all .264 files
Z drive is on the Ubuntu server itself. Just move every .mp4 files here, no folder here. Path on Ubuntu is /Share/CCTV/
Its just a simple script to convert the cctv capture .264 format to mp4 and move to server to be process and delete off any h264 files and any folder thats older than 1 day, the script will schedule to run every 3 mins.
I have ffmpeg installed on the Ubuntu server, just unable to find the for each file in the folders to do the same.
Also for the last for files command that delete folder older than 1 days
FOR /r y:\ %%F in (*.h264) do c:\scripts\ffmpeg -i %%F %%F.mp4
FOR /r y:\ %%F in (*.h264) do del %%F
FOR /r y:\ %%G in (*.mp4) do move %%G Z:\
forfiles -p "Y:\" -d -1 -c "cmd /c IF #isdir == TRUE rd /S /Q #path"
Appreciate any forms of help or point me to the right guide so I can rewrite it on the Linux server. I did try to search for for loop but all show me to count number, maybe I search wrongly.
Find all .h264 files (recursively)
find /NAS/CCTV -type f -name '*.h264'
Convert all such files to .mp4
while IFS= read -d '' -r file ; do
ffmpeg -i "$file" "$file".mp4
done < <(find /NAS/CCTV -type f -name '*.h264' -print0)
Note that this will create files called like filename.h264.mp4. This matches your batch file behavior. If you would prefer to replace the extension use ffmpeg -i "$file" "${file%.*}".mp4 instead and you will get a name like filename.h264.
Also move those mp4 files to another directory
while IFS= read -d '' -r file ; do
ffmpeg -i "$file" "$file".mp4
if [[ -f $file.mp4 ]] ; then
mv -f -- "$file".mp4 /Share/CCTV
fi
done < <(find /NAS/CCTV -type f -name '*.h264' -print0)
Delete old directories (recursively)
find /NAS/CCTV -type d -not -newermt '1 day ago' -exec rm -rf {} +
Documentation.
The find command recursively lists files according to criteria you specify. Any time you need to deal with files in multiple directories or very large numbers of files it is probably what you want to use. For safety against malicious file names it's important to -print0 so file names are delimited by null rather than newline, which then requires using the IFS= read -d '' construct to interpret later.
The while read variable ; do ... done construct reads data from input and assigns each record to the named variable. This allows each matching file to be handled one at a time inside the loop. The insides of the loop should be fairly obvious.
Again find is used to select files, but in this case the files are directories. The switches -not -newer select files which are not newer (in other words, files which are older) according to their m time, the modification time, compared against t, which in this case means that the next argument is text describing a time. Here you can use any expression understood by GNU date's -d switch, so I can write in plain English and it will work as expected.
As you embark on your shell scripting journey you should keep two things by your side:
shellcheck - Always runs scripts you write through shellcheck to catch basic errors.
Bash FAQ - The bash FAQ at wooledge.org. Most of the answers to questions you have not thought of yet will be here. For example FAQ 15 is highly relevant to this question.
for f in /NAS/CCTV/*.h264; do ffmpeg -i "$f" "$f".mp4; done
rm /NAS/CCTV/*.h264
mv /NAS/CCTV/*.mp4 /Share/CCTV
find /NAS/CCTV/ -type d -ctime +1 -exec rm -rf {} \;

sed not working as expected, but only for directory depth greater than 1

I am trying to find all instances of a string in all files on my system up to a specified directory depth. I then want to replace these with another string and I am using 'find' and 'sed' by piping one into the other.
This works where I use the base path as cd /home/../.. or any other directory which isn't "/". It also only works if I select a directory depth of 1 (so /test.txt is changed, but /home/test.txt isn't) If I change nothing else and used say a depth of 2 or 3, neither /test.txt nor /home/text.txt are changed. In the former, no warnings appear, and in the latter, the results below (And no strings are replaced in either of the files).
Worryingly, it did work once out of the blue, but I have no idea how and I can't recreate the results. I should say I know the risks of using these commands with root from base directory, and the specific use of the programs below is intentional so I am not looking for an alternative way, just a clue as to how this isn't working and perhaps a suggestion on how to fix it.
cd /;find . -maxdepth 3 -type f -print0 | xargs -0 sed -i 's/teststring123/itworked/gI'
sed: couldn't open temporary file ./sys/kernel/sedoPGqGB: No such file or directory
sed: couldn't open temporary file ./proc/878/sedtqayiq: No such file or directory
As you see, there are warnings, but nether the less I would expect it to work, the commands appear good, anything I am missing folks?
This should be:
find / -maxdepth 3 -type f -print -exec sed -i -e 's/teststring123/itworked/g' {} \;
Although changing all files below / strikes me as a very bad idea indeed (I hope you're not running as root!).
The "couldn't open temporary file ./[...]" errors are likely to be because sed, running as your user, doesn't have permission to create files in /.
My version runs from your current working directory, I assume your ${HOME}, where you'll be able to create the temporary file, but you're still unlikely to be able to replace those files vital to the continued running of your operating system.

Unix: traverse a directory

I need to traverse a directory so starting in one directory and going deeper into difference sub directories. However I also need to be able to have access to each individual file to modify the file. Is there already a command to do this or will I have to write a script? Could someone provide some code to help me with this task? Thanks.
The find command is just the tool for that. Its -exec flag or -print0 in combination with xargs -0 allows fine-grained control over what to do with each file.
Example: Replace all foo's by bar's in all files in /tmp and subdirectories.
find /tmp -type f -exec sed -i -e 's/foo/bar/' '{}' ';'
for i in `find` ; do
if [ -d $i ] ; then do something with a directory ; fi
if [ -f $i ] ; then do something with a file etc. ; fi
done
This will return the whole tree (recursively) in the current directory in a list that the loop will go through.
This can be easily achieved by mixing find, xargs, sed (or other file modification command).
For example:
$ find /path/to/base/dir -type f -name '*.properties' | xargs sed -ie '/^#/d'
This will filter all files with file extension .properties.
The xargs command will feed the file path generated by find command into the sed command.
The sed command will delete all lines start with # in the files (feed by xargs).
Command combination in this way is very flexible.
For example, find command have different parameters so you can filter by user name, file size, file path (eg: under /test/ subfolder), file modification time.
Another dimension of flexibility is how and what to change in your file. For ex, sed command allows you to make changes on file in applying substitution (specify via regular expressions). Similarly, you can use gzip to compress the file. And so on ...
You would usually use the find command. On Linux, you have the GNU version, of course. It has many extra (and useful) options. Both will allow you to execute a command (eg a shell script) on the files as they are found.
The exact details of how to make changes to the file depend on the change you want to make to the file. That is probably best scripted, with find running the script:
POSIX or GNU:
find . -type f -exec your_script '{}' +
This will run your script once for a group of files with those names provided as arguments. If you want to do it one file at a time, replace the + with ';' (or \;).
I am assuming SearchMe is the example directory name you need to traverse completely.
I am also assuming, since it was not specified, the files you want to modify are all text file. Is this correct?
In such scenario I would suggest using the command:
find SearchMe -type f -exec vi {} \;
If you are not familiar with vi editor, just use another one (nano, emacs, kate, kwrite, gedit, etc.) and it should work as well.
Bash 4+
shopt -s globstar
for file in **
do
if [ -f "$file" ];then
# do some processing to your file here
# where the find command can't do conveniently
fi
done

Resources