Concatenate (using bash) all file names in subdirectories with option - linux

I have directory work_dir, and there are some subdirectories inside. And inside subdirectories there are zip archives. I can see all zip archives in terminal:
find . -name *.zip
The output:
./folder2/sub/dir/test2.zip
./folder3/test3.zip
./folder1/sub/dir/new/test1.zip
Now I want to concatinate all these file names in single row with some option. For example I want single row:
my_command -f ./folder2/sub/dir/test2.zip -f ./folder3/test3.zip -f ./folder1/sub/dir/new/test1.zip -u user1 -p pswd1
In this example:
my_command is some command
-f the option
-u user1 another option with value
-p pswd1 another option with value
Can you help me please, how can I do this in Linux BASH ?

One way is: (updated per #M. Nejat Aydin comments)
find . -name "*.zip" -print0 | xargs -0 -n1 printf -- '-f\0%s\0' | xargs -0 -n100000 my_command -u user1 -p pswd1
Note that -n100000 parameter forces all output of the previous xargs to be executed on the same line with the assumption that number of findings will be less than 100000.
I used null terminated versions (notice: -0 flag, -print0) because file names can contain spaces.

This is a bash script that should do what you wanted.
#!/usr/bin/env bash
user=user1
passwd=pswd1
while IFS= read -rd '' files; do
args+=(-f "$files")
done < <(find . -name '*.zip' -print0)
args=("${args[#]}" -u "$user" -p "$passwd")
##: Just for the human eye to see the output,
##: change this line of code according to the comment below.
printf 'mycommand %s\n' "${args[*]}"
The output should be in one-line, like what you wanted, but do change the last line from
printf 'mycommand %s\n' "${args[*]}"
into
mycommand "${args[#]}"
If you actually want to execute mycommand with the arguments.
Change the value of user and passwd too.
A while + read loop was used with IFS.
See How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?
Why the last line should be change.
See Arguments
Shell quoting is a basic but common mistake when dealing with spaces in file/path name.
See How can I find and safely handle file names containing
Also the find command/utiliy.
The construct "${args[#}" is an array.
See Array1 Array2 Array3

You can do this by making a bash script.
Make a new file called whatever.sh
Type chmod +x ./whatever.sh so it becomes executable on the terminal
Add the BASH scripting as shown below..
#!/bin/bash
# Get all the zip files from your FolderName
files="`find ./FolderName -name *.zip`"
# Loop through the files and build your args
arg=""
for file in $files; do
arg="$arg -f $file"
done
# Run your command
mycommand $arg -u user1 -p pswd1

Related

BASH: grep doesn't work in shell script but echo shows correct command and it works on command line

I need to write a script that checks some >20k files for some >2k search text and it needs to be flexible, so I came up with this script:
#!/bin/bash
# This script checks all files in a given directory against a list of criteria
shopt -s expand_aliases
source ~/.bashrc
TIMESTAMP=$(date "+%Y-%m-%d-%T")
ROOT_DIR=/data
PROJECT_NAME=$1
FILE_DIR=$ROOT_DIR/projects/$1/$2
RESULT_DIR=$ROOT_DIR/projects/$1/check_result
SEARCHTEXT_FILE=$ROOT_DIR/scripts/$3
OIFS="$IFS"
IFS=$'\n'
files=$(find $FILE_DIR -type f -name '*.json')
for file in $files; do
while read line; do
grep -H -o $line "$file" >> $RESULT_DIR/check_result_$TIMESTAMP.log
done < $SEARCHTEXT_FILE
done
IFS="$OIFS"
This script only produces the empty $RESULT_DIR/check_result_$TIMESTAMP.log log file with correct name.
Because the file names sometimes contain spaces I added the IFS... statements and I enclosed $file in " quotes (copied from another post).
The content of the $SEARCHTEXT_FILE is for example:
'Tel alt........'
'City ..........'
If I place an echo before the grep like this
echo grep -H -o $line "$file"
then output I get is
grep -H -o 'Tel alt........' /data/projects/DNAR/input/report-157538.json
and I can execute this line as is and get the correct result.
I tried to put various combinations of " or ' or ` or () or {} around any part of this grep command but nothing changed.
Somewhere I did read about alias and the alias set for grep is
alias grep='grep --color=auto'
After many hours of searching on the internet I couldn't find any post that helped me as most of them are covering issues around wrong quotes or inline bash issues.
What are I missing here?
The simple and obvious workaround is to remove all that complexity and simply use the features of the commands you are running anyway.
find "$FILE_DIR" -type f -name '*.json' \
-exec grep -H -o -f "$SEARCHTEXT_FILE" {} + > "$RESULT_DIR/check_result_$TIMESTAMP.log"
Notice also the quoting fixes; see When to wrap quotes around a shell variable; to avoid mishaps, you should switch to lower case for your private variables (see Correct Bash and shell script variable capitalization).
shopt -s expand_aliases
and source ~/.bashrc merely look superfluous, but could contribute to whatever problem you are trying to troubleshoot; they should basically never be part of a script you plan to use in production.

Deleting all files except ones mentioned in config file

Situation:
I need a bash script that deletes all files in the current folder, except all the files mentioned in a file called ".rmignore". This file may contain addresses relative to the current folder, that might also contain asterisks(*). For example:
1.php
2/1.php
1/*.php
What I've tried:
I tried to use GLOBIGNORE but that didn't work well.
I also tried to use find with grep, like follows:
find . | grep -Fxv $(echo $(cat .rmignore) | tr ' ' "\n")
It is considered bad practice to pipe the exit of find to another command. You can use -exec, -execdir followed by the command and '{}' as a placeholder for the file, and ';' to indicate the end of your command. You can also use '+' to pipe commands together IIRC.
In your case, you want to list all the contend of a directory, and remove files one by one.
#!/usr/bin/env bash
set -o nounset
set -o errexit
shopt -s nullglob # allows glob to expand to nothing if no match
shopt -s globstar # process recursively current directory
my:rm_all() {
local ignore_file=".rmignore"
local ignore_array=()
while read -r glob; # Generate files list
do
ignore_array+=(${glob});
done < "${ignore_file}"
echo "${ignore_array[#]}"
for file in **; # iterate over all the content of the current directory
do
if [ -f "${file}" ]; # file exist and is file
then
local do_rmfile=true;
# Remove only if matches regex
for ignore in "${ignore_array[#]}"; # Iterate over files to keep
do
[[ "${file}" == "${ignore}" ]] && do_rmfile=false; #rm ${file};
done
${do_rmfile} && echo "Removing ${file}"
fi
done
}
my:rm_all;
If we assume that none of the files in .rmignore contain newlines in their name, the following might suffice:
# Gather our exclusions...
mapfile -t excl < .rmignore
# Reverse the array (put data in indexes)
declare -A arr=()
for file in "${excl[#]}"; do arr[$file]=1; done
# Walk through files, deleting anything that's not in the associative array.
shopt -s globstar
for file in **; do
[ -n "${arr[$file]}" ] && continue
echo rm -fv "$file"
done
Note: untested. :-) Also, associative arrays were introduced with Bash 4.
An alternate method might be to populate an array with the whole file list, then remove the exclusions. This might be impractical if you're dealing with hundreds of thousands of files.
shopt -s globstar
declare -A filelist=()
# Build a list of all files...
for file in **; do filelist[$file]=1; done
# Remove files to be ignored.
while read -r file; do unset filelist[$file]; done < .rmignore
# Annd .. delete.
echo rm -v "${!filelist[#]}"
Also untested.
Warning: rm at your own risk. May contain nuts. Keep backups.
I note that neither of these solutions will handle wildcards in your .rmignore file. For that, you might need some extra processing...
shopt -s globstar
declare -A filelist=()
# Build a list...
for file in **; do filelist[$file]=1; done
# Remove PATTERNS...
while read -r glob; do
for file in $glob; do
unset filelist[$file]
done
done < .rmignore
# And remove whatever's left.
echo rm -v "${!filelist[#]}"
And .. you guessed it. Untested. This depends on $f expanding as a glob.
Lastly, if you want a heavier-weight solution, you can use find and grep:
find . -type f -not -exec grep -q -f '{}' .rmignore \; -delete
This runs a grep for EACH file being considered. And it's not a bash solution, it only relies on find which is pretty universal.
Note that ALL of these solutions are at risk of errors if you have files that contain newlines.
This line do perfectly the job
find . -type f | grep -vFf .rmignore
If you have rsync, you might be able to copy an empty directory to the target one, with suitable rsync ignore files. Try it first with -n, to see what it will attempt, before running it for real!
This is another bash solution that seems to work ok in my tests:
while read -r line;do
exclude+=$(find . -type f -path "./$line")$'\n'
done <.rmignore
echo "ignored files:"
printf '%s\n' "$exclude"
echo "files to be deleted"
echo rm $(LC_ALL=C sort <(find . -type f) <(printf '%s\n' "$exclude") |uniq -u ) #intentionally non quoted to remove new lines
Test it online here
Alternatively, you may want to look at the simplest format:
rm $(ls -1 | grep -v .rmignore)

Linux: Update directory structure for millions of images which are already in prefix-based folders

This is basically a follow-up to Linux: Move 1 million files into prefix-based created Folders
The original question:
I want to write a shell command to rename all of those images into the
following format:
original: filename.jpg new: /f/i/l/filename.jpg
Now, I want to take all of those files and add an additional level to the directory structure, e.g:
original: /f/i/l/filename.jpg new: /f/i/l/e/filename.jpg
Is this possible to do with command line or bash?
One way to do it is to simply loop over all the directories you already have, and in each bottom-level subdirectory create the new subdirectory and move the files:
for d in ?/?/?/; do (
cd "$d" &&
printf '%.4s\0' * | uniq -z |
xargs -0 bash -c 'for prefix do
s=${prefix:3:1}
mkdir -p "$s" && mv "$prefix"* "$s"
done' _
) done
That probably needs a bit of explanation.
The glob ?/?/?/ matches all directory paths made up of three single-character subdirectories. Because it ends with a /, everything it matches is a directory so there is no need to test.
( cd "$d" && ...; )
executes ... after cd'ing to the appropriate subdirectory. Putting that block inside ( ) causes it to be executed in a subshell, which means the scope of the cd will be restricted to the parenthesized block. That's easier and safer than putting cd .. at the end.
We then collecting the subdirectories first, by finding the unique initial strings of the files:
printf '%.4s\0' * | uniq -z | xargs -0 ...
That extracts the first four letters of each filename, nul-terminating each one, then passes this list to uniq to eliminate duplicates, providing the -z option because the input is nul-terminated, and then passes the list of unique prefixes to xargs, again using -0 to indicate that the list is nul-terminated. xargs executes a command with a list of arguments, issuing the command several times only if necessary to avoid exceeding the command-line limit. (We probably could have avoided the use of xargs but it doesn't cost that much and it's a lot safer.)
The command called with xargs is bash itself; we use the -c option to pass it a command to be executed. That command iterates over its arguments by using the for arg in syntax. Each argument is a unique prefix; we extract the fourth character from the prefix to construct the new subdirectory and then mv all files whose names start with the prefix into the newly created directory.
The _ at the end of the xargs invocation will be passed to bash (as with all the rest of the arguments); bash -c uses the first argument following the command as the $0 argument to the script, which is not part of the command line arguments iterated over by the for arg in syntax. So putting the _ there means that the argument list constructed by xargs will be precisely $1, $2, ... in the execution of the bash command.
Okay, so I've created a very crude solution:
#!/bin/bash
for file1 in *; do
if [[ -d "$file1" ]]; then
cd "$file1"
for file2 in *; do
if [[ -d "$file2" ]]; then
cd "$file2"
for file3 in *; do
if [[ -d "$file3" ]]; then
cd "$file3"
for file4 in *; do
if [[ -f "$file4" ]]; then
echo "mkdir -p ${file4:3:1}/; mv $file4 ${file4:3:1}/;"
mkdir -p ${file4:3:1}/; mv $file4 ${file4:3:1}/;
fi
done
cd ..
fi
done
cd ..
fi
done
cd ..
fi
done
I should warn that this is untested, as my actual structure varies slightly, but I wanted to keep the question/answer consistent with the original question for clarity.
That being said, I'm sure a much more elegant solution exists than this one.

Read txt file and parse the values to bash script

I have the following bash script:
#!/bin/bash
filename='config.txt'
while filename = read -r line do
for file in $(find /home/user/ftpuser -maxdepth 1 -name "*.[ew]ar" -type f); do
/apps/oracle/jrockit/4.1.0-1.6.0_37-R28.2.5-x86_64/bin/java -jar ../windup-cli-0.6.8/windup-cli.jar -javaPkgs com.lib - input ../ftpuser/ -output ../reports/ "${file}"
cp "${file}" /home/user/ftpuser/scanned/
done < "$filename"
sleep 60
done
The script needs to find two types of files into a directory. An .ear and a .war file. Once it finishes it executes one command which is making reports of the files that we have found before in a directory called /reports. The next step is to copy all the files that we have found in the step number one, to a directory called scanned. My problem focuses in the command I am executing to make the reports. In this command there is somewhere the -javaPkgs com.lib. I need to read this value from a configuration file.The script needs to read the values from the configuration file and assign them to the script, so each time we change the values in the configuration file we can execute the script with different values. My question is how can I do this? I have tried above to do something but it doesn't work.
Below you can also see the configuration file.
config.txt
targetHostName=10.125.162.132
packages=com.ibm,com.jboss
path=/home/user/ftpuser/reports
username=root
password=root
The following would give you a list of packages for which you want the reports:
grep "^packages" config.txt | cut -d= -f2 | tr ',' ' '
Based on this, you can loop for values in the list:
filename="config.txt"
for i in $(grep "^packages" $filename | cut -d= -f2 | tr ',' ' '); do
for file in $(find /home/user/ftpuser -maxdepth 1 -name "*.[ew]ar" -type f); do
echo /apps/oracle/jrockit/4.1.0-1.6.0_37-R28.2.5-x86_64/bin/java -jar ../windup-cli-0.6.8/windup-cli.jar -javaPkgs ${i} - input ../ftpuser/ -output ../reports/ "${file}"
cp "${file}" /home/user/ftpuser/scanned/
done
done
This is how I see how you could use it:
#!/bin/bash
config_file='./config.txt' ## If you want to pass your configuration as an argument, use config_file=$1
. "$config_file"
while read -r file do
/apps/oracle/jrockit/4.1.0-1.6.0_37-R28.2.5-x86_64/bin/java -jar ../windup-cli-0.6.8/windup-cli.jar -javaPkgs com.lib - input ../ftpuser/ -output ../reports/ "${file}"
cp "${file}" /home/user/ftpuser/scanned/
sleep 60
done < <(exec find /home/user/ftpuser -maxdepth 1 -name '*.[ew]ar' -type f)
The script would read config_file.txt as another source file assigning values to variables. With that you could already use those variables as $targetHostName, $packages, $path, $username and $password.

Escaping space in bash script

I am trying to make a script to append all files ending with .hash to be verified by md5deep. Files with space in their name seem to break this script.
#!/bin/bash
XVAR=""
for f in *.hash
do
XVAR="$XVAR -x $f "
done
md5deep -e $XVAR -r *
Whenever i run the script with a file called "O S.hash" i would get
O: No such file or directory
If i change XVAR="$XVAR -x $f " to XVAR="$XVAR -x \'$f\' " or XVAR="$XVAR -x \"$f\" "
md5deep will interpenetrate the input as "O instead
"O: No such file or directory
an echo of the variable in the script shows XVAR as -x 'O S.hash' or -x "O S.hash"
a manual input of the command in shell such as md5deep -e -x "O S.hash" -r * works but if its in the script the command seems to break
This is not the nicest solution, but is seems it will work:
find . -name '*.hash' -printf "-x\0%p\0" | xargs -0 md5deep -r * -e
This actually doesn't do exactly the same as the OP wanted, so here's a modification as suggested by Tim Pote and Jonathan Leffler:
find . -maxdepth 1 -name '*.hash' -printf "-x\0%p\0" | xargs -0 md5deep -r * -e
Now you know why people on Unix systems traditionally avoided file names with spaces in them (and directory names likewise) — it is a nuisance (to be polite about it) to have to program the shell to handle such names. The shell was designed for use in systems without such names. Newlines also cause much grief.
With bash, your best solution by far is to use an array to hold the elements, and then "${array[#]}" to list them; it is almost trivial:
declare -a XVAR
for file in *.hash
do
XVAR+=("-x" "$file")
done
md5deep -e "${XVAR[#]}" -r *
(Exploiting the array extension notation mentioned by Gordon Davisson. See section §6.7 'Arrays' of the bash reference manual (for Bash 4.1) for a lot of array information; see section §3.4 'Shell Parameters' for the += operator.)
If you can't use arrays for some reason, then you need a program that escapes its arguments so that the shell won't distort things. I have such a program, called escape:
XVAR=
for file in *.hash
do
name=$(escape "$file")
XVAR="$XVAR -x $file"
done
eval md5deep -e $XVAR -r *
With the eval, it is tricky to use; it works, but use arrays.

Resources