Exclude filenames with certain prefix in for loop (globbing)

Exclude filenames with certain prefix in for loop (globbing) - linux

This is probably quite easy, but I can't figure it out. In a for loop I want to exclude certain files with the prefix zz (e.g. zz131232.JPG) but I don't know how to exclude these files.
for i in *.JPG; do
# do something
done
How do I modify the 'for rule' to exclude files with the prefix zz?

Something like
for i in *.JPG; do
[[ $i != "zz"* ]] && echo "$i"
done
or skip them:
for i in *.JPG; do
[[ $i == "zz"* ]] && continue
# process the other files here
done

If you are dealing with many files you can also use GLOBIGNORE or extended globbing to avoid expanding the files you wish to skip in the first place (which might be faster):
GLOBIGNORE='zz*'
for file in *.JPG; do
do_something_with "${file}"
done
# save and restore GLOBIGNORE if necessary
or
shopt -s extglob # enable extended globbing
for file in !(zz*).JPG; do
do_something_with "${file}"
done
shopt -u extglob # disable extended globbing, if necessary
Note that if there are no .JPG files in the current directory the loop will still be entered and $i be set to the literal *.JPG (in your example), so you either need to check if the file exists inside the loop or use the nullglob option.
for file in *.JPG; do
[ -e "${file}" ] || continue
do_something_with "${file}"
done
or
shopt -s nullglob
for file *.JPG; do
do_something_with "${file}"
done
shopt -u nullglob # if necessary
Try the following in your shell to understand what happens without nullglob:
$ for f in *.doesnotexist; do echo "$f"; done
*.doesnotexist

Related

How to delete multiple same characters from a file using bash script?

I have this script which removes the first periods from a file in order to unhide it. However, all it can do is remove the first period and not the succeeding periods, which makes the file still hidden. What I want to know now is how I can remove more than 1 periods at a time to unhide the file.
#!/bin/bash
period='.'
for i in $#; do
if [[ "$i" == "$period"* ]] ; then
mv "$i" "${i#"$period"}"
else
mv $i .${i}
fi
done
I have some knowledge in using grep and regex so I thought of using + to remove a lot of them at a time but I cant really figure out if it is the correct way to go about it

You can use the bash extended glob +(pattern) to match one or more periods, combined with the ## parameter expansion to remove the longest leading match:
#!/usr/bin/env bash
# Turn on extended globs
shopt -s extglob
name=...foo
printf "%s -> %s\n" "$name" "${name##+(.)}"
Or you can use a regular expression, combining looking for leading periods with capturing the rest of the name:
#!/usr/bin/env bash
# Note the quoted parameters to avoid many issues.
for i in "$#"; do
if [[ "$i" =~ ^\.+(.*) ]]; then
mv "$i" "${BASH_REMATCH[1]}"
else
mv "$i" ".${i}"
fi
done

bash loop through files that have no extension

related to: Loop through all the files with a specific extension
I want to loop through files that matches a pattern:
for item in ./bob* ; do
echo $item
done
I have a file list like:
bob
bobob
bobob.log
I only want to list files that have no extension:
bob
bobob
what is the best way to archive this? - can I do it in the loop somehow or do I need an if statement within the loop?

In bash you can use features of xtended globbing:
shopt -s extglob
for item in ./bob!(*.*) ; do
echo $item
done
You can put shopt -s extglob in your .bashrc file to enable it.

Recent Bash versions have regular expression support:
for f in *
do
if [[ "$f" =~ .*\..* ]]
then
: ignore
else
echo "$f"
fi
done

Deleting all files except ones mentioned in config file

Situation:
I need a bash script that deletes all files in the current folder, except all the files mentioned in a file called ".rmignore". This file may contain addresses relative to the current folder, that might also contain asterisks(*). For example:
1.php
2/1.php
1/*.php
What I've tried:
I tried to use GLOBIGNORE but that didn't work well.
I also tried to use find with grep, like follows:
find . | grep -Fxv $(echo $(cat .rmignore) | tr ' ' "\n")

It is considered bad practice to pipe the exit of find to another command. You can use -exec, -execdir followed by the command and '{}' as a placeholder for the file, and ';' to indicate the end of your command. You can also use '+' to pipe commands together IIRC.
In your case, you want to list all the contend of a directory, and remove files one by one.
#!/usr/bin/env bash
set -o nounset
set -o errexit
shopt -s nullglob # allows glob to expand to nothing if no match
shopt -s globstar # process recursively current directory
my:rm_all() {
local ignore_file=".rmignore"
local ignore_array=()
while read -r glob; # Generate files list
do
ignore_array+=(${glob});
done < "${ignore_file}"
echo "${ignore_array[#]}"
for file in **; # iterate over all the content of the current directory
do
if [ -f "${file}" ]; # file exist and is file
then
local do_rmfile=true;
# Remove only if matches regex
for ignore in "${ignore_array[#]}"; # Iterate over files to keep
do
[[ "${file}" == "${ignore}" ]] && do_rmfile=false; #rm ${file};
done
${do_rmfile} && echo "Removing ${file}"
fi
done
}
my:rm_all;

If we assume that none of the files in .rmignore contain newlines in their name, the following might suffice:
# Gather our exclusions...
mapfile -t excl < .rmignore
# Reverse the array (put data in indexes)
declare -A arr=()
for file in "${excl[#]}"; do arr[$file]=1; done
# Walk through files, deleting anything that's not in the associative array.
shopt -s globstar
for file in **; do
[ -n "${arr[$file]}" ] && continue
echo rm -fv "$file"
done
Note: untested. :-) Also, associative arrays were introduced with Bash 4.
An alternate method might be to populate an array with the whole file list, then remove the exclusions. This might be impractical if you're dealing with hundreds of thousands of files.
shopt -s globstar
declare -A filelist=()
# Build a list of all files...
for file in **; do filelist[$file]=1; done
# Remove files to be ignored.
while read -r file; do unset filelist[$file]; done < .rmignore
# Annd .. delete.
echo rm -v "${!filelist[#]}"
Also untested.
Warning: rm at your own risk. May contain nuts. Keep backups.
I note that neither of these solutions will handle wildcards in your .rmignore file. For that, you might need some extra processing...
shopt -s globstar
declare -A filelist=()
# Build a list...
for file in **; do filelist[$file]=1; done
# Remove PATTERNS...
while read -r glob; do
for file in $glob; do
unset filelist[$file]
done
done < .rmignore
# And remove whatever's left.
echo rm -v "${!filelist[#]}"
And .. you guessed it. Untested. This depends on $f expanding as a glob.
Lastly, if you want a heavier-weight solution, you can use find and grep:
find . -type f -not -exec grep -q -f '{}' .rmignore \; -delete
This runs a grep for EACH file being considered. And it's not a bash solution, it only relies on find which is pretty universal.
Note that ALL of these solutions are at risk of errors if you have files that contain newlines.

This line do perfectly the job
find . -type f | grep -vFf .rmignore

If you have rsync, you might be able to copy an empty directory to the target one, with suitable rsync ignore files. Try it first with -n, to see what it will attempt, before running it for real!

This is another bash solution that seems to work ok in my tests:
while read -r line;do
exclude+=$(find . -type f -path "./$line")$'\n'
done <.rmignore
echo "ignored files:"
printf '%s\n' "$exclude"
echo "files to be deleted"
echo rm $(LC_ALL=C sort <(find . -type f) <(printf '%s\n' "$exclude") |uniq -u ) #intentionally non quoted to remove new lines
Test it online here

Alternatively, you may want to look at the simplest format:
rm $(ls -1 | grep -v .rmignore)

How can I batch rename multiple images with their path names and reordered sequences in bash?

My pictures are kept in the folder with the picture-date for folder name, for example the original path and file names:
.../Pics/2016_11_13/wedding/DSC0215.jpg
.../Pics/2016_11_13/afterparty/DSC0234.jpg
.../Pics/2016_11_13/afterparty/DSC0322.jpg
How do I rename the pictures into the format below, with continuous sequences and 4-digit padding?
.../Pics/2016_11_13_wedding.0001.jpg
.../Pics/2016_11_13_afterparty.0002.jpg
.../Pics/2016_11_13_afterparty.0003.jpg
I'm using Bash 4.1, so only mv command is available. Here is what I have now but it's not working
#!/bin/bash
p=0
for i in *.jpg;
do
mv "$i" "$dirname.%03d$p.JPG"
((p++))
done
exit 0

Let say you have something like .../Pics/2016_11_13/wedding/XXXXXX.jpg; then go in directory .../Pics/2016_11_13; from there, you should have a bunch of subdirectories like wedding, afterparty, and so on. Launch this script (disclaimer: I didn't test it):
#!/bin/sh
for subdir in *; do # scan directory
[ ! -d "$subdir" ] && continue; # skip non-directory
prognum=0; # progressive number
for file in $(ls "$dir"); do # scan subdirectory
(( prognum=$prognum+1 )) # increment progressive
newname=$(printf %4.4d $prognum) # format it
newname="$subdir.$newname.jpg" # compose the new name
if [ -f "$newname" ]; then # check to not overwrite anything
echo "error: $newname already exist."
exit
fi
# do the job, move or copy
cp "$subdir/$file" "$newname"
done
done
Please note that I skipped the "date" (2016_11_13) part - I am not sure about it. If you have a single date, then it is easy to add these digits in # compose the new name. If you have several dates, then you can add a nested for for scanning the "date" directories. One more reason I skipped this, is to let you develop something by yourself, something you can be proud of...

Using only mv and bash builtins:
#! /bin/bash
shopt -s globstar
cd Pics
p=1
# recursive glob for .jpg files
for i in **/*.jpg
do
# (date)/(event)/(filename).jpg
if [[ $i =~ (.*)/(.*)/(.*).jpg ]]
then
newname=$(printf "%s_%s.%04d.jpg" "${BASH_REMATCH[#]:1:2}" "$p")
echo mv "$i" "$newname"
((p++))
fi
done
globstar is a bash 4.0 feature, and regex matching is available even in OSX's anitque bash.

How to write shell script to create zip file for the files that had same string in file name

How to write simple shell script to create zip file.
I want to create zip file by collecting files with same string pattern in their file names from a folder.
For example, there may be many files under a folder.
xxxxx_20140502_xxx.txt
xxxxx_20140502_xxx.txt
xxxxx_20140503_xxx.txt
xxxxx_20140503_xxx.txt
xxxxx_20140504_xxx.txt
xxxxx_20140504_xxx.txt
After running the shell script, the result must be following three zip files.
20140502.zip
20140503.zip
20140504.zip
Please give me right direction to create simple shell script to output the result as above.

#!/bin/bash
for file in *_????????_*.csv *_????????_*.txt; do
[ -f "${file}" ] || continue
date=${file#*_} # adjust this and next line depending
date=${date%_*} # on your actual prefix/suffix
echo "${date}"
done | sort -u | while read date; do
zip "${date}.zip" *${date}*
done

Since zip will update the archive, this will do:
shopt -s nullglob
for file in *.{txt,csv}; do [[ $file =~ _([[:digit:]]{8})_ ]] && zip "${BASH_REMATCH[1]}.zip" "$file"; done
The shopt -s nullglob is because you don't want to have unexpanded globs if there are no matching files.
Everything below this line is my old answer...
First, get all the possible dates. Heuristically, this could be the files ending in .txt and .csv that match the regex _[[:digit:]]{8}_:
#!/bin/bash
shopt -s nullglob
declare -A dates=()
for file in *.{csv,txt}; do
[[ $file =~ _([[:digit:]]{8})_ ]] && dates[${BASH_REMATCH[1]}]=
done
printf "Date found: %s\n" "${!dates[#]}"
This will output to stdout all the dates found in the files. E.g. (I called the previous snipped gorilla and I chmod +x gorilla and touched a few files for demo):
$ ls
banana_20010101_gorilla.csv gorilla_20140502_bonobo.csv
gorilla notthisone_123_lol.txt
gorilla_20140502_banana.txt
$ ./gorilla
Date found: 20140502
Date found: 20010101
Next step, for each date found, get all the files ending in .txt and .csv and zip them in the archive corresponding to the date: appending this to gorilla will do the job:
for date in "${!dates[#]}"; do
zip "$date.zip" *"_${date}_"*.{csv,txt}
done
Full script after removing the flooding part:
#!/bin/bash
shopt -s nullglob
declare -A dates=()
for file in *.{csv,txt}; do
[[ $file =~ _([[:digit:]]{8})_ ]] && dates[${BASH_REMATCH[1]}]=
done
for date in "${!dates[#]}"; do
zip "$date.zip" *"_${date}_"*.{csv,txt}
done
Edit. I overlooked your requirement with one line command. Then here's the one-liner:
shopt -s nullglob; declare -A dates=(); for file in *.{csv,txt}; do [[ $file =~ _([[:digit:]]{8})_ ]] && dates[${BASH_REMATCH[1]}]=; done; for date in "${!dates[#]}"; do zip "$date.zip" *"_${date}_"*.{csv,txt}; done
:)

#! /bin/bash
dates=$(ls ?????_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_???.{csv,txt} \
| cut -f2 -d_ | sort -u)
for date in $dates ; do
zip $date.zip ?????_"$date"_???.{csv,txt}
done

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Exclude filenames with certain prefix in for loop (globbing) - linux

This is probably quite easy, but I can't figure it out. In a for loop I want to exclude certain files with the prefix zz (e.g. zz131232.JPG) but I don't know how to exclude these files. for i in *.JPG; do # do something done How do I modify the 'for rule' to exclude files with the prefix zz?

Something like for i in .JPG; do [[ $i != "zz" ]] && echo "$i" done or skip them: for i in .JPG; do [[ $i == "zz" ]] && continue # process the other files here done

Related

How to delete multiple same characters from a file using bash script?

bash loop through files that have no extension

Deleting all files except ones mentioned in config file

How can I batch rename multiple images with their path names and reordered sequences in bash?

How to write shell script to create zip file for the files that had same string in file name

Categories

Resources

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Exclude filenames with certain prefix in for loop (globbing) - linux

This is probably quite easy, but I can't figure it out. In a for loop I want to exclude certain files with the prefix zz (e.g. zz131232.JPG) but I don't know how to exclude these files. for i in *.JPG; do # do something done How do I modify the 'for rule' to exclude files with the prefix zz?

Something like for i in *.JPG; do [[ $i != "zz"* ]] && echo "$i" done or skip them: for i in *.JPG; do [[ $i == "zz"* ]] && continue # process the other files here done

Related

How to delete multiple same characters from a file using bash script?

bash loop through files that have no extension

Deleting all files except ones mentioned in config file

How can I batch rename multiple images with their path names and reordered sequences in bash?

How to write shell script to create zip file for the files that had same string in file name

Categories

Resources

Something like for i in .JPG; do [[ $i != "zz" ]] && echo "$i" done or skip them: for i in .JPG; do [[ $i == "zz" ]] && continue # process the other files here done