Syntax error in expression for if statement in bash - linux

The codes are below, I want to check some files whose file size is less than 410Bytes:
for file in *; do
if [[ "$file" =~ ^dataset([0-9]+)$ && `du -b $file/${BASH_REMATCH[1]}_conserv.png` -lt 410 ]]; then
cd $file
$some_commands
cd ..
fi
done
However, when I run this script, it complains like this:
less_than_410.bash: line 2: [[: 13605 dataset4866/4866_conserv.png: syntax error in expression (error token is "dataset4866/4866_conserv.png")
Does anyone have ideas about how to fix this? Thanks!

du -b file
It will print file size and name. Use cut to get size only:
du -b file | cut -f 1

Rather than using such a complex expression, just do:
find . -maxdepth 1 -regex '.*/dataset[0-9]+' -size -410c
(Note that -maxdepth and -regex are both gnu extensions to find. Since your question is tagged linux, those options are probably available.)

Use bash extended globbing to restrict the file loop:
shopt -s extglob
for file in dataset+([0-9]); do
num=${file#dataset}
(( $(stat -c %s "$file/${num}_conserv.png") >= 410 )) && continue
do stuff here
done

Related

Deleting all files except ones mentioned in config file

Situation:
I need a bash script that deletes all files in the current folder, except all the files mentioned in a file called ".rmignore". This file may contain addresses relative to the current folder, that might also contain asterisks(*). For example:
1.php
2/1.php
1/*.php
What I've tried:
I tried to use GLOBIGNORE but that didn't work well.
I also tried to use find with grep, like follows:
find . | grep -Fxv $(echo $(cat .rmignore) | tr ' ' "\n")
It is considered bad practice to pipe the exit of find to another command. You can use -exec, -execdir followed by the command and '{}' as a placeholder for the file, and ';' to indicate the end of your command. You can also use '+' to pipe commands together IIRC.
In your case, you want to list all the contend of a directory, and remove files one by one.
#!/usr/bin/env bash
set -o nounset
set -o errexit
shopt -s nullglob # allows glob to expand to nothing if no match
shopt -s globstar # process recursively current directory
my:rm_all() {
local ignore_file=".rmignore"
local ignore_array=()
while read -r glob; # Generate files list
do
ignore_array+=(${glob});
done < "${ignore_file}"
echo "${ignore_array[#]}"
for file in **; # iterate over all the content of the current directory
do
if [ -f "${file}" ]; # file exist and is file
then
local do_rmfile=true;
# Remove only if matches regex
for ignore in "${ignore_array[#]}"; # Iterate over files to keep
do
[[ "${file}" == "${ignore}" ]] && do_rmfile=false; #rm ${file};
done
${do_rmfile} && echo "Removing ${file}"
fi
done
}
my:rm_all;
If we assume that none of the files in .rmignore contain newlines in their name, the following might suffice:
# Gather our exclusions...
mapfile -t excl < .rmignore
# Reverse the array (put data in indexes)
declare -A arr=()
for file in "${excl[#]}"; do arr[$file]=1; done
# Walk through files, deleting anything that's not in the associative array.
shopt -s globstar
for file in **; do
[ -n "${arr[$file]}" ] && continue
echo rm -fv "$file"
done
Note: untested. :-) Also, associative arrays were introduced with Bash 4.
An alternate method might be to populate an array with the whole file list, then remove the exclusions. This might be impractical if you're dealing with hundreds of thousands of files.
shopt -s globstar
declare -A filelist=()
# Build a list of all files...
for file in **; do filelist[$file]=1; done
# Remove files to be ignored.
while read -r file; do unset filelist[$file]; done < .rmignore
# Annd .. delete.
echo rm -v "${!filelist[#]}"
Also untested.
Warning: rm at your own risk. May contain nuts. Keep backups.
I note that neither of these solutions will handle wildcards in your .rmignore file. For that, you might need some extra processing...
shopt -s globstar
declare -A filelist=()
# Build a list...
for file in **; do filelist[$file]=1; done
# Remove PATTERNS...
while read -r glob; do
for file in $glob; do
unset filelist[$file]
done
done < .rmignore
# And remove whatever's left.
echo rm -v "${!filelist[#]}"
And .. you guessed it. Untested. This depends on $f expanding as a glob.
Lastly, if you want a heavier-weight solution, you can use find and grep:
find . -type f -not -exec grep -q -f '{}' .rmignore \; -delete
This runs a grep for EACH file being considered. And it's not a bash solution, it only relies on find which is pretty universal.
Note that ALL of these solutions are at risk of errors if you have files that contain newlines.
This line do perfectly the job
find . -type f | grep -vFf .rmignore
If you have rsync, you might be able to copy an empty directory to the target one, with suitable rsync ignore files. Try it first with -n, to see what it will attempt, before running it for real!
This is another bash solution that seems to work ok in my tests:
while read -r line;do
exclude+=$(find . -type f -path "./$line")$'\n'
done <.rmignore
echo "ignored files:"
printf '%s\n' "$exclude"
echo "files to be deleted"
echo rm $(LC_ALL=C sort <(find . -type f) <(printf '%s\n' "$exclude") |uniq -u ) #intentionally non quoted to remove new lines
Test it online here
Alternatively, you may want to look at the simplest format:
rm $(ls -1 | grep -v .rmignore)

How do I search for a file based on what is output by a command running on that file

I am working on a project for one of my professors and he asked me to sort a couple hundred .fits images based on their header files (specifically what star they are images of) I think that grep would be the best way to do this however I can't seam to figure out how to use grep based on the header.
I am entering:
ls | imhead *.fits | grep -E -r "PG\ 1104+243" *
to just list them out for now, once they are listed I know how to copy them into a directory.
I am new to using grep so I am unsure as to where my error lies? any help would be greatly appreciated! Thanks!
Assuming that imghead will extract the headers of the .fits as txt, you can use a simple shell script to do it:
script.sh
#!/bin/bash
grep "$1" "$2" > /dev/null 2>&1 && echo "$2"
Note that the + is a special character if you use extended regular expression, meaning if you pass the -E as in the question. A simple grep without any options should do the trick here.
Use find to exec the script on every *.fits file in the current folder:
find -maxdepth 1 -name '*.fits' -exec ./script.sh 'PG 1104+243' {} \;
If you are going to copy/move/alter or do something with the files you find, you might be better off, in terms of complexity and ease of quoting, using a loop like this:
#!/bin/bash
find . -name \*.fits -print0 | while read -d '' -r file; do
echo Checking file: $file
imhead "$file" | grep -q 'PG 1104+243'
if [ $? -eq 0 ]; then
echo Object matches: $file
fi
done

Linux Bash file Reading Lines and words

I apologize if this is a trivial question. I am learning how to use linux bash and this little task is giving me a headache...
So I need to write a script, let's call it count.sh. I want that: for each file in the working directory, prints the filename, the number of lines, and the number of words to the console:
test.txt 100 1023
someOtherfiles 10 233
So far, I know that the following gives me all the files names in the directory. And thanks for all who helped me, I get this working version:
for f in *; do
echo -n "$f"
cat "$f" | wc -wl
done
I would really appreciate your help! Thanks ahead!
P.s. If you know great resources (links for tutorials) for learning about script and you are willing to share it with me. I think I really need to know these basics. Thanks again!
If you must have the file name as the first field in your output, try this:
for f in *; do
if [ -f "$f" ]; then
echo -n "$f"
cat "$f" | wc -wl
fi
done
for f in *; do
if [[ -f $f ]]; then
echo "$f $(wc -wl < "$f")"
fi
done
[[ -f $f ]] processes only files (excludes subdirectories) and also handles the case where the directory is empty (in which case * is (by default) left unexpanded, i.e. assigned to $f as is).
echo "$f $(wc -wl < "$f")" uses command substitution ($( ... )) to directly include the output from the enclosed command in the output string passed to echo.
Note that the reason that < is used to direct the content of file $f to wc via stdin is that wc would otherwise append the name of the input file to its output (thanks, #R Sahu).

how to find and delete below line using shell script

Below line has printed in my all php project pages because of malicious attacks.Now think is how i can find and delete this lines using shell script
function_exists('date_default_timezone') ?
date_default_timezone_set('America/Los_Angeles') :
($_REQUEST['c_id']));
I tried with below script but i getting error.I mean to say I not able to match above line with sed commend.Please help me to correct this script..
#!/bin/sh
search='^function_exists\(\'date_default_timezone\'\)\ \?\ date_default_timezone_set\(\'America/Los_Angeles\'\)\ \:\ \(\$_REQUEST\[\'c_id\'\]\)\)\;'
for file in `find /root/test1 -name "*.php"`; do grep "$search" $file &> /dev/null if [ $? -ne 0 ]; then echo "Search string not found in $file!" else sed -i '/$search/d' $file
Try sed with : seperators rather than / since in your pattern America/La conflicts with / ir add a backslash so its America/la
You're not escaping the regex correctly. Try the following:
while IFS= read -r -d '' file; do
if grep -qF "function_exists('date_default_timezone') ? date_default_timezone_set('America/Los_Angeles') : (\$_REQUEST['c_id']));" "$file"
then
sed -i "s|function_exists('date_default_timezone') ? date_default_timezone_set('America/Los_Angeles') : (\$_REQUEST\['c_id'\]));|FOO|g" "$file"
fi
done < <(find /root/test1 -type f -name "*.php" -print0)
This might work for you (GNU sed)
pattern1='function_exists('\''date_default_timezone'\'''
pattern2='.*date_default_timezone_set('\''America\/Los_Angeles'\'') :'
pattern3='.*($_REQUEST\['\''c_id'\''\]));'
sed '/^'"$pattern1"'/{N;N;/^'"$pattern1$pattern2$pattern3"'/d}' file

Renaming a set of files to 001, 002,

I originally had a set of images of the form image_001.jpg, image_002.jpg, ...
I went through them and removed several. Now I'd like to rename the leftover files back to image_001.jpg, image_002.jpg, ...
Is there a Linux command that will do this neatly? I'm familiar with rename but can't see anything to order file names like this. I'm thinking that since ls *.jpg lists the files in order (with gaps), the solution would be to pass the output of that into a bash loop or something?
If I understand right, you have e.g. image_001.jpg, image_003.jpg, image_005.jpg, and you want to rename to image_001.jpg, image_002.jpg, image_003.jpg.
EDIT: This is modified to put the temp file in the current directory. As Stephan202 noted, this can make a significant difference if temp is on a different filesystem. To avoid hitting the temp file in the loop, it now goes through image*
i=1; temp=$(mktemp -p .); for file in image*
do
mv "$file" $temp;
mv $temp $(printf "image_%0.3d.jpg" $i)
i=$((i + 1))
done
A simple loop (test with echo, execute with mv):
I=1
for F in *; do
echo "$F" `printf image_%03d.jpg $I`
#mv "$F" `printf image_%03d.jpg $I` 2>/dev/null || true
I=$((I + 1))
done
(I added 2>/dev/null || true to suppress warnings about identical source and target files. If this is not to your liking, go with Matthew Flaschen's answer.)
Some good answers here already; but some rely on hiding errors which is not a good idea (that assumes mv will only error because of a condition that is expected - what about all the other reaons mv might error?).
Moreover, it can be done a little shorter and should be better quoted:
for file in *; do
printf -vsequenceImage 'image_%03d.jpg' "$((++i))"
[[ -e $sequenceImage ]] || \
mv "$file" "$sequenceImage"
done
Also note that you shouldn't capitalize your variables in bash scripts.
Try the following script:
numerate.sh
This code snipped should do the job:
./numerate.sh -d <your image folder> -b <start number> -L 3 -p image_ -s .jpg -o numerically -r
This does the reverse of what you are asking (taking files of the form *.jpg.001 and converting them to *.001.jpg), but can easily be modified for your purpose:
for file in *
do
if [[ "$file" =~ "(.*)\.([[:alpha:]]+)\.([[:digit:]]{3,})$" ]]
then
mv "${BASH_REMATCH[0]}" "${BASH_REMATCH[1]}.${BASH_REMATCH[3]}.${BASH_REMATCH[2]}"
fi
done
I was going to suggest something like the above using a for loop, an iterator, cut -f1 -d "_", then mv i i.iterator. It looks like it's already covered other ways, though.

Resources