How can I run a command on all files in a directory and mv to a different the ones that get an output that contains 'Cannot read TIFF header'? - linux

I'd like to remove all bad tiffs from out of a very large directory. The commandline tool "tiffinfo" makes it easy to identify them:
tiffinfo -D *
This will have an ouput like this:
00074000/74986.TIF: Cannot read TIFF header.
if the tiff file is corrupt. If this happens I'd like to take the file and move it to a different dirrectory: bad_images. I tried using awk on this, but it hasn't worked so far...
Thanks!

Assuming the "Cannot read TIFF header" error comes on standard error, and assuming tiffinfo outputs other data on standard out which you don't want, then:
cd /path/to/tiffs
for file in `tiffinfo -D * 2>&1 >/dev/null | cut -f1 -d:`
do
echo mv $file /path/to/bad_images
done
Remove the echo to actually move the files, once satisfied that the script will work as expected.

Related

Rename large folder of Jpegs

I have a large folder of jpegs, which I would like to rename sequentially to image01.jpg, image02.jpg...image533jpg etc.
I have tried using the following
find ‘/myImages/‘ -maxdepth 1 -name ‘*.jpg’ | sort -n | awk 'BEGIN{ x=1 }{printf "mv \"%s\" \”/myImages/image%04d.jpg\”\n”, $0, x++ }' | bash
which I got from here: http://www.algissalys.com/how-to/how-to-quickly-rename-modify-and-scale-all-images-in-a-directory-using-linux
However, this is only returning
>
And then nothing happens, any suggestions would be great.
The easiest way to do that is with rename which you can install with homebrew using:
brew install rename
Then, you can go into your directory containing the images and run:
rename --dry-run -X -e '$_ = "$N"' *jpg
Sample Output
'a.jpg' would be renamed to '1.jpg'
'article.jpg' would be renamed to '2.jpg'
'blob-0.jpg' would be renamed to '3.jpg'
'blob-1.jpg' would be renamed to '4.jpg'
'blob-2.jpg' would be renamed to '5.jpg'
'blob-3.jpg' would be renamed to '6.jpg'
If that looks correct, you can run it again without the --dry-run to actually do it, rather than just telling you what it will do.
If you want your names zero-padded, the easiest is to let rename work out how much padding you need automatically like this:
rename --dry-run -X -N ...01 -e '$_ = "$N"' *jpg
The benefits of using rename are that:
it is simple and powerful
it will warn you before overwriting any files
it can do a dry run and tell you what would happen without actually doing anything
If you want an explanation of the command '$_ = "$N"' then read on...
The rename command is actually a Perl script, so the part I mention above is just a Perl script enclosed in single quotes. The $N is just a Perl variable that expands to be a sequentially increasing number. The Perl special variable $_ is filled with the name of the current file before your little Perl script is executed, and crucially, you are expected to set it to the name you want that input file renamed as.
You could do that with a bash script. Say you have the following in a file called rename_images.
#!/bin/bash
declare -a FILESERIES
FILESERIES=(`ls $1`)
NUM=${#FILESERIES[#]}
NEWNAME=$2
EXT=$3
for (( i=0; i<$NUM ; i++))
do
FI=${FILESERIES[$i]}
NEWFILENAME=`echo $NEWNAME$i$EXT`
mv $FI $NEWFILENAME
done
To do what you need, run the script from within the folder with all the images as follows:
./rename_images '*.jpg' image .jpg
And you should be sorted.

Sort files according to their filetype

After an HD problem and some work, I have a bunch of files with names like "f1234", "f1235", etc.
My goal is to sort this files according to their filetype. For example, I want to move all the PDF files in the "pdfs" directory.
For one file, I can do : "file f1234", and if it's a PDF, I can "mv f1234 pdfs/". But I have thousands of file... Can you help me with a bash or zsh command for sort all the PDF in one pass ? Thanks
The hard part here is reliably turning the output of file into a directory name. I think probably the best candidate for that is the mime-type of the file rather than the human readable output of file. I'd use something like:
mkdir sorted
for f in f*
do
d=$(file -b --mime-type "$f" | tr / -)
mkdir -p "sorted/$d"
mv "$f" "sorted/$d/"
done
Obviously I'd test that out a bit before running it on your files, but something pretty close to that should work.

Bash loop to gunzip file and remove file extension and file prefixes

I have several .vcf.gz files:
subset_file1.vcf.vcf.gz
subset_file2.vcf.vcf.gz
subset_file3.vcf.vcf.gz
I want to gunzip these file and rename them (remove subset_ and redudant .vcf extension in one go and get these files:
file1.vcf
file2.vcf
file3.vcf
This is the script I have tried:
iFILES=/file/path/*.gz
for i in $iFILES;
do gunzip -k $i > /get/in/this/dir/"${i##*/}"
done
Since you have to three operation at your output path name
1.remove the directory part
2.remove prefix subset_
3.remove redudant extension .vcf
It's hard to accomplish with only one command.
Following is a modification version. Be CAREFUL to try it. I didn't test it thorough in my computer.
for i in /file/path/*.gz;
do
# get the output file name
o=$(echo ${i##*/} | sed 's/.*_\(.*\)\(\.[a-z]\{3\}\)\{2\}.*/\1\2/g')
gunzip -k $i > /get/in/this/dir/$o
done

copy multiple files from directory tree to new different tree; bash script

I want to write a script that do specific thing:
I have a txt file e.g.
from1/from2/from3/apple.file;/to1/to2/to3;some not important stuff
from1/from2/banana.file;/to1/to5;some not important stuff
from1/from10/plum.file;/to1//to5/to100;some not important stuff
Now i want to copy file from each line (e.g. apple.file), from original directory tree to new, non existing directories, after first semicolon (;).
I try few code examples from similar questions, but nothing works fine and I'm too weak in bash scripting, to find errors.
Please help :)
need to add some conditions:
file not only need to be copy, but also rename. Example line in file.txt:
from1/from2/from3/apple.file;to1/to2/to3/juice.file;some1
from1/from2/banana.file;to1/to5/fresh.file;something different from above
so apple.file need to be copy and rename to juice.file and put in to1/to2/to3/juice.file
I think thaht cp will also rename file but
mkdir -p "$to"
from answer below will create full folder path with juice.file as folder
In addidtion after second semicolon in each line will be something different, so how to cut it off?
Thanks for all help
EDIT: There will be no spaces in input txt file.
Try this code..
cat file | while IFS=';' read from to some_not_important_stuff
do
to=${to:1} # strip off leading space
mkdir -p "$to" # create parent for 'to' if not existing yet
cp -i "$from" "$to" # option -i to get a warning when it would overwrite something
done
Using awk
(run the awk command first and confirm the output is fine, then add |sh to do the copy)
awk -F";" '{printf "cp %s %s\n",$1,$2}' file |sh
Using shell (get updated that need manually create folder, base on alfe's
while IFS=';' read from to X
do
mkdir -p $to
cp $from $to
done < file
I had this same problem and used tar to solve it! Posted here:
tmpfile=/tmp/myfile.tar
files="/some/folder/file1.txt /some/other/folder/file2.txt"
targetfolder=/home/you/somefolder
tar --file="$tmpfile" "$files"​
tar --extract --file="$tmpfile" --directory="$targetfolder"
In this case, tar will automatically create all (sub)folders for you! Best,
Nabi

Command line tool to search docx file under ms dos or cygwin

Is there a command line tool that is able to search docx file under ms dos or cygwin ?
I have tried grep, it's not working with docx while working fine with txt file.
I know I could always convert the docx to txt 1st then search using grep, but I am wondering
is there a command tool that I can search directly under command line?
Thanks
i wrote a small bash script, which would help you:
#!/bin/bash
export DOCKEY="$#"
function searchdoc(){
VK1=$(cat "$#" | grep -i "$DOCKEY" | wc -c)
VK2=$(unzip -c "$#" | grep -i "$DOCKEY" | wc -c)
let NUM=$VK1+$VK2
if [ "$NUM" -gt 0 ]; then
echo $NUM occurences in $#
echo opening file.
gnome-open "$#"
fi
}
export -f searchdoc
echo searching for $DOCKEY ...
find . -exec bash -c 'searchdoc "{}" 2>/dev/null' \;
save it as docfind.sh and you can invoke
$#> docfind.sh searchterm
from any folder you want to scan.
After a trying out the stuff , I found the easiest way to do this is to use a linux utility to batch convert all docx files into txt files, then do grep with those txt files easily.
zgrep might work for you? It usually works in OpenOffice documents, and both are compressed archives containing XML:
zgrep "some string" *.xdoc
I have no .xdoc files to test this with, but in theory it should work...
You can use zipgrep, which calls grep on all files of a zip archive (which a docx file is).
You might be disappointed with the result, though, as it returns raw content of XML files containing both the text and XML tags.
save it as docfind.sh and you can invoke
Newbies like me might need to be told that for the .sh script to be executable from any directory, it needs to have the executable property set and be located in /usr/bin or elsewhere in your Path.
I was able to set up the nemo file manager in Linux Mint to open a terminal from any folder's context menu (information here).

Resources