Table output with different variables - linux

How can I modify my script to print output in a table, like in this screenshot?
ruta=$1
nom=$2
size=$3
mod=$4
search="$(find "$ruta" -iname "*$nom*" -size $size -mtime $mod)"
fichero="$(ls -lh $search | awk 'BEGIN{FS="/"; OFS="\t"}{print$NF}')"
mida="$(ls -lh $search | awk '{print$5}')"
modificado="$(ls -l --full-time $search | awk '{print$6}')"
path="$(ls -d $search | awk 'BEGIN{FS="/"; OFS="/"} {$NF=""; print$0,$NF}')"
#echo $search
echo -e "\e[31mNOMBRE DEL FICHERO\n\e[0m$fichero"
echo
echo -e "\e[31mTAMAÑO\n\e[0m$mida"
echo
echo -e "$modificado"
echo
echo -e "$path"
echo

You receive packets of information from the different input sub-commands like search, fichero etc. That means better names are searches, ficheros etc.
If you assign those to bash arrays you may loop afterwards like:
# read shell command into array ()
result=($(find /var/log -type f))
# show found number of elements
echo "Elements found: ${#result[#]}"
# loop elements by counter
for ((i=0; i<${#result[#]}; i++))
do
echo ${result[$i]}
done
echo "-----"
# print all content
echo ${result[#]}
Interesting article for using bash arrays: https://www.thegeekstuff.com/2010/06/bash-array-tutorial/
ruta=$1
nom=$2
size=$3
mod=$4
search=($(find "$ruta" -not -type d -iname "*$nom*" -size $size -mtime $mod))
echo -e "FICHERO\tTAMANO\tMODIFICATION\tRUTA"
for ((i=0; i<${#search[#]}; i++))
do
fichero=$(basename ${search[$i]})
mida=$(stat -f "%z" ${search[$i]})
modificado=$(stat -t "%Y-%m-%d" -f "%Sm" ${search[$i]})
path=$(dirname ${search[$i]})
echo -e "$fichero\c"
echo -e "\t$mida\c"
echo -e "\t$modificado\c"
echo -e "\t$path\c"
echo
done
And another simplified variant with printf output and another for loop without array:
ruta=$1
nom=$2
size=$3
mod=$4
search=$(find "$ruta" -not -type d -iname "*$nom*" -size $size -mtime $mod 2>/dev/null)
printf "%s %38s %12s %s\n" "FICHERO" "TAMANO" "MODIFICATION" "RUTA"
printf "%s\n" "--------------------------------------------------------------------"
for file in ${search}
do
fichero=$(basename $file)
mida=$(stat -f "%z" $file)
modificado=$(stat -t "%Y-%m-%d" -f "%Sm" $file)
path=$(dirname $file)
printf "%35s %10s %12s %s\n" $fichero $mida $modificado $path
done

Related

Getting a single output from multiple if/else outputs

I'm trying to get the line count of a few files in a directory by a script. Up until now I was able to do so with if/else statements but I'm getting an output for each file that is checked.
My goal is to get a single "main" output that:
If all the outputs are "OK" ->> main output will be "OK"
If even one of the outputs is "PROBLEM" ->> main output will indicate an error.
files=`find /backup/external/logs -type f -daystart -ctime 0 -print | grep csv | grep -v Collateral`
count_files=`echo $files | grep -o " " | wc -l`
count_files=$((count_files+1))
for ((i=1;i<=${count_files}; i++));
do
file=`echo $files | awk -F " " -v a=$i '{ print $a }'`
linecount=`(wc -l "$file"| awk '{print $1}')`
if [ $linecount -gt "1" ]; then
echo "OK"
else
echo "PROBLEM! File $file"
fi
done
And my output is:
PROBLEM! File /backup/external/logs/log1_20211214010002.csv
PROBLEM! File /backup/external/logs/log2_20211214010002.csv
OK
PROBLEM! File /backup/external/logs/log4_20211214010002.csv
OK
PROBLEM! File /backup/external/logs/log6_20211214010002.csv
Accumulate the problematic files in a variable.
problems=""
for ((i=1;i<=${count_files}; i++));
do
file=`echo $files | awk -F " " -v a=$i '{ print $a }'`
linecount=`(wc -l "$file"| awk '{print $1}')`
if [ $linecount -le 1 ]; then
problems+="$file "
fi
done
if [ "$problems" ] ; then
echo Problems: "$problems"
fi
It seems you want to report the csv files whose filenames don't contain "Collateral", which have exactly zero or one line. If so you do not need such a complicated script; this all could be done with a single find command:
find /backup/external/logs -type f -daystart -ctime 0 \
-name '*.csv' \! -name '*Collateral*' \
-exec bash -c '{ read && read || echo "PROBLEM! File $1"; } < "$1"' _ {} \;

Delete files older than the epoch date of a file in a list

I have a file which contains a full path of a filename (space separated) and the last column I put the change date of the file in epoch.
/data/owncloud/c/files/Walkthrough 2019/#25 SEC-C03/Group Enterprise.jpg 1569314988
I want to delete all space separated files which epoch number is smaller then 1568187717.
The script looks like this at the moment, but this if with the space separation can't work :(
#!/bin/bash
IFS=$'\n'
while read i
do printf "%s " "$i"
stat --format=%Z $i
done < <(find /data/owncloud/*/files -type f) > filelistwithchangeddate
filetodelete=expr `date +'%s'` - 2592000
The awk '{print $(NF)}' has the last column number so somehow need to compare the awk output with the filetodelete and delete the space separated files.
Update:
Something like this what it should be I think:
for i in `cat filelistwithchangeddate `
do
if [ $(awk '{print $(NF)}' $i) -lt $filetodelete ]
then
echo "this will be deleted:"
awk '{$NF=""}1' $i
fi
done
But need to fix somehow the spaces and run the delete
Ok, Thank you triplee, I think this will work:
IFS=$'\n'
while read i
do printf "%s " "$i"
stat --format=%Z $i
done < <(find /data/owncloud/*/files -type f) > /root/script/newpurge/filelistwithchangeddate
filetodelete=$(expr `date +'%s'` - 2592000)
awk -v epoch="$filetodelete" '$NF<epoch' /root/script/newpurge/filelistwithchangeddate > oldf
iles
awk '{$NF=""}1' /root/script/newpurge/oldfiles > marktodelete
sed -i "s/^/'/g" /root/script/newpurge/marktodelete
sed -i "s/[ ]\+$/'/g" /root/script/newpurge/marktodelete
for i in $(cat /root/script/newpurge/marktodelete)
do
rm -f $i
done
This answer is based on this:
This can easily done using find. Normally you would do:
$ find . -type f ! -newermt "#1569314988" -delete
but if you want to pick the time from a file (example from OP):
$ t=$(awk '{print NF}' file)
$ [[ "$t" != "" ]] && find . -type f ! -newermt "#${t}" -delete
See man find for details on the meaning of the flags and for extra modifications which might be needed.

echo the output of a ls command with less files than n

I have 400 folders with several files inside, I am interested in:
counting how many files with the extension .solution are in each folder, and
then output only those folder have less than 440 elements
The point 1) is easy to get with the command:
for folder in $(ls -d */ | grep "sol_cv_");
do
a=$(ls -1 "$folder"/*.solution | wc -l);
echo $folder has "${a}" files;
done
But is there any easy way to filter only the files with less than 440 elements?
This simple script could work for you:-
#!/bin/bash
MAX=440
for folder in sol_cv_*; do
COUNT=$(find "$folder" -type f -name "*.solution" | wc -l)
((COUNT < MAX)) && echo "$folder"
done
The script below
counterfun(){
count=$(find "$1" -maxdepth 1 -type f -iname "*.solution" | wc -l)
(( count < 440 )) && echo "$1"
}
export -f counterfun
find /YOUR/BASE/FOLDER/ -maxdepth 1 -type d -iname "sol_cv_*" -exec bash -c 'counterfun "$1"' _ {} \;
#maxdepth 1 in both find above as you've confirmed no sub-folders
should do it
Avoid parsing ls command and use printf '%q\n for counting files:
for folder in *sol_cv_*/; do
# if there are less than 440 elements then skip
(( $(printf '%q\n' "$folder"/* | wc -l) < 440 )) && continue
# otherwise print the count using safer printf '%q\n'
echo "$folder has $(printf '%q\n' "$folder"*.solution | wc -l) files"
done

check a substring in a string in bash

I need a sentence if...else.. that verify if the filename curbing a string specifies in bash
for j in `ls `
do
if [ "${j:(-3)}" == ".gz" ]; then
Cmd="zcat"
elif [ "${j:(-4)}" == ".bz2" ]; then
Cmd="bzcat"
else
Cmd="cat"
fi
if [ $j ***contains*** "string1"]; then
$cmd $j | awk -F"," '{print $4}'
elif [ $j *contains* "string2" ]; then
$cmd $j | awk -F"," '{print $2}'
fi
done
Use double brackets, which support wildcards:
if [[ $j == *string1* ]]; then
Also, don't parse ls; use a glob instead:
Instead of
for j in `ls `
use
for j in *
If you wan't the match to be case-insensitive, you can set the shopt -s nocasematch option:
shopt -s nocasematch
if [[ $j == *string1* ]]; then
The =~ operator does what you want.
I personally would use find and xargs though.
find . -name "*.gz" -print0 | xargs -I{} -0 gzip -dc {} | cut -f, -d4
find . -name "*.bz2" -print0 | xargs -I{} -0 bzip2 -dc {} | cut -f, -d4
Use bash's regex capabilities here. So instead of;
if [ $j ***contains*** "string1"]; then
Use:
if [[ "$j" =~ \bstring1\b ]]; then
PS: Note use of \b (word boundaries) to make sure you don't match string123 in $j.
Also instead of using ls:
for j in `ls `
You should better use:
for j in *

How to find duplicate files with same name but in different case that exist in same directory in Linux?

How can I return a list of files that are named duplicates i.e. have same name but in different case that exist in the same directory?
I don't care about the contents of the files. I just need to know the location and name of any files that have a duplicate of the same name.
Example duplicates:
/www/images/taxi.jpg
/www/images/Taxi.jpg
Ideally I need to search all files recursively from a base directory. In above example it was /www/
The other answer is great, but instead of the "rather monstrous" perl script i suggest
perl -pe 's!([^/]+)$!lc $1!e'
Which will lowercase just the filename part of the path.
Edit 1: In fact the entire problem can be solved with:
find . | perl -ne 's!([^/]+)$!lc $1!e; print if 1 == $seen{$_}++'
Edit 3: I found a solution using sed, sort and uniq that also will print out the duplicates, but it only works if there are no whitespaces in filenames:
find . |sed 's,\(.*\)/\(.*\)$,\1/\2\t\1/\L\2,'|sort|uniq -D -f 1|cut -f 1
Edit 2: And here is a longer script that will print out the names, it takes a list of paths on stdin, as given by find. Not so elegant, but still:
#!/usr/bin/perl -w
use strict;
use warnings;
my %dup_series_per_dir;
while (<>) {
my ($dir, $file) = m!(.*/)?([^/]+?)$!;
push #{$dup_series_per_dir{$dir||'./'}{lc $file}}, $file;
}
for my $dir (sort keys %dup_series_per_dir) {
my #all_dup_series_in_dir = grep { #{$_} > 1 } values %{$dup_series_per_dir{$dir}};
for my $one_dup_series (#all_dup_series_in_dir) {
print "$dir\{" . join(',', sort #{$one_dup_series}) . "}\n";
}
}
Try:
ls -1 | tr '[A-Z]' '[a-z]' | sort | uniq -c | grep -v " 1 "
Simple, really :-) Aren't pipelines wonderful beasts?
The ls -1 gives you the files one per line, the tr '[A-Z]' '[a-z]' converts all uppercase to lowercase, the sort sorts them (surprisingly enough), uniq -c removes subsequent occurrences of duplicate lines whilst giving you a count as well and, finally, the grep -v " 1 " strips out those lines where the count was one.
When I run this in a directory with one "duplicate" (I copied qq to qQ), I get:
2 qq
For the "this directory and every subdirectory" version, just replace ls -1 with find . or find DIRNAME if you want a specific directory starting point (DIRNAME is the directory name you want to use).
This returns (for me):
2 ./.gconf/system/gstreamer/0.10/audio/profiles/mp3
2 ./.gconf/system/gstreamer/0.10/audio/profiles/mp3/%gconf.xml
2 ./.gnome2/accels/blackjack
2 ./qq
which are caused by:
pax> ls -1d .gnome2/accels/[bB]* .gconf/system/gstreamer/0.10/audio/profiles/[mM]* [qQ]?
.gconf/system/gstreamer/0.10/audio/profiles/mp3
.gconf/system/gstreamer/0.10/audio/profiles/MP3
.gnome2/accels/blackjack
.gnome2/accels/Blackjack
qq
qQ
Update:
Actually, on further reflection, the tr will lowercase all components of the path so that both of
/a/b/c
/a/B/c
will be considered duplicates even though they're in different directories.
If you only want duplicates within a single directory to show as a match, you can use the (rather monstrous):
perl -ne '
chomp;
#flds = split (/\//);
$lstf = $f[-1];
$lstf =~ tr/A-Z/a-z/;
for ($i =0; $i ne $#flds; $i++) {
print "$f[$i]/";
};
print "$x\n";'
in place of:
tr '[A-Z]' '[a-z]'
What it does is to only lowercase the final portion of the pathname rather than the whole thing. In addition, if you only want regular files (no directories, FIFOs and so forth), use find -type f to restrict what's returned.
I believe
ls | sort -f | uniq -i -d
is simpler, faster, and will give the same result
Following up on the response of mpez0, to detect recursively just replace "ls" by "find .".
The only problem I see with this is that if this is a directory that is duplicating, then you have 1 entry for each files in this directory. Some human brain is required to treat the output of this.
But anyway, you're not automatically deleting these files, are you?
find . | sort -f | uniq -i -d
This is a nice little command line app called findsn you get if you compile fslint that the deb package does not include.
it will find any files with the same name, and its lightning fast and it can handle different case.
/findsn --help
find (files) with duplicate or conflicting names.
Usage: findsn [-A -c -C] [[-r] [-f] paths(s) ...]
If no arguments are supplied the $PATH is searched for any redundant
or conflicting files.
-A reports all aliases (soft and hard links) to files.
If no path(s) specified then the $PATH is searched.
If only path(s) specified then they are checked for duplicate named
files. You can qualify this with -C to ignore case in this search.
Qualifying with -c is more restrictive as only files (or directories)
in the same directory whose names differ only in case are reported.
I.E. -c will flag files & directories that will conflict if transfered
to a case insensitive file system. Note if -c or -C specified and
no path(s) specified the current directory is assumed.
Here is an example how to find all duplicate jar files:
find . -type f -printf "%f\n" -name "*.jar" | sort -f | uniq -i -d
Replace *.jar with whatever duplicate file type you are looking for.
Here's a script that worked for me ( I am not the author). the original and discussion can be found here:
http://www.daemonforums.org/showthread.php?t=4661
#! /bin/sh
# find duplicated files in directory tree
# comparing by file NAME, SIZE or MD5 checksum
# --------------------------------------------
# LICENSE(s): BSD / CDDL
# --------------------------------------------
# vermaden [AT] interia [DOT] pl
# http://strony.toya.net.pl/~vermaden/links.htm
__usage() {
echo "usage: $( basename ${0} ) OPTION DIRECTORY"
echo " OPTIONS: -n check by name (fast)"
echo " -s check by size (medium)"
echo " -m check by md5 (slow)"
echo " -N same as '-n' but with delete instructions printed"
echo " -S same as '-s' but with delete instructions printed"
echo " -M same as '-m' but with delete instructions printed"
echo " EXAMPLE: $( basename ${0} ) -s /mnt"
exit 1
}
__prefix() {
case $( id -u ) in
(0) PREFIX="rm -rf" ;;
(*) case $( uname ) in
(SunOS) PREFIX="pfexec rm -rf" ;;
(*) PREFIX="sudo rm -rf" ;;
esac
;;
esac
}
__crossplatform() {
case $( uname ) in
(FreeBSD)
MD5="md5 -r"
STAT="stat -f %z"
;;
(Linux)
MD5="md5sum"
STAT="stat -c %s"
;;
(SunOS)
echo "INFO: supported systems: FreeBSD Linux"
echo
echo "Porting to Solaris/OpenSolaris"
echo " -- provide values for MD5/STAT in '$( basename ${0} ):__crossplatform()'"
echo " -- use digest(1) instead for md5 sum calculation"
echo " $ digest -a md5 file"
echo " -- pfexec(1) is already used in '$( basename ${0} ):__prefix()'"
echo
exit 1
(*)
echo "INFO: supported systems: FreeBSD Linux"
exit 1
;;
esac
}
__md5() {
__crossplatform
:> ${DUPLICATES_FILE}
DATA=$( find "${1}" -type f -exec ${MD5} {} ';' | sort -n )
echo "${DATA}" \
| awk '{print $1}' \
| uniq -c \
| while read LINE
do
COUNT=$( echo ${LINE} | awk '{print $1}' )
[ ${COUNT} -eq 1 ] && continue
SUM=$( echo ${LINE} | awk '{print $2}' )
echo "${DATA}" | grep ${SUM} >> ${DUPLICATES_FILE}
done
echo "${DATA}" \
| awk '{print $1}' \
| sort -n \
| uniq -c \
| while read LINE
do
COUNT=$( echo ${LINE} | awk '{print $1}' )
[ ${COUNT} -eq 1 ] && continue
SUM=$( echo ${LINE} | awk '{print $2}' )
echo "count: ${COUNT} | md5: ${SUM}"
grep ${SUM} ${DUPLICATES_FILE} \
| cut -d ' ' -f 2-10000 2> /dev/null \
| while read LINE
do
if [ -n "${PREFIX}" ]
then
echo " ${PREFIX} \"${LINE}\""
else
echo " ${LINE}"
fi
done
echo
done
rm -rf ${DUPLICATES_FILE}
}
__size() {
__crossplatform
find "${1}" -type f -exec ${STAT} {} ';' \
| sort -n \
| uniq -c \
| while read LINE
do
COUNT=$( echo ${LINE} | awk '{print $1}' )
[ ${COUNT} -eq 1 ] && continue
SIZE=$( echo ${LINE} | awk '{print $2}' )
SIZE_KB=$( echo ${SIZE} / 1024 | bc )
echo "count: ${COUNT} | size: ${SIZE_KB}KB (${SIZE} bytes)"
if [ -n "${PREFIX}" ]
then
find ${1} -type f -size ${SIZE}c -exec echo " ${PREFIX} \"{}\"" ';'
else
# find ${1} -type f -size ${SIZE}c -exec echo " {} " ';' -exec du -h " {}" ';'
find ${1} -type f -size ${SIZE}c -exec echo " {} " ';'
fi
echo
done
}
__file() {
__crossplatform
find "${1}" -type f \
| xargs -n 1 basename 2> /dev/null \
| tr '[A-Z]' '[a-z]' \
| sort -n \
| uniq -c \
| sort -n -r \
| while read LINE
do
COUNT=$( echo ${LINE} | awk '{print $1}' )
[ ${COUNT} -eq 1 ] && break
FILE=$( echo ${LINE} | cut -d ' ' -f 2-10000 2> /dev/null )
echo "count: ${COUNT} | file: ${FILE}"
FILE=$( echo ${FILE} | sed -e s/'\['/'\\\['/g -e s/'\]'/'\\\]'/g )
if [ -n "${PREFIX}" ]
then
find ${1} -iname "${FILE}" -exec echo " ${PREFIX} \"{}\"" ';'
else
find ${1} -iname "${FILE}" -exec echo " {}" ';'
fi
echo
done
}
# main()
[ ${#} -ne 2 ] && __usage
[ ! -d "${2}" ] && __usage
DUPLICATES_FILE="/tmp/$( basename ${0} )_DUPLICATES_FILE.tmp"
case ${1} in
(-n) __file "${2}" ;;
(-m) __md5 "${2}" ;;
(-s) __size "${2}" ;;
(-N) __prefix; __file "${2}" ;;
(-M) __prefix; __md5 "${2}" ;;
(-S) __prefix; __size "${2}" ;;
(*) __usage ;;
esac
If the find command is not working for you, you may have to change it. For example
OLD : find "${1}" -type f | xargs -n 1 basename
NEW : find "${1}" -type f -printf "%f\n"
You can use:
find -type f -exec readlink -m {} \; | gawk 'BEGIN{FS="/";OFS="/"}{$NF=tolower($NF);print}' | uniq -c
Where:
find -type f
recursion print all file's full path.
-exec readlink -m {} \;
get file's absolute path
gawk 'BEGIN{FS="/";OFS="/"}{$NF=tolower($NF);print}'
replace the all filename's to lower case
uniq -c
unique the path, -c output the count of duplicate.
Little bit late to this one, but here's the version I went with:
find . -type f | awk -F/ '{print $NF}' | sort -f | uniq -i -d
Here we are using:
find - find all files under the current dir
awk - remove the file path part of the filename
sort - sort case insensitively
uniq - find the dupes from what makes it through the pipe
(Inspired by #mpez0 answer, and #SimonDowdles comment on #paxdiablo answer.)
You can check duplicates in a given directory with GNU awk:
gawk 'BEGINFILE {if ((seen[tolower(FILENAME)]++)) print FILENAME; nextfile}' *
This uses BEGINFILE to perform some action before going on and reading a file. In this case, it keeps track of the names that have appeared in an array seen[] whose indexes are the names of the files in lowercase.
If a name has already appeared, no matter its case, it prints it. Otherwise, it just jumps to the next file.
See an example:
$ tree
.
├── bye.txt
├── hello.txt
├── helLo.txt
├── yeah.txt
└── YEAH.txt
0 directories, 5 files
$ gawk 'BEGINFILE {if ((a[tolower(FILENAME)]++)) print FILENAME; nextfile}' *
helLo.txt
YEAH.txt
I just used fdupes on CentOS to clean up a whole buncha duplicate files...
yum install fdupes

Resources