Zipping one month old files - linux

I have written a script to zip a set of files into one zip file if the number of files go above a limit.
limit=1000 #limit the number of files
files=( /mnt/md0/capture/dcn/*.pcap) #file format to be zipped
if((${#files[0]}>limit )); then #if number of files above limit
zip -j /mnt/md0/capture/dcn/capture_zip-$(date "+%b_%d_%Y_%H_%M_%S").zip /mnt/md0/capture/dcn/*.pcap
fi
I need to modify this, so that the script checks for number of files from previous month rather than the whole set of files. How do I implement that

This script perhaps.
#!/bin/bash
[ -n "$BASH_VERSION" ] || {
echo "You need Bash to run this script."
exit 1
}
shopt -s extglob || {
echo "Unable to enable extglob option."
exit 1
}
LIMIT=1000
FILES=(/mnt/md0/capture/dcn/*.pcap)
ONE_MONTH_BEFORE=0
ONE_MONTH_OLD_FILES=()
read ONE_MONTH_BEFORE < <(date -d 'TODAY - 1 month' '+%s') && [[ $ONE_MONTH_BEFORE == +([[:digit:]]) && ONE_MONTH_BEFORE -gt 0 ]] || {
echo "Unable to get timestamp one month before current day."
exit 1
}
for F in "${FILES[#]}"; do
read TIMESTAMP < <(date -r "$F" '+%s') && [[ $TIMESTAMP == +([[:digit:]]) && TIMESTAMP -le ONE_MONTH_BEFORE ]] && ONE_MONTH_OLD_FILES+=("$F")
done
if [[ ${#ONE_MONTH_OLD_FILES[#]} -gt LIMIT ]]; then
# echo "Zipping ${FILES[*]}." ## Just an example message you can create.
zip -j "/mnt/md0/capture/dcn/capture_zip-$(date '+%b_%d_%Y_%H_%M_%S').zip" "${ONE_MONTH_OLD_FILES[#]}"
fi
Make sure you save in unix file format and run bash script.sh.
You could also modify the script to get files by arguments instead by:
FILES=("$#")

Complete update:
#!/bin/bash
#Limit of your choice
LIMIT=1000
#Get the number of files, that has `*.txt` in its name, with last modified time 30 days ago
NUMBER=$(find /yourdirectory -maxdepth 1 -name "*.pcap" -mtime +30 | wc -l)
if [[ $NUMBER -gt $LIMIT ]]
then
FILES=$(find /yourdirectory -maxdepth 1 -name "*.pcap" -mtime +30)
zip archive.zip $FILES
fi
The reason I am getting the files twice, is because the bash array is delimeted by space, rather than \n, and I couldn't find a clear way to count the number of files, you might want to do some research on that to make find once.

Just replace your if line with
if [[ "$(find $(dirname "$files") -maxdepth 1 -wholename "$files" -mtime -30 | wc -l)" -gt "$limit" ]]; then
From left to right this expression
searches (find)
in the path of your pattern ($(dirname "$files") strips away everything from the last "/")
but not in its subdirectories (-maxdepth 1)
for files matching your pattern (-wholename "$files")
that are newer than 30 days (-mtime -30)
and counts the number of those files (wc -l)
I prefer -gt for comparisons, but else it is the same as in your example.
Note that this will only work when all your files are in the same directory!

Related

Move files to folder by date in a bash script

I have few files named as per year+month+date format.
Example:
20220101
20220102
20220103
20220104
..
20220130
20220131
As the file generated daily, I need to move 1st 2(20220101,20220102) and last 2(20220130,20220131) files in a specific folder every month. Can someone help me out how can I write the script?
This helped me a long back -
#!/bin/bash
DIR=/Users/limeworks/Downloads/target
target=$DIR
cd "$DIR"
for file in *; do
# Top tear folder name
year=$(stat -f "%Sm" -t "%Y" $file)
# Secondary folder name
subfolderName=$(stat -f "%Sm" -t "%d-%m-%Y" $file)
if [ ! -d "$target/$year" ]; then
mkdir "$target/$year"
echo "starting new year: $year"
fi
if [ ! -d "$target/$year/$subfolderName" ]; then
mkdir "$target/$year/$subfolderName"
echo "starting new day & month folder: $subfolderName"
fi
echo "moving file $file"
mv "$file" "$target/$year/$subfolderName"
done
well if you want to do this in bash i would suggest having a single script file and one log file to keep track of the current month/previous month.
#!/bin/bash
x=$(date +%D | cut -c 4,5 | sed 's|0||g')
y=$(sed -n 1p date.log 2>/dev/null)
if ! [ -d date.log ]; then
printf "$x" > date.log
exit 0
fi
if [[ $y -ge 0 && $y -le 12 && $x != $y ]]; then
#if the current month equal the previous month then everthing here will be exicuted
echo "a new month is here"
else
sed -i "1s/^.*$/$x/" date.log
fi
what this script essentially dose is that it creates log file containing the current month "if it doesn't already exist and". After that "if executed again" it matches the new month value to the one contained in the log file if it doesn't match it executes everything where the commented text is which is most likely a bunch of mv commands.
Try this Shellcheck-clean code:
#! /bin/bash -p
datefiles=( 20[0-9][0-9][01][0-9][0-3][0-9] )
mv -n -v -- "${datefiles[#]:0:2}" "${datefiles[#]: -2}" /path/to/folder
datefiles=( 20[0-9][0-9][01][0-9][0-3][0-9] ) makes an array of the files in the current directory with date-formatted names, sorted by name.
"${datefiles[#]:0:2}" expands to the first two elements in the datefiles array.
"${datefiles[#]: -2}" expands to the last two elements in the datefiles array.
You'll need to change /path/to/folder.
Unless it is absolutely guaranteed that there will always be at least 4 date files, you should add a check on the number of files found (eg. if (( ${#datefiles[*]} >= 4 )) ...).
$ string="20220101 20220102 20220103 20220104 .. 20220130 20220131"
$ awk '{ print |"mv " $1" "$2" "$(NF-1)" "$NF " /your/folder"}' <<<"$string"
or
$ myArray=(20220101 20220102 20220103 20220104 .. 20220130 20220131)
$ mv ${myArray[0]} ${myArray[1]} ${myArray[-2]} ${myArray[-1]} /your/folder
Files to array
$ myArray=($(find /path/to/files -mindepth 1 -maxdepth 1 -type f -name "[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]" -print0))
or
$ readarray myArray < <(find /path/to/files -mindepth 1 -maxdepth 1 -type f -name "[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]")
My purpose has been served.
`
for i in {1..12}; do
mv $(date -d "20220101 $i months" +'%Y%m%d') /path/to/folder/
mv $(date -d "20220102 $i months" +'%Y%m%d') /path/to/folder/
mv $(date -d "20220101 + $i month - 1 day" +'%Y%m%d') /path/to/folder/
mv $(date -d "20220101 + $i month - 2 day" +'%Y%m%d') /path/to/folder/
done
`
that's the solution.
thanks everyone who has been participated.

bash script in loop which counts numbers of files and directories

I need to write script in loop which will count the number of files and directories and indicates which grater and by how much. Like etc: there are 10 more files than directories.
I was trying something like that but it just show files and directories and I don't have idea how to indicates which is greater etc. Thanks for any help
shopt -s dotglob
count=0
for dir in *; do
test -d "$dir" || continue
test . = "$dir" && continue
test .. = "$dir" && continue
((count++))
done
echo $count
for -f in *; do
"$fname"
done
Here is a recursive dir walk I used for something a while back. Added counting of dirs and files:
#!/bin/sh
# recursive directory walk
loop() {
for i in *
do
if [ -d "$i" ]
then
dir=$((dir+1))
cd "$i"
loop
else
file=$((file+1))
fi
done
cd ..
}
loop
echo dirs: $dir, files: $file
Paste it to a script.sh and run with:
$ sh script.sh
dirs: 1, files: 11
You can use the find command to make things simplier.
The following command will list all the files in the given path:
find "path" -mindepth 1 -maxdepth 1 -type f
And also using the -type d you will get the directories.
Piping find into the wc -l will give you the number instead of the actual file and directory names, so:
root="${1:-.}"
files=$( find "$root" -mindepth 1 -maxdepth 1 -type f | wc -l)
dirs=$( find "$root" -mindepth 1 -maxdepth 1 -type d | wc -l)
if [ $files -gt $dirs ]; then
echo "there are $((files - dirs)) more files"
elif [ $files -lt $dirs ]; then
echo "there are $((dirs - files)) more dirs"
else
echo "there are the same"
fi
Use could use find to get the number of files/folders in a directory. Use wc -l to count the number of found paths, which you could use to calculate/show the result;
#!/bin/bash
# Path to search
search="/Users/me/Desktop"
# Get number of files
no_files=$(find "$search" -type f | wc -l )
# Number of folders
no_folders=$(find "$search" -type d | wc -l )
echo "Files: ${no_files}"
echo "Folders: ${no_folders}"
# Caculate dif
diff=$((no_files - $no_folders))
# Check if there are more folders or files
if [ "$diff" -gt 0 ]; then
echo "There are $diff more files then folders!"
else
diff=$((diff * -1 ) # Invert negative number to positive (-10 -> 10)
echo "There are $diff more folders then files!"
fi;
Files: 13
Folders: 2
There are 11 more files then folders!

Run script skipping files

I have a quite simple script I'd like to write just using bash.
Given a folder with 0..N *.XML files; I want to sort those by name and remove N-10 files (leave the last 10 in place).
I've been tinkering with find and tail/head but couldn't figure a way
find /mnt/user/Temporary/1 -name *.xml | tail -n +10 | rm
Please read up. It is about keeping the last 10. If there are 10 or less files, none should be deleted!
EDIT:
As someone closed, but did not repoen the question, here is the solution for those getting here with the same question.
#!/bin/bash
files=()
while IFS= read -r -d $'\0'; do
files+=("$REPLY")
done < <(find . -name *.xml -print0 | sort)
Limit=$((${#files[#]}-10))
count=0
while [ $Limit -gt $count ]; do
rm "${files[count]}"
let count=count+1
done
Maybe some linux "pro" can optimize it or give it some parameters (like limit, path and file pattern) to make it callable anywhere.
EDIT: New answer
#!/usr/bin/env bash
files=$(find *.xml | wc -l)
[ "$files" -lt 10 ] && echo "Files are less than 10..." && exit 1
count=$(($files-10))
for i in $(find *.xml | sort -V); do
[ $count -eq 0 ] && echo "Done" && exit 1
rm $i
((count--))
done
$files stores the number of *.xml in the folder
if the number is less or equal to 10 exit
set a counter that of the number of files to delete
loop through each file in order
if the counter is equal to 0 exit
if not remove the file and increment the counter

Archive old files only AND re-construct folder tree in archive

I want to move all my files older than 1000 days, which are distributed over various subfolders, from /home/user/documents into /home/user/archive. The command I tried was
find /home/user/documents -type f -mtime +1000 -exec rsync -a --progress --remove-source-files {} /home/user/archive \;
The problem is, that (understandably) all files end up being moved into the single folder /home/user/archive. However, what I want is to re-construct the file tree below /home/user/documents inside /home/user/archive. I figure this should be possible by simply replacing a string with another somehow, but how? What is the command that serves this purpose?
Thank you!
I would take this route instead of rsync:
Change directories so we can deal with relative path names instead of absolute ones:
cd /home/user/documents
Run your find command and feed the output to cpio, requesting it to make hard-links (-l) to the files, creating the leading directories (-d) and preserve attributes (-m). The -print0 and -0 options use nulls as record terminators to correctly handle file names with whitespace in them. The -l option to cpio uses links instead of actually copying the files, so very little additional space is used (just what is needed for the new directories).
find . -type f -mtime +1000 -print0 | cpio -dumpl0 /home/user/archives
Re-run your find command and feed the output to xargs rm to remove the originals:
find . -type f -mtime +1000 -print0 | xargs -0 rm
Here's a script too.
#!/bin/bash
[ -n "$BASH_VERSION" ] && [[ BASH_VERSINFO -ge 4 ]] || {
echo "You need Bash version 4.0 to run this script."
exit 1
}
# SOURCE=/home/user/documents/
# DEST=/home/user/archive/
SOURCE=$1
DEST=$2
declare -i DAYSOLD=10
declare -a DIRS=()
declare -A DIRS_HASH=()
declare -a FILES=()
declare -i E=0
# Check directories.
[[ -n $SOURCE && -d $SOURCE && -n $DEST && -d $DEST ]] || {
echo "Source or destination directory may be invalid."
exit 1
}
# Format source and dest variables properly:
SOURCE=${SOURCE%/}
DEST=${DEST%/}
SOURCE_LENGTH=${#SOURCE}
# Copy directories first.
echo "Creating directories."
while read -r FILE; do
DIR=${FILE%/*}
if [[ -z ${DIRS_HASH[$DIR]} ]]; then
PARTIAL=${DIR:SOURCE_LENGTH}
if [[ -n $PARTIAL ]]; then
TARGET=${DEST}${PARTIAL}
echo "'$TARGET'"
mkdir -p "$TARGET" || (( E += $? ))
chmod --reference="$DIR" "$TARGET" || (( E += $? ))
chown --reference="$DIR" "$TARGET" || (( E += $? ))
touch --reference="$DIR" "$TARGET" || (( E += $? ))
DIRS+=("$DIR")
fi
DIRS_HASH[$DIR]=.
fi
done < <(exec find "$SOURCE" -mindepth 1 -type f -mtime +"$DAYSOLD")
# Copy files.
echo "Copying files."
while read -r FILE; do
PARTIAL=${FILE:SOURCE_LENGTH}
cp -av "$FILE" "${DEST}${PARTIAL}" || (( E += $? ))
FILES+=("$FILE")
done < <(exec find "$SOURCE" -mindepth 1 -type f -mtime +"$DAYSOLD")
# Remove old files.
if [[ E -eq 0 ]]; then
echo "Removing old files."
rm -fr "${DIRS[#]}" "${FILES[#]}"
else
echo "An error occurred during copy. Not removing old files."
exit 1
fi

Very slow script

I have a problem. I need to write a bash script that will find all files and directories in given path and will display some info about results. Allowed time: 30 seconds.
#!/bin/bash
DIRS=0
FILES=0
OLD_FILES=0
LARGE_FILES=0
TMP_FILES=0
EXE_FILES=0
IMG_FILES=0
SYM_LINKS=0
TOTAL_BYTES=0
#YEAR_AGO=$(date -d "now - 1 year" +%s)
#SECONDS_IN_YEAR=31536000
function check_dir {
for entry in "$1"/*
do
if [ -d "$entry" ]; then
((DIRS+=1))
check_dir "$entry"
else if [ -f "$entry" ]; then
((FILES+=1))
#SIZE=$(stat -c%s "$entry")
#((TOTAL_BYTES+=SIZE))
#CREATE_DATE=$(date -r "$entry" +%s)
#CREATE_DATE=$(stat -c%W "$entry")
#DIFF=$((CREATE_DATE-YEAR_AGO))
#if [ $DIFF -ge $SECONDS_IN_YEAR ]; then
# ((OLD_FILES+=1))
#fi
fi
fi
done
}
if [ $# -ne 2 ]; then
echo "Usage: ./srpt path emailaddress"
exit 1
fi
if [ ! -d $1 ]; then
echo "Provided path is invalid"
exit 1
fi
check_dir $1
echo "Execution time $SECONDS"
echo "Dicrecoties $DIRS"
echo "Files $FILES"
echo "Sym links $SYM_LINKS"
echo "Old files $OLD_FILES"
echo "Large files $LARGE_FILES"
echo "Graphics files $IMG_FILES"
echo "Temporary files $TMP_FILES"
echo "Executable files $EXE_FILES"
echo "Total file size $TOTAL_BYTES"
Here are result of executing with commented lines above:
Execution time 1
Dicrecoties 931
Files 14515
Sym links 0
Old files 0
Large files 0
Graphics files 0
Temporary files 0
Executable files 0
Total file size 0
If I'll delete comment from
SIZE=$(stat -c%s "$entry")
((TOTAL_BYTES+=SIZE))
I got:
Execution time 31
Dicrecoties 931
Files 14515
Sym links 0
Old files 0
Large files 0
Graphics files 0
Temporary files 0
Executable files 0
Total file size 447297022
31 seconds. How can I speed up my script?
Another +30 seconds gives finding of files with date creating more the one year
More often than not, using loops in shells is an indication that you're going for the wrong approach.
A shell is before all a tool to run other tools.
Though it can do counting, awk is a better tool to do it.
Though it can list and find files, find is better at it.
The best shell scripts are those that manage to have a few tools contribute to the task, not those that start millions of tools in sequence and where all the job is done by the shell.
Here, typically a better approach would be to have find find the files and gather all the data you need, and have awk munch it and return the statistics. Here using GNU find and GNU awk (for RS='\0') and GNU date (for -d):
find . -printf '%y.%s.%Ts%p\0' |
awk -v RS='\0' -F'[.]' -v yearago="$(date -d '1 year ago' +%s)" '
{
type[$1]++;
if ($1 == "f") {
total_size+=$2
if ($3 < yearago) old++
if (!index($NF, "/")) ext[tolower($NF)]++
}
}
END {
printf("%20s: %d\n", "Directories", type["d"])
printf("%20s: %d\n", "Total size", total_size)
printf("%20s: %d\n", "old", old)
printf("%20s: %d\n", "jpeg", ext["jpg"]+ext["jpeg"])
printf("%20s: %d\n", "and so on...", 0)
}'
The key is to avoid firing up too many utilities. You seem to be invoking two or three per file, which will be quite slow.
Also, the comments show that handling filenames, in general, is complicated, particularly if the filenames might have spaces and/or newlines in them. But you don't actually need the filenames, if I understand your problem correctly, since you are only using them to collect information.
If you're using gnu find, you can extract the stat information directly from find, which will be quite a lot more efficient, since find needs to do a stat() anyway on every file. Here's an example, which pipes from find into awk for simplicity:
summary() {
find "$#" '(' -type f -o -type d ')' -printf '%y %s %C#\n' |
awk '$1=="d"{DIR+=1;next}
$1!="f"{next}
{REG+=1;SIZE+=$2}
$3<'$(date +%s -d"last year")'{OLD+=1}
END{printf "Directories: %d\nFiles: %d\nOld files: %d\nTotal Size: %d\n",
DIR, REG, OLD, SIZE}'
}
On my machine, that summarised 28718 files in 4817 directories in one-tenth of a second elapsed time. YMMV.
You surely want to avoid parsing the output of find as you did (see my comment): it'll break whenever you have spaces in filenames.
You surely want to avoid forking to external processes like your $(stat ...) or $(date ...) statements: each fork costs a lot!
It turns out that find on its own can do quite a lot. For example, if we want to count the numbers of files, dirs and links.
We all know the naive way in bash (pretty much what you've done):
#!/bin/bash
shopt -s globstar
shopt -s nullglob
shopt -s dotglob
nbfiles=0
nbdirs=0
for f in ./**; do
[[ -f $f ]] && ((++nbfiles))
[[ -d $f ]] && ((++nbdirs))
done
echo "There are $nbdirs directories and $nbfiles files, and we're very happy."
Caveat. This method counts links according to what they link to: a link to a file will be counted as a file.
How about the find way? Count number of files, directories and (symbolic) links:
#!/bin/bash
nbfiles=0
nbdirs=0
nblinks=0
while read t n; do
case $t in
dirs) ((nbdirs+=n+1)) ;;
files) ((nbfiles+=n+1)) ;;
links) ((nblinks+=n+1)) ;;
esac
done < <(
find . -type d -exec bash -c 'echo "dirs $#"' {} + \
-or -type f -exec bash -c 'echo "files $#"' {} + \
-or -type l -exec bash -c 'echo "links $#"' {} + 2> /dev/null
)
echo "There are $nbfiles files, $nbdirs dirs and $nblinks links. You're happy to know aren't you?"
Same principles, using associative arrays, more fields and more involved find logic:
#!/bin/bash
declare -A fields
while read f n; do
((fields[$f]+=n))
done < <(
find . -type d -exec bash -c 'echo "dirs $(($#+1))"' {} + \
-or -type f -exec bash -c 'echo "files $(($#+1))"' {} + -printf 'size %s\n' \
\( \
\( -iname '*.jpg' -printf 'jpg 1\n' -printf 'jpg_size %s\n' \) \
-or -size +100M -printf 'large 1\n' \
\) \
-or -type l -exec bash -c 'echo "links $(($#+1))"' {} + 2> /dev/null
)
for f in "${!fields[#]}"; do
printf "%s: %s\n" "$f" "${fields[$f]}"
done
I hope this will give you some ideas! Good luck!

Resources