bash loop file echo to each file in the directory - linux

I searched a while and tried it by myself but unable to get this sorted so far. My folder looks below, 4 files
1.txt, 2.txt, 3.txt, 4.txt, 5.txt, 6.txt
I want to print file modified time and echo the time stamp in it
#!/bin/bash
thedate= `ls | xargs stat -s | grep -o "st_mtime=[0-9]*" | sed "s/st_mtime=//g"` #get file modified time
files= $(ls | grep -Ev '(5.txt|6.txt)$') #exclud 5 and 6 text file
for i in $thedate; do
echo $i >> $files
done
I want to insert each timestamp to each file. but having "ambiguous redirect" error. am I doing it incorrectly? Thanks

In this case, files is a "list" of files, so you probably want to add another loop to handle them one by one.
Your description is slightly confusing but, if your intent is to append the last modification date of each file to that file, you can do something like:
for fspec in [1-4].txt ; do
stat -c %y ${fspec} >>${fspec}
done
Note I've used stat -c %y to get the modification time such as 2017-02-09 12:21:22.848349503 +0800 - I'm not sure what variant of stat you're using but mine doesn't have a -s option. You can still use your option, you just have to ensure it's done on each file in turn, probably something like (in the for loop above):
stat -s ${fspec} | grep -o "st_mtime=[0-9]*" | sed "s/st_mtime=//g" >>${fspec}

You can not redirect the output to several files as in > $files.
To process several files you need something like:
#!/bin/bash
for f in ./[0-4].txt ; do
# get file modified time (in seconds)
thedate="$(stat --printf='%Y\n' "$f")"
echo "$thedate" >> "$f"
done
If you want a human readable time format change %Y by %y:
thedate="$(stat --printf='%y\n' "$f")"

Related

bash/awk/unix detect changes in lines of csv files

I have a timestamp in this format:
(normal_file.csv)
timestamp
19/02/2002
19/02/2002
19/02/2002
19/02/2002
19/02/2002
19/02/2002
The dates are usually uniform, however, there are files with irregular dates pattern such as this example:
(abnormal_file.csv)
timestamp
19/02/2002
19/02/2003
19/02/2005
19/02/2006
In my directory, there are hundreds of files that consist of normal.csv and abnormal.csv.
I want to write a bash or awk script that detect the dates pattern in all files of a directory. Files with abnormal.csv should be moved automatically to a new, separate directory (let's say dir_different/).
Currently, I have tried the following:
#!/bin/bash
mkdir dir_different
for FILE in *.csv;
do
# pipe 1: detect the changes in the line
# pipe 2: print the timestamp column (first column, columns are comma-separated)
awk '$1 != prev {print ; prev = $1}' < $FILE | awk -F , '{print $1}'
done
If the timestamp in a given file is normal, then only one single timestamp will be printed; but for abnormal files, multiple dates will be printed.
I am not sure how to separate the abnormal files from the normal files, and I have tried the following:
do
output=$(awk 'FNR==3{print $0}' $FILE)
echo ${output}
if [[ ${output} =~ ([[:space:]]) ]]
then
mv $FILE dir_different/
fi
done
Or is there an easier method to detect changes in lines and separate files that have different lines? Thank you for any suggestions :)
Assuming that none of your "normal" CSV files have trailing newlines this should do the separation just fine:
#!/bin/bash
mkdir -p dir_different
for FILE in *.csv;
do
if awk '{a[$1]++}END{if(length(a)<=2){exit 1}}' "$FILE" ; then
echo mv "$FILE" dir_different
fi
done
After a dry-run just get rid of the echo :)
Edit:
{a[$1]++} This bit creates an array a that gets the first field of each line as an index, and that gets incremented every time the same value is seen.
END{if(length(a)<=2){exit 1}} This checks how many elements are in the array. If there there are less than 3 (which should be the case if there's always the same date and we only get 1 header, 1 date) exit the processing with 1.
"$FILE" is part of the bash script, not awk, and I quoted your variable out of habit, should you ever have files w/ spaces in their names you'll see why :)
So, a "normal" file contains only two different lines:
timestamp
dd/mm/yyyy
Testing if a file is normal is thus as simple as:
[ $(sort -u file.csv | wc -l) -eq 2 ]
This leads to the following possible solution:
#!/usr/bin/env bash
mkdir -p dir_different
for FILE in *.csv;
do
if [ $(sort -u "$FILE" | wc -l) -ne 2 ] ; then
echo mv "$FILE" dir_different
fi
done

How to use line that read from file in grep command

I'm sorry for my poor English, first.
I want to read a file (tel.txt) that contains many tel numbers (a number per line) and use that line to grep command to search about the specific number in the source file (another file)!
I wrote this code :
dir="/home/mujan/Desktop/data/ADSL_CDR_Text_Parts_A"
file="$dir/tel.txt"
datafile="$dir/ADSL_CDR_Like_Tct4_From_960501_to_97501_Part0.txt"
while IFS= read -r line
do
current="$line"
echo `grep -F $current "$datafile" >> output.txt`
done < $file
the tel file sample :
44001547
44001478
55421487
but that code returns nothing!
when I declare 'current' variable with literals it works correctly!
what happened?!
Your grep command is redirected to write its output to a file, so you don't see it on the terminal.
Anyway, you should probably be using the much simpler and faster
grep -Ff "$file" "$datafile"
Add | tee -a output.txt if you want to save the output to a file and see it at the same time.
echo `command` is a buggy and inefficient way to write command. (echo "`command`" would merely be inefficient.) There is no reason to capture standard output into a string just so that you can echo that string to standard output.
Why don't you search for the line var directly? I've done some tests, this script works on my linux (CentOS 7.x) with bash shell:
#!/bin/bash
file="/home/mujan/Desktop/data/ADSL_CDR_Text_Parts_A/tel.txt"
while IFS= read -r line
do
echo `grep "$line" /home/mujan/Desktop/data/ADSL_CDR_Text_Parts_A/ADSL_CDR_Like_Tct4_From_960501_to_97501_Part0.tx >> output.txt`
done < $file
Give it a try... It shows nothing on the screen since you're redirecting the output to the file output.txt so the matching results are saved there.
You should use file descriptors when reading with while loop.instead use for loop to avoid false re-directions
dir="/home/mujan/Desktop/data/ADSL_CDR_Text_Parts_A"
file="$dir/tel.txt"
datafile="$dir/ADSL_CDR_Like_Tct4_From_960501_to_97501_Part0.txt"
for line in `cat $file`
do
current="$line"
echo `grep -F $current "$datafile" >> output.txt`
done

Linux: batch filename change adding creation date

i have a directory with a lot of sub-directories including files.
For each WAV file i would like to rename WAV file by adding creation date (date when file WAV has been firstly created) at the beginning of the file (without changing timestamps of file itself).
Next step would be to convert the WAV file to MP3 file, so i will save hard drive space.
for that purpose, i'm trying to create a bash script but i'm having some issues.
I want to keep the same structure as original directory and therefore i was thinking of something like:
for file in `ls -1 *.wav`
do name=`stat -c %y $file | awk -F"." '{ print $1 }' | sed -e "s/\-//g" -e "s/\://g" -e "s/[ ]/_/g"`.wav
cp -r --preserve=timestampcp $dir_original/$file $dir_converted/$name
done
Don't use ls to generate a list of file names, just let the shell glob them (that's what ls *.wav does anyway):
for file in ./*.wav ; do
I think you want the timestamp in the format YYYYMMDD_HHMMSS ?
You could use GNU date with stat to have a somewhat neater control of the output format:
epochtime=$(stat -c %Y "$file" )
name=$(date -d "#$epochtime" +%Y%m%d_%H%M%S).wav
stat -c %Y (or %y) gives the last modification date, but you can't really get the date of the file creation on Linux systems.
That cp looks ok, except for the stray cp at the end of timestampcp, but that must be a typo. If you do *.wav, the file names will be relative to current directory anyway, so no need to prefix with $dir_original/.
If you want to walk through a whole subdirectory, use Bash's globstar feature, or find. Something like this:
shopt -s globstar
cd "$sourcedir"
for file in ./**/*.wav ; do
epochtime=$(stat -c %Y "$file" )
name=$(date -d "#$epochtime" +%Y%m%d_%H%M%S).wav
dir=$(dirname "$file")
mkdir -p "$target/$dir"
cp -r --preserve=timestamp "$file" "$target/$dir/$name"
done
The slight inconvenience here is that cp can't create the directories in the path, so we need to use mkdir there. Also, I'm not sure if you wanted to keep the original filename as part of the resulting one, this would remove it and just replace the file names with the timestamp.
I did some experimenting with the calculation of name to see if I could get it more succinctly, and came up with this:
name=$(date "+%Y%m%d_%H%M%S" -r "$file")
I wanted to append all file names in that folder with the date they were created , and below works perfectly.
#############################
#!/bin/sh
for file in `ls *.JPG`;
do
mv -f "$file" "$(date -r "$file" +"%Y%m%d_%H_%M_%S")_"$file".jpg"
done
##############################

Extract part of a file name in bash

I have a folder with lots of files having a pattern, which is some string followed by a date and time:
BOS_CRM_SUS_20130101_10-00-10.csv (3 strings before date)
SEL_DMD_20141224_10-00-11.csv (2 strings before date)
SEL_DMD_SOUS_20141224_10-00-10.csv (3 strings before date)
I want to loop through the folder and extract only the part before the date and output into a file.
Output
BOS_CRM_SUS_
SEL_DMD_
SEL_DMD_SOUS_
This is my script but it is not working
#!/bin/bash
# script variables
FOLDER=/app/list/l088app5304d1/socles/Data/LEMREC/infa_shared/Shell/Check_Header_T24/
LOG_FILE=/app/list/l088app5304d1/socles/Data/LEMREC/infa_shared/Shell/Check_Header_T24/log
echo "Starting the programme at: $(date)" >> $LOG_FILE
# Getting part of the file name from FOLDER
for file in `ls $FOLDER/*.csv`
do
mv "${file}" "${file/date +%Y%m%d HH:MM:SS}" 2>&1 | tee -a $LOG_FILE
done #> $LOG_FILE
Use sed with extended-regex and groups to achieve this.
cat filelist | sed -r 's/(.*)[0-9]{8}_[0-9][0-9]-[0-9][0-9].[0-9][0-9].csv/\1/'
where filelist is a file with all the names you care about. Of course, this is just a placeholder because I don't know how you are going to list all eligible files. If a glob will do, for example, you can do
ls mydir/*.csv | sed -r 's/(.*)[0-9]{8}_[0-9][0-9]-[0-9][0-9].[0-9][0-9].csv/\1/'
Assuming you wont have numbers in the first part, you could use:
$ for i in *csv;do str=$(echo $i|sed -r 's/[0-9]+.*//'); echo $str; done
BOS_CRM_SUS_
SEL_DMD_
SEL_DMD_SOUS_
Or with parameter substitution:
$ for i in *csv;do echo ${i%_*_*}_; done
BOS_CRM_SUS_
SEL_DMD_
SEL_DMD_SOUS_
When you use ${var/pattern/replace}, the pattern must be a filename glob, not command to execute.
Instead of using the substitution operator, use the pattern removal operator
mv "${file}" "${file%_*-*-*.csv}.csv"
% finds the shortest match of the pattern at the end of the variable, so this pattern will just match the date and time part of the filename.
The substitution:
"${file/date +%Y%m%d HH:MM:SS}"
is unlikely to do anything, because it doesn't execute date +%Y%m%d HH:MM:SS. It just treats it as a pattern to search for, and it's not going to be found.
If you did execute the command, though, you would get the current date and time, which is also (apparently) not what you find in the filename.
If that pattern is precise, then you can do the following:
echo "${file%????????_??-??-??.csv}" >> "$LOG_FILE"
using grep:
ls *.csv | grep -Po "\K^([A-Za-z]+_)+"
output:
BOS_CRM_SUS_
SEL_DMD_
SEL_DMD_SOUS_

Clearing archive files with linux bash script

Here is my problem,
I have a folder where is stored multiple files with a specific format:
Name_of_file.TypeMM-DD-YYYY-HH:MM
where MM-DD-YYYY-HH:MM is the time of its creation. There could be multiple files with the same name but not the same time of course.
What i want is a script that can keep the 3 newest version of each file.
So, I found one example there:
Deleting oldest files with shell
But I don't want to delete a number of files but to keep a certain number of newer files. Is there a way to get that find command, parse in the Name_of_file and keep the 3 newest???
Here is the code I've tried yet, but it's not exactly what I need.
find /the/folder -type f -name 'Name_of_file.Type*' -mtime +3 -delete
Thanks for help!
So i decided to add my final solution in case anyone liked to get it. It's a combination of the 2 solutions given.
ls -r | grep -P "(.+)\d{4}-\d{2}-\d{2}-\d{2}:\d{2}" | awk 'NR > 3' | xargs rm
One line, super efficiant. If anything changes on the pattern of date or name just change the grep -P pattern to match it. This way you are sure that only the files fitting this pattern will get deleted.
Can you be extra, extra sure that the timestamp on the file is the exact same timestamp on the file name? If they're off a bit, do you care?
The ls command can sort files by timestamp order. You could do something like this:
$ ls -t | awk 'NR > 3' | xargs rm
THe ls -t lists the files by modification time where the newest are first.
The `awk 'NR > 3' prints out the list of files except for the first three lines which are the three newest.
The xargs rm will remove the files that are older than the first three.
Now, this isn't the exact solution. There are possible problems with xargs because file names might contain weird characters or whitespace. If you can guarantee that's not the case, this should be okay.
Also, you probably want to group the files by name, and keep the last three. Hmm...
ls | sed 's/MM-DD-YYYY-HH:MM*$//' | sort -u | while read file
do
ls -t $file* | awk 'NR > 3' | xargs rm
done
The ls will list all of the files in the directory. The sed 's/\MM-DD-YYYY-HH:MM//' will remove the date time stamp from the files. Thesort -u` will make sure you only have the unique file names. Thus
file1.txt-01-12-1950
file2.txt-02-12-1978
file2.txt-03-12-1991
Will be reduced to just:
file1.txt
file2.txt
These are placed through the loop, and the ls $file* will list all of the files that start with the file name and suffix, but will pipe that to awk which will strip out the newest three, and pipe that to xargs rm that will delete all but the newest three.
Assuming we're using the date in the filename to date the archive file, and that is possible to change the date format to YYYY-MM-DD-HH:MM (as established in comments above), here's a quick and dirty shell script to keep the newest 3 versions of each file within the present working directory:
#!/bin/bash
KEEP=3 # number of versions to keep
while read FNAME; do
NODATE=${FNAME:0:-16} # get filename without the date (remove last 16 chars)
if [ "$NODATE" != "$LASTSEEN" ]; then # new file found
FOUND=1; LASTSEEN="$NODATE"
else # same file, different date
let FOUND="FOUND + 1"
if [ $FOUND -gt $KEEP ]; then
echo "- Deleting older file: $FNAME"
rm "$FNAME"
fi
fi
done < <(\ls -r | grep -P "(.+)\d{4}-\d{2}-\d{2}-\d{2}:\d{2}")
Example run:
[me#home]$ ls
another_file.txt2011-02-11-08:05
another_file.txt2012-12-09-23:13
delete_old.sh
not_an_archive.jpg
some_file.exe2011-12-12-12:11
some_file.exe2012-01-11-23:11
some_file.exe2012-12-10-00:11
some_file.exe2013-03-01-23:11
some_file.exe2013-03-01-23:12
[me#home]$ ./delete_old.sh
- Deleting older file: some_file.exe2012-01-11-23:11
- Deleting older file: some_file.exe2011-12-12-12:11
[me#home]$ ls
another_file.txt2011-02-11-08:05
another_file.txt2012-12-09-23:13
delete_old.sh
not_an_archive.jpg
some_file.exe2012-12-10-00:11
some_file.exe2013-03-01-23:11
some_file.exe2013-03-01-23:12
Essentially, but changing the file name to dates in the form to YYYY-MM-DD-HH:MM, a normal string sort (such as that done by ls) will automatically group similar files together sorted by date-time.
The ls -r on the last line simply lists all files within the current working directly print the results in reverse order so newer archive files appear first.
We pass the output through grep to extract only files that are in the correct format.
The output of that command combination is then looped through (see the while loop) and we can simply start deleting after 3 occurrences of the same filename (minus the date portion).
This pipeline will get you the 3 newest files (by modification time) in the current dir
stat -c $'%Y\t%n' file* | sort -n | tail -3 | cut -f 2-
To get all but the 3 newest:
stat -c $'%Y\t%n' file* | sort -rn | tail -n +4 | cut -f 2-

Resources