How do I create a recursive file list with md5sum in Linux and output to csv - linux

I would like to list the files (ideally with an md5sum) within a directory and subdirectories in Ubuntu and output the results to a csv file. I would like the output to be in the following format.
File Name, File Path, File Size (bytes), Created Date Time (dd/mm/yyyy hh:mm:ss), Modified Date Time (dd/mm/yyyy hh:mm:ss), md5sum
I have played around with the ls command but can seem to get the output correct. Is there a better way to do this?
Thanks

Create the following script that outputs a CSV line for a given filepath argument:
#!/bin/bash
set -eu
filepath=$1
qfilepath=${filepath//\\/\\\\} # Quote backslashes.
qfilepath=${qfilepath//\"/\\\"} # Quote doublequotes.
file=${qfilepath##*/} # Remove the path.
stats=($(stat -c "%s %W %Y" "$filepath"))
size=${stats[0]}
ctime=$(date --date #"${stats[1]}" +'%d/%m/%Y %H:%M:%S')
mtime=$(date --date #"${stats[2]}" +'%d/%m/%Y %H:%M:%S')
md5=$(md5sum < "$filepath")
md5=${md5%% *} # Remove the dash.
printf '"%s","%s",%s,%s,%s,%s\n' \
"$file" "$qfilepath" "$size" "$ctime" "$mtime" $md5
Now call it with
find /path/to/dir -type f -exec ~/csvline.sh {} \;
Note that the creation time is often not supported by the file system.

Related

How to extract the directory from full file path

I have the following script which prints various file stats, which was kindly supplied by another user (choroba) (link).
Is there a way that this can be amended to report just the directory name of each file and not the full file path with the file name? I have tried changing filepath with dirname and I get a series of errors saying No such file or directory. Thanks for any advice.
#!/bin/bash
set -eu
filepath=$1
qfilepath=${filepath//\\/\\\\} # Quote backslashes.
qfilepath=${qfilepath//\"/\\\"} # Quote doublequotes.
file=${qfilepath##*/} # Remove the path.
stats=($(stat -c "%s %W %Y" "$filepath"))
size=${stats[0]}
ctime=$(date --date #"${stats[1]}" +'%d/%m/%Y %H:%M:%S')
mtime=$(date --date #"${stats[2]}" +'%d/%m/%Y %H:%M:%S')
md5=$(md5sum < "$filepath")
md5=${md5%% *} # Remove the dash.
printf '"%s","%s",%s,%s,%s,%s\n' \
"$file" "$qfilepath" "$size" "$ctime" "$mtime" $md5
You can use a combination of dirname and basename, where:
dirname will strip the last component from the full path;
basename will get the last component from the path.
So to summarize: $(basename $(dirname $qfilepath)) will give you the name of the last directory in the path.
Or, for the full path without the file name - just $(dirname $qfilepath).
Although I do not see anything in the script snippet which would produce a recursive list, you can get the directory name output with adding dir=$(dirname $filepath) and modifying your printf output to use $dir instead of $file.

Changing File name Dynamically in linux bash

I want to change a filename "Domain_20181012230112.csv" to "Domain_12345_20181012230112.csv" where "Domain" and "12345" are constants while 20181012230112 is always gonna change but with fix length. In bash how can I do this
If all you want is to replace Domain_ with Domain_12345_, then just do
for file in Domain_*;
do
mv "$file" "${file/Domain_/Domain_12345_}"
done
You can make it even shorter if you know that there will only be one underscore:
...
mv "$file" "${file/_/_12345_}"
...
See string substitutions for more info.
You can use mv in a for loop, like this:
for file in Domain_??????????????.csv ; do ts=`echo ${file} | cut -c8-21`; mv ${file} Domain_12345_${ts}.csv; done
Given the one file of your example, this will essentially execute this command
mv Domain_20181012230112.csv Domain_12345_20181012230112.csv
You can simply use the date command to get the date and time information you want
date '+%Y-%m-%d %H:%M:%S'
# 2018-10-26 10:25:47
To then use the result within the filename, you can put it in `` to evaluate it inline, for example you can run
echo "Domain_12345_`date '+%Y-%m-%d %H:%M:%S'`"
# Domain_12345_2018-10-26 10:29:17
You can use the date's man page to figure out the option for milliseconds to add es well.
man date
There are different options like %m and %d for example that always have leading zeroes if necessary, so the file name length stays constant.
To then rename the file you can use the mv (move) command
mv "Domain_20181012230112.csv" "Domain_12345_`date '+%Y-%m-%d %H:%M:%S'`.csv"
Good luck with the rest of the exercise!

Linux: batch filename change adding creation date

i have a directory with a lot of sub-directories including files.
For each WAV file i would like to rename WAV file by adding creation date (date when file WAV has been firstly created) at the beginning of the file (without changing timestamps of file itself).
Next step would be to convert the WAV file to MP3 file, so i will save hard drive space.
for that purpose, i'm trying to create a bash script but i'm having some issues.
I want to keep the same structure as original directory and therefore i was thinking of something like:
for file in `ls -1 *.wav`
do name=`stat -c %y $file | awk -F"." '{ print $1 }' | sed -e "s/\-//g" -e "s/\://g" -e "s/[ ]/_/g"`.wav
cp -r --preserve=timestampcp $dir_original/$file $dir_converted/$name
done
Don't use ls to generate a list of file names, just let the shell glob them (that's what ls *.wav does anyway):
for file in ./*.wav ; do
I think you want the timestamp in the format YYYYMMDD_HHMMSS ?
You could use GNU date with stat to have a somewhat neater control of the output format:
epochtime=$(stat -c %Y "$file" )
name=$(date -d "#$epochtime" +%Y%m%d_%H%M%S).wav
stat -c %Y (or %y) gives the last modification date, but you can't really get the date of the file creation on Linux systems.
That cp looks ok, except for the stray cp at the end of timestampcp, but that must be a typo. If you do *.wav, the file names will be relative to current directory anyway, so no need to prefix with $dir_original/.
If you want to walk through a whole subdirectory, use Bash's globstar feature, or find. Something like this:
shopt -s globstar
cd "$sourcedir"
for file in ./**/*.wav ; do
epochtime=$(stat -c %Y "$file" )
name=$(date -d "#$epochtime" +%Y%m%d_%H%M%S).wav
dir=$(dirname "$file")
mkdir -p "$target/$dir"
cp -r --preserve=timestamp "$file" "$target/$dir/$name"
done
The slight inconvenience here is that cp can't create the directories in the path, so we need to use mkdir there. Also, I'm not sure if you wanted to keep the original filename as part of the resulting one, this would remove it and just replace the file names with the timestamp.
I did some experimenting with the calculation of name to see if I could get it more succinctly, and came up with this:
name=$(date "+%Y%m%d_%H%M%S" -r "$file")
I wanted to append all file names in that folder with the date they were created , and below works perfectly.
#############################
#!/bin/sh
for file in `ls *.JPG`;
do
mv -f "$file" "$(date -r "$file" +"%Y%m%d_%H_%M_%S")_"$file".jpg"
done
##############################

bash loop file echo to each file in the directory

I searched a while and tried it by myself but unable to get this sorted so far. My folder looks below, 4 files
1.txt, 2.txt, 3.txt, 4.txt, 5.txt, 6.txt
I want to print file modified time and echo the time stamp in it
#!/bin/bash
thedate= `ls | xargs stat -s | grep -o "st_mtime=[0-9]*" | sed "s/st_mtime=//g"` #get file modified time
files= $(ls | grep -Ev '(5.txt|6.txt)$') #exclud 5 and 6 text file
for i in $thedate; do
echo $i >> $files
done
I want to insert each timestamp to each file. but having "ambiguous redirect" error. am I doing it incorrectly? Thanks
In this case, files is a "list" of files, so you probably want to add another loop to handle them one by one.
Your description is slightly confusing but, if your intent is to append the last modification date of each file to that file, you can do something like:
for fspec in [1-4].txt ; do
stat -c %y ${fspec} >>${fspec}
done
Note I've used stat -c %y to get the modification time such as 2017-02-09 12:21:22.848349503 +0800 - I'm not sure what variant of stat you're using but mine doesn't have a -s option. You can still use your option, you just have to ensure it's done on each file in turn, probably something like (in the for loop above):
stat -s ${fspec} | grep -o "st_mtime=[0-9]*" | sed "s/st_mtime=//g" >>${fspec}
You can not redirect the output to several files as in > $files.
To process several files you need something like:
#!/bin/bash
for f in ./[0-4].txt ; do
# get file modified time (in seconds)
thedate="$(stat --printf='%Y\n' "$f")"
echo "$thedate" >> "$f"
done
If you want a human readable time format change %Y by %y:
thedate="$(stat --printf='%y\n' "$f")"

How can i format the output of stat expression in Linux Gnome Terminal?

I am really newbie in Linux(Fedora-20) and I am trying to learn basics
I have the following command
echo "`stat -c "The file "%n" was modified on ""%y" *Des*`"
This command returns me this output
The file Desktop was modified on 2014-11-01 18:23:29.410148517 +0000
I want to format it as this:
The file Desktop was modified on 2014-11-01 at 18:23
How can I do this?
You can't really do that with stat (unless you have a smart version of stat I'm not aware of).
With date
Very likely, your date is smart enough and handles the -r switch.
date -r Desktop +"The file Desktop was modified on %F at %R"
Because of your glob, you'll need a loop to handle all files that match *Des* (in Bash):
shopt -s nullglob
for file in *Des*; do
date -r "$file" +"The file ${file//%/%%} was modified on %F at %R"
done
With find
Very likely your find has a rich -printf option:
find . -maxdepth 1 -name '*Des*' -printf 'The file %f was modified on %TY-%Tm-%Td at %TH:%TM\n'
I want to use stat
(because your date doesn't handle the -r switch, you don't want to use find or just because you like using as most tools as possible to impress your little sister). Well, in that case, the safest thing to do is:
date -d "#$(stat -c '%Y' Desktop)" +"The file Desktop was modified on %F at %R"
and with your glob requirement (in Bash):
shopt -s nullglob
for file in *Des*; do
date -d "#$(stat -c '%Y' -- "$file")" +"The file ${file//%/%%} was modified on %F at %R"
done
stat -c "The file "%n" was modified on ""%y" *Des* | awk 'BEGIN{OFS=" "}{for(i=1;i<=7;++i)printf("%s ",$i)}{print "at " substr($8,0,6)}'
I have use here awk modify your code. what i have done in this code, from field 1,7 i printed it using for loop, i need to modify field 8, so i used substr to extract 1st 5 character.

Resources