get the date part from filname from a path in shell script

get the date part from filname from a path in shell script - string

I have a script as follows
pathtofile="/c/github/something/r1.1./myapp/*.txt"
echo $pathtofile
filename = ${pathtofile##*/}
echo $filename
i always have only one txt file as 2015-08-07.txt in the ../myapp/ directory. So the o/p is as follows:
/c/github/something/r1.1./myapp/2015-08-07.txt
*.txt
I need to extract the filename as 2015-08-07. i did follow a lot of the stack-overflow answers with same requirements. whats the best approach and how to do this to get the only date part of the filename from that path ?
FYI: the filename changes every time the script executed with today's date.

When you are saying:
pathtofile="/c/github/something/r1.1./myapp/*.txt"
you are storing the literal /c/github/something/r1.1./myapp/*.txt in a variable.
When you echo, this * gets expanded, so you see the results properly.
$ echo $pathtofile
/c/github/something/r1.1./myapp/2015-08-07.txt
However, if you quoted it you would see how the content is indeed a *:
$ echo "$pathtofile"
/c/github/something/r1.1./myapp/*.txt
So what you need to do is to store the value in, say, an array:
files=( /c/github/something/r1.1./myapp/*.txt )
This files array will be populated with the expansion of this expression.
Then, since you know that the array just contains an element, you can print it with:
$ echo "${files[0]}"
/c/github/something/r1.1./myapp/2015-08-07.txt
and then get the name by using Extract filename and extension in Bash:
$ filename=$(basename "${files[0]}")
$ echo "${filename%.*}"
2015-08-07

You are doing a lot for just getting the filename
$ find /c/github/something/r1.1./myapp/ -type f -printf "%f\n" | sed 's/\.txt//g'
2015-08-07

Related

How to auto insert a string in filename by bash?

I have the output file day by day:
linux-202105200900-foo.direct.tar.gz
The date and time string, ex: 202105200900 will change every day.
I need to manually rename these files to
linux-202105200900x86-foo.direct.tar.gz
( insert a short string x86 after date/time )
any bash script can help to do this?

If you're always inserting the string "x86" at character #18 in the string, you may use that command:
var="linux-202105200900-foo.direct.tar.gz"
var2=${var:0:18}"x86"${var:18}
echo $var2
The 2nd line means: "assign to variable var2 the first 18 characters of var, followed by x86 followed by the rest of the variable var"
If you want to insert "x86" just before the last hyphen in the string, you may write it like this:
var="linux-202105200900-foo.direct.tar.gz"
var2=${var%-*}"x86-"${var##*-}
echo $var2
The 2nd line means: "assign to variable var2:
the content of the variable var after removing the shortest matching pattern "-*" at the end
the string "x86-"
the content of the variable var after removing the longest matching pattern "*-" at the beginning

In addition to the very good answer by #Jean-Loup Sabatier another, perhaps more general way would simply be to replace the second occurrence of '-' with x86- which you can do with sed. Let's say you have:
fname=linux-202105200900-foo.direct.tar.gz
You can update that with:
fname="$(sed 's/-/x86-/2' <<< "$fname")"
Which simply uses a command substitution with sed and a herestring to modify fname assigning the modified result back to fname.
Example Use/Output
$ fname=linux-202105200900-foo.direct.tar.gz
fname="$(sed 's/-/x86-/2' <<< "$fname")"
echo $fname
linux-202105200900x86-foo.direct.tar.gz

Do you need this?
❯ dat=$(date '+%Y%m%d%H%M%S'); echo ${dat}
20210520170336
❯ filename="linux-${dat}x86-foo.direct.tar.gz"; echo ${filename}
linux-20210520170336x86-foo.direct.tar.gz

I wanted to go as simple as possible, considering only the timestamp is going to change, this script should do it. Just run it inside the folder where files are located and you'll get all of them renamed with x86.
#!/bin/bash
for file in $(ls); do
replaced=$(echo $file | sed 's|-foo|x86-foo|g')
mv $file $replaced
done
This is my output
filip#filip-ThinkPad-T14-Gen-1:~/test$ ls
linux-202105200900-foo.direct.tar.gz linux-202105201000-foo.direct.tar.gz linux-202105201100-foo.direct.tar.gz
filip#filip-ThinkPad-T14-Gen-1:~/test$ ./../development/bash-utils/bulk-rename.sh
filip#filip-ThinkPad-T14-Gen-1:~/test$ ls
linux-202105200900x86-foo.direct.tar.gz linux-202105201000x86-foo.direct.tar.gz linux-202105201100x86-foo.direct.tar.gz
Simply iterate through all the files in current folder and pipeline result to sed to replace regex -foo with x86-foo, then rename file with mv command.
As David mentioned in comment, if you're worried that there could be multiple occurrences of -foo then you can just replace g as global to 1 as first occurrence and that's it!

There is also the rename utility (https://man7.org/linux/man-pages/man1/rename.1.html), you could use:
rename -v 0-foo.direct.tar.gz 0x86-foo.direct.tar.gz *
which results in
`linux-202105200900-foo.direct.tar.gz' -> `linux-202105200900x86-foo.direct.tar.gz'
`linux-202205200900-foo.direct.tar.gz' -> `linux-202205200900x86-foo.direct.tar.gz'
`linux-202305200900-foo.direct.tar.gz' -> `linux-202305200900x86-foo.direct.tar.gz'

In addition to the very good answer by #David C. Rankin, just adding it in a loop and renaming the files
# !/usr/bin/bash
for file in `ls linux* 2>/dev/null` # Extract all files starting with linux
do
echo $file
fname="$(sed 's/-/x86-/2' <<< "$file")"
mv "$file" "$fname" # Rename file
done
Output recieved :
linux-202105200900x86-foo.direct.tar.gz

Bash script to get all file with desired extensions

I'm trying to write a bash script that if I pass a text file containing some extension and a folder returns me an output file with the list of all files that match the desired extension, searching recursively in all sub-directories
the folder is my second parameter the extension list file my first parameter
I have tried:
for i in $1 ; do
find . -name $2\*.$i -print>>result.txt
done
but doesn't work

As noted from in comment:
It is not a good idea to write to a hard coded file name.
The given example fixes only the given code from the OP question.
Yes of course, it is even better to call with
x.sh y . > blabla
and remove the filename from the script itself. But my intention is not to fix the question...
The following bash script, named as x.sh
#!/bin/bash
echo -n >result.txt # delete old content
while read i; do # read a line from file
find $2 -name \*.$i -print>>result.txt # for every item do a find
done <$1 # read from file named with first arg from cmdline
with an text file named y with following content
txt
sh
and called with:
./x.sh y .
results in a file result.txt which contents is:
a.txt
b.txt
x.sh
OK, lets give some additional hints as got from comments:
If the results fiel should not collect any other conntent from other results of the script it can be simplified to:
#!/bin/bash
while read i; do # read a line from file
find $2 -name \*.$i -print # for every item do a find
done <$1 >result.txt # read from file named with first arg from cmdline
And as already mentioned:
The hard coded result.txt could be removed and the call can be something like
./x.sh y . > result.txt

Give this one-liner command a try.
Replace /mydir with the folder to search.
Change the list of extensions passed as argument to the egrep command:
find /mydir -type f | egrep "[.]txt|[.]xml" >> result.txt
After the egrep, each extension should be separated with |.
. char must be escaped with [.]

Extract part of a file name in bash

I have a folder with lots of files having a pattern, which is some string followed by a date and time:
BOS_CRM_SUS_20130101_10-00-10.csv (3 strings before date)
SEL_DMD_20141224_10-00-11.csv (2 strings before date)
SEL_DMD_SOUS_20141224_10-00-10.csv (3 strings before date)
I want to loop through the folder and extract only the part before the date and output into a file.
Output
BOS_CRM_SUS_
SEL_DMD_
SEL_DMD_SOUS_
This is my script but it is not working
#!/bin/bash
# script variables
FOLDER=/app/list/l088app5304d1/socles/Data/LEMREC/infa_shared/Shell/Check_Header_T24/
LOG_FILE=/app/list/l088app5304d1/socles/Data/LEMREC/infa_shared/Shell/Check_Header_T24/log
echo "Starting the programme at: $(date)" >> $LOG_FILE
# Getting part of the file name from FOLDER
for file in `ls $FOLDER/*.csv`
do
mv "${file}" "${file/date +%Y%m%d HH:MM:SS}" 2>&1 | tee -a $LOG_FILE
done #> $LOG_FILE

Use sed with extended-regex and groups to achieve this.
cat filelist | sed -r 's/(.*)[0-9]{8}_[0-9][0-9]-[0-9][0-9].[0-9][0-9].csv/\1/'
where filelist is a file with all the names you care about. Of course, this is just a placeholder because I don't know how you are going to list all eligible files. If a glob will do, for example, you can do
ls mydir/*.csv | sed -r 's/(.*)[0-9]{8}_[0-9][0-9]-[0-9][0-9].[0-9][0-9].csv/\1/'

Assuming you wont have numbers in the first part, you could use:
$ for i in *csv;do str=$(echo $i|sed -r 's/[0-9]+.*//'); echo $str; done
BOS_CRM_SUS_
SEL_DMD_
SEL_DMD_SOUS_
Or with parameter substitution:
$ for i in *csv;do echo ${i%_*_*}_; done
BOS_CRM_SUS_
SEL_DMD_
SEL_DMD_SOUS_

When you use ${var/pattern/replace}, the pattern must be a filename glob, not command to execute.
Instead of using the substitution operator, use the pattern removal operator
mv "${file}" "${file%_*-*-*.csv}.csv"
% finds the shortest match of the pattern at the end of the variable, so this pattern will just match the date and time part of the filename.

The substitution:
"${file/date +%Y%m%d HH:MM:SS}"
is unlikely to do anything, because it doesn't execute date +%Y%m%d HH:MM:SS. It just treats it as a pattern to search for, and it's not going to be found.
If you did execute the command, though, you would get the current date and time, which is also (apparently) not what you find in the filename.
If that pattern is precise, then you can do the following:
echo "${file%????????_??-??-??.csv}" >> "$LOG_FILE"

using grep:
ls *.csv | grep -Po "\K^([A-Za-z]+_)+"
output:
BOS_CRM_SUS_
SEL_DMD_
SEL_DMD_SOUS_

How to remove the extension of a file?

I have a folder that is full of .bak files and some other files also. I need to remove the extension of all .bak files in that folder. How do I make a command which will accept a folder name and then remove the extension of all .bak files in that folder ?
Thanks.

To remove a string from the end of a BASH variable, use the ${var%ending} syntax. It's one of a number of string manipulations available to you in BASH.
Use it like this:
# Run in the same directory as the files
for FILENAME in *.bak; do mv "$FILENAME" "${FILENAME%.bak}"; done
That works nicely as a one-liner, but you could also wrap it as a script to work in an arbitrary directory:
# If we're passed a parameter, cd into that directory. Otherwise, do nothing.
if [ -n "$1" ]; then
cd "$1"
fi
for FILENAME in *.bak; do mv "$FILENAME" "${FILENAME%.bak}"; done
Note that while quoting your variables is almost always a good practice, the for FILENAME in *.bak is still dangerous if any of your filenames might contain spaces. Read David W.'s answer for a more-robust solution, and this document for alternative solutions.

There are several ways to remove file suffixes:
In BASH and Kornshell, you can use the environment variable filtering. Search for ${parameter%word} in the BASH manpage for complete information. Basically, # is a left filter and % is a right filter. You can remember this because # is to the left of %.
If you use a double filter (i.e. ## or %%, you are trying to filter on the biggest match. If you have a single filter (i.e. # or %, you are trying to filter on the smallest match.
What matches is filtered out and you get the rest of the string:
file="this/is/my/file/name.txt"
echo ${file#*/} #Matches is "this/` and will print out "is/my/file/name.txt"
echo ${file##*/} #Matches "this/is/my/file/" and will print out "name.txt"
echo ${file%/*} #Matches "/name.txt" and will print out "/this/is/my/file"
echo ${file%%/*} #Matches "/is/my/file/name.txt" and will print out "this"
Notice this is a glob match and not a regular expression match!. If you want to remove a file suffix:
file_sans_ext=${file%.*}
The .* will match on the period and all characters after it. Since it is a single %, it will match on the smallest glob on the right side of the string. If the filter can't match anything, it the same as your original string.
You can verify a file suffix with something like this:
if [ "${file}" != "${file%.bak}" ]
then
echo "$file is a type '.bak' file"
else
echo "$file is not a type '.bak' file"
fi
Or you could do this:
file_suffix=$(file##*.}
echo "My file is a file '.$file_suffix'"
Note that this will remove the period of the file extension.
Next, we will loop:
find . -name "*.bak" -print0 | while read -d $'\0' file
do
echo "mv '$file' '${file%.bak}'"
done | tee find.out
The find command finds the files you specify. The -print0 separates out the names of the files with a NUL symbol -- which is one of the few characters not allowed in a file name. The -d $\0means that your input separators are NUL symbols. See how nicely thefind -print0andread -d $'\0'` together?
You should almost never use the for file in $(*.bak) method. This will fail if the files have any white space in the name.
Notice that this command doesn't actually move any files. Instead, it produces a find.out file with a list of all the file renames. You should always do something like this when you do commands that operate on massive amounts of files just to be sure everything is fine.
Once you've determined that all the commands in find.out are correct, you can run it like a shell script:
$ bash find.out

rename .bak '' *.bak
(rename is in the util-linux package)

Caveat: there is no error checking:
#!/bin/bash
cd "$1"
for i in *.bak ; do mv -f "$i" "${i%%.bak}" ; done

You can always use the find command to get all the subdirectories
for FILENAME in `find . -name "*.bak"`; do mv --force "$FILENAME" "${FILENAME%.bak}"; done

Append date to filename in linux

I want add the date next to a filename ("somefile.txt"). For example: somefile_25-11-2009.txt or somefile_25Nov2009.txt or anything to that effect
Maybe a script will do or some command in the terminal window. I'm using Linux(Ubuntu).
The script or command should update the filename to a new date everytime you want to save the file into a specific folder but still keeping the previous files. So there would be files like this in the folder eventually: filename_18Oct2009.txt , filename_9Nov2009.txt , filename_23Nov2009.txt

Info/Summary
With bash scripting you can enclose commands in back ticks or parantheses. This works great for labling files, the following wil create a file name with the date appended to it.
Methods
Backticks -
$ echo myfilename-"`date +"%d-%m-%Y"`"
$(parantheses) - :
$ echo myfilename-$(date +"%d-%m-%Y")
Example Usage:
echo "Hello World" > "/tmp/hello-$(date +"%d-%m-%Y").txt"
(creates text file '/tmp/hello-28-09-2022.txt' with text inside of it)
Note, in Linux quotes are your friend, best practice to enclose the file name to prevent issues with spaces and such in variables.

There's two problems here.
1. Get the date as a string
This is pretty easy. Just use the date command with the + option. We can use backticks to capture the value in a variable.
$ DATE=`date +%d-%m-%y`
You can change the date format by using different % options as detailed on the date man page.
2. Split a file into name and extension.
This is a bit trickier. If we think they'll be only one . in the filename we can use cut with . as the delimiter.
$ NAME=`echo $FILE | cut -d. -f1
$ EXT=`echo $FILE | cut -d. -f2`
However, this won't work with multiple . in the file name. If we're using bash - which you probably are - we can use some bash magic that allows us to match patterns when we do variable expansion:
$ NAME=${FILE%.*}
$ EXT=${FILE#*.}
Putting them together we get:
$ FILE=somefile.txt
$ NAME=${FILE%.*}
$ EXT=${FILE#*.}
$ DATE=`date +%d-%m-%y`
$ NEWFILE=${NAME}_${DATE}.${EXT}
$ echo $NEWFILE
somefile_25-11-09.txt
And if we're less worried about readability we do all the work on one line (with a different date format):
$ FILE=somefile.txt
$ FILE=${FILE%.*}_`date +%d%b%y`.${FILE#*.}
$ echo $FILE
somefile_25Nov09.txt

cp somefile somefile_`date +%d%b%Y`

You can add date next to a filename invoking date command in subshell.
date command with required formatting options invoked the braces of $() or between the backticks (`…`) is executed in a subshell and the output is then placed in the original command.
The $(...) is more preferred since in can be nested. So you can use command substitution inside another substitution.
Solutions for requests in questions
$ echo somefile_$(date +%d-%m-%Y).txt
somefile_28-10-2021.txt
$ echo somefile_$(date +%d%b%Y).txt
somefile_28Oct2021.txt
The date command comes with many formatting options that allow you to customize the date output according to the requirement.
%D – Display date in the format mm/dd/yy (e.g. : 10/28/21)
%Y – Year (e.g. : 2021)
%m – Month (e.g. : 10)
%B – Month name in the full string format (e.g. : October)
%b – Month name in the shortened string format (e.g. : Oct)
%d – Day of month (e.g. : 28)
%j – Day of year (e.g. : 301)
%u – Day of the week (e.g. : 4)
%A – Weekday in full string format (e.g. : Thursday)
%a – Weekday in shortened format (e.g. : Thu)

I use this script in bash:
#!/bin/bash
now=$(date +"%b%d-%Y-%H%M%S")
FILE="$1"
name="${FILE%.*}"
ext="${FILE##*.}"
cp -v $FILE $name-$now.$ext
This script copies filename.ext to filename-date.ext, there is another that moves filename.ext to filename-date.ext, you can download them from here.
Hope you find them useful!!

I use it in raspberry pi, and the first answer doesn't work for me, maybe because I typed wrong or something? I don't know. So I combined the above answers and came up with this:
now=$(date +'%Y-%m-%d')
geany "OptionalName-${now}.txt"
That if you want to use geany or anything else

a bit more convoluted solution that fully matches your spec
echo `expr $FILENAME : '\(.*\)\.[^.]*'`_`date +%d-%m-%y`.`expr $FILENAME : '.*\.\([^.]*\)'`
where first 'expr' extracts file name without extension, second 'expr' extracts extension

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string