Shell script to list all files in a directory [duplicate]

Shell script to list all files in a directory [duplicate] - linux

This question already has answers here:
How to get the list of files in a directory in a shell script?
(11 answers)
Closed 12 months ago.
I am using the following code :
#!/bin/bash
for f in $1 ; do
echo $f
done
The aim is to list down all the files in the directory that is passed as an argument to this script. But it's not printing anything. Not sure what could be wrong with this.

Try this Shellcheck-clean pure Bash code for the "further plan" mentioned in a comment:
#! /bin/bash -p
# List all subdirectories of the directory given in the first positional
# parameter. Include subdirectories whose names begin with dot. Exclude
# symlinks to directories.
shopt -s dotglob
shopt -s nullglob
for d in "$1"/*/; do
dir=${d%/} # Remove trailing slash
[[ -L $dir ]] && continue # Skip symlinks
printf '%s\n' "$dir"
done
shopt -s dotglob causes shell glob patterns to match names that begin with a dot (.). (find does this by default.)
shopt -s nullglob causes shell glob patterns to expand to nothing when nothing matches, so looping over glob patterns is safe.
The trailing slash on the glob pattern ("$1"/*/) causes only directories (including symlinks to directories) to be matched. It's removed (dir=${d%/}) partly for cleanliness but mostly to enable the test for a symlink ([[ -L $dir ]]) to work.
See the accepted, and excellent, answer to Why is printf better than echo? for an explanation of why I used printf instead of echo to print the subdirectory paths.

If you only need to list files not directories. (this part is unclear to me.) find is your friend.
find $1 -depth 1 -type file
Returns:
./output.tf
./locals.tf
./main.tf
./.tflint.hcl
./versions.tf
./.pre-commit-config.yaml
./makefile
./.terraformignore
./jenkins.tf
./devops.tf
./README.md
./.gitignore
./variables.tf
./Jenkinsfile
./accounts.tf
./.terraform.lock.hcl
Furthermore, please run man find.

Related

Moving files to subfolders based on prefix in bash

I currently have a long list of files, which look somewhat like this:
Gmc_W_GCtl_E_Erz_Aue_Dl_281_heart_xerton
Gmc_W_GCtl_E_Erz_Aue_Dl_254_toe_taixwon
Gmc_W_GCtl_E_Erz_Homersdorf_Dl_201_head_xaubadan
Gmc_W_GCtl_E_Erz_Homersdorf_Dl_262_bone_bainan
Gmc_W_GCtl_E_Thur_Peuschen_Dl_261_blood_blodan
Gmc_W_GCtl_E_Thur_Peuschen_Dl_281_heart_xerton
The naming pattern all follow the same order, where I'm mainly seeking to group the files based on the part with "Aue", "Homersdorf", "Peuschen", and so forth (there are many others down the list), with the position of these keywords being always the same (e.g. they are all followed by Dl; they are all after the fifth underscore...etc.).
All the files are in the same folder, and I am trying to move these files into subfolders based on these keywords in bash, but I'm not quite certain how. Any help on this would be appreciated, thanks!

I am guessing you want something like this:
$ find . -type f | awk -F_ '{system("mkdir -p "$5"/"$6";mv "$0" "$5"/"$6)}'
This will move say Gmc_W_GCtl_E_Erz_Aue_Dl_281_heart_xerton into /Erz/Aue/Gmc_W_GCtl_E_Erz_Aue_Dl_281_heart_xerton.

Using the bash shell with a for loop.
#!/usr/bin/env bash
shopt -s nullglob
for file in Gmc*; do
[[ -d $file ]] && continue
IFS=_ read -ra dir <<< "$file"
echo mkdir -pv "${dir[4]}/${dir[5]}" || exit
echo mv -v "$file" "${dir[4]}/${dir[5]}" || exit
done
Place the script inside the directory in question make it executable and execute it.
Remove the echo's so it create the directories and move the files.

`mv somedir/* someotherdir` when somedir is empty

I am writing an automated bash script that moves some files from one directory to another directory, but the first directory may be empty:
$ mv somedir/* someotherdir/
mv: cannot stat 'somedir/*': No such file or directory
How can I write this command without generating an error if the directory is empty? Should I just use rm and cp instead? I could write a conditional check to see if the directory is empty first, but that feels overweight.
I'm surprised the command fails if the directory is empty, so I'm trying to find out if I'm missing some simple solution.
Environment:
bash
RHEL

If you really want full control over the process, it might look like:
#!/usr/bin/env bash
# ^^^^- bash, not sh
restore_nullglob=$(shopt -p nullglob) # store the initial state of the nullglob setting
shopt -s nullglob # unconditionally enable nullglob
source_files=( somedir/* ) # store matching files in an array
if (( ${#source_files[#]} )); then # if that array isn't empty...
mv -- "${source_files[#]}" someotherdir/ # ...move the files it contains...
else # otherwise...
echo "No files to move; doing nothing" >&2 # ...write an error message.
fi
eval "$restore_nullglob" # restore nullglob to its original setting
Explaining the moving parts:
When nullglob is set, the shell expands *.txt to an empty list if no .txt files exist; otherwise (by default), it expands *.txt to the string *.txt when there are no matching files.
source_files is an array above -- bash's native mechanism to store a list. ${#source_files[#]} expands to the length of that array, whereas ${source_files[#]} on its own expands to its contents.
(( )) creates an arithmetic context, in which expressions are treated as math. In such a context, 0 is falsey, and positive numbers are truthy. Thus, if (( ${#source_files[#]} )) is true only if there is more than one file listed in the array source_files.
BTW, note that saving and restoring nullglob isn't really essential in an independent script; the purpose of showing how to do it is so you can safely use this code in larger scripts that might make assumptions about whether or not nullglob is set, without disrupting other code.

find somedir -type f -exec mv -t someotherdir/. '{}' +
Saves you the check, may not be what you want, though.

Are you aware of the output stream and the error stream? Output stream has number 1, while error stream has number 2. In case you don't want to see a result, you can redirect that result to the garbage bin.
Excuse me?
Well, let's have a look at this case: when the directory is empty, an error is generated and that error is shown in the error stream (2). You can redirect this, using 2>/dev/null (/dev/null being the UNIX/Linux garbage bin), so your command becomes:
$ mv somedir/* someotherdir/ 2>/dev/null

Following up on Dominique, to report all errors except the empty directory one use:
mv somedir/* someotherdir 2>&1 | grep -v No.such

How to replace date part in filename with current date

How to replace only date part to current date of all files present in diretory in unix.
Folder path: C:/shan
Sample files:
CN_Apria_837p_20180924.txt
DN_Apria_837p_20150502.txt
GN_Apria_837p_20160502.txt
CH_Apria_837p_20170502.txt
CU_Apria_837p_20180502.txt
PN_Apria_837p_20140502.txt
CN_Apria_837p_20101502.txt
Desired result should be:
CN_Apria_837p_20190502.txt
DN_Apria_837p_20190502.txt
GN_Apria_837p_20190502.txt
CH_Apria_837p_20190502.txt
CU_Apria_837p_20190502.txt
PN_Apria_837p_20190502.txt
CN_Apria_837p_20190502.txt
Edit:
I'm completely new to unix sell scripting. I tried this below, however it's not working.
#!/bin/bash
for i in ls $1 | grep -E '[0-9]{4}-[0-9]{2}-[0-9]{2}'
do
x=echo $i | grep -oE '[0-9]{4}-[0-9]{2}-[0-9]{2}'
y=echo $i | sed "s/$x/$(date +%F)/g"
mv $1/$i $1/$y 2>/dev/null #incase if old date is same as current date
done

I would use regular expressions here. From the bash man-page:
An additional binary operator, =~, is available, with the same
precedence as == and !=. When it is used, the string to the right
of the operator is considered an extended regular expression and
matched accordingly (as in regex(3)). The return value is 0 if the
string matches the pattern, and 1 otherwise. .... Substrings
matched by parenthesized subexpressions within the regular
expression are saved in the array variable BASH_REMATCH. ...
The element of BASH_REMATCH with indexn is the portion of the
string matching the nth parenthesized sub-expression.
Hence, assuming that the variable x holds the name of one of the files
in question, the code
if [[ $x =~ ^(.*_)[0-9]+([.]txt$) ]]
then
mv "$x" "$BASH_REMATCH[1]$(date +%Y%m%d)$BASH_REMATCH[2]"
fi
first tests roughly whether the file indeed follows the required naming scheme, and then modifies the name accordingly.
Of course in practice, you will tailor the regexp to match your application better. Only you can know what variations in the file name are permitted.

The below should do this
for f in $(find /path/to/files -name "*_*_*_*.txt")
do
newname=$(echo "$f" | sed -r "s/[12][0-9]{3}[01][0-9][0-3][0-9]/$(date '+%Y%m%d')/g")
mv "$f" "$newname"
done

Try this Shellcheck-clean code:
#! /bin/bash -p
readonly dir=$1
shopt -s nullglob # Make glob patterns that match nothing expand to nothing
readonly dateglob='20[0-9][0-9][0-9][0-9][0-9][0-9]'
currdate=$(date '+%Y%m%d')
# shellcheck disable=SC2231
for path in "$dir"/*_${dateglob}.* ; do
name=${path##*/}
newname=${name/_${dateglob}./_${currdate}.}
if [[ $newname != "$name" ]] ; then
newpath="$dir/$newname"
printf "%q -> %q\\n" "$path" "$newpath"
mv -i -- "$path" "$newpath"
fi
done
shopt -s nullglob stops the code trying to process a garbage path if nothing matches the glob pattern in for path in ....
The pattern assigned to dateglob assumes that you will not have to process dates before 2000 (or after 2099!). Change it if that assumption is not valid.
The # shellcheck ... line is to prevent Shellcheck warning about the use of ${dateglob} without quotes. The quotes would be wrong in this case because they would prevent the glob pattern being expanded.
The pattern used to match filenames (*_${dateglob}.*) will match many more forms of filename than the examples given (e.g. A_20180313.tar.gz). You might want to change it.
See Removing part of a string (BashFAQ/100 (How do I do string manipulation in bash?)) for information about the Bash string manipulation mechanisms used (${path##...}, ${name/...}).
I've added a printf to output details of what is being moved.
The -i option to mv prompts for confirmation if a file would be overwritten. This turns out to be an issue for the example files because both CN_Apria_837p_20180924.txt and CN_Apria_837p_20101502.txt are identical except for the date, so the code tries to rename them to the same thing.
If any of the files with dates in their names have names beginning with '.', the code will not process them. Add line shopt -s dotglob somewhere before the loop if that is an issue.

How to delete numbers, dashes and underscores in the beginning of a file name

I have thousands of mp3 files but all with unusual file names such as 1-2songone.mp3, 2songtwo.mp3, 2_2_3_songthree.mp3. I want to remove all the numbers, dashes and underscores in the beginning of these files and get the result:
songone.mp3
songtwo.mp3
songthree.mp3

This can be done using extended globbing:
$ ls
1-2songone.mp3 2_2_3_songthree.mp3 2songtwo.mp3
$ shopt -s extglob
$ for fname in *.mp3; do mv -- "$fname" "${fname##*([-_[:digit:]])}"; done
$ ls
songone.mp3 songthree.mp3 songtwo.mp3
This uses parameter expansion: ${fname##pattern} removes the longest possible match from the beginning of fname. As the pattern, we use *([-_[:digit:]]), where *(pattern) stands for "zero or more matches of pattern", and the actual pattern is a bracket expression for hyhpens, underscores and digits.
Remarks:
The -- after mv indicates the end of options for move and makes sure that filenames starting with - aren't interpreted as options.
The *() expression requires the extglob shell option. As pointed out, if you don't want extended globs later, you have to unset it again with shopt -u extglob.
As per Gordon Davisson's comment: this will clobber files if you have, for example, something like 1file.mp3 and 2file.mp3. To avoid that, you can either use mv -i (or --interactive), which will prompt you before overwriting a file, or mv -n (or --noclobber), which will just not overwrite any files.
triplee points out that this needlessly moves files onto themselves if they don't start with slash, underscore or digit. To avoid that, we can iterate only over matching files with
for fname in [-_[:digit:]]*.mp3; do mv -- "$fname" "${fname##*([-_[:digit:]])}"; done
which makes sure that there is something to rename.

Benjamin W.'s answer is helpful and efficient, but has two drawbacks:
It requires setting global shell option extglob, which should be restored to its previous value afterward (the alternative, at the cost of creating an extra process, is to use a subshell: (shopt -s extglob; for fname ...)).
The extglob syntax, an extension to regular glob syntax, is familiar to few people and still less powerful than true regular expressions.
Using Bash's regex-matching operator, =~:
for f in *.mp3; do [[ $f =~ ^[0-9_-]+(.+)$ ]] && echo mv "$f" "${BASH_REMATCH[1]}"; done
Remove the echo to perform actual renaming.
$f =~ ^[0-9_-]+(.+)$ matches the longest nonempty sequence of digits, hyphens, and underscores at the start of the filename, followed by any nonempty sequence of characters captured in a parenthesized subexpression (capture group).
If the match succeeds (&&), the mv command is invoked, with the captured subexpression - accessible via element 1 of special BASH array variable ${BASH_REMATCH[#]} - forming the target filename.

You may do it this way too :
find . -type f -name "*.mp3" -print0 | while read -r -d '' line
do
mv "$line" "$( sed -E 's!(.*)/[^[:alpha:]]*([[:alpha:]].*mp3)$!\1/\2!' <<<"$line")" 2>/dev/null
done
Using sed gives you more control over the regex, I guess. Also, the 2>/dev/null is for ignoring the mv error for already converted/correct filenames.
Note:
This will recursively change the filenames across subfolders too.

Recursively look for files with a specific extension

I'm trying to find all files with a specific extension in a directory and its subdirectories with my bash (Latest Ubuntu LTS Release).
This is what's written in a script file:
#!/bin/bash
directory="/home/flip/Desktop"
suffix="in"
browsefolders ()
for i in "$1"/*;
do
echo "dir :$directory"
echo "filename: $i"
# echo ${i#*.}
extension=`echo "$i" | cut -d'.' -f2`
echo "Erweiterung $extension"
if [ -f "$i" ]; then
if [ $extension == $suffix ]; then
echo "$i ends with $in"
else
echo "$i does NOT end with $in"
fi
elif [ -d "$i" ]; then
browsefolders "$i"
fi
done
}
browsefolders "$directory"
Unfortunately, when I start this script in terminal, it says:
[: 29: in: unexpected operator
(with $extension instead of 'in')
What's going on here, where's the error?
But this curly brace

find "$directory" -type f -name "*.in"
is a bit shorter than that whole thing (and safer - deals with whitespace in filenames and directory names).
Your script is probably failing for entries that don't have a . in their name, making $extension empty.

find {directory} -type f -name '*.extension'
Example: To find all csv files in the current directory and its sub-directories, use:
find . -type f -name '*.csv'

The syntax I use is a bit different than what #Matt suggested:
find $directory -type f -name \*.in
(it's one less keystroke).

Without using find:
du -a $directory | awk '{print $2}' | grep '\.in$'

Though using find command can be useful here, the shell itself provides options to achieve this requirement without any third party tools. The bash shell provides an extended glob support option using which you can get the file names under recursive paths that match with the extensions you want.
The extended option is extglob which needs to be set using the shopt option as below. The options are enabled with the -s support and disabled with he -u flag. Additionally you could use couple of options more i.e. nullglob in which an unmatched glob is swept away entirely, replaced with a set of zero words. And globstar that allows to recurse through all the directories
shopt -s extglob nullglob globstar
Now all you need to do is form the glob expression to include the files of a certain extension which you can do as below. We use an array to populate the glob results because when quoted properly and expanded, the filenames with special characters would remain intact and not get broken due to word-splitting by the shell.
For example to list all the *.csv files in the recursive paths
fileList=(**/*.csv)
The option ** is to recurse through the sub-folders and *.csv is glob expansion to include any file of the extensions mentioned. Now for printing the actual files, just do
printf '%s\n' "${fileList[#]}"
Using an array and doing a proper quoted expansion is the right way when used in shell scripts, but for interactive use, you could simply use ls with the glob expression as
ls -1 -- **/*.csv
This could very well be expanded to match multiple files i.e. file ending with multiple extension (i.e. similar to adding multiple flags in find command). For example consider a case of needing to get all recursive image files i.e. of extensions *.gif, *.png and *.jpg, all you need to is
ls -1 -- **/+(*.jpg|*.gif|*.png)
This could very well be expanded to have negate results also. With the same syntax, one could use the results of the glob to exclude files of certain type. Assume you want to exclude file names with the extensions above, you could do
excludeResults=()
excludeResults=(**/!(*.jpg|*.gif|*.png))
printf '%s\n' "${excludeResults[#]}"
The construct !() is a negate operation to not include any of the file extensions listed inside and | is an alternation operator just as used in the Extended Regular Expressions library to do an OR match of the globs.
Note that these extended glob support is not available in the POSIX bourne shell and its purely specific to recent versions of bash. So if your are considering portability of the scripts running across POSIX and bash shells, this option wouldn't be right.

find "$PWD" -type f -name "*.in"

There's a { missing after browsefolders ()
All $in should be $suffix
The line with cut gets you only the middle part of front.middle.extension. You should read up your shell manual on ${varname%%pattern} and friends.
I assume you do this as an exercise in shell scripting, otherwise the find solution already proposed is the way to go.
To check for proper shell syntax, without running a script, use sh -n scriptname.

To find all the pom.xml files in your current directory and print them, you can use:
find . -name 'pom.xml' -print

find $directory -type f -name "*.in"|grep $substring

for file in "${LOCATION_VAR}"/*.zip
do
echo "$file"
done

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string