How to find out if ls command output is file or a directory Bash - linux

ls command outputs everything that is contained in current directory. For example ls -la will output something like this
drwxr-xr-x 3 user user 4096 dec 19 17:53 .
drwxr-xr-x 15 user user 4096 dec 19 17:39 ..
drwxrwxr-x 2 user user 4096 dec 19 17:53 tess (directory)
-rw-r--r-- 1 user user 178 dec 18 21:52 file (file)
-rw-r--r-- 1 user user 30 dec 18 21:47 text (file)
And what if I want to know how much space does all files consume. For that I would have to sum $5 from all lines with ls -la | awk '{ sum+=$5 } END{print sum}'. So how can I only sum size of files and leave directories behind?

You can use the following :
find . -maxdepth 1 -type f -printf '%s\n' | awk '{s+=$1} END {print s}'
The find command selects all the files in the current directory and output their size. The awk command sums the integers and output the total.

Don't.
One of the most quoted pages on SO that I've seen is https://unix.stackexchange.com/questions/128985/why-not-parse-ls-and-what-do-to-instead.
That being said and as a hint for further development, ls -l | awk '/^-/{s+=$5} END {print s}' will probably do what you ask.

Related

how to get previous date files and pass ls output to array in gawk

I have log files like below generated, and I need to daily run script ,which will list them , and then do 2 things.
1- get previous / yesterday files and transfer them to x server
2- get files older than one day and transfer them to y server
files are like below and I am trying below code but not working.
how can we pass ls -altr output to gawk ? can we built an associate array like below.
array[index]=ls -altr | awk '{print $6,$7,$8}'
code I am trying to retrieve previous date files , but not working
previous_dates=$(date -d "-1 days" '+-%d')
ls -altr |gawk '{if ( $7!=previous_dates ) print $9 }'
-r-------- 1 root root 6291563 Jun 22 14:45 audit.log.4
-r-------- 1 root root 6291619 Jun 24 09:11 audit.log.3
drwxr-xr-x. 14 root root 4096 Jun 26 03:47 ..
-r-------- 1 root root 6291462 Jun 26 04:15 audit.log.2
-r-------- 1 root root 6291513 Jun 27 23:05 audit.log.1
drwxr-x---. 2 root root 4096 Jun 27 23:05 .
-rw------- 1 root root 5843020 Jun 29 14:57 audit.log
To select files modified yesterday, you could use
find . -daystart -type f -mtime 1
and to select older files, you could use
find . -daystart -type f -mtime +1
possibly adding a -name test to select only files like audit.log*, for example. You could then use xargs to process the files, e.g.
find . -daystart -type f -mtime 1 | xargs -n 1 -I{} scp {} user#server

How to get the name of the executables files in bash with ls

I try to get the name of the executable files using ls -l.
Then I tried to get the lines of ls -l which have a x using grep -w x but the result is not right : some executable files are missing (the .sh).
I just need the name of the executable files not the path but I don't know how ...
user#user-K53TA:~/Bureau$ ls -l
total 52
-rwxrwxrwx 1 user user 64 oct. 6 21:07 a.sh
-rw-rw-r-- 1 user user 11 sept. 29 21:51 e.txt
-rwxrwxrwx 1 user user 140 sept. 29 23:42 hi.sh
drwxrwxr-x 8 user user 4096 juil. 30 20:47 nerdtree-master
-rw-rw-r-- 1 user user 492 oct. 6 21:07 okk.txt
-rw-rw-r-- 1 user user 1543 oct. 6 21:07 ok.txt
-rw-rw-r-- 1 user user 119 sept. 29 23:27 oo.txt
-rwxrwxr-x 1 user user 8672 sept. 29 21:20 prog
-rw-rw-rw- 1 user user 405 sept. 29 21:23 prog.c
-rw-rw-r-- 1 user user 0 sept. 29 21:58 rev
drwxrwxr-x 3 user user 4096 sept. 29 20:51 sublime
user#user-K53TA:~/Bureau$ ls -l | grep -w x
drwxrwxr-x 8 user user 4096 juil. 30 20:47 nerdtree-master
-rwxrwxr-x 1 user user 8672 sept. 29 21:20 prog
drwxrwxr-x 3 user user 4096 sept. 29 20:51 sublime
Don't parse ls. This can be done with find.
find . -type f -perm /a+x
This finds files with any of the executable bits set: user, group, or other.
Use find instead:
find -executable
find -maxdepth 1 -type f -executable
find -maxdepth 1 -type f -executable -ls
One can use a for loop with glob expansion for discovering and manipulating file names. Observe:
#!/bin/sh
for i in *
do # Only print discoveries that are executable files
[ -f "$i" -a -x "$i" ] && printf "%s\n" "$i"
done
Since the accepted answer uses no ls at all:
ls -l | grep -e '^...x'

parse result of `ls -l` with bash script [duplicate]

This question already has answers here:
How to loop over files in directory and change path and add suffix to filename
(6 answers)
Closed 5 years ago.
I need to store the name of every file contained in a directory with a bash script and processes it in some way:
drwxrwxr-x 5 matteorr matteorr 4096 Jan 10 17:37 Cluster
drwxr-xr-x 2 matteorr matteorr 4096 Jan 19 10:43 Desktop
drwxrwxr-x 9 matteorr matteorr 4096 Jan 20 10:01 Developer
drwxr-xr-x 11 matteorr matteorr 4096 Dec 20 13:55 Documents
drwxr-xr-x 2 matteorr matteorr 12288 Jan 20 13:44 Downloads
drwx------ 11 matteorr matteorr 4096 Jan 20 14:01 Dropbox
drwxr-xr-x 2 matteorr matteorr 4096 Oct 18 18:43 Music
drwxr-xr-x 2 matteorr matteorr 4096 Jan 19 22:12 Pictures
drwxr-xr-x 2 matteorr matteorr 4096 Oct 18 18:43 Public
drwxr-xr-x 2 matteorr matteorr 4096 Oct 18 18:43 Templates
drwxr-xr-x 2 matteorr matteorr 4096 Oct 18 18:43 Videos
with the following command I'm able to split the result of ls -l in between all the spaces and then access the last element, which contains the name:
ls -l | awk '{split($0,array," ")} END{print array[9]}'
However it returns only the last line (i.e. Videos) so I need to iterate it over all the lines returned by the ls -l command.
how can I do this?
Is there a better way to approach this whole problem?
ADDED PART
To be a little more specific on what I need to do:
For all the files contained in a directory if it is a file I won't do anything, if it is a directory I should append the name of the directory to all the files it contains.
So supposing the directory Videos has the files:
-rwxr-xr-x 2 matteorr matteorr 4096 Oct 18 18:43 video1.mpeg
-rwxr-xr-x 2 matteorr matteorr 4096 Oct 18 18:43 Video2.wmv
I need to rename them as follows:
-rwxr-xr-x 2 matteorr matteorr 4096 Oct 18 18:43 video1_Videos.mpeg
-rwxr-xr-x 2 matteorr matteorr 4096 Oct 18 18:43 Video2_Videos.wmv
A better way would be to use bash globbing
Just listing all files
echo *
Or doing something with them
for file in *; do
echo "$file" # or do something else
done
Or recursively with bash 4+
shopt -s globstar
for file in **/*; do
echo "$file" # or do something else
done
Update to get directory name and append it to all files within it
Replace mv with an echo to test what it does. Also note ${file##*.} assumes the extension is everything after the last period, so if you had a file like file.tar.gz in directory on, below would turn it into file.tar_on.gz. As far as I know there is no easy way to handle this problem, though you could skip files with multiple . if you want)
#!/bin/bash
d="/some/dir/to/do/this/on"
name=${d##*/} #name=on
for file in "$d"/*; do
extension=${file##*.}
filename=${file%.*}
filename=${filename##*/}
[[ -f $file ]] && mv "$file" "$d/${filename}_${name}.$extension"
done
e.g.
> ls /some/dir/to/do/this/on
video1.mpeg Video2.wmv
> ./abovescript
> ls /some/dir/to/do/this/on
video1_on.mpeg Video2_on.wmv
Explanation
In bash you can do this
${parameter#word} Removes shortest matching prefix
${parameter##word} Removes longest matching prefix
${parameter%word} Removes shortest matching suffix
${parameter%%word} Removes longest matching suffix
To remove everything anything (*) before and including the last period, I did below
extension=${file##*.}
To remove everything including and from the last period, I did below (think about shortest match here as going from right to left, e.g. * looks for any non-period text right to left, then when it finds a period it removes that whole section)
filename=${file%.*}
To remove everything up to and including the last /, I did below.
filename=${filename##*/}
Some other notes:
"$d/${filename}_${name}.$extension" Variables can have _ so I switched syntax for a couple of variables here for it to work
"$d"/* Expands to every file of any type (regular, dir, symlink etc...) directly in "$d"
What is wrong with
ls > myfile.txt
This will only list the file names (nothing else) and send them to myfile.txt
If you want to go the awk route, just do
ls -l | awk '{print $9}'
The default action of awk is to split fields on space - and this prints the 9th field for every line…
If you want to do other things with the file names, you can just extend your awk script. For example, an array with these file names could be created with
ls -l | awk '{a[NR]=$9}'
and you can use this array (called a) in further processing. If the processing requires something other than awk (from the comments I think it does), you would be better off with something that looks like
#!/bin/bash
for f in $1"/"*
do
if [ -d "$f" ] ; then
./listdir $f
else
echo $f
fi
done
Save this as listdir in your current directory, and you're good to go.
./listdir .
Will list the entire directory, recursing down (with full relative path appended) as needed.
If you want this to be available "from anywhere" (it is a pretty useful command after all) you would put it somewhere in your path (and do a "rehash" command so it will be "known"); then you don't need the ./ at the start of the command.
Good question! Glad you asked. Parsing ls's output is rarely the right thing to do. There are myriad ways to process a list of files. It depends what you want to do with them.
Here are some examples of things you can do. I've used touch as an example command. Replace that with whatever command or commands you want to do.
To run a command over multiple files, often you can simply pass all the files on the command-line.
touch /var/myapp/*
To loop over the files in the current directory:
for file in *; do
touch "$file"
done
To loop over files in another directory:
for file in /some/dir/*; do
touch "$file"
done
To rename files named *.txt to '*.bak', both here and in sub-directories:
find . -name '*.txt' -exec mv {} {}.bak \;
To delete JPEGs in Bob's home directory (damn you Bob and your wandering eyes):
find ~bob/ -name '*.jpg' -delete
To loop over files recursively and do complicated things to them:
find /dir/to/search -print0 | while read -d $'\0' file; do
echo "$file"
touch "$file"
if [[ -L $file ]]; then
# $file is a symlink, do something special
fi
done
ls -l | awk '{split($0,array," ")} {print array[9]}'
or
ls -l | awk '{print $9}'
but why not just ls?

counting number of directories in a specific directory

How to count the number of folders in a specific directory. I am using the following command, but it always provides an extra one.
find /directory/ -maxdepth 1 -type d -print| wc -l
For example, if I have 3 folders, this command provides 4. If it contains 5 folders, the command provides 6. Why is that?
find is also printing the directory itself:
$ find .vim/ -maxdepth 1 -type d
.vim/
.vim/indent
.vim/colors
.vim/doc
.vim/after
.vim/autoload
.vim/compiler
.vim/plugin
.vim/syntax
.vim/ftplugin
.vim/bundle
.vim/ftdetect
You can instead test the directory's children and do not descend into them at all:
$ find .vim/* -maxdepth 0 -type d
.vim/after
.vim/autoload
.vim/bundle
.vim/colors
.vim/compiler
.vim/doc
.vim/ftdetect
.vim/ftplugin
.vim/indent
.vim/plugin
.vim/syntax
$ find .vim/* -maxdepth 0 -type d | wc -l
11
$ find .vim/ -maxdepth 1 -type d | wc -l
12
You can also use ls:
$ ls -l .vim | grep -c ^d
11
$ ls -l .vim
total 52
drwxrwxr-x 3 anossovp anossovp 4096 Aug 29 2012 after
drwxrwxr-x 2 anossovp anossovp 4096 Aug 29 2012 autoload
drwxrwxr-x 13 anossovp anossovp 4096 Aug 29 2012 bundle
drwxrwxr-x 2 anossovp anossovp 4096 Aug 29 2012 colors
drwxrwxr-x 2 anossovp anossovp 4096 Aug 29 2012 compiler
drwxrwxr-x 2 anossovp anossovp 4096 Aug 29 2012 doc
-rw-rw-r-- 1 anossovp anossovp 48 Aug 29 2012 filetype.vim
drwxrwxr-x 2 anossovp anossovp 4096 Aug 29 2012 ftdetect
drwxrwxr-x 2 anossovp anossovp 4096 Aug 29 2012 ftplugin
drwxrwxr-x 2 anossovp anossovp 4096 Aug 29 2012 indent
drwxrwxr-x 2 anossovp anossovp 4096 Aug 29 2012 plugin
-rw-rw-r-- 1 anossovp anossovp 2505 Aug 29 2012 README.rst
drwxrwxr-x 2 anossovp anossovp 4096 Aug 29 2012 syntax
$ ls -l .vim | grep ^d
drwxrwxr-x 3 anossovp anossovp 4096 Aug 29 2012 after
drwxrwxr-x 2 anossovp anossovp 4096 Aug 29 2012 autoload
drwxrwxr-x 13 anossovp anossovp 4096 Aug 29 2012 bundle
drwxrwxr-x 2 anossovp anossovp 4096 Aug 29 2012 colors
drwxrwxr-x 2 anossovp anossovp 4096 Aug 29 2012 compiler
drwxrwxr-x 2 anossovp anossovp 4096 Aug 29 2012 doc
drwxrwxr-x 2 anossovp anossovp 4096 Aug 29 2012 ftdetect
drwxrwxr-x 2 anossovp anossovp 4096 Aug 29 2012 ftplugin
drwxrwxr-x 2 anossovp anossovp 4096 Aug 29 2012 indent
drwxrwxr-x 2 anossovp anossovp 4096 Aug 29 2012 plugin
drwxrwxr-x 2 anossovp anossovp 4096 Aug 29 2012 syntax
Get a count of only the directories in the current directory
echo */ | wc
you will get out put like 1 309 4594
2nd digit represents no. of directories.
or
tree -L 1 | tail -1
find . -mindepth 1 -maxdepth 1 -type d | wc -l
For find -mindepth means total number recusive in directories
-maxdepth means total number recusive in directories
-type d means directory
And for wc -l means count the lines of the input
If you only have directories in the folder and no files this does it:
ls | wc -l
Run stat -c %h folder and subtract 2 from the result. This employs only a single subprocess as opposed to the 2 (or even 3) required by most of the other solutions here (typically find or ls plus wc).
Using sh/bash:
echo $((`stat -c %h folder` - 2))   # 'echo' is a shell builtin
Using csh/tcsh:
# cnt = `stat -c %h folder` - 2; echo $cnt   # 'echo' is a shell builtin
Explanation: stat -c %h folder prints the number of hardlinks to folder, and each subfolder under folder contains a ../ entry which is a hardlink back to folder. You must subtract 2 because there are two additional hardlinks in the count:
folder's own self-referential ./ entry, and
folder's parent's link to folder
Best way to navigate to your drive and simply execute
ls -lR | grep ^d | wc -l
and to Find all folders in total, including subdirectories?
find /mount/point -type d | wc -l
...or find all folders in the root directory (not including subdirectories)?
find /mount/point -maxdepth 1 -type d | wc -l
Cheers!
I think the easiest is
ls -ld images/* | wc -l
where images is your target directory. The -d flag limits to directories, and the -l flag will perform a per-line listing, compatible with the very familiar wc -l for line count.
No of directory we can find using below command
ls -l | grep "^d" | wc -l
Some useful examples:
count files in current dir
/bin/ls -lA | egrep -c '^-'
count dirs in current dir
/bin/ls -lA | egrep -c '^d'
count files and dirs in current dir
/bin/ls -lA | egrep -c '^-|^d'
count files and dirs in in one subdirectory
/bin/ls -lA subdir_name/ | egrep -c '^-|^d'
I have noticed a strange thing (at least in my case) :
When I have tried with ls instead /bin/ls
the -A parameter do not list implied . and .. NOT WORK as espected. When I use
ls that show ./ and ../ So that result wrong count. SOLUTION : /bin/ls instead ls
To get the number of directories - navigate
go to the directory and execute
ls -l | grep -c ^d
A pure bash solution:
shopt -s nullglob
dirs=( /path/to/directory/*/ )
echo "There are ${#dirs[#]} (non-hidden) directories"
If you also want to count the hidden directories:
shopt -s nullglob dotglob
dirs=( /path/to/directory/*/ )
echo "There are ${#dirs[#]} directories (including hidden ones)"
Note that this will also count links to directories. If you don't want that, it's a bit more difficult with this method.
Using find:
find /path/to/directory -type d \! -name . -prune -exec printf x \; | wc -c
The trick is to output an x to stdout each time a directory is found, and then use wc to count the number of characters. This will count the number of all directories (including hidden ones), excluding links.
The methods presented here are all safe wrt to funny characters that can appear in file names (spaces, newlines, glob characters, etc.).
Using zsh:
a=(*(/N)); echo ${#a}
The N is a nullglob, / makes it match directories, # counts. It will neatly cope with spaces in directory names as well as returning 0 if there are no directories.
The best answer to what you want is
echo `find . -maxdepth 1 -type d | wc -l`-1 | bc
this subtracts one to remove the unwanted '.' directory that find lists (as patel deven mentioned above).
If you want to count subfolders recursively, then just leave off the maxdepth option, so
echo `find . -type d | wc -l`-1 | bc
PS If you find command substitution ugly, subtracting one can be done as a pure stream using sed and bc.
Subtracting one from count:
find . -maxdepth 1 -type d | wc -l | sed 's/$/-1\n/' | bc
or, adding count to minus one:
find . -maxdepth 1 -type d | wc -l | sed 's/^/-1+/' | bc
Count all files and subfolders, windows style:
dir=/YOUR/PATH;f=$(find $dir -type f | wc -l); d=$(find $dir -mindepth 1 -type d | wc -l); echo "$f Files, $d Folders"
If you want to use regular expressions, then try:
ls -c | grep "^d" | wc -l
If you want to count folders that have similar names like folder01,folder02,folder03, etc then you can do
ls -l | grep ^d | grep -c folder
Best way to do it:
ls -la | grep -v total | wc -l
This gives you the perfect count.

Linux - Save only recent 10 folders and delete the rest

I have a folder that contains versions of my application, each time I upload a new version a new sub-folder is created for it, the sub-folder name is the current timestamp, here is a printout of the main folder used (ls -l |grep ^d):
drwxrwxr-x 7 root root 4096 2011-03-31 16:18 20110331161649
drwxrwxr-x 7 root root 4096 2011-03-31 16:21 20110331161914
drwxrwxr-x 7 root root 4096 2011-03-31 16:53 20110331165035
drwxrwxr-x 7 root root 4096 2011-03-31 16:59 20110331165712
drwxrwxr-x 7 root root 4096 2011-04-03 20:18 20110403201607
drwxrwxr-x 7 root root 4096 2011-04-03 20:38 20110403203613
drwxrwxr-x 7 root root 4096 2011-04-04 14:39 20110405143725
drwxrwxr-x 7 root root 4096 2011-04-06 15:24 20110406151805
drwxrwxr-x 7 root root 4096 2011-04-06 15:36 20110406153157
drwxrwxr-x 7 root root 4096 2011-04-06 16:02 20110406155913
drwxrwxr-x 7 root root 4096 2011-04-10 21:10 20110410210928
drwxrwxr-x 7 root root 4096 2011-04-10 21:50 20110410214939
drwxrwxr-x 7 root root 4096 2011-04-10 22:15 20110410221414
drwxrwxr-x 7 root root 4096 2011-04-11 22:19 20110411221810
drwxrwxr-x 7 root root 4096 2011-05-01 21:30 20110501212953
drwxrwxr-x 7 root root 4096 2011-05-01 23:02 20110501230121
drwxrwxr-x 7 root root 4096 2011-05-03 21:57 20110503215252
drwxrwxr-x 7 root root 4096 2011-05-06 16:17 20110506161546
drwxrwxr-x 7 root root 4096 2011-05-11 10:00 20110511095709
drwxrwxr-x 7 root root 4096 2011-05-11 10:13 20110511100938
drwxrwxr-x 7 root root 4096 2011-05-12 14:34 20110512143143
drwxrwxr-x 7 root root 4096 2011-05-13 22:13 20110513220824
drwxrwxr-x 7 root root 4096 2011-05-14 22:26 20110514222548
drwxrwxr-x 7 root root 4096 2011-05-14 23:03 20110514230258
I'm looking for a command that will leave the last 10 versions (sub-folders) and deletes the rest.
Any thoughts?
There you go. (edited)
ls -dt */ | tail -n +11 | xargs rm -rf
First list directories recently modified then take all of them except first 10, then send them to rm -rf.
ls -dt1 /path/to/folder/*/ | sed '11,$p' | rm -r
this assumes those are the only directories and no others are present in the working directory.
ls -dt1 will normally only print the newest directory however the /*/ will
only match directories and print their full paths the 1 ensures one
line per match/listing t sorts time with newest at the top.
sed takes the 11th line on down to the bottom and prints only those lines, which are then passed to rm.
You can use xargs, but for testing you may wish to remove | rm -r to see if the directories are listed properly first.
If the directories' names contain the date one can delete all but the last 10 directories with the default alphabetical sort
ls -d */ | head -n -10 | xargs rm -rf
ls -lt | grep ^d | sed -e '1,10d' | awk '{sub(/.* /, ""); print }' | xargs rm -rf
Explanation:
list all contents of current directory in chronological order (most recent files first)
filter out all the directories
ignore the 10 first lines / directories
use awk to extract the file names from the remaining 'ls -l' output
remove the files
EDIT:
find . -maxdepth 1 -type d ! -name \\.| sort | tac | sed -e '1,10d' | xargs rm -rf
I suggest the following sequence. I use a similar approach on my Synology NAS to delete old backups. It doesn't rely on the folder names, instead it uses the last modified time to decide which folders to delete. It also uses zero-termination in order to correctly handle quotes, spaces and newline characters in the folder names:
find /path/to/folder -maxdepth 1 -mindepth 1 -type d -printf '%Ts\t' -print0 \
| sort -rnz \
| tail -n +11 -z \
| cut -f2- -z \
| xargs -0 -r rm -rf
IMPORTANT: This will delete any matching folders! I strongly recommend doing a test run first by replacing the last command xargs -0 -r rm -rf with xargs -0 which will echo the matching folders instead of deleting them.
A short explanation of each step:
find /path/to/folder -maxdepth 1 -mindepth 1 -type d -printf '%Ts\t' -print0
Find all directories (-type d) directly inside the backup folder (-maxdepth 1) except the backup folder itself (-mindepth 1), print (-printf) the Unix time (%Ts) of the last modification followed by a tab character (\t, used in step 4) and the full file name followed by a null character (-print0).
sort -rnz
Sort the zero-terminated items (-z) from the previous step using a numerical comparison (-n) and reverse the order (-r). The result is a list of all folders sorted by their last modification time in descending order.
tail -n +11 -z
Print the last lines (tail) from the previous step starting from line 11 (-n +11) considering each line as zero-terminated (-z). This excludes the newest 10 folders (by modification time) from the remaining steps.
cut -f2- -z
Cut each line from the second field until the end (-f2-) treating each line as zero-terminaded (-z) to obtain a list containing the full path to each folder older than 10 days.
xargs -r -0 rm -rf
Take the zero-terminated (-0) items from the previous step (xargs), and, if there are any (-r avoids running the command passed to xargs if there are no nonblank characters), force delete (rm -rf) them.
Your directory names are sorted in chronological order, which makes this easy. The list of directories in chronological order is just *, or [0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9] to be more precise. So you want to delete all but the last 10 of them.
set [0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]/
while [ $# -gt 10 ]; do
rm -rf "$1"
shift
fi
(While there are more than 10 directories left, delete the oldest one.)

Resources