Output of cat to bash numeric variable - linux

I have a set of files, each containing a single (integer) number, which is the number of files in the directory of the same name (without the .txt suffix) - the result of a wc on each of the directories.
I would like to sum the numbers in the files. I've tried:
i=0;
find -mindepth 1 -maxdepth 1 -type d -printf '%f\n' | while read j; do i=$i+`cat $j.txt`; done
echo $i
But the answer is 0. If I simply echo the output of cat:
i=0; find -mindepth 1 -maxdepth 1 -type d -printf '%f\n' | while read j; do echo `cat $j.txt`; done
The values are there:
1313
1528
13465
22258
7262
6162
...
Presumably I have to cast the output of cat somehow?
[EDIT]
I did find my own solution in the end:
i=0;
for j in `find -mindepth 1 -maxdepth 1 -type d -printf '%f\n'`; do
expr $((i+=$(cat $j.txt)));
done;
28000
30250
...
...
647185
649607
but the accepted answer is neater as it doesn't output along the way

The way you're summing the output of cat should work. However, you're getting 0 because your while loop is running in a subshell and so the variable that stores the sum goes out of scope once the loop ends. For details, see BashFAQ/024.
Here's one way to solve it, using process substitution (instead of pipes):
SUM=0
while read V; do
let SUM="SUM+V"
done < <(find -mindepth 1 -maxdepth 1 -type d -exec cat "{}.txt" \;)
Note that I've taken the liberty of changing the find/cat/sum operations, but your approach should work fine as well.

My one-liner solution without the need of find :
echo $(( $(printf '%s\n' */ | tr -d / | xargs -I% cat "%.txt" | tr '\n' '+')0 ))

Related

bash - count files per directory and total at the end

Ubuntu 18.04 LTS with bash 4.4.20
I am trying to count the number of files in each directory starting in the directory where I executed the script. Borrowing from other coders, I found this script and modified it. I am trying to modify it to provide a total at the end, but I can't seem to get it. Also, the script is running the same count function twice each loop and that is inefficient. I inserted that extra find command because I could not get the results of the nested 'find | wc -l' to store in a variable. And it still didn't work.
Thanks!
#!/bin/bash
count=0
find . -maxdepth 1 -mindepth 1 -type d | sort -n | while read dir; do
printf "%-25.25s : " "$dir"
find "$dir" -type f | wc -l
filesthisdir=$(find "$dir" -type f | wc -l)
count=$count+$filesthisdir
done
echo "Total files : $count"
Here are the results. It should total up the results. Otherwise, this would work well.
./1800wls1 : 1086
./1800wls2 : 1154
./1900wls-in1 : 780
./1900wls-in2 : 395
./1900wls-in3 : 0
./1900wls-out1 : 8
./1900wls-out2 : 304
./1900wls-out3 : 160
./test : 0
Total files : 0
This doesn't work because the while loop is executed in a sub shell. By using <<< you make sure it's executed in the current shell.
#!/bin/bash
count=0
while read dir; do
printf "%-25.25s : " "$dir"
find "$dir" -type f | wc -l
filesthisdir=$(find "$dir" -type f | wc -l)
((count+=filesthisdir))
done <<< "$(find . -maxdepth 1 -mindepth 1 -type d | sort -n)"
echo "Total files : $count"
Of course you also can make use of a for loop:
for i in "$(find . -maxdepth 1 -mindepth 1 -type d | sort -n)"; do
# do something
done
Use (( count += filesthisdir)) and think about counting files with newlines.
You should change your find command:
filesthisdir=$(find "$dir" -type f -exec echo . \;| wc -l)

how to remove blank space of text files?

I want to remove all empty lines from some text file. I can do it with:
grep '[^[:blank:]]' < file1.dat > file1.dat.nospace
But I need to do it with n-files in a directory. How can I do it?
Any help would be appreciated. Thanks!
You can use it with find
find . -name '*.dat' -exec sed -i.bak '/^[[:blank:]]*$/d' {} +
here is a way:
for filename in *.dat; do
grep '[^[:blank:]]' < $filename > $filename.nospace
done
here is a more robust way, one that works in a larger variety of circumstances:
find . -maxdepth 1 -type f -name "*.dat" | while read filename; do
grep '[^[:blank:]]' < "$filename" > "$filename.nospace"
done
here a much faster way (in execution time, but also in typing). this is the way i would actually do it:
find *.dat -printf "grep '[^[:blank:]]' < \"%f\" > \"%f.nospace\"\n" | sh
here is a more robust version of that:
find . -maxdepth 1 -type f -name "*.dat" -printf "grep '[^[:blank:]]' < \"%f\" > \"%f.nospace\"\n" | sh
ps: here's the actual correct grep for nonblank lines:
grep -v '^$' < $filename > $filename.nospace
this oneliner could probably help you:
for a in /path/to/file_pattern*; do sed "/^\s*$/d" $a > $a.nospace;done

Bash find and expression

Is there some way to make this working?
pFile=find ${destpath} (( -iname "${mFile##*/}" )) -o (( -iname "${mFile##*/}" -a -name "*[],&<>*?|\":'()[]*" )) -exec printf '.' \;| wc -c
i need pFile return the number of file with the same filename, or if there aren't, return 0.
I have to do this, because if i only use:
pFile=find ${destpath} -iname "${mFile##*/}" -exec printf '.' \;| wc -c
It doesn't return if there are same filename with metacharacter.
Thanks
EDIT:
"${mFile##*/}" have as output file name in start folder without path.
echo "${mFile##*/}" -> goofy.mp3
Exmple
in start folder i have:
goofy.mp3 - mickey[1].avi - donald(2).mkv - scrooge.3gp
In destination folder i have:
goofy.mp3 - mickey[1].avi -donald(2).mkv -donald(1).mkv -donald(3).mkv -minnie.iso
i want this:
echo pFile -> 3
With:
pFile=find ${destpath} -iname "${mFile##*/}" -exec printf '.' \;| wc -c
echo pFile -> 2
With:
pFile=find ${destpath} -name "*[],&<>*?|\":'()[]*" -exec printf '.' \;| wc -c
echo pFile -> 4
With Same file name i mean:
/path1/mickey[1].avi = /path2/mickey[1].avi
I am not sure I understood your intended semantics of ${mFile##*/}, however looking at your start/destination folder example, I have created the following use case directory structure and the script below to solve your issue:
$ find root -type f | sort -t'/' -k3
root/dir2/donald(1).mkv
root/dir1/donald(2).mkv
root/dir2/donald(2).mkv
root/dir2/donald(3).mkv
root/dir1/goofy.mp3
root/dir2/goofy.mp3
root/dir1/mickey[1].avi
root/dir2/mickey[1].avi
root/dir2/minnie.iso
root/dir1/scrooge.3gp
Now, the following script (I've used gfind to indicated that you need GNU find for this to work, but if you're on Linux, just use find):
$ pFile=$(($(gfind root -type f -printf "%f\n" | wc -l) - $(gfind root -type f -printf "%f\n" | sort -u | wc -l)))
$ echo $pFile
3
I'm not sure this solves your issue, however it does print the number you expected in your provided example.

Can this be printed on same line?

This command will count the number of files in the sub-directories.
find . -maxdepth 1 -type d |while read dir;do echo "$dir";find "$dir" -type f|wc -l;done
Which looks like
./lib64
327
./bin
118
Would it be possible to have it to look like
327 ./lib64
118 ./bin
instead?
There are a number of ways to do this... Here's something that doesn't change your code very much. (I've put it in multiple lines for readability.)
find . -maxdepth 1 -type d | while read dir; do
echo `find "$dir" -type f | wc -l` "$dir"
done
pipe into tr to remove or replace newlines. I expect you want the newline to be turned into a tab character, like this:
find . -maxdepth 1 -type d |while read dir;do
find "$dir" -type f|wc -l | tr '\n' '\t';
echo "$dir";
done
(Edit: I had them the wrong way around)
do echo -n "$dir "
The -n prevents echo from ending the line afterwards.

How do I recursively list all directories at a location, breadth-first?

Breadth-first list is important, here. Also, limiting the depth searched would be nice.
$ find . -type d
/foo
/foo/subfoo
/foo/subfoo/subsub
/foo/subfoo/subsub/subsubsub
/bar
/bar/subbar
$ find . -type d -depth
/foo/subfoo/subsub/subsubsub
/foo/subfoo/subsub
/foo/subfoo
/foo
/bar/subbar
/bar
$ < what goes here? >
/foo
/bar
/foo/subfoo
/bar/subbar
/foo/subfoo/subsub
/foo/subfoo/subsub/subsubsub
I'd like to do this using a bash one-liner, if possible. If there were a javascript-shell, I'd imagine something like
bash("find . -type d").sort( function (x) x.findall(/\//g).length; )
The find command supports -printf option which recognizes a lot of placeholders.
One such placeholder is %d which renders the depth of given path, relative to where find started.
Therefore you can use following simple one-liner:
find -type d -printf '%d\t%P\n' | sort -r -nk1 | cut -f2-
It is quite straightforward, and does not depend on heavy tooling like perl.
How it works:
it internally generates list of files, each rendered as a two-field line
the first field contains the depth, which is used for (reverse) numerical sorting, and then cut away
resulting is simple file listing, one file per line, in the deepest-first order
If you want to do it using standard tools, the following pipeline should work:
find . -type d | perl -lne 'print tr:/::, " $_"' | sort -n | cut -d' ' -f2
That is,
find and print all the directories here in depth first order
count the number of slashes in each directory and prepend it to the path
sort by depth (i.e., number of slashes)
extract just the path.
To limit the depth found, add the -maxdepth argument to the find command.
If you want the directories listed in the same order that find output them, use "sort -n -s" instead of "sort -n"; the "-s" flag stabilizes the sort (i.e., preserves input order among items that compare equally).
You can use find command,
find /path/to/dir -type d
So below example list of directories in current directory :
find . -type d
My feeling is that this is a better solution than previously mentioned ones. It involves grep and such and a loop, but I find it works very well, specifically for cases where you want things line buffered and not the full find buffered.
It is more resource intensive because of:
Lots of forking
Lots of finds
Each directory before the current depth is hit by find as many times as there is total depth to the file structure (this shouldn't be a problem if you have practically any amount of ram...)
This is good because:
It uses bash and basic gnu tools
It can be broken whenever you want (like you see what you were looking for fly by)
It works per line and not per find, so subsequent commands don't have to wait for a find and a sort
It works based on the actual file system separation, so if you have a directory with a slash in it, it won't be listed deeper than it is; if you have a different path separator configured, you still are fine.
#!/bin/bash
depth=0
while find -mindepth $depth -maxdepth $depth | grep '.'
do
depth=$((depth + 1))
done
You can also fit it onto one line fairly(?) easily:
depth=0; while find -mindepth $depth -maxdepth $depth | grep --color=never '.'; do depth=$((depth + 1)); done
But I prefer small scripts over typing...
I don't think you could do it using built-in utilities, since when traversing a directory hierarchy you almost always want a depth-first search, either top-down or bottom-up. Here's a Python script that will give you a breadth-first search:
import os, sys
rootdir = sys.argv[1]
queue = [rootdir]
while queue:
file = queue.pop(0)
print(file)
if os.path.isdir(file):
queue.extend(os.path.join(file,x) for x in os.listdir(file))
Edit:
Using os.path-module instead of os.stat-function and stat-module.
Using list.pop and list.extend instead of del and += operators.
I tried to find a way to do this with find but it doesn't appear to have anything like a -breadth option. Short of writing a patch for it, try the following shell incantation (for bash):
LIST="$(find . -mindepth 1 -maxdepth 1 -type d)";
while test -n "$LIST"; do
for F in $LIST; do
echo $F;
test -d "$F" && NLIST="$NLIST $(find $F -maxdepth 1 -mindepth 1 -type d)";
done;
LIST=$NLIST;
NLIST="";
done
I sort of stumbled upon this accidentally so I don't know if it works in general (I was testing it only on the specific directory structure you were asking about)
If you want to limit the depth, put a counter variable in the outer loop, like so (I'm also adding comments to this one):
# initialize the list of subdirectories being processed
LIST="$(find . -mindepth 1 -maxdepth 1 -type d)";
# initialize the depth counter to 0
let i=0;
# as long as there are more subdirectories to process and we haven't hit the max depth
while test "$i" -lt 2 -a -n "$LIST"; do
# increment the depth counter
let i++;
# for each subdirectory in the current list
for F in $LIST; do
# print it
echo $F;
# double-check that it is indeed a directory, and if so
# append its contents to the list for the next level
test -d "$F" && NLIST="$NLIST $(find $F -maxdepth 1 -mindepth 1 -type d)";
done;
# set the current list equal to the next level's list
LIST=$NLIST;
# clear the next level's list
NLIST="";
done
(replace the 2 in -lt 2 with the depth)
Basically this implements the standard breadth-first search algorithm using $LIST and $NLIST as a queue of directory names. Here's the latter approach as a one-liner for easy copy-and-paste:
LIST="$(find . -mindepth 1 -maxdepth 1 -type d)"; let i=0; while test "$i" -lt 2 -a -n "$LIST"; do let i++; for F in $LIST; do echo $F; test -d "$F" && NLIST="$NLIST $(find $F -maxdepth 1 -mindepth 1 -type d)"; done; LIST=$NLIST; NLIST=""; done
Without the deserved ordering:
find -maxdepth -type d
To get the deserved ordering, you have to do the recursion yourself, with this small shellscript:
#!/bin/bash
r ()
{
let level=$3+1
if [ $level -gt $4 ]; then return 0; fi
cd "$1"
for d in *; do
if [ -d "$d" ]; then
echo $2/$d
fi;
done
for d in *; do
if [ -d "$d" ]; then
(r "$d" "$2/$d" $level $4)
fi;
done
}
r "$1" "$1" 0 "$2"
Then you can call this script with parameters base directory and depth.
Here's a possible way, using find. I've not thoroughly tested it, so user beware...
depth=0
output=$(find . -mindepth $depth -maxdepth $depth -type d | sort);
until [[ ${#output} -eq 0 ]]; do
echo "$output"
let depth=$depth+1
output=$(find . -mindepth $depth -maxdepth $depth -type d | sort)
done
Something like this:
find . -type d |
perl -lne'push #_, $_;
print join $/,
sort {
length $a <=> length $b ||
$a cmp $b
} #_ if eof'

Resources