Write a command to display text file name and its size in different lines in linux - linux

I want to display text file name and its size in different lines
I have tried
du *.* | cut -f 1
This give me only size of the files in given directory
du *.* | cut -f 2
This gives the filenames
But i could't figure out how to format it in way where the size comes first then the file name.
example :
4
file1.txt
5
file2.txt

I just figured it out this is working as expected.
du *.txt* | tr [:space:] '\n'

You can do some awk scripting:
for file in *
do
echo "$file $(du "$file" | awk '{print $1}')"
done

Related

Output of wc -l without file-extension

I've got the following line:
wc -l ./*.txt | sort -rn
i want to cut the file extension. So with this code i've got the output:
number filename.txt
for all my .txt-files in the .-directory. But I want the output without the file-extension, like this:
number filename
I tried a pipe with cut for different kinds of parameter, but all i got was to cut the whole filename with this command.
wc -l ./*.txt | sort -rn | cut -f 1 -d '.'
Assuming you don't have newlines in your filename you can use sed to strip out ending .txt:
wc -l ./*.txt | sort -rn | sed 's/\.txt$//'
unfortunately, cut doesn't have a syntax for extracting columns according to an index from the end. One (somewhat clunky) trick is to use rev to reverse the line, apply cut to it and then rev it back:
wc -l ./*.txt | sort -rn | rev | cut -d'.' -f2- | rev
Using sed in more generic way to cut off whatever extension the files have:
$ wc -l *.txt | sort -rn | sed 's/\.[^\.]*$//'
14 total
8 woc
3 456_base
3 123_base
0 empty_base
A better approach using proper mime type (what is the extension of tar.gz or such multi extensions ? )
#!/bin/bash
for file; do
case $(file -b $file) in
*ASCII*) echo "this is ascii" ;;
*PDF*) echo "this is pdf" ;;
*) echo "other cases" ;;
esac
done
This is a POC, not tested, feel free to adapt/improve/modify

Bash shell script for finding file size

Consider:
var=`ls -l | grep TestFile.txt | awk '{print $5}'`
I am able to read file size, but how does it work?
Don't parse ls
size=$( stat -c '%s' TestFile.txt )
Yes, so basically you could divide it into 4 parts:
ls -l
List the current directory content (-l for long listing format)
| grep TestFile.txt
Pipe the result and look for the file you are interested in
| awk '{print $5}
Pipe the result to awk program which cuts (by using spaces as separator) the fifth column which happens to be the file size in this case (but this can be broken by spaces in the filename, for example)
var=`...`
The backquotes (`) enclose commands. The output of the commands gets stored in the var variable.
NOTE: You can get the file size directly by using du -b TestFile.txt or stat -c %s TestFile.txt

Removing the file part of the output from du in a bash script

I'm trying to remove the output when calling du in my bash script. I'm just trying to print out the size of the current directory. So it looks like this:
DIRSIZE=$(du -hs $1)
printf "The size of the directory given is: %s\n" "$DIRSIZE"
I want the output to look like this:
The size of the directory given is: 32K
However, my command currently outputs:
The size of the directory given is: 32K /home/dir_listed/
Is there an easy way to remove the directory?
With awk:
DIRSIZE=$(du -hs $1 | awk '{print $1}')
Take only the first field from du output and save to DIRSIZE.
With sed:
DIRSIZE=$(du -hs $1 | sed 's/[[:space:]].*//')
Remove from first space to end of line and save to DIRSIZE.
With cut:
DIRSIZE=$(du -hs $1 | cut -f 1)
Take only the first field from du output which is tab seperated and save to DIRSIZE.
Try this:
DIRSIZE=$(du -hs $1 | awk '{print $1}')
printf "The size of the directory given is: %s\n" "$DIRSIZE"

Output filename/lines/type for given directory

I'm trying to teach myself basic file manipulation and scripting in linux but I've hit a wall. Right now I'm trying to output a table that gives something like
FILENAME LINES TYPE
File1 22 File
File2 56 File
Folder1 N/A Directory
when given any directory to search. I've been researching how to format output using awk and using maybe grep and wc to try and get my data but I'm a bit lost. For all I know I'm barking up the wrong tree entirely.
Look at printf to format your output, then look at the commands file to find your file type, wc to print out the number of lines, etc.
All this could be done via a find | while read loop:
printf "%-20.20s %-3.3s %s\n", "File", "Lines", "Type"
find . -type f -print0 | while read -d $'\0' file
do
file_name=$(basename $file)
lines="$(cat $file | wc -l | sed 's/^ *//')"
desc="$(file --brief "$file")"
printf "%-20.20s %3.3s %s\n", "$file_name", $lines, "$desc"
done
The $(...) syntax returns the output of the enclosed command as a string that can be assigned to variable. I use cat $file | wc -l to eliminate the name of the file, and then use sed to remove leading spaces.

how to compare output of two ls in linux

So here is the task which I can't solve. I have a directory with .h files and a directory with .i files, which have the same names as the .h files. I want just by typing a command to have all .h files which are not found as .i files. It's not a hard problem, I can do it in some programming language, but I'm just curious how it will look like in cmd :). To be more specific here is the algo:
get file names without extensions from ls *.h
get file names without extensions from ls *.i
compare them
print all names from 1 that are not met in 2
Good luck!
diff \
<(ls dir.with.h | sed 's/\.h$//') \
<(ls dir.with.i | sed 's/\.i$//') \
| grep '$<' \
| cut -c3-
diff <(ls dir.with.h | sed 's/\.h$//') <(ls dir.with.i | sed 's/\.i$//') executes ls on the two directories, cuts off the extensions, and compares the two lists. Then grep '$<' finds the files that are only in the first listing, and cut -c3- cuts off the "< " characters that diff inserted.
ls ./dir_h/*.h | sed -r -n 's:.*dir_h/([^.]*).h$:dir_i/\1.i:p' | xargs ls 2>&1 | \
grep "No such file or directory" | awk '{print $4}' | sed -n -r 's:dir_i/([^:]*).*:dir_h/\1:p'
ls -1 dir1/*.hh dir2/*.ii | awk -F"/" '{print $NF}' |awk -F"." '{a[$1]++;b[$0]}END{for(i in a)if(a[i]==1 && b[i".hh"]) print i}'
explanation:
ls -1 dir1/*.hh dir2/*.ii
above will list all the files *.hh and *.ii files in both the directories.
awk -F"/" '{print $NF}'
above will just print the file name excluding the complete path of the file.
awk -F"." '{a[$1]++;b[$0]}END{for(i in a)if(a[i]==1 && b[i".hh"]) print i}'
above will create two associative arrays one with file name and one with excluding the extension.
if both hh and ii files exist the value in the assosciative array will 2 if there is only one file then the value will be 1.so we need array item whose value is 1 and it should be a header file (.hh).
this can be checked using the asso..array b which is done in the END block.
Assuming bash is your shell:
for file in $( ls dir_with_h/*.h ); do
name=${file%\.h}; # trim trailing ".h" file extension
name=${name#dir_with_h/}; # trim leading folder name
if [ ! -e dir_with_i/${name}.i ]; then
echo ${name};
fi
done
Undoubtedly this can be ported to virtually all other shells. I find this less cryptic than some other approaches (although this is surely my problem) but it is a little wordy. As such. a shell script might help recall it.

Resources