Awk script which takes as input the path to a directory and displays all the files whose size are more than limit?

Awk script which takes as input the path to a directory and displays all the files whose size are more than limit? - linux

I'm fairly new to linux and awk. I want to display all files whose size are more than (eg 3Kb) and where those files are found within a directory whose path is specified by the user.
I managed to do it by "hard-coding" the path in the terminal like this :
ls -l /home/user/Documents | ./testScript
testScript contains:
#!/bin/bash -f
awk '
BEGIN{
if($5>3000){
print $9
}
}
'
How do I do this with the user specifying a directory path?

It would be easier to use find, than a combination of ls and a script
find PATH_TO_DIRECTORY -size +10k
you can make it a bash function taking a parameter

Posting an answer as I can't comment for lack of reputation points:
Not sure what you mean by "path specified by user" but assuming you can read the path in some variable then just do this in your code:
ls -l $mypath | ./testScript

altagir's find is a better solution, but in the cases where someone wants to use this general structure but doesn't know a way other than ls to do it:
stat -c "%s %n" "$someDir"/* | awk -v max=$maxval '$1 > max { print $2 }'

Related

How to move files in Linux based on its name in a folder with a corresponding name?

I would need to move a series of files in certain folders via scripts. The files are of the format xxxx.date.0000 and I have to move them to a folder whose name is the same value given.
For example:
file hello.20190131.0000
in folder 20190131
The ideal would be to be able to create folders even before moving files but it is not a priority because I can create them by hand. I managed to get the value of dates on video with
ls * .0000 | awk -F. '{Print $ 2}'
Does anyone have any suggestions on how to proceed?

The initial awk command provided much of the answer. You just need to do something with the directory name you extract:
A simple option:
ls *.0000 | awk -F. '{printf "mkdir -p '%s'; mv '%s' '%s';",$2,$0,$2}' | sh
This might be more efficient with a large number of files:
ls *.0000 | awk -F. '{print $2}' |\
sort | uniq |\
while read dir; do
mkdir -p "$dir"
mv *."$dir".0000 "$dir"
done

I would do something like this:
ls *.0000 |\
sort |\
while read f; do
foldername="`echo $f | cut -d. -f2`"
echo mkdir +p "$foldername/"
echo mv "$f" "$foldername/"
done
i.e.: For eache of your files, I build the folder name using the cut command with a dot as field separator, and getting the second field (the date in this case); then I create that folder with mkdir -p (the -p flag avoids any warning if the folder should exist already), and finally I move the file to the brand new folder.

You can do that with rename, a.k.a. Perl rename.
Try it on a COPY of your files in a temporary directory.
If you use -p parameter, it will make any necessary directories for you automatically. If you use --dry-run parameter, you can see what it would do without actually doing anything.
rename --dry-run -p 'my #X=split /\./; $_=$X[1] . "/" . $_' hello*
Sample Output
'hello.20190131.0000' would be renamed to '20190131/hello.20190131.0000'
'hello.20190137.0000' would be renamed to '20190137/hello.20190137.0000'
All you need to know is that it passes you the current name of the file in a variable called $_ and it expects you to change that to return the new filename you would like.
So, I split the current name into elements of an array X[] with the dot (period) as the separator:
my #X = split /\./
That gives me the output directory in $X[1]. Now I can set the new filename I want by putting the new directory, a slash and the old filename into $_:
$_=$X[1] . "/" . $_
You could also try this, shorter version:
rename --dry-run -p 's/.*\.(\d+)\..*/$1\/$_/' hello*
On ArchLinux, the package you would use is called perl-rename.
On debian, it is called rename
On macOS, use homebrew like this: brew install rename

Using cut in Linux Mint Terminal more precisely

In the directory /usr/lib on Linux Mint there are files, among other things, that goes by the name of xxx.so.d where xxx is their name, and d being a number. The assignment is to find all files with .so file ending and write out their name, xxx. The code I got so far is
ls | grep "\.so\." | cut -d "." -f 1
The problem now is that cut cuts of some filenames short, as an example there is an file called libgimp-2.0.so.0, where the wanted output would be libgimp-2.0 since that part is infront of .so
Is there anyway to make cut cut at ".so" instead of the first .?

The answer given by pacholik can give you wrong files (ie: 'xyz.socket' will appear on your list). To correct his script:
for i in *.so.*; do echo "${i%%.so*}"; done
Another way to do this (easier to read in my opinion) is to use a little Perl:
ls | grep "\.so\." | perl -n0e "print ((split(/\.so/))[0], \"\n\")"
Sorry, I don't think there is a way to use only "cut" as you asked.

for i in *.so*; do echo "${i%.so*}"; done
just a bash parameter substitution
http://www.tldp.org/LDP/abs/html/parameter-substitution.html

Just use sed instead:
ls | grep -v ".socket" | grep .so | sed "s/.so.*//"
This will delete everything behind the first found .so in the file names. So also files named xxx.so.so would work.

Depend on the size of the directory probably using find could be the best option, as a start point give a try to this:
find . -iname "*.so.*" -exec basename {} \; | cut -d "." -f 1
Like cut there are many other options, like sed, awk that could help you achieve in some cases the same result in a faster way.

How to use select with awk in bash script?

I have to write a bash script for university, the text says:
Write a bash script that allows root user, to get a list of all users
of the machine. Selecting a user, using select, will be required to
indicate a directory (indicate the absolute path). At this point in
the output will have be shown a list of all files folder owned by the
user, ranked in ascending order according to the size of file.
To check if the user is root i used:
if[ "$(id -u)" = 0 ]; then
To get the list of users of the machine I was thinking of using awk:
awk -F':' '{ print$1}' /etc/passwd
How can I use select with awk?
Is there another way without using awk?
Thank you so much in advance

Here is the way to use awk in select statement, you need finish the rest for your homework (for example, sort the result)
#!/usr/bin/env bash
select user in $(awk -F ":" '{print $1}' /etc/passwd )
do
read -p "input the absolute directory: " path
find $path -type f -user "$user" -ls
done

Another way to test the UID by arithmetic (smarter!?) is :
if((UID==0)); then
...
else
...
fi
Check http://wiki.bash-hackers.org/syntax/arith_expr

Alternative to ls in shell-script compatible with nohup

I have a shell-script which lists all the file names in a directory and store them in a new file.
The problem is that when I execute this script with the nohup command, it lists the first name four times instead of listing the correct names.
Commenting the problem with other programmers they think that the problem may be the ls command.
Part of my code is the following:
for i in $( ls -1 ./Datasets/); do
awk '{print $1}' ./genes.txt | head -$num_lineas | tail -1 >> ./aux
let num_lineas=$num_lineas-1
done
Do you know an alternative to ls that works well with nohup?
Thanks.

Don't use ls to feed the loop, use:
for i in ./Datasets/*; do
or if subdirectories are of interest
for i in ./Datasets/*/*; do
Lastly, and more correctly, use find if you need the entire tree below Datasets:
find ./Datasets -type f | while IFS= read -r file; do
(do stuff with $file)
done
Others frown, but there is nothing wrong with also using find as:
for file in $(find ./Datasets -type f); do
(do stuff with $file)
done
Just choose the syntax that most closely meets your needs.

First of all, don't parse ls! A simple glob will suffice. Secondly, your awk | head | tail chain can be simplified by only printing the first column of the line that you're interested in using awk. Thirdly, you can redirect the output of your loop to a file, rather than using >>.
Incorporating all of those changes into your script:
for i in Datasets/*; do
awk -v n="$(( num_lineas-- ))" 'NR==n{print $1}' genes.txt
done > aux
Every time the loop goes round, the value of $num_lineas will decrease by 1.
In terms of your problem with nohup, I would recommend looking into using something like screen, which is known to be a better solution for maintaining a session between logins.

Extracting inode from ls command

I am trying to extract inode from ls command:
"ls -i"
out put:
1234 File name with space
2345 File name
In order to extract, I tried to use cut command as follows:
ls -i | cut -d" " -f1
above command didn't work cuz the files name have different number of spaces in them. From man page I found that by specifying "-Q" to ls command, it would double quote the file name:
ls -iQ
1234 "File name with space"
2345 "File name"
I can't find a way on to utilize this option. Any help would be greatly appreciated.
Thanks
Update 1
It looks like the file name space wasn't the cause of the problem. It's the inode number size. For example:
ls -iQ
2321352 "My Cheat Tables"
507896 "My Data Sources"
Note the leading space before the number " 507896". Therefore, for the first file, the inode is in field 1(-f1) and for the second file inode number is on the second field (-f2).
Updat 2 (My solution)
I found the solution.. xD the command as follows:
ls -iQ | cut -d ' "' -f1
above would print inode only. However, I would like to know if there is proper or better way of doing this. Please do answer on this post if you know the right way. I am new to unix, and I would love to learn it in the proper way. Thanks.. :)

if you have an unknown number of trailing (or delimiting) white spaces, you can use awk to get the desired column:
ls -iQ1 | awk '{print $1}'
other than cut, awk ignores any number of consecutive white spaces so you don't need to guess the offset.
an even better solution is to ask explicitly for the information you need rather than parse ls output:
find . -type f -printf '%i\n'

You can also use the stat(1) command.
stat -c %i "File name with spaces"

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Awk script which takes as input the path to a directory and displays all the files whose size are more than limit? - linux

It would be easier to use find, than a combination of ls and a script find PATH_TO_DIRECTORY -size +10k you can make it a bash function taking a parameter

Posting an answer as I can't comment for lack of reputation points: Not sure what you mean by "path specified by user" but assuming you can read the path in some variable then just do this in your code: ls -l $mypath | ./testScript

altagir's find is a better solution, but in the cases where someone wants to use this general structure but doesn't know a way other than ls to do it: stat -c "%s %n" "$someDir"/* | awk -v max=$maxval '$1 > max { print $2 }'

Related

How to move files in Linux based on its name in a folder with a corresponding name?

Using cut in Linux Mint Terminal more precisely

How to use select with awk in bash script?

Alternative to ls in shell-script compatible with nohup

Extracting inode from ls command

Categories

Resources