Getting directory list with serial number in shell script

Getting directory list with serial number in shell script - linux

I wanna list out all the sub directory in a main directory with a serial numbers.
Example :
If a directory A contains B,C,D and E as a sub directory then the output should be like
1 B
2 C
3 D
4 E

ls | nl ;
where ls is for listing the directory files and nl is for numbered line

You could use a loop:
i=0
for f in A/*/; do
echo "$((++i)) $f"
done
The pattern matches all directories in A. $((++i)) increments the variable $i by 1 for each directory that is found.

use this:
find * -type d | nl
find * -type d : print name of all directories in current path
nl : add line number to output

It depends what you want to do with the number. If you plan to use it further/later, and anyone could make or delete directories between looking at the list and using the number, you will be in trouble.
If that is the case, you can use the inode numbers of the directories like this as they are constant and unique across the entire filesystem:
ls -di */
19866918 f/ 19803132 other/ 19705681 save/
On a Mac, you can also do
stat -f '%i %N' */
19866918 f/
19803132 other/
19705681 save/
and I believe the Linux equivalent is
stat -c '%i %N' */

Related

How to delete smallest file if names are duplicate

I would like to clean up a folder with videos. I have a bunch of videos that were downloaded with different resolutions, so each file will start with the same name and then end with "_480p" or "_720p" etc.
I just want to keep the largest file of each such set.
So I am looking for a way to delete files based on
check if name before "_" is identical
if true, then delete all files except largest one

Thinking of a flexible and fast way to approach the problem, you can gather a list of files ending in "[[:digit:]]+p" and then a quick way to parse the names is to provide them on stdin to awk and let awk index an array with the file prefix (path + part of name before '_') so it will be unique for files allowing the different format size to be obtained and stored at that index.
Then it's a simply matter of comparing the stored resolution number for the file against the current file number and deleting the lesser of the two.
Your find command to locate all files in the directory below the current, recursively, could be:
find ./tmp -type f -regex "^.*[0-9]+p$"
What I would do is then pipe the filename output to a short awk script where an array stores the last seen number for a given file prefix, and then if the current record (line) resolution number if bigger than the value stored in the array, a filename using the array number is created and that file deleted with system() using rm filename. If the current line resolution number is less than what is already stored in the array for the file, you simply delete the current file.
You can do that as:
#!/usr/bin/awk -f
BEGIN { FS = "/" }
{
num = $NF # last field holds number up to 'p'
prefix = $0 # prefix is name up to "_[[:digit:]]+p
sub (/^.*_/, "", num) # isolate number
sub (/p$/, "", num) # remove 'p' at and
sub (/_[[:digit:]]+p$/, "", prefix) # isolate path and name prefix
if (prefix in a) { # current file in array a[] ?
rmfile = $0 # set file to remove to current
if (num + 0 > a[prefix] + 0) { # current number > array number
rmfile = prefix "_" a[prefix] "p" # for remove filename from array
a[prefix] = num # update array with higher num
}
system ("rm " rmfile); # delete the file
}
else
a[prefix] = num # if no num for prefix in array, store first
}
(note: the field-separator splits the fields using the directory separator so you have all file components to work with.)
Example Use/Output
With a representative set of files in a tmp/ directory below the current, e,g.
$ ls -1 tmp
a_480p
a_720p
b_1080p
b_480p
c_1080p
c_720p
Running the find command piped to the awk script named awkparse.sh would be as follows (don't forget to make the awk script executable):
$ find ./tmp -type f -regex "^.*[0-9]+p$" | ./awkparse.sh
Looking at the directory after piping the results of find to the awk script, the tmp/ directory now only contains the highest resolution (largest) files for any given filename, e.g.
$ ls -1
a_720p
b_1080p
c_1080p
This would be highly efficient. It could also handle all files in a nested directory structure where multiple directory levels hold files you need to clean out. Look things over and let me know if you have questions.

This shell script might be what you want:
previous_prefix=
for file in *_[0-9]*[0-9]p*; do
prefix=${file%_*}
resolution=${file##*_}
resolution=${resolution%%p*}
if [ "$prefix" = "$previous_prefix" ]; then
if [ "$resolution" -gt "$greater_resolution" ]; then
file_to_be_removed=$greater_file
greater_file=$file
greater_resolution=$resolution
else
file_to_be_removed=$file
fi
echo rm -- "$file_to_be_removed"
else
greater_resolution=$resolution
greater_file=$file
previous_prefix=$prefix
fi
done
Drop the echo if the output looks good.

I would try to:
list all non-smallest files (non-480p): *_720p* and *_1080p*
for each of them replace *_720p*/*_1080p* in the name with all possible smaller resolutions
and try to delete those files with rm -f, whether they exist or not
#!/bin/sh -e
shopt -s nullglob
for file in *_1080p*; do
rm -f -- "${file//_1080p/_720p}"
rm -f -- "${file//_1080p/_480p}"
done
for file in *_720p*; do
rm -f -- "${file//_720p/_480p}"
done
And here is a Bash script using nested loops to automate the above:
#!/bin/bash -e
shopt -s nullglob
res=(_1080p _720p _480p _240p)
for r in ${res[#]}; do
res=("${res[#]:1}") # remove the first element in res array
for file in *$r*; do
for r2 in ${res[#]}; do
rm -f -- "${file//$r/$r2}"
done
done
done

Linux - is there a way to get the file size of a directory BUT only including the files that have a last modified / creation date of x?

as per title I am trying to find a way to get the file size of a directory (using du) but only counting the files in the directory that have been created (or modified) after a specific date.
Is it something that can be done using the command line?
Thanks :)

From #Bodo's comment. Using GNU find:
find directory/ -type f -newermt 2021-11-25 -printf "%s\t %f\n" | \
awk '{s += $1 } END { print s }' | \
numfmt --to=iec-i
find looks in in directory/ (change this)
Looks for files (-type f)
that have a newer modified time than 2021-11-25 (-newermt) (change this)
and outputs the files's size (%s) on each line
adds up all the sizes from the lines with awk {s += $1 }
Prints the results END { print s }
Formats the byte value to human readable with numfmt's --to=iec-i

get name, size and timestamp of all files in dir

I'm trying to get 3 attributes of all files(size, name and timestamp) present in all folders in main folder.
For example
MainFolder
FolderA
file1
file2
file3
FolderB
file4
file5
file6
Output should be
file1|size|timestamp
file2|size|timestamp
file3|size|timestamp
file4|size|timestamp
file5|size|timestamp
file6|size|timestamp
Is there any way I could do it using single command ?

find . -type f | xargs stat --format='%n,%s,%.19x' * | awk '{split($0,a,","); split(a[1],B,"/"); print B[length(B)],"|",a[2],"|",a[3]}'
Let me explain the 3 parts, first list all files in all subdirectories:
find . -type f
Then generate 3 columns with filename (will contain the full path), size in bytes and last access time (I trimmed to 19 characters), if you want another timestamp, like creation simply change the format of stat command as described here
xargs stat --format='%n,%s,%.19x' *
Then as final step strip the path from the filename:
awk '{split($0,a,","); split(a[1],B,"/"); print B[length(B)],"|",a[2],"|",a[3]}'

Write a specific text/string into a text file for each file present in a specified folder

I am trying to prepare a txt file containing a specific string of text (X tab Y) per line for each file in a folder matching my search parameter.
So far I've got:
find ./directory/*.extension -type f | wc -l
This gives me the number of files with *.extension - but I can't find a way to print (X separated by tab Y) on a line equal to the number of files matching find.
I.e. for 3 files matching my search, the txt file should contain:
X Y
X Y
X Y
Sorry if this is too basic, but any help would be appreciated.

I would do :
find ./directory/ -name "*.extension" -type f -exec echo -e "X\tY" \; > yourfile.txt
This will execute the echo -e "X\tY" command for each file found by the find and redirect this ouput to the file yourfile.txt

Count occurence of character in files

I want to count all $ characters in each file in a directory with several subdirectories.
My goal is to count all variables in a PHP project. The files have the suffix .php.
I tried
grep -r '$' . | wc -c
grep -r '$' . | wc -l
and a lot of other stuff but all returned a number that can not match. In my example file are only four $.
So I hope someone can help me.
EDIT
My example file
<?php
class MyClass extends Controller {
$a;$a;
$a;$a;
$a;
$a;

To recursively count the number of $ characters in a set of files in a directory you could do:
fgrep -Rho '$' some_dir | wc -l
To include only files of extension .php in the recursion you could instead use:
fgrep -Rho --include='*.php' '$' some_dir | wc -l
The -R is for recursively traversing the files in some_dir and the -o is for matching part of the each line searched. The set of files are restricted to the pattern *.php and file names are not included in the output with -h, which may otherwise have caused false positives.

For counting variables in a PHP project you can use the variable regex defined here.
So, the next will grep all variables for each file:
cd ~/my/php/project
grep -Pro '\$[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*' .
-P - use perlish regex
-r - recursive
-o - each match on separate line
will produce something like:
./elFinderVolumeLocalFileSystem.class.php:$path
./elFinderVolumeLocalFileSystem.class.php:$path
./elFinderVolumeMySQL.class.php:$driverId
./elFinderVolumeMySQL.class.php:$db
./elFinderVolumeMySQL.class.php:$tbf
You want count them, so you can use:
$ grep -Proc '\$[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*' .
and will get the count of variables in each file, like:
./connector.minimal.php:9
./connector.php:9
./elFinder.class.php:437
./elFinderConnector.class.php:46
./elFinderVolumeDriver.class.php:1343
./elFinderVolumeFTP.class.php:577
./elFinderVolumeFTPIIS.class.php:63
./elFinderVolumeLocalFileSystem.class.php:279
./elFinderVolumeMySQL.class.php:335
./mime.types:0
./MySQLStorage.sql:0
When want count by file and by variable, you can use:
$ grep -Pro '\$[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*' . | sort | uniq -c
for getting result like:
17 ./elFinderVolumeLocalFileSystem.class.php:$target
8 ./elFinderVolumeLocalFileSystem.class.php:$targetDir
3 ./elFinderVolumeLocalFileSystem.class.php:$test
97 ./elFinderVolumeLocalFileSystem.class.php:$this
1 ./elFinderVolumeLocalFileSystem.class.php:$write
6 ./elFinderVolumeMySQL.class.php:$arc
3 ./elFinderVolumeMySQL.class.php:$bg
10 ./elFinderVolumeMySQL.class.php:$content
1 ./elFinderVolumeMySQL.class.php:$crop
where you can see, than the variable $write is used only once, so (maybe) it is useless.
You can also count per variable per whole project
$ grep -Proh '\$[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*' . | sort | uniq -c
and will get something like:
13 $tree
1 $treeDeep
3 $trg
3 $trgfp
10 $ts
6 $tstat
35 $type
where you can see, than the $treeDeep is used only once in a whole project, so it is sure useless.
You can achieve many other combinations with different grep, sort and uniq commands..

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Getting directory list with serial number in shell script - linux

I wanna list out all the sub directory in a main directory with a serial numbers. Example : If a directory A contains B,C,D and E as a sub directory then the output should be like 1 B 2 C 3 D 4 E

ls | nl ; where ls is for listing the directory files and nl is for numbered line

You could use a loop: i=0 for f in A/*/; do echo "$((++i)) $f" done The pattern matches all directories in A. $((++i)) increments the variable $i by 1 for each directory that is found.

use this: find * -type d | nl find * -type d : print name of all directories in current path nl : add line number to output

Related

How to delete smallest file if names are duplicate

Linux - is there a way to get the file size of a directory BUT only including the files that have a last modified / creation date of x?

get name, size and timestamp of all files in dir

Write a specific text/string into a text file for each file present in a specified folder

Count occurence of character in files

Categories

Resources