get name, size and timestamp of all files in dir - linux

I'm trying to get 3 attributes of all files(size, name and timestamp) present in all folders in main folder.
For example
MainFolder
FolderA
file1
file2
file3
FolderB
file4
file5
file6
Output should be
file1|size|timestamp
file2|size|timestamp
file3|size|timestamp
file4|size|timestamp
file5|size|timestamp
file6|size|timestamp
Is there any way I could do it using single command ?

find . -type f | xargs stat --format='%n,%s,%.19x' * | awk '{split($0,a,","); split(a[1],B,"/"); print B[length(B)],"|",a[2],"|",a[3]}'
Let me explain the 3 parts, first list all files in all subdirectories:
find . -type f
Then generate 3 columns with filename (will contain the full path), size in bytes and last access time (I trimmed to 19 characters), if you want another timestamp, like creation simply change the format of stat command as described here
xargs stat --format='%n,%s,%.19x' *
Then as final step strip the path from the filename:
awk '{split($0,a,","); split(a[1],B,"/"); print B[length(B)],"|",a[2],"|",a[3]}'

Related

How to count files in specific subdirectories of a parent directory?

I use the find . -type f | wc -l command to count all the files in a regular directory, but in more specific cases if a directory contains many files, is it possible to specify this in the command? In case I only want to count all the files in the image subdirectories for example. To know how many images (all in .jpeg) I have in total in mydirectory.
This command works find /Users/mydirectory -type f -exec file --mime-type {} \; | awk '{if ($NF == "image/jpeg") print $0 }' but just display them. How to count them?
Finally the command find /Users/mydirectory -type f -exec file --no-pad --mime-type {} + | awk '$NF == "image/jpeg" {$NF=""; sub(": $", ""); print}' | wc -l seems to do the trick.
mydirectory/
folder1/
image/
label/
folder2/
image/
label/
...
I have the impression that you are not aware of the maxdepth parameter of the find command, which indicates the depth of your search command: by using find ... -maxdepth 1 you say that you only want to search within the directory itself. I believe this will solve your question.
Simple example: I created a subdirectory "tralala" and added two files, a subsubdirectory with one file, and I launched following command:
find tralala -maxdepth 1 -type f | wc -l
The answer was 2, which was correct, as you can see here from the amount "to be counted":
Prompt$ find tralala -ls
... drwxrwxrwx ... tralala/ => is a directory, don't count.
... drwxrwxrwx ... tralala/dir => is a directory, don't count.
... -rwxrwxrwx ... tralala/dir/test.txt => is inside subdirectory, don't count.
... -rwxrwxrwx ... tralala/test.txt => is file inside directory, to be counted.
... -rwxrwxrwx ... tralala/test2.txt => is file inside directory, to be counted.

Merge multiple files containing same IDs - Linux

i have 10000 files in One folder like this :
1000.htm
Page_1000.html
file-1000.txt
2000.htm
Page_2000.html
file-2000.txt
i want merge each files that have similar name
example :
1000.htm Page_1000.html file-1000.txt > 1.txt
2000.htm Page_2000.html file-2000.txt > 2.txt
i have try to merge using cat like this its working but i cant do that in 10k files.
cat 1000* > 1.txt
cat 2000* > 2.txt
Thanks
You probably can't do that because the globbing (*) tries to expand to a too large amount of argument. You can use find instead to find all files matching the pattern and than use xargs to perform cat on them.
find . -name '1000*' -print0 | xargs -0 cat > 1.txt
'-print0' and '-0' will delimit on the null (\0) character instead of the default line break character (\n). This way also files with linebreaks in their file names work as expected.
find . -name '*.htm' -printf '%P\n' |
while IFS='.' read -r key sfx; do
cnt=$(( cnt + 1 ))
cat "${key}.htm" "Page_${key}.html" "file-${key}.txt" > "${cnt}.txt"
done
though you should consider using the key in the output file name instead of a cnt variable so it's easy to tell which input files were included in the output file.
i=1;
for ((num = 1000; num < 10000; num+=1000));
do
cat ${num}.htm Page_${num}.html file-${num}.txt > ${i}.txt
i=$((i + 1));
done
You can change num < 10000, as per your requirement.

Listing most recent files whose total sizes are about a certain value

I would like to copy the most recent files in a directory to another directory such that the total sizes of the files I would like to copy should be 10gb, let say.
I know that I can list the most recent 10 files with a certain amount of size like this:
find . -maxdepth 1 -type f -size +100M -print0 | xargs -0 ls -Shal | head
But, is there any way to find the most recent files whose total sizes are about 10gb?
Thanks
Yes, that is possible. find has a printf action that allows you to output only the information you are interested in. In your case, that would be a timestamp (e.g. last modification time), the file size, and the name of the file. You can then sort the output according to the timestamp, and use awk to sum the file sizes and output the file names up to a certain limit:
find "$some_directory" -printf "%T# %s %p\n" | sort -nr \
| awk '{ a = a + $2; if (a > 10000) { print a; exit; }; print $3; }'
Adjust the limit according to your needs, and remove print a if you are not interested in the result. If you want to include the file that pushes the sum over the limit, replace print a with print $3.

Getting directory list with serial number in shell script

I wanna list out all the sub directory in a main directory with a serial numbers.
Example :
If a directory A contains B,C,D and E as a sub directory then the output should be like
1 B
2 C
3 D
4 E
ls | nl ;
where ls is for listing the directory files and nl is for numbered line
You could use a loop:
i=0
for f in A/*/; do
echo "$((++i)) $f"
done
The pattern matches all directories in A. $((++i)) increments the variable $i by 1 for each directory that is found.
use this:
find * -type d | nl
find * -type d : print name of all directories in current path
nl : add line number to output
It depends what you want to do with the number. If you plan to use it further/later, and anyone could make or delete directories between looking at the list and using the number, you will be in trouble.
If that is the case, you can use the inode numbers of the directories like this as they are constant and unique across the entire filesystem:
ls -di */
19866918 f/ 19803132 other/ 19705681 save/
On a Mac, you can also do
stat -f '%i %N' */
19866918 f/
19803132 other/
19705681 save/
and I believe the Linux equivalent is
stat -c '%i %N' */

Count occurence of character in files

I want to count all $ characters in each file in a directory with several subdirectories.
My goal is to count all variables in a PHP project. The files have the suffix .php.
I tried
grep -r '$' . | wc -c
grep -r '$' . | wc -l
and a lot of other stuff but all returned a number that can not match. In my example file are only four $.
So I hope someone can help me.
EDIT
My example file
<?php
class MyClass extends Controller {
$a;$a;
$a;$a;
$a;
$a;
To recursively count the number of $ characters in a set of files in a directory you could do:
fgrep -Rho '$' some_dir | wc -l
To include only files of extension .php in the recursion you could instead use:
fgrep -Rho --include='*.php' '$' some_dir | wc -l
The -R is for recursively traversing the files in some_dir and the -o is for matching part of the each line searched. The set of files are restricted to the pattern *.php and file names are not included in the output with -h, which may otherwise have caused false positives.
For counting variables in a PHP project you can use the variable regex defined here.
So, the next will grep all variables for each file:
cd ~/my/php/project
grep -Pro '\$[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*' .
-P - use perlish regex
-r - recursive
-o - each match on separate line
will produce something like:
./elFinderVolumeLocalFileSystem.class.php:$path
./elFinderVolumeLocalFileSystem.class.php:$path
./elFinderVolumeMySQL.class.php:$driverId
./elFinderVolumeMySQL.class.php:$db
./elFinderVolumeMySQL.class.php:$tbf
You want count them, so you can use:
$ grep -Proc '\$[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*' .
and will get the count of variables in each file, like:
./connector.minimal.php:9
./connector.php:9
./elFinder.class.php:437
./elFinderConnector.class.php:46
./elFinderVolumeDriver.class.php:1343
./elFinderVolumeFTP.class.php:577
./elFinderVolumeFTPIIS.class.php:63
./elFinderVolumeLocalFileSystem.class.php:279
./elFinderVolumeMySQL.class.php:335
./mime.types:0
./MySQLStorage.sql:0
When want count by file and by variable, you can use:
$ grep -Pro '\$[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*' . | sort | uniq -c
for getting result like:
17 ./elFinderVolumeLocalFileSystem.class.php:$target
8 ./elFinderVolumeLocalFileSystem.class.php:$targetDir
3 ./elFinderVolumeLocalFileSystem.class.php:$test
97 ./elFinderVolumeLocalFileSystem.class.php:$this
1 ./elFinderVolumeLocalFileSystem.class.php:$write
6 ./elFinderVolumeMySQL.class.php:$arc
3 ./elFinderVolumeMySQL.class.php:$bg
10 ./elFinderVolumeMySQL.class.php:$content
1 ./elFinderVolumeMySQL.class.php:$crop
where you can see, than the variable $write is used only once, so (maybe) it is useless.
You can also count per variable per whole project
$ grep -Proh '\$[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*' . | sort | uniq -c
and will get something like:
13 $tree
1 $treeDeep
3 $trg
3 $trgfp
10 $ts
6 $tstat
35 $type
where you can see, than the $treeDeep is used only once in a whole project, so it is sure useless.
You can achieve many other combinations with different grep, sort and uniq commands..

Resources