how to get previous date files and pass ls output to array in gawk - linux

I have log files like below generated, and I need to daily run script ,which will list them , and then do 2 things.
1- get previous / yesterday files and transfer them to x server
2- get files older than one day and transfer them to y server
files are like below and I am trying below code but not working.
how can we pass ls -altr output to gawk ? can we built an associate array like below.
array[index]=ls -altr | awk '{print $6,$7,$8}'
code I am trying to retrieve previous date files , but not working
previous_dates=$(date -d "-1 days" '+-%d')
ls -altr |gawk '{if ( $7!=previous_dates ) print $9 }'
-r-------- 1 root root 6291563 Jun 22 14:45 audit.log.4
-r-------- 1 root root 6291619 Jun 24 09:11 audit.log.3
drwxr-xr-x. 14 root root 4096 Jun 26 03:47 ..
-r-------- 1 root root 6291462 Jun 26 04:15 audit.log.2
-r-------- 1 root root 6291513 Jun 27 23:05 audit.log.1
drwxr-x---. 2 root root 4096 Jun 27 23:05 .
-rw------- 1 root root 5843020 Jun 29 14:57 audit.log

To select files modified yesterday, you could use
find . -daystart -type f -mtime 1
and to select older files, you could use
find . -daystart -type f -mtime +1
possibly adding a -name test to select only files like audit.log*, for example. You could then use xargs to process the files, e.g.
find . -daystart -type f -mtime 1 | xargs -n 1 -I{} scp {} user#server

Related

Shell script find cmin evaluates on directory, not on files

I have a directory with files in them, when the files in this directory are older than 10 minutes, I would like to receive a notification from our monitoring.
With our monitoring I'm creating a SSH session which executes a shell script to check the age of the files in the directory.
The shell script only shows the files in the checked directory when the directory is older than 10 minutes, and not the individual files.
See below example (I've tested this example at Nov 27th, 09:22)
There are files older than 10 minutes (since Nov 27th, 09:22):
system:/mls_bmp/indir/BT> ll
-rw-r--r-- 1 sonic sonic 845 Nov 24 08:04 BRMLREL20171124080420572
-rw-r--r-- 1 sonic sonic 845 Nov 24 08:17 BRMLREL20171124081723685
-rw-r--r-- 1 sonic sonic 845 Nov 24 08:17 BRMLREL20171124081729805
-rw-r--r-- 1 sonic sonic 845 Nov 27 08:49 BRMLREL20171127084911037
-rw-r--r-- 1 sonic sonic 845 Nov 27 08:49 BRMLREL20171127084920817
However, this find command shows amount 0:
system:/mls_bmp/indir> find /mls_bmp/indir/BT -prune -cmin +10 -exec ls {} \; | wc -l | xargs
0
And this is because the directory in which I check is younger than 10 minutes (since Nov 27th, 09:22):
system:/mls_bmp/indir> ll
drwxrwxrwx 4 oracle dba 20480 Nov 27 09:15 BT
I don't want to check subdirs, so using the statement: find /mls_bmp/indir/BT/* -prune -cmin +10 -exec ls {} \; | wc -l | xargs is not an option.
find starts on the specified directory, then looks at its contents (then their contents, etc). This means that that -prune applies to the directory (/mls_bmp/indir/BT), and prevents find from looking at the files inside it. As #BenjaminW said in a comment, use -type f -maxdepth 1 instead.

replacement on xargs variable returns empty string

I need to search for XML files inside a directory tree and create links for them on another directory (staging_ojs_pootle), naming these links with the file path (replacing slashes per dots).
the bash command is not working, I got stuck on the replacement part. Seems like the variable from xargs, named 'file', is not accessible inside the replacement code (${file/\//.})
find directory/ -name '*.xml' | xargs -I 'file' echo "ln" file staging_ojs_pootle/${file/\//.}
The replacement inside ${} result gives me an empty string.
Tried using sed but regular expressions were replacing all or just the last slash :/
find directory/ -name '*.xml' | xargs -I 'file' echo "ln" file staging_ojs_pootle/file |sed -e '/^ln/s/\(staging_ojs_pootle.*\)[\/]\(.*\)/\1.\2/g'
regards
Try this:
$ find directory/ -name '*.xml' |sed -r 'h;s|/|.|g;G;s|([^\n]+)\n(.+)|ln \2 staging_ojs_pootle/\1|e'
For example:
$ mkdir -p /tmp/test
$ touch {1,2,3,4}.xml
# use /tmp/test as staging_ojs_pootle
$ find /tmp/test -name '*.xml' |sed -r 'h;s|/|.|g;G;s|([^\n]+)\n(.+)|ln \2 /tmp/test/\1|e'
$ ls -al /tmp/test
total 8
drwxr-xr-x. 2 root root 4096 Jun 15 13:09 .
drwxrwxrwt. 9 root root 4096 Jun 15 11:45 ..
-rw-r--r--. 2 root root 0 Jun 15 11:45 1.xml
-rw-r--r--. 2 root root 0 Jun 15 11:45 2.xml
-rw-r--r--. 2 root root 0 Jun 15 11:45 3.xml
-rw-r--r--. 2 root root 0 Jun 15 11:45 4.xml
-rw-r--r--. 2 root root 0 Jun 15 11:45 .tmp.test.1.xml
-rw-r--r--. 2 root root 0 Jun 15 11:45 .tmp.test.2.xml
-rw-r--r--. 2 root root 0 Jun 15 11:45 .tmp.test.3.xml
-rw-r--r--. 2 root root 0 Jun 15 11:45 .tmp.test.4.xml
# if don NOT use the e modifier of s command, we can get the final command
$ find /tmp/test -name '*.xml' |sed -r 'h;s|/|.|g;G;s|([^\n]+)\n(.+)|ln \2 /tmp/test/\1|'
ln /tmp/test/1.xml /tmp/test/.tmp.test.1.xml
ln /tmp/test/2.xml /tmp/test/.tmp.test.2.xml
ln /tmp/test/3.xml /tmp/test/.tmp.test.3.xml
ln /tmp/test/4.xml /tmp/test/.tmp.test.4.xml
Explains:
for each xml file, use h to keep the origin filename in hold space.
the use s|/|.|g to substitute all / to . for xml filename.
use G to append the hold space to pattern space, then pattern space is CHANGED_FILENAME\nORIGIN_FILENAME.
use s|([^\n]+)\n(.+)|ln \2 staging_ojs_pootle/\1|e' to merge the command with CHANGED_FILENAME and ORIGIN_FILENAME, then use e modifier of s command to execute the command assembled above, which will do the actual works.
Hope this helps!
If you can be sure that the names of your XML files do not contain any word-splitting characters, you can use something like:
find directory -name "*.xml" | sed 'p;s/\//./' | xargs -n2 echo ln

How to find files modified in last x minutes (find -mmin does not work as expected)

I'm trying to find files modified in last x minutes, for example in the last hour. Many forums and tutorials on the net suggest to use the find command with the -mmin option, like this:
find . -mmin -60 |xargs ls -l
However, this command did not work for me as expected. As you can see from the following listing, it also shows files modified earlier than 1 hour ago:
-rw------- 1 user user 9065 Oct 28 23:13 1446070435.V902I67a5567M283852.harvester
-rw------- 1 user user 1331 Oct 29 01:10 1446077402.V902I67a5b34M538793.harvester
-rw------- 1 user user 1615 Oct 29 01:36 1446078983.V902I67a5b35M267251.harvester
-rw------- 1 user user 72365 Oct 29 02:27 1446082022.V902I67a5b36M873811.harvester
-rw------- 1 user user 69102 Oct 29 02:27 1446082024.V902I67a5b37M142247.harvester
-rw------- 1 user user 2611 Oct 29 02:34 1446082482.V902I67a5b38M258101.harvester
-rw------- 1 user user 2612 Oct 29 02:34 1446082485.V902I67a5b39M607107.harvester
-rw------- 1 user user 2600 Oct 29 02:34 1446082488.V902I67a5b3aM465574.harvester
-rw------- 1 user user 10779 Oct 29 03:27 1446085622.V902I67a5b3bM110329.harvester
-rw------- 1 user user 5836 Oct 29 03:27 1446085623.V902I67a5b3cM254104.harvester
-rw------- 1 user user 8970 Oct 29 04:27 1446089232.V902I67a5b3dM936339.harvester
-rw------- 1 user user 165393 Oct 29 06:10 1446095400.V902I67a5b3eM290158.harvester
-rw------- 1 user user 105054 Oct 29 06:10 1446095430.V902I67a5b3fM265065.harvester
-rw------- 1 user user 1615 Oct 29 06:24 1446096244.V902I67a5b40M55701.harvester
-rw------- 1 user user 1620 Oct 29 06:24 1446096292.V902I67a5b41M337769.harvester
-rw------- 1 user user 10436 Oct 29 06:36 1446096973.V902I67a5b42M707215.harvester
-rw------- 1 user user 7150 Oct 29 06:36 1446097019.V902I67a5b43M415731.harvester
-rw------- 1 user user 4357 Oct 29 06:39 1446097194.V902I67a5b56M446687.harvester
-rw------- 1 user user 4283 Oct 29 06:39 1446097195.V902I67a5b57M957052.harvester
-rw------- 1 user user 4393 Oct 29 06:39 1446097197.V902I67a5b58M774506.harvester
-rw------- 1 user user 4264 Oct 29 06:39 1446097198.V902I67a5b59M532213.harvester
-rw------- 1 user user 4272 Oct 29 06:40 1446097201.V902I67a5b5aM534679.harvester
-rw------- 1 user user 4274 Oct 29 06:40 1446097228.V902I67a5b5dM363553.harvester
-rw------- 1 user user 20905 Oct 29 06:44 1446097455.V902I67a5b5eM918314.harvester
Actually, it just listed all files in the current directory. We can take one of these files as an example and check if its modification time is really as displayed by the ls command:
stat 1446070435.V902I67a5567M283852.harvester
File: ‘1446070435.V902I67a5567M283852.harvester’
Size: 9065 Blocks: 24 IO Block: 4096 regular file
Device: 902h/2306d Inode: 108680551 Links: 1
Access: (0600/-rw-------) Uid: ( 1001/ user) Gid: ( 1027/ user)
Access: 2015-10-28 23:13:55.281515368 +0100
Modify: 2015-10-28 23:13:55.281515368 +0100
Change: 2015-10-28 23:13:55.313515539 +0100
As we can see, this file was definitely last modified earlier than 1 hour ago! I also tried find -mmin 60 or find -mmin +60, but it did not work either.
Why is this happening and how to use the find command correctly?
I can reproduce your problem if there are no files in the directory that were modified in the last hour. In that case, find . -mmin -60 returns nothing. The command find . -mmin -60 |xargs ls -l, however, returns every file in the directory which is consistent with what happens when ls -l is run without an argument.
To make sure that ls -l is only run when a file is found, try:
find . -mmin -60 -type f -exec ls -l {} +
The problem is that
find . -mmin -60
outputs:
.
./file1
./file2
Note the line with one dot?
That makes ls list the whole directory exactly the same as when ls -l . is executed.
One solution is to list only files (not directories):
find . -mmin -60 -type f | xargs ls -l
But it is better to use directly the option -exec of find:
find . -mmin -60 -type f -exec ls -l {} \;
Or just:
find . -mmin -60 -type f -ls
Which, by the way is safe even including directories:
find . -mmin -60 -ls
To search for files in /target_directory and all its sub-directories, that have been modified in the last 60 minutes:
$ find /target_directory -type f -mmin -60
To find the most recently modified files, sorted in the reverse order of update time (i.e., the most recently updated files first):
$ find /etc -type f -printf '%TY-%Tm-%Td %TT %p\n' | sort -r
Manual of find:
Numeric arguments can be specified as
+n for greater than n,
-n for less than n,
n for exactly n.
-amin n
File was last accessed n minutes ago.
-anewer file
File was last accessed more recently than file was modified. If file is a symbolic link and the -H option or the -L option is in effect, the access time of the file it points to is always
used.
-atime n
File was last accessed n*24 hours ago. When find figures out how many 24-hour periods ago the file was last accessed, any fractional part is ignored, so to match -atime +1, a file has to
have been accessed at least two days ago.
-cmin n
File's status was last changed n minutes ago.
-cnewer file
File's status was last changed more recently than file was modified. If file is a symbolic link and the -H option or the -L option is in effect, the status-change time of the file it points
to is always used.
-ctime n
File's status was last changed n*24 hours ago. See the comments for -atime to understand how rounding affects the interpretation of file status change times.
Example:
find /dir -cmin -60 # creation time
find /dir -mmin -60 # modification time
find /dir -amin -60 # access time
I am working through the same need and I believe your timeframe is incorrect.
Try these:
15min change: find . -mtime -.01
1hr change: find . -mtime -.04
12 hr change: find . -mtime -.5
You should be using 24 hours as your base. The number after -mtime should be relative to 24 hours. Thus -.5 is the equivalent of 12 hours, because 12 hours is half of 24 hours.
Actually, there's more than one issue here. The main one is that xargs by default executes the command you specified, even when no arguments have been passed. To change that you might use a GNU extension to xargs:
--no-run-if-empty
-r
If the standard input does not contain any nonblanks, do not run the command. Normally, the command is run once even if there is no input. This option is a GNU extension.
Simple example:
find . -mmin -60 | xargs -r ls -l
But this might match to all subdirectories, including . (the current directory), and ls will list each of them individually. So the output will be a mess. Solution: pass -d to ls, which prohibits listing the directory contents:
find . -mmin -60 | xargs -r ls -ld
Now you don't like . (the current directory) in your list? Solution: exclude the first directory level (0) from find output:
find . -mindepth 1 -mmin -60 | xargs -r ls -ld
Now you'd need only the files in your list? Solution: exclude the directories:
find . -type f -mmin -60 | xargs -r ls -l
Now you have some files with names containing white space, quote marks, or backslashes? Solution: use null-terminated output (find) and input (xargs) (these are also GNU extensions, afaik):
find . -type f -mmin -60 -print0 | xargs -r0 ls -l
This may work for you. I used it for cleaning folders during deployments for deleting old deployment files.
clean_anyfolder() {
local temp2="$1/**"; //PATH
temp3=( $(ls -d $temp2 -t | grep "`date | awk '{print $2" "$3}'`") )
j=0;
while [ $j -lt ${#temp3[#]} ]
do
echo "to be removed ${temp3[$j]}"
delete_file_or_folder ${temp3[$j]} 0 //DELETE HERE
fi
j=`expr $j + 1`
done
}
this command may be help you sir
find -type f -mtime -60

Linux combine sort files by date created and given file name

I need to combine these to commands in order to have a sorted list by date created with the specified "filename".
I know that sorting files by date can be achieved with:
ls -lrt
and finding a file by name with
find . -name "filename*"
I don't know how to combine these two. I tried with a pipeline but I don't get the right result.
[EDIT]
Not sorted
find . -name "filename" -printf '%TY:%Tm:%Td %TH:%Tm %h/%f\n' | sort
Forget xargs. "Find" and "sort" are all the tools you need.
My best guess would be to use xargs:
find . -name 'filename*' -print0 | xargs -0 /bin/ls -ltr
There's an upper limit on the number of arguments, but it shouldn't be a problem unless they occupy more than 32kB (read more here), in which case you will get blocks of sorted files :)
find . -name "filename" -exec ls --full-time \{\} \; | cut -d' ' -f7- | sort
You might have to adjust the cut command depending on what your version of ls outputs.
Check the below-shared command:
1) List Files directory with Last Modified Date/Time
To list files and shows the last modified files at top, we will use -lt options with ls command.
$ ls -lt /run
output
total 24
-rw-rw-r--. 1 root utmp 2304 Sep 8 14:58 utmp
-rw-r--r--. 1 root root 4 Sep 8 12:41 dhclient-eth0.pid
drwxr-xr-x. 4 root root 100 Sep 8 03:31 lock
drwxr-xr-x. 3 root root 60 Sep 7 23:11 user
drwxr-xr-x. 7 root root 160 Aug 26 14:59 udev
drwxr-xr-x. 2 root root 60 Aug 21 13:18 tuned
https://linoxide.com/linux-how-to/how-sort-files-date-using-ls-command-linux/

How to limit depth for recursive file list?

Is there a way to limit the depth of a recursive file listing in linux?
The command I'm using at the moment is:
ls -laR > dirlist.txt
But I've got about 200 directories and each of them have 10's of directories. So it's just going to take far too long and hog too many system resources.
All I'm really interested in is the ownership and permissions information for the first level subdirectories:
drwxr-xr-x 14 root root 1234 Dec 22 13:19 /var/www/vhosts/domain1.co.uk
drwxr--r-- 14 jon root 1234 Dec 22 13:19 /var/www/vhosts/domain1.co.uk/htdocs
drwxr--r-- 14 jon root 1234 Dec 22 13:19 /var/www/vhosts/domain1.co.uk/cgi-bin
drwxr-xr-x 14 root root 1234 Dec 22 13:19 /var/www/vhosts/domain2.co.uk
drwxr-xrwx 14 proftp root 1234 Dec 22 13:19 /var/www/vhosts/domain2.co.uk/htdocs
drwxr-xrwx 14 proftp root 1234 Dec 22 13:19 /var/www/vhosts/domain2.co.uk/cgi-bin
drwxr-xr-x 14 root root 1234 Dec 22 13:19 /var/www/vhosts/domain3.co.uk
drwxr-xr-- 14 jon root 1234 Dec 22 13:19 /var/www/vhosts/domain3.co.uk/htdocs
drwxr-xr-- 14 jon root 1234 Dec 22 13:19 /var/www/vhosts/domain3.co.uk/cgi-bin
drwxr-xr-x 14 root root 1234 Dec 22 13:19 /var/www/vhosts/domain4.co.uk
drwxr-xr-- 14 jon root 1234 Dec 22 13:19 /var/www/vhosts/domain4.co.uk/htdocs
drwxr-xr-- 14 jon root 1234 Dec 22 13:19 /var/www/vhosts/domain4.co.uk/cgi-bin
EDIT:
Final choice of command:
find -maxdepth 2 -type d -ls >dirlist
Checkout the -maxdepth flag of find
find . -maxdepth 1 -type d -exec ls -ld "{}" \;
Here I used 1 as max level depth, -type d means find only directories, which then ls -ld lists contents of, in long format.
Make use of find's options
There is actually no exec of /bin/ls needed;
Find has an option that does just that:
find . -maxdepth 2 -type d -ls
To see only the one level of subdirectories you are interested in, add -mindepth to the same level as -maxdepth:
find . -mindepth 2 -maxdepth 2 -type d -ls
Use output formatting
When the details that get shown should be different, -printf can show any detail about a file in custom format;
To show the symbolic permissions and the owner name of the file, use -printf with %M and %u in the format.
I noticed later you want the full ownership information, which includes
the group. Use %g in the format for the symbolic name, or %G for the group id (like also %U for numeric user id)
find . -mindepth 2 -maxdepth 2 -type d -printf '%M %u %g %p\n'
This should give you just the details you need, for just the right files.
I will give an example that shows actually different values for user and group:
$ sudo find /tmp -mindepth 2 -maxdepth 2 -type d -printf '%M %u %g %p\n'
drwx------ www-data www-data /tmp/user/33
drwx------ octopussy root /tmp/user/126
drwx------ root root /tmp/user/0
drwx------ siegel root /tmp/user/1000
drwxrwxrwt root root /tmp/systemd-[...].service-HRUQmm/tmp
(Edited for readability: indented, shortened last line)
Notes on performance
Although the execution time is mostly irrelevant for this kind of command, increase in performance
is large enough here to make it worth pointing it out:
Not only do we save creating a new process for each name - a huge task -
the information does not even need to be read, as find already knows it.
tree -L 2 -u -g -p -d
Prints the directory tree in a pretty format up to depth 2 (-L 2).
Print user (-u) and group (-g) and permissions (-p).
Print only directories (-d).
tree has a lot of other useful options.
All I'm really interested in is the ownership and permissions information for the first level subdirectories.
I found a easy solution while playing my fish, which fits your need perfectly.
ll `ls`
or
ls -l $(ls)

Resources