How to create a patch file for text files only in a directory - linux

I have a directory with hundreds of text files and object files. I had a copy of this directory somewhere else and I edited it and recompiled it. the object files are now different, but I want to generate a patch from the text files only. is there a way to do this or do I need to separate them into different folders?
diff -uraN original/ new/ > patch.diff
how can I specify file types in this command?
-X excludes, but I want the opposite of this. I want to exclude everything except .txt files

Did you want one diff per txt?
for f in original/*.txt # for each original
do d=${f#original/} # get base name
diff -uraN "$f" "new/$d" > ${d%.txt}.diff # diff old against new
done
You mention -X; I'm not sure how diff implements it, but the bash CLI allows extended globbing.
$: shopt -s extglob
$: ls -l *.???
-rw-r--r-- 1 P2759474 1049089 0 May 10 21:49 OCG3C82.tmp
-rw-r--r-- 1 P2759474 1049089 0 May 11 03:22 OCG511D.tmp
-rw-r--r-- 1 P2759474 1049089 0 May 12 00:03 OCG5214.tmp
-rw-r--r-- 1 P2759474 1049089 0 May 14 09:34 a.txt
-rw-r--r-- 1 P2759474 1049089 0 May 14 09:34 b.txt
$: ls *.!(txt)
-rw-r--r-- 1 P2759474 1049089 0 May 10 21:49 OCG3C82.tmp
-rw-r--r-- 1 P2759474 1049089 0 May 11 03:22 OCG511D.tmp
-rw-r--r-- 1 P2759474 1049089 0 May 12 00:03 OCG5214.tmp

If I understand your question correctly, you just want to perform the diff command on any files with the extension .txt.
Unfortunately, diff has no include option, but we can get around it by using find to get a list of the files which aren't text files, and then we can exclude those using -X. This one liner will do that.
find original new ! -name '*.txt' -printf '%f\n' -type f | diff -uraN original/ new/ -X - > patch.diff
If you want more info on how that works you can check out the man pages for find and diff.

Related

replacement on xargs variable returns empty string

I need to search for XML files inside a directory tree and create links for them on another directory (staging_ojs_pootle), naming these links with the file path (replacing slashes per dots).
the bash command is not working, I got stuck on the replacement part. Seems like the variable from xargs, named 'file', is not accessible inside the replacement code (${file/\//.})
find directory/ -name '*.xml' | xargs -I 'file' echo "ln" file staging_ojs_pootle/${file/\//.}
The replacement inside ${} result gives me an empty string.
Tried using sed but regular expressions were replacing all or just the last slash :/
find directory/ -name '*.xml' | xargs -I 'file' echo "ln" file staging_ojs_pootle/file |sed -e '/^ln/s/\(staging_ojs_pootle.*\)[\/]\(.*\)/\1.\2/g'
regards
Try this:
$ find directory/ -name '*.xml' |sed -r 'h;s|/|.|g;G;s|([^\n]+)\n(.+)|ln \2 staging_ojs_pootle/\1|e'
For example:
$ mkdir -p /tmp/test
$ touch {1,2,3,4}.xml
# use /tmp/test as staging_ojs_pootle
$ find /tmp/test -name '*.xml' |sed -r 'h;s|/|.|g;G;s|([^\n]+)\n(.+)|ln \2 /tmp/test/\1|e'
$ ls -al /tmp/test
total 8
drwxr-xr-x. 2 root root 4096 Jun 15 13:09 .
drwxrwxrwt. 9 root root 4096 Jun 15 11:45 ..
-rw-r--r--. 2 root root 0 Jun 15 11:45 1.xml
-rw-r--r--. 2 root root 0 Jun 15 11:45 2.xml
-rw-r--r--. 2 root root 0 Jun 15 11:45 3.xml
-rw-r--r--. 2 root root 0 Jun 15 11:45 4.xml
-rw-r--r--. 2 root root 0 Jun 15 11:45 .tmp.test.1.xml
-rw-r--r--. 2 root root 0 Jun 15 11:45 .tmp.test.2.xml
-rw-r--r--. 2 root root 0 Jun 15 11:45 .tmp.test.3.xml
-rw-r--r--. 2 root root 0 Jun 15 11:45 .tmp.test.4.xml
# if don NOT use the e modifier of s command, we can get the final command
$ find /tmp/test -name '*.xml' |sed -r 'h;s|/|.|g;G;s|([^\n]+)\n(.+)|ln \2 /tmp/test/\1|'
ln /tmp/test/1.xml /tmp/test/.tmp.test.1.xml
ln /tmp/test/2.xml /tmp/test/.tmp.test.2.xml
ln /tmp/test/3.xml /tmp/test/.tmp.test.3.xml
ln /tmp/test/4.xml /tmp/test/.tmp.test.4.xml
Explains:
for each xml file, use h to keep the origin filename in hold space.
the use s|/|.|g to substitute all / to . for xml filename.
use G to append the hold space to pattern space, then pattern space is CHANGED_FILENAME\nORIGIN_FILENAME.
use s|([^\n]+)\n(.+)|ln \2 staging_ojs_pootle/\1|e' to merge the command with CHANGED_FILENAME and ORIGIN_FILENAME, then use e modifier of s command to execute the command assembled above, which will do the actual works.
Hope this helps!
If you can be sure that the names of your XML files do not contain any word-splitting characters, you can use something like:
find directory -name "*.xml" | sed 'p;s/\//./' | xargs -n2 echo ln

how to get previous date files and pass ls output to array in gawk

I have log files like below generated, and I need to daily run script ,which will list them , and then do 2 things.
1- get previous / yesterday files and transfer them to x server
2- get files older than one day and transfer them to y server
files are like below and I am trying below code but not working.
how can we pass ls -altr output to gawk ? can we built an associate array like below.
array[index]=ls -altr | awk '{print $6,$7,$8}'
code I am trying to retrieve previous date files , but not working
previous_dates=$(date -d "-1 days" '+-%d')
ls -altr |gawk '{if ( $7!=previous_dates ) print $9 }'
-r-------- 1 root root 6291563 Jun 22 14:45 audit.log.4
-r-------- 1 root root 6291619 Jun 24 09:11 audit.log.3
drwxr-xr-x. 14 root root 4096 Jun 26 03:47 ..
-r-------- 1 root root 6291462 Jun 26 04:15 audit.log.2
-r-------- 1 root root 6291513 Jun 27 23:05 audit.log.1
drwxr-x---. 2 root root 4096 Jun 27 23:05 .
-rw------- 1 root root 5843020 Jun 29 14:57 audit.log
To select files modified yesterday, you could use
find . -daystart -type f -mtime 1
and to select older files, you could use
find . -daystart -type f -mtime +1
possibly adding a -name test to select only files like audit.log*, for example. You could then use xargs to process the files, e.g.
find . -daystart -type f -mtime 1 | xargs -n 1 -I{} scp {} user#server

Rename a file in linux which has date appended to it

There is a single file in a folder which has date and time appended to it. I would like to rename it to something else so that I can access it easily. I know the starting word of this file name. Is they there a way I can rename this (using wildcard or something)? . Can't use Tab since I am trying to write a script to automate something.
Also, I would like to access the lexicographical last element and rename it if there are multiple files.
You want to find the last changed file in a directory? Which matches a pattern?
find . -maxdepth 1 -mindepth 1 -type f -name "prefix*" \
-printf "%TY%Tm%TdTH%TM %p\n" | sort -nr | read -r _ file
printf "%s" "$file"
Assuming no filename contains newlines. And that you actually just want to get the last changed file in a directory.
Alternative you can sort do something like this:
find . -maxdepth 1 -mindepth 1 -type f -name "prefix*" | sort -nr -t- -k2
Which will sort files like this:
prefix-2016-05-05
prefix-2016-06-05
prefix-2016-04-08
To
prefix-2016-06-05
prefix-2016-05-05
prefix-2016-04-08
Assuming the name of the file is bob_2016-06-10_06:00:00.txt you could do this:
mv *_20[0-9][0-9]-[01][0-9]-[0-3][0-9]_[0-2][0-9]:[0-5][0-9]:[0-5][0-9].txt commmonname.txt
You can use rename with a regexp. The syntax is:
Usage: rename [-v] [-n] [-f] perlexpr [filenames]
So, given the following files:
coda#pong:/tmp/kk$ ls -l
total 0
-rw-rw-r-- 1 coda coda 0 Jun 10 13:28 aaaa-2016-01-01.txt
-rw-rw-r-- 1 coda coda 0 Jun 10 13:28 aaaa-2016-01-02.txt
-rw-rw-r-- 1 coda coda 0 Jun 10 13:28 aaaa-2016-01-03.txt
-rw-rw-r-- 1 coda coda 0 Jun 10 13:28 aaaa-2016-01-04.txt
You can rename them this way:
coda#pong:/tmp/kk$ rename -v 's/aaaa/xxxx/' *.txt
aaaa-2016-01-01.txt renamed as xxxx-2016-01-01.txt
aaaa-2016-01-02.txt renamed as xxxx-2016-01-02.txt
aaaa-2016-01-03.txt renamed as xxxx-2016-01-03.txt
aaaa-2016-01-04.txt renamed as xxxx-2016-01-04.txt
If you want to keep track of the lexicographical last element, I would use something like this in the script:
ln -sf `ls | tail -n1` latest
Since ls sorts by name by default, you will always have a link to your lexicographical last element:
coda#pong:/tmp/kk$ ls -lrt
total 0
-rw-rw-r-- 1 coda coda 0 Jun 10 13:28 xxxx-2016-01-01.txt
-rw-rw-r-- 1 coda coda 0 Jun 10 13:28 xxxx-2016-01-02.txt
-rw-rw-r-- 1 coda coda 0 Jun 10 13:28 xxxx-2016-01-03.txt
-rw-rw-r-- 1 coda coda 0 Jun 10 13:28 xxxx-2016-01-04.txt
lrwxrwxrwx 1 coda coda 19 Jun 10 13:59 latest -> xxxx-2016-01-04.txt

Linux combine sort files by date created and given file name

I need to combine these to commands in order to have a sorted list by date created with the specified "filename".
I know that sorting files by date can be achieved with:
ls -lrt
and finding a file by name with
find . -name "filename*"
I don't know how to combine these two. I tried with a pipeline but I don't get the right result.
[EDIT]
Not sorted
find . -name "filename" -printf '%TY:%Tm:%Td %TH:%Tm %h/%f\n' | sort
Forget xargs. "Find" and "sort" are all the tools you need.
My best guess would be to use xargs:
find . -name 'filename*' -print0 | xargs -0 /bin/ls -ltr
There's an upper limit on the number of arguments, but it shouldn't be a problem unless they occupy more than 32kB (read more here), in which case you will get blocks of sorted files :)
find . -name "filename" -exec ls --full-time \{\} \; | cut -d' ' -f7- | sort
You might have to adjust the cut command depending on what your version of ls outputs.
Check the below-shared command:
1) List Files directory with Last Modified Date/Time
To list files and shows the last modified files at top, we will use -lt options with ls command.
$ ls -lt /run
output
total 24
-rw-rw-r--. 1 root utmp 2304 Sep 8 14:58 utmp
-rw-r--r--. 1 root root 4 Sep 8 12:41 dhclient-eth0.pid
drwxr-xr-x. 4 root root 100 Sep 8 03:31 lock
drwxr-xr-x. 3 root root 60 Sep 7 23:11 user
drwxr-xr-x. 7 root root 160 Aug 26 14:59 udev
drwxr-xr-x. 2 root root 60 Aug 21 13:18 tuned
https://linoxide.com/linux-how-to/how-sort-files-date-using-ls-command-linux/

Linux - Save only recent 10 folders and delete the rest

I have a folder that contains versions of my application, each time I upload a new version a new sub-folder is created for it, the sub-folder name is the current timestamp, here is a printout of the main folder used (ls -l |grep ^d):
drwxrwxr-x 7 root root 4096 2011-03-31 16:18 20110331161649
drwxrwxr-x 7 root root 4096 2011-03-31 16:21 20110331161914
drwxrwxr-x 7 root root 4096 2011-03-31 16:53 20110331165035
drwxrwxr-x 7 root root 4096 2011-03-31 16:59 20110331165712
drwxrwxr-x 7 root root 4096 2011-04-03 20:18 20110403201607
drwxrwxr-x 7 root root 4096 2011-04-03 20:38 20110403203613
drwxrwxr-x 7 root root 4096 2011-04-04 14:39 20110405143725
drwxrwxr-x 7 root root 4096 2011-04-06 15:24 20110406151805
drwxrwxr-x 7 root root 4096 2011-04-06 15:36 20110406153157
drwxrwxr-x 7 root root 4096 2011-04-06 16:02 20110406155913
drwxrwxr-x 7 root root 4096 2011-04-10 21:10 20110410210928
drwxrwxr-x 7 root root 4096 2011-04-10 21:50 20110410214939
drwxrwxr-x 7 root root 4096 2011-04-10 22:15 20110410221414
drwxrwxr-x 7 root root 4096 2011-04-11 22:19 20110411221810
drwxrwxr-x 7 root root 4096 2011-05-01 21:30 20110501212953
drwxrwxr-x 7 root root 4096 2011-05-01 23:02 20110501230121
drwxrwxr-x 7 root root 4096 2011-05-03 21:57 20110503215252
drwxrwxr-x 7 root root 4096 2011-05-06 16:17 20110506161546
drwxrwxr-x 7 root root 4096 2011-05-11 10:00 20110511095709
drwxrwxr-x 7 root root 4096 2011-05-11 10:13 20110511100938
drwxrwxr-x 7 root root 4096 2011-05-12 14:34 20110512143143
drwxrwxr-x 7 root root 4096 2011-05-13 22:13 20110513220824
drwxrwxr-x 7 root root 4096 2011-05-14 22:26 20110514222548
drwxrwxr-x 7 root root 4096 2011-05-14 23:03 20110514230258
I'm looking for a command that will leave the last 10 versions (sub-folders) and deletes the rest.
Any thoughts?
There you go. (edited)
ls -dt */ | tail -n +11 | xargs rm -rf
First list directories recently modified then take all of them except first 10, then send them to rm -rf.
ls -dt1 /path/to/folder/*/ | sed '11,$p' | rm -r
this assumes those are the only directories and no others are present in the working directory.
ls -dt1 will normally only print the newest directory however the /*/ will
only match directories and print their full paths the 1 ensures one
line per match/listing t sorts time with newest at the top.
sed takes the 11th line on down to the bottom and prints only those lines, which are then passed to rm.
You can use xargs, but for testing you may wish to remove | rm -r to see if the directories are listed properly first.
If the directories' names contain the date one can delete all but the last 10 directories with the default alphabetical sort
ls -d */ | head -n -10 | xargs rm -rf
ls -lt | grep ^d | sed -e '1,10d' | awk '{sub(/.* /, ""); print }' | xargs rm -rf
Explanation:
list all contents of current directory in chronological order (most recent files first)
filter out all the directories
ignore the 10 first lines / directories
use awk to extract the file names from the remaining 'ls -l' output
remove the files
EDIT:
find . -maxdepth 1 -type d ! -name \\.| sort | tac | sed -e '1,10d' | xargs rm -rf
I suggest the following sequence. I use a similar approach on my Synology NAS to delete old backups. It doesn't rely on the folder names, instead it uses the last modified time to decide which folders to delete. It also uses zero-termination in order to correctly handle quotes, spaces and newline characters in the folder names:
find /path/to/folder -maxdepth 1 -mindepth 1 -type d -printf '%Ts\t' -print0 \
| sort -rnz \
| tail -n +11 -z \
| cut -f2- -z \
| xargs -0 -r rm -rf
IMPORTANT: This will delete any matching folders! I strongly recommend doing a test run first by replacing the last command xargs -0 -r rm -rf with xargs -0 which will echo the matching folders instead of deleting them.
A short explanation of each step:
find /path/to/folder -maxdepth 1 -mindepth 1 -type d -printf '%Ts\t' -print0
Find all directories (-type d) directly inside the backup folder (-maxdepth 1) except the backup folder itself (-mindepth 1), print (-printf) the Unix time (%Ts) of the last modification followed by a tab character (\t, used in step 4) and the full file name followed by a null character (-print0).
sort -rnz
Sort the zero-terminated items (-z) from the previous step using a numerical comparison (-n) and reverse the order (-r). The result is a list of all folders sorted by their last modification time in descending order.
tail -n +11 -z
Print the last lines (tail) from the previous step starting from line 11 (-n +11) considering each line as zero-terminated (-z). This excludes the newest 10 folders (by modification time) from the remaining steps.
cut -f2- -z
Cut each line from the second field until the end (-f2-) treating each line as zero-terminaded (-z) to obtain a list containing the full path to each folder older than 10 days.
xargs -r -0 rm -rf
Take the zero-terminated (-0) items from the previous step (xargs), and, if there are any (-r avoids running the command passed to xargs if there are no nonblank characters), force delete (rm -rf) them.
Your directory names are sorted in chronological order, which makes this easy. The list of directories in chronological order is just *, or [0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9] to be more precise. So you want to delete all but the last 10 of them.
set [0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]/
while [ $# -gt 10 ]; do
rm -rf "$1"
shift
fi
(While there are more than 10 directories left, delete the oldest one.)

Resources