searching for some content in a file accross multiple revisions [duplicate]

searching for some content in a file accross multiple revisions [duplicate] - string

This question already has answers here:
How do I search all revisions of a file in a SubVersion repository?
(7 answers)
SVN Repository Search [closed]
(14 answers)
Closed 7 years ago.
I would like to do the following using svn: Search in a file for a certain string content across the revision history? Is this possible>? How do I best go about it?
Do I come up with a script using svn cat -r1 and grep or is there some better method?

Svn has no direct support for this.
If it's only one file this will work, albeit slowly.
svn log -q <file> | grep '^r' | awk '{print $1;}' | \
xargs -n 1 -i svn cat -r {} <file> | grep '<string>'
Fill in <file> and <string>
A for loop will also work so you can print the matching file/revision if desired.
Using a for loop to have some more output control (this is bash):
#!/bin/env bash
f=$1
s=$2
for r in $(svn log -q "$f" | grep '^r' | awk '{print $1;}'); do
e=$(svn cat -r $r "$f" | grep "$s")
if [[ -n "$e" ]]; then
echo "Found in revision $r: $e"
fi
done
This takes two arguments: the (path to) the file to search and the string to search for in the file.

Related

Result of bash expression fails to be interpreted as a string [duplicate]

This question already has an answer here:
Why would a correct shell script give a wrapped/truncated/corrupted error message? [duplicate]
(1 answer)
Closed 5 years ago.
Trying to download latest SBT version from GitHub:
version="$(curl -vsLk https://github.com/sbt/sbt/releases/latest 2>&1 | grep "< Location" | rev | cut -d'/' -f1 | rev)"
version is set to v1.1.0-RC2
Then attempting to download the .tar.gz package:
curl -fsSLk "https://github.com/sbt/sbt/archive/${version}.tar.gz" | tar xvfz - -C /home/myuser
However, instead of the correct URL:
https://github.com/sbt/sbt/archive/v1.1.0-RC2.tar.gz
Somehow the version string is interpreted as a command(?!), resulting in:
.tar.gzttps://github.com/sbt/sbt/archive/v1.1.0-RC2
When I manually set version="v1.1.0-RC2", this doesn't happen.
Thanks in advance!

You should use -I flag in curl command and a much simpler pipeline to grab version number like this:
curl -sILk https://github.com/sbt/sbt/releases/latest |
awk -F '[/ ]+' '$1 == "Location:"{sub(/\r$/, ""); print $NF}'
v1.1.0-RC2
Also note use of sub function to strip off \r from end of line of curl output.
Your script:
version=$(curl -sILk https://github.com/sbt/sbt/releases/latest | awk -F '[/ ]+' '$1 == "Location:"{sub(/\r$/, ""); print $NF}')
curl -fsSLk "https://github.com/sbt/sbt/archive/${version}.tar.gz" | tar xvfz - -C /home/myuser

Trying to delete lines beginning with a specific string from files where the file meets a target condition, in bash/linux

I am writing a bash script that will run a couple of times a minute. What I would like it to do is find all files in a specified directory that contain a specified string, and search that list of files and delete any line beginning with a different specific string (in this case it's
Here's what I've tried s far, but they aren't working:
ls -1t /the/directory | head -10 | grep -l "qualifying string" * | sed -i '/^<meta/d' *'
ls -1t /the/directory | head -10 | grep -l "qualifying string" * | sed -i '/^<meta/d' /the/directory'
The only reason I added in the head -10 is so that every time the script runs, it will start by only looking at the 10 most recent files. I don't want it to spend a lot of time searching needlessly through the entire directory since it will be going through and removing the line many times a minute.
The script has to be run out of a different directory than the files are in. Also, would the modified date on the files change if the "<meta" string doesn't exist in the file?

There are a variety of problem with this part of the command...
ls -1t /the/directory | head -10 | grep -l "qualifying string" * ...
First, you appear to be trying to pipe the output of ls ... | head -10 into grep, which would cause grep to search for "qualifying string" in the output of ls. Except then you turn around and provide * as a command line argument to grep, causing it to search in all the files, and completely ignoring the ls and head commands.
You probably want to read about the xargs commands, which reads a list of files on stdin and then runs a given command against that list. For example, you ought to be able to generate your file list like this:
ls -1t /the/directory | head -10 |
xargs grep -l "qualifying string"
And to apply sed to those files:
ls -1t /the/directory | head -10 |
xargs grep -l "qualifying string" |
sed -i 's/something/else/g'
Modifying the files with sed will update the modification time on the files.

You can use globbing with the * character to expand file names and loop through the directory.
n=0
for file in /the/directory/*; do
if [ -f "$file" ]; then
grep "qualifying string" "$file" && sed -i '/^<meta/d' "$file"
n=$((n+1))
fi
[ $n -eq 10 ] && break
done

How to get the second latest file in a folder in Linux

Found several posts like this one to tell how to find the latest file inside of a folder.
My question is one step forward, how to find the second latest file inside the same folder? The purpose is that I am looking for a way to diff the latest log with a previous log so as to know what have been changed. The log was generated in a daily basis.

Building on the linked solutions, you can just make tail keep the last two files, and then pass the result through head to keep the first one of those:
ls -Art | tail -n 2 | head -n 1

To do diff of the last (lately modified) two files:
ls -t | head -n 2 | xargs diff

Here's a stat-based solution (tested on linux)
for x in ./*;
do
if [[ -f "$x" ]]; then
stat --printf="%n %Y\n" "$x"; fi;
done |
sort -k2,2 -n -r |
sed -n '2{p;q}'

ls -dt {{your file pattern}} | head -n 2 | tail -n 1
Will provide second latest file in the pattern you search.

Here's the command returns you latest second file in the folder
ls -lt | tail -n 1 | head -n 2
enjoy...!

how to compare output of two ls in linux

So here is the task which I can't solve. I have a directory with .h files and a directory with .i files, which have the same names as the .h files. I want just by typing a command to have all .h files which are not found as .i files. It's not a hard problem, I can do it in some programming language, but I'm just curious how it will look like in cmd :). To be more specific here is the algo:
get file names without extensions from ls *.h
get file names without extensions from ls *.i
compare them
print all names from 1 that are not met in 2
Good luck!

diff \
<(ls dir.with.h | sed 's/\.h$//') \
<(ls dir.with.i | sed 's/\.i$//') \
| grep '$<' \
| cut -c3-
diff <(ls dir.with.h | sed 's/\.h$//') <(ls dir.with.i | sed 's/\.i$//') executes ls on the two directories, cuts off the extensions, and compares the two lists. Then grep '$<' finds the files that are only in the first listing, and cut -c3- cuts off the "< " characters that diff inserted.

ls ./dir_h/*.h | sed -r -n 's:.*dir_h/([^.]*).h$:dir_i/\1.i:p' | xargs ls 2>&1 | \
grep "No such file or directory" | awk '{print $4}' | sed -n -r 's:dir_i/([^:]*).*:dir_h/\1:p'

ls -1 dir1/*.hh dir2/*.ii | awk -F"/" '{print $NF}' |awk -F"." '{a[$1]++;b[$0]}END{for(i in a)if(a[i]==1 && b[i".hh"]) print i}'
explanation:
ls -1 dir1/*.hh dir2/*.ii
above will list all the files *.hh and *.ii files in both the directories.
awk -F"/" '{print $NF}'
above will just print the file name excluding the complete path of the file.
awk -F"." '{a[$1]++;b[$0]}END{for(i in a)if(a[i]==1 && b[i".hh"]) print i}'
above will create two associative arrays one with file name and one with excluding the extension.
if both hh and ii files exist the value in the assosciative array will 2 if there is only one file then the value will be 1.so we need array item whose value is 1 and it should be a header file (.hh).
this can be checked using the asso..array b which is done in the END block.

Assuming bash is your shell:
for file in $( ls dir_with_h/*.h ); do
name=${file%\.h}; # trim trailing ".h" file extension
name=${name#dir_with_h/}; # trim leading folder name
if [ ! -e dir_with_i/${name}.i ]; then
echo ${name};
fi
done
Undoubtedly this can be ported to virtually all other shells. I find this less cryptic than some other approaches (although this is surely my problem) but it is a little wordy. As such. a shell script might help recall it.

Problems with Grep Command in bash script

I'm having some rather unusual problems using grep in a bash script. Below is an example of the bash script code that I'm using that exhibits the behaviour:
UNIQ_SCAN_INIT_POINT=1
cat "$FILE_BASENAME_LIST" | uniq -d >> $UNIQ_LIST
sed '/^$/d' $UNIQ_LIST >> $UNIQ_LIST_FINAL
UNIQ_LINE_COUNT=`wc -l $UNIQ_LIST_FINAL | cut -d \ -f 1`
while [ -n "`cat $UNIQ_LIST_FINAL | sed "$UNIQ_SCAN_INIT_POINT"'q;d'`" ]; do
CURRENT_LINE=`cat $UNIQ_LIST_FINAL | sed "$UNIQ_SCAN_INIT_POINT"'q;d'`
CURRENT_DUPECHK_FILE=$FILE_DUPEMATCH-$CURRENT_LINE
grep $CURRENT_LINE $FILE_LOCTN_LIST >> $CURRENT_DUPECHK_FILE
MATCH=`grep -c $CURRENT_LINE $FILE_BASENAME_LIST`
CMD_ECHO="$CURRENT_LINE matched $MATCH times," cmd_line_echo
echo "$CURRENT_DUPECHK_FILE" >> $FILE_DUPEMATCH_FILELIST
let UNIQ_SCAN_INIT_POINT=UNIQ_SCAN_INIT_POINT+1
done
On numerous occasions, when grepping for the current line in the file location list, it has put no output to the current dupechk file even though there have definitely been matches to the current line in the file location list (I ran the command in terminal with no issues).
I've rummaged around the internet to see if anyone else has had similar behaviour, and thus far all I have found is that it is something to do with buffered and unbuffered outputs from other commands operating before the grep command in the Bash script....
However no one seems to have found a solution, so basically I'm asking you guys if you have ever come across this, and any idea/tips/solutions to this problem...
Regards
Paul

The `problem' is the standard I/O library. When it is writing to a terminal
it is unbuffered, but if it is writing to a pipe then it sets up buffering.
try changing
CURRENT_LINE=`cat $UNIQ_LIST_FINAL | sed "$UNIQ_SCAN_INIT_POINT"'q;d'`
to
CURRENT LINE=`sed "$UNIQ_SCAN_INIT_POINT"'q;d' $UNIQ_LIST_FINAL`

Are there any directories with spaces in their names in $FILE_LOCTN_LIST? Because if they are, those spaces will need escaped somehow. Some combination of find and xargs can usually deal with that for you, especially xargs -0

A small bash script using md5sum and sort that detects duplicate files in the current directory:
CURRENT="" md5sum * |
sort |
while read md5sum filename;
do
[[ $CURRENT == $md5sum ]] && echo $filename is duplicate;
CURRENT=$md5sum;
done

you tagged linux, some i assume you have tools like GNU find,md5sum,uniq, sort etc. here's a simple example to find duplicate files
$ echo "hello world">file
$ md5sum file
6f5902ac237024bdd0c176cb93063dc4 file
$ cp file file1
$ md5sum file1
6f5902ac237024bdd0c176cb93063dc4 file1
$ echo "blah" > file2
$ md5sum file2
0d599f0ec05c3bda8c3b8a68c32a1b47 file2
$ find . -type f -exec md5sum "{}" \; |sort -n | uniq -w32 -D
6f5902ac237024bdd0c176cb93063dc4 ./file
6f5902ac237024bdd0c176cb93063dc4 ./file1

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

searching for some content in a file accross multiple revisions [duplicate] - string

Related

Result of bash expression fails to be interpreted as a string [duplicate]

Trying to delete lines beginning with a specific string from files where the file meets a target condition, in bash/linux

How to get the second latest file in a folder in Linux

how to compare output of two ls in linux

Problems with Grep Command in bash script

Categories

Resources