List the files I own in subversion - linux

So I have a bit of an issue. I work for a small startup (about 8 developers) and my boss recently decided that we need to put the owner of each file in the documentation. So I have been try to write something using svn blame and file to loop through every php file and see which files have my username on more that 15 lines, but I haven't been able to get it quite right.
What I would really like is a one-liner (or simple bash script) that will list every file in a subversion repository and the username that last edited the majority of the lines. Any ideas?

Alright, this is what I came up with:
#!/bin/bash
set -e
for file in `svn ls -R`; do
if [ -f $file ]; then
owner=`svn blame $file | tr -s " " " " | cut -d" " -f3 | sort | uniq -c | sort -nr | head -1 | tr -s " " " " | cut -d" " -f3`
if [ $owner ]; then
echo $file $owner
fi
fi
done
It uses svn ls to determine each file in the repository, then for each file, svn blame output is examined:
tr -s " " " " squeezes multiple spaces into one space
cut -d" " -f3 gets the third space-delimited field, which is the username
sort sorts the output so all lines last edited by one user are together
uniq -c gets all unique lines and outputs the count of how many times each line appeared
sort -nr sorts numerically, in reverse order (so that the username that appeared most is sorted first)
head -1 returns the first line
tr -s " " " " | cut -d" " -f3 same as before, squeezes spaces and returns the third fieldname which is user.
It'll take a while to run but at the end you'll have a list of <filename> <most prevalent author>
Caveats:
Error checking is not done to make sure the script is called from within an SVN working copy
If called from deeper than the root of a WC, only files at that level and deeper will be considered
As mentioned in the comments, you might want to take revision date into account (if the majority of checkins happened 10 years ago, you might want to discount them determining the owner)
Any working copy changes that aren't checked in won't be taken into effect

for f in $(find . -name .svn -prune -o -type f); do
echo $f $(svn blame $f | awk '{ print $2 }' | sort | uniq -c | sort -nr | head -n 1 | cut -f 1)
done

Related

Bash - if statement not automatically operating

I'm having a very odd problem with a bash script, written on Ubuntu 17.04 machine.
I have a txt file that contains information about people in this fashion:
number name surname city state
With these infos I have to create an organization system that works by state. For example, with a list like this
123 alan smith new_york NEW_YORK
123 bob smith buffalo NEW_YORK
123 charles smith los_angeles CALIFORNIA
123 dean smith mobile ALABAMA
the outcome at the end of the computation should be three new files named NEW_YORK, CALIFORNIA and ALABAMA that contain people who live there.
The script takes as a parameter the list of names. I implemented an if statement (which condition is dictated by a test of existance for the file, just in case there's more people living in a certain state) inside a for loop that, oddly, doesn't operate unless I press enter while the program is running. The outcome is right, I get the files with the right people on them, but it baffles me that I have to press enter to make the code work, it doesn't make sense to me.
Here's my code:
#!/bin/bash
clear
#finding how many file lines and adding 1 to use the value as a counter later
fileLines=`wc -l addresses | cut -f1 --delimiter=" "`
(( fileLines = fileLines+1 ))
for (( i=1; i<$fileLines; i++ ))
do
#if the file named as the last column already exists do not create new one
test -e `head -n$i | tail -n1 | cut -f5 --delimiter=" "`
if [ $? = 0 ]
then
head -n$i $1 | tail -n1 >> `head -n$i $1 | tail -n1 | cut -f5 --delimiter=" "`
else
head -n$i $1 | tail -n1 > `head -n$i $1 | tail -n1 | cut -f5 --delimiter=" "`
fi
done
echo "cancel created files? y/n"
read key
if [ $key = y ]
then
rm `ls | grep [A-Z]$`
echo "done"
read
else
echo "done"
read
fi
clear
What am I doing wrong here? And moreover, why doesn't it tells me that something's wrong (clearly there is)?
The immediate problem (pointed out by #that other guy) is that in the line:
test -e `head -n$i | tail -n1 | cut -f5 --delimiter=" "`
The head command isn't given a filename to read from, so it's reading from stdin (i.e. you). But I'd change the whole script drastically, because you're doing it in a very inefficient way. If you have, say, a 1000-line file, you run head to read the first line (3 times, actually), then the first two lines (three times), then the first three... by the time you're done, head has read the first line of the file 3000 times, and then tail has discarded it 2997 of those times. You only really needed to read it once.
When iterating through a file like this, you're much better off just reading the file line-by-line, with something like this:
while read line; do
# process $line here
done <"$1"
But in this case, there's an even better tool. awk is really good at processing files like this, and it can handle the task really simply:
awk '{ if($5!="") { print $0 >>$5 }}' "$1"
(Note: I also put in the if to make sure there is a fifth field/ignore blank lines. Without that check it would've just been awk '{ print $0 >>$5 }' "$1").
Also, the command:
rm `ls | grep [A-Z]$`
...is a really weird and fragile way to do this. Parsing the output of ls is generally a bad idea, and again there's a much simpler way to do it:
rm *[A-Z]
Finally, I recommend running your script through shellcheck.net, since it'll point out some other problems (e.g. unquoted variable references).

change name of file in nested folders

I have been trying to think of a way to rename file names that are listed in nested folders and am having an issue resolving this matter. as a test i have been able to cut out what part of the name i would like to alter but can't think of how to put that into a variable and chain the name together. the file format looks like this.
XXX_XXXX_YYYYYYYYYY_100426151653-all.mp3
i have been testing this format out to cut the part out i was looking to change but i am not sure this would be the best way of doing it.
echo XXX_XXXX_YYYYYYYYYY_100426095135-all.mp3 |awk -F_ '{print $4}' | cut -c 1-6
I would like to change the 100426151653 to this 20100426-151653 format in the name.
i have tried to use the rename the file with this command with this format 's/ //g' but that format did not work i had to resort to rename ' ' '' file name to remove a blank space.
so the file would start as this
XXX_XXXX_YYYYYYYYYY_100426151653-all.mp3
and end like this
XXX_XXXX_YYYYYYYYYY_20100426-151653-all.mp3
How about using find and a bash function
#!/bin/bash
modfn () {
suffix=$2
fn=$(basename $1)
path=$(dirname $1)
fld1=$(echo $fn | cut -d '_' -f1)
fld2=$(echo $fn | cut -d '_' -f2)
fld3=$(echo $fn | cut -d '_' -f3)
fld4=$(echo $fn | cut -d '_' -f4)
fld5=${fld4%$suffix}
l5=${#fld5}
fld6=${fld5:0:$(($l5 - 6))}
fld7=${fld5:$(($l5 - 6)):6}
newfn="${fld1}_${fld2}_${fld3}_20${fld6}-${fld7}${suffix}"
echo "moving ${path}/${fn} to ${path}/${newfn}"
mv ${path}/${fn} ${path}/${newfn}"
}
export -f modfn
suffix="-all.mp3"
export suffix
find . -type f -name "*${suffix}" ! -name "*-*${suffix}" -exec bash -c 'modfn "$0" ${suffix}' {} +
The above bash script uses find to search in the current folder and it's contents for files like WWW_XXXX_YYYYYYYYYY_AAAAAABBBBBB-all.mp3 yet excludes ones that are already renamed and look like WWW_XXXX_YYYYYYYYYY_20AAAAAA-BBBBBB-all.mp3.
W,X,Y,A,B can be any character other than underscore or dash.
All the found files are renamed
NOTE: There are ways to shrink the above script but doing that makes the operation less obvious.
This perl one-liner does the job:
find . -name "XXX_XXXX_YYYYYYYYYY_*-all.mp3" -printf '%P\n' 2>/dev/null | perl -nle '$o=$_; s/_[0-9]{6}/_20100426-/; $n=$_; rename($o,$n)if!-e$n'
Note: I came just with a find command and regex part. The credit for a perl one liner goes to perlmonks user at http://www.perlmonks.org/?node=823355

Output of wc -l without file-extension

I've got the following line:
wc -l ./*.txt | sort -rn
i want to cut the file extension. So with this code i've got the output:
number filename.txt
for all my .txt-files in the .-directory. But I want the output without the file-extension, like this:
number filename
I tried a pipe with cut for different kinds of parameter, but all i got was to cut the whole filename with this command.
wc -l ./*.txt | sort -rn | cut -f 1 -d '.'
Assuming you don't have newlines in your filename you can use sed to strip out ending .txt:
wc -l ./*.txt | sort -rn | sed 's/\.txt$//'
unfortunately, cut doesn't have a syntax for extracting columns according to an index from the end. One (somewhat clunky) trick is to use rev to reverse the line, apply cut to it and then rev it back:
wc -l ./*.txt | sort -rn | rev | cut -d'.' -f2- | rev
Using sed in more generic way to cut off whatever extension the files have:
$ wc -l *.txt | sort -rn | sed 's/\.[^\.]*$//'
14 total
8 woc
3 456_base
3 123_base
0 empty_base
A better approach using proper mime type (what is the extension of tar.gz or such multi extensions ? )
#!/bin/bash
for file; do
case $(file -b $file) in
*ASCII*) echo "this is ascii" ;;
*PDF*) echo "this is pdf" ;;
*) echo "other cases" ;;
esac
done
This is a POC, not tested, feel free to adapt/improve/modify

appending to a tar file in a loop

I have a directory, that has a maybe 6 files.
team1_t444444_jill.csv
team1_t444444_jill.csv
team1_t444444_jill.csv
team1_t999999_jill.csv
team1_t999999_jill.csv
team1_t111111_jill.csv
team1_t111111_jill.csv
I want to be able to tar each of the files based on their t number, so t444444 should have it's own tar file with all the corresponding csv's. t999999 should then have its own and so on... a total of three tar files should be created dynamically
for file in $bad_dir/*.csv; do
fbname=`basename "$file" | cut -d. -f1` #takes the pathfile off, only shows xxx_tyyyyy_zzz.csv
t_name=$(echo "$fbname" | cut -d_ -f2) #takes the remaning stuff off, only shows tyyyyy
#now i am stuck on how to create a tar file and send email
taredFile = ??? #no idea how to implement
(cat home/files/hello.txt; uuencode $taredFile $taredFile) | mail -s "Failed Files" $t_name#hotmail.com
The simplest edit of your script that should do what you want is likely something like this.
for file in $bad_dir/*.csv; do
fbname=`basename "$file" | cut -d. -f1` #takes the pathfile off, only shows xxx_tyyyyy_zzz.csv
t_name=$(echo "$fbname" | cut -d_ -f2) #takes the remaning stuff off, only shows tyyyyy
tarFile=$t_name-combined.tar
if [ ! -f "$tarFile" ]; then
tar -cf "$tarFile" *_${t_name}_*.csv
{ cat home/files/hello.txt; uuencode $tarFile $tarFile; } | mail -s "Failed Files" $t_name#hotmail.com
fi
done
Use a tar file name based on the unique bit of the input file names. Then check for that file existing before creating it and sending email (protects against creating the file more than once and sending email more than once).
Use the fact that the files are globbable to get tar to archive them all from the first one we see.
You'll also notice that I replaced (commands) with { commands; } in the pipeline. The () force a sub-shell but so does the pipe itself so there's no reason (in this case) to force an extra sub-shell manually just for the grouping effect.
This is what you want:
for i in `find | cut -d. -f2 | cut -d_ -f1,2 | sort | uniq`;
do
tar -zvcf $i.tgz $i*
# mail the $i.tgz file
done
Take a look on my run:
$ for i in `find | cut -d. -f2 | cut -d_ -f1,2 | sort | uniq`; do tar -zvcf $i.tgz $i*; done
team1_t111111_jill.csv
team1_t111111_jxx.csv
team1_t111111.tgz
team1_t444444_j123.csv
team1_t444444_j444.csv
team1_t444444_jill.csv
team1_t444444.tgz
team1_t999999_jill.csv
team1_t999999_jilx.csv
team1_t999999.tgz
ubuntu#ubuntu1504:/tmp/foo$ ls
team1_t111111_jill.csv team1_t111111.tgz team1_t444444_j444.csv team1_t444444.tgz team1_t999999_jilx.csv
team1_t111111_jxx.csv team1_t444444_j123.csv team1_t444444_jill.csv team1_t999999_jill.csv team1_t999999.tgz

Why bc and args doesn't work together in one line?

I need help using xargs(1) and bc(1) in the same line. I can do it multiple lines, but I really want to find a solution in one line.
Here is the problem: The following line will print the size of a file.txt
ls -l file.txt | cut -d" " -f5
And, the following line will print 1450 (which is obviously 1500 - 50)
echo '1500-50' | bc
Trying to add those two together, I do this:
ls -l file.txt | cut -d" " -f5 | xargs -0 -I {} echo '{}-50' | bc
The problem is, it's not working! :)
I know that xargs is probably not the right command to use, but it's the only command I can find who can let me decide where to put the argument I get from the pipe.
This is not the first time I'm having issues with this kind of problem. It will be much of a help..
Thanks
If you do
ls -l file.txt | cut -d" " -f5 | xargs -0 -I {} echo '{}-50'
you will see this output:
23
-50
This means, that bc does not see a complete expression.
Just use -n 1 instead of -0:
ls -l file.txt | cut -d" " -f5 | xargs -n 1 -I {} echo '{}-50'
and you get
23-50
which bc will process happily:
ls -l file.txt | cut -d" " -f5 | xargs -n 1 -I {} echo '{}-50' | bc
-27
So your basic problem is, that -0 expects not lines but \0 terminated strings. And hence the newline(s) of the previous commands in the pipe garble the expression of bc.
This might work for you:
ls -l file.txt | cut -d" " -f5 | sed 's/.*/&-50/' | bc
Infact you could remove the cut:
ls -l file.txt | sed -r 's/^(\S+\s+){4}(\S+).*/\2-50/' | bc
Or use awk:
ls -l file.txt | awk '{print $5-50}'
Parsing output from the ls command is not the best idea. (really).
you can use many other solutions, like:
find . -name file.txt -printf "%s\n"
or
stat -c %s file.txt
or
wc -c <file.txt
and can use bash arithmetics, for avoid unnecessary slow process forks, like:
find . -type f -print0 | while IFS= read -r -d '' name
do
size=$(wc -c <$name)
s50=$(( $size - 50 ))
echo "the file=$name= size:$size minus 50 is: $s50"
done
Here is another solution, which only use one external command: stat:
file_size=$(stat -c "%s" file.txt) # Get the file size
let file_size=file_size-50 # Subtract 50
If you really want to combine them into one line:
let file_size=$(stat -c "%s" file.txt)-50
The stat command gets you the file size in bytes. The syntax above is for Linux (I tested against Ubuntu). On the Mac the syntax is a little different:
let file_size=$(stat -f "%z" mini.csv)-50

Resources