Python skip the lines which do not have any of starting line in the output - python-3.x

I am trying to write a code after getting help from google and So to parse a command output but still getting some problem, as the output what i am expecting continuous there line starting with dn , instance and tag but somehow the very first output only contains dn and tag So, i want those line which do not have all these three starting strings then just skip those, as i am learning so not getting the idea to do that.
Below is my code:
import subprocess as sp
p = sp.Popen(somecmd, shell=True, stdout=sp.PIPE)
stout = p.stdout.read().decode('utf8')
output = stout.splitlines()
startline = ["instance:", "tag"]
for line in output:
print(line)
Script output:
dn: ou=People,ou=pti,o=pt
tag: pti00631
dn: cn=pti00857,ou=People,ou=pti,o=pt
instance: Jassu Lal
tag: pti00857
dn: cn=pti00861,ou=People,ou=pti,o=pt
instance: Gatti Lal
tag: pti00861
Desired output:
dn: cn=pti00857,ou=People,ou=pti,o=pt
instance: Jassu Lal
tag: pti00857
dn: cn=pti00861,ou=People,ou=pti,o=pt
instance: Gatti Lal
tag: pti00861

Assuming your output always the same, your loop can look like this:
lines_to_skip = 3
skip_lines = False
skipped_lines = 0
for line in output():
if "dn: " in line and not "dn: cn" in line:
skip_lines = True
if skip_lines:
if skipped_lines < lines_to_skip:
skipped_lines += 1
continue
if skipped_lines == lines_to_skip:
skip_lines = False
skipped_lines = 0
print(line)
It will check if there is a dn without the cn, counts to 3 (or rather lines_to_skip) and starts outputting when it's reached the lines to skip.
It's a pretty hacky solution but the best one I could come up with for the given context

The below code is flexible. You only need to add the tags in the necessary_tags dictionary without which you do not want to print. It can be more than 3 as well. It also accounts for situations when you receive a particular tag more than once.
import subprocess as sp
p = sp.Popen(somecmd, shell=True, stdout=sp.PIPE)
stout = p.stdout.read().decode('utf8')
output = stout.splitlines()
output.append("")
necessary_tags = {'dn':0, 'instance':0, 'tag':0}
temp_output = []
for line in (output):
tag = line.split(':')[0].strip()
if necessary_tags.get(tag, -1) != -1:
necessary_tags[tag] += 1
temp_output.append(line)
elif line == "":
if all(necessary_tags.values()):
for out in temp_output:
print(out)
temp_output = []
necessary_tags.update({}.fromkeys(necessary_tags,0))
print()

Related

python3 : append string to every list

I am using python3 with gitpython and generating the result as shown below :
0bf35c4cf243e0fe13adbe7aeba99a03ddf6acfd refs/release/17.xp.0.95/head
d0c5f748e65488ce2e90c1ed027c2da252a5c6a2 refs/release/17.xp.0.96/head
530bdbf8f06859d8aca55cee7b57e27e68e87a94 refs/release/17.xp.0.97/head
0dd0342466540bc38e26ef74af6c8837d165cae5 refs/release/17.xp.0.98/head
919b78fb737b00830a8e48353b0f977c442600dd refs/release/17.xp.0.99/head
But i want to append the string name "acme" to every line, for example
0bf35c4cf243e0fe13adbe7aeba99a03ddf6acfd refs/release/17.xp.0.95/head
acme
d0c5f748e65488ce2e90c1ed027c2da252a5c6a2 refs/release/17.xp.0.96/head
acme
530bdbf8f06859d8aca55cee7b57e27e68e87a94 refs/release/17.xp.0.97/head
acme
0dd0342466540bc38e26ef74af6c8837d165cae5 refs/release/17.xp.0.98/head
acme
919b78fb737b00830a8e48353b0f977c442600dd refs/release/17.xp.0.99/head
acme
Below is the code i am using, please advise the solution to append/concatenate the string to every end of the lines.
import os,re,sys,argparse
import git
if len(sys.argv) < 2:
print('Usage : --track <track name> without "track/" ')
sys.exit()
input_track = sys.argv[1].strip()
print ("Checking for the track name - track/",input_track)
def show_ref(input_track,gitname):
url = "git#github/"+gitname+".git"
g = git.cmd.Git()
ig1 = g.ls_remote(url,"refs/heads/track/"+input_track).split('\d')
print ("Branch for glide-test:\n",'\n'.join(ig1))
for x in range(13,20):
ig6 = g.ls_remote(url,"refs/release/"+str(x)+"."+input_track+".*/head").split('|')
print ('\n'.join(ig6))
#"\n".join(map(lambda word: word+"x", s.split("\n")))
show_ref(input_track,"acme")
You can simply modify this line
ig6 = g.ls_remote(url,"refs/release/"+str(x)+"."+input_track+".*/head").split('|')
By adding the string "acme" to the string you build.
Like this
ig6 = g.ls_remote(url,"refs/release/"+str(x)+"."+input_track+".*/head acme").split('|')
Is that what you meant?

I need to get the text from uma file and put into another empty file

this is the code i made till now, i need a help to solve this problem, i am trying to get the text from one file and put it in another file but i didn t fin a way to do this correctly.
When i run this code my file is empty.
f = open('teste.txt','r')
texto = f.readlines()
x = 0
while x < len(texto):
if texto[x] == "\n":
local = texto.index(texto[x])
texto.pop(local)
else:
texto[x] = texto[x].split(',')
x += 1
# print(texto[1])
texto1 = open('gravando.txt','r+')
# texto1.write(texto[1,5,6,7,8])
(texto1.write(line) for line in (texto[i] for i in [1,5,6,7,8]))
print('O conteudo do texto1 e ', texto1.readlines())
This is the text in file teste.txt
Name: compute-resources
Namespace: voting-application
Scopes: NotTerminating
* Matches all pods that do not have an active deadline. These pods usually include long running pods whose container command is not expected to terminate.
Resource Used Hard
-------- ---- ----
limits.cpu 4 4
limits.memory 2Gi 2Gi
And this is the result i expected
Namespace: voting-application
Resource Used Hard
-------- ---- ----
limits.cpu 4 4
limits.memory 2Gi 2Gi
As far as I understand, You are trying to copy one text from file to another by omitting some lines such as new lines. Also, only trying to write only certain lines like 1,5,6
Below is correction while writing in another file,
f = open('teste.txt','r')
texto = f.readlines()
x = 0
while x < len(texto):
if texto[x] == "\n":
local = texto.index(texto[x])
texto.pop(local)
else:
x += 1
# print(texto[1])
texto1 = open('gravando.txt','r+')
[texto1.write(texto[i]) for i in [1,5,6,7,8]]
or you can do like this,
for i in [1,5,6,7,8]:
texto1.write(texto[i])
I would suggest file handling operation should be performed using context manager,
with open("test.txt", "w") as f:
[f.write(texto[i]) for i in [1,5,6,7,8]]

How to handle blank line,junk line and \n while converting an input file to csv file

Below is the sample data in input file. I need to process this file and turn it into a csv file. With some help, I was able to convert it to csv file. However not fully converted to csv since I am not able to handle \n, junk line(2nd line) and blank line(4th line). Also, i need help to filter transaction_type i.e., avoid "rewrite" transaction_type
{"transaction_type": "new", "policynum": 4994949}
44uu094u4
{"transaction_type": "renewal", "policynum": 3848848,"reason": "Impressed with \n the Service"}
{"transaction_type": "cancel", "policynum": 49494949, "cancel_table":[{"cancel_cd": "AU"}, {"cancel_cd": "AA"}]}
{"transaction_type": "rewrite", "policynum": 5634549}
Below is the code
import ast
import csv
with open('test_policy', 'r') as in_f, open('test_policy.csv', 'w') as out_f:
data = in_f.readlines()
writer = csv.DictWriter(
out_f,
fieldnames=[
'transaction_type', 'policynum', 'cancel_cd','reason'],lineterminator='\n',
extrasaction='ignore')
writer.writeheader()
for row in data:
dict_row = ast.literal_eval(row)
if 'cancel_table' in dict_row:
cancel_table = dict_row['cancel_table']
cancel_cd= []
for cancel_row in cancel_table:
cancel_cd.append(cancel_row['cancel_cd'])
dict_row['cancel_cd'] = ','.join(cancel_cd)
writer.writerow(dict_row)
Below is my output not considering the junk line,blank line and transaction type "rewrite".
transaction_type,policynum,cancel_cd,reason
new,4994949,,
renewal,3848848,,"Impressed with
the Service"
cancel,49494949,"AU,AA",
Expected output
transaction_type,policynum,cancel_cd,reason
new,4994949,,
renewal,3848848,,"Impressed with the Service"
cancel,49494949,"AU,AA",
Hmm I try to fix them but I do not know how CSV file work, but my small knoll age will suggest you to run this code before to convert the file.
txt = {"transaction_type": "renewal",
"policynum": 3848848,
"reason": "Impressed with \n the Service"}
newTxt = {}
for i,j in txt.items():
# local var (temporar)
lastX = ""
correctJ = ""
# check if in J is ascii white space "\n" and get it out
if "\n" in f"b'{j}'":
j = j.replace("\n", "")
# for grammar purpose check if
# J have at least one space
if " " in str(j):
# if yes check it closer (one by one)
for x in ([j[y:y+1] for y in range(0, len(j), 1)]):
# if 2 spaces are consecutive pass the last one
if x == " " and lastX == " ":
pass
# if not update correctJ with new values
else:
correctJ += x
# remember what was the last value checked
lastX = x
# at the end make J to be the correctJ (just in case J has not grammar errors)
j = correctJ
# add the corrections to a new dictionary
newTxt[i]=j
# show the resoult
print(f"txt = {txt}\nnewTxt = {newTxt}")
Termina:
txt = {'transaction_type': 'renewal', 'policynum': 3848848, 'reason': 'Impressed with \n the Service'}
newTxt = {'transaction_type': 'renewal', 'policynum': 3848848, 'reason': 'Impressed with the Service'}
Process finished with exit code 0

How to convert cmudict-0.7b or cmudict-0.7b.dict in to FST format to use it with phonetisaurus?

I am looking for a simple procedure to generate FST (finite state transducer) from cmudict-0.7b or cmudict-0.7b.dict, which will be used with phonetisaurus.
I tried following set of commands (phonetisaurus Aligner, Google NGramLibrary and phonetisaurus arpa2wfst) and able to generate FST but it didn't work. I am not sure where I did a mistake or miss any step. I guess very first command ie phonetisaurus-align, is not correct.
phonetisaurus-align --input=cmudict.dict --ofile=cmudict/cmudict.corpus --seq1_del=false
ngramsymbols < cmudict/cmudict.corpus > cmudict/cmudict.syms
/usr/local/bin/farcompilestrings --symbols=cmudict/cmudict.syms --keep_symbols=1 cmudict/cmudict.corpus > cmudict/cmudict.far
ngramcount --order=8 cmudict/cmudict.far > cmudict/cmudict.cnts
ngrammake --v=2 --bins=3 --method=kneser_ney cmudict/cmudict.cnts > cmudict/cmudict.mod
ngramprint --ARPA cmudict/cmudict.mod > cmudict/cmudict.arpa
phonetisaurus-arpa2wfst-omega --lm=cmudict/cmudict.arpa > cmudict/cmudict.fst
I tried fst with phonetisaurus-g2p as follows:
phonetisaurus-g2p --model=cmudict/cmudict.fst --nbest=3 --input=HELLO --words
But it didn't return anything....
Appreciate any help on this matter.
It is very important to keep dictionary in the right format. Phonetisaurus is very sensitive about that, it requires word and phonemes to be tab separated, spaces would not work then. It also does not allow pronunciation variant numbers CMUSphinx uses like (2) or (3). You need to cleanup dictionary with simple python script for example before feeding it into phonetisaurus. Here is the one I use:
#!/usr/bin/python
import sys
if len(sys.argv) != 3:
print "Split the list on train and test sets"
print
print "Usage: traintest.py file split_count"
exit()
infile = open(sys.argv[1], "r")
outtrain = open(sys.argv[1] + ".train", "w")
outtest = open(sys.argv[1] + ".test", "w")
cnt = 0
split_count = int(sys.argv[2])
for line in infile:
items = line.split()
if items[0][-1] == ')':
items[0] = items[0][:-3]
if items[0].find("_") > 0:
continue
line = items[0] + '\t' + " ".join(items[1:]) + '\n'
if cnt % split_count == 3:
outtest.write(line)
else:
outtrain.write(line)
cnt = cnt + 1

Lines of code you have written [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Out of curiosity, is there any way to get the number of lines of code you have written (in a specific project)?
I tried perforce with p4 describe #CLN | wc -l, but apart from so many edge cases (comments being included, new lines being added etc.), it skips the newly added files as well. Edge cases can be ignored, if we try to display physical line of code but newly added files still cause the issue.
I went ahead and wrote a Python script that prints out the number of lines of code added/changed by a user and the average number of lines per change.
Tested on Windows with Python 2.7.2. You can run from the command line - it assumes you have p4 in your path.
Usage: codestats.py -u [username]
It works with git too: codestats.py -u [authorname] -g.
It does some blacklisting to prune out bulk adds (e.g. you just added a library), and also imposes a blacklist on certain types of files (e.g. .HTML files, etc.). Otherwise, it works pretty well.
Hope this helps!
########################################################################
# Script that computes the lines of code stats for a perforce/git user.
########################################################################
import argparse
import logging
import subprocess
import sys
import re
VALID_ARGUMENTS = [
("user", "-u", "--user", "Run lines of code computation for the specified user.", 1),
("change", "-c", "--change", "Just display lines of code in the passed in change (useful for debugging).", 1),
("git", "-g", "--git", "Use git rather than perforce (which is the default versioning system queried).", 0)
]
class PrintHelpOnErrorArgumentParser(argparse.ArgumentParser):
def error(self, message):
logging.error("error: {0}\n\n".format(message))
self.print_help()
sys.exit(2)
def is_code_file(depot_path):
fstat_output = subprocess.Popen(['p4', 'fstat', depot_path], stdout=subprocess.PIPE).communicate()[0].split('\n')
text_file = False
head_type_regex = re.compile('^... headType (\S+)\s*$')
for line in fstat_output:
head_type_line = head_type_regex.match(line)
if head_type_line:
head_type = head_type_line.group(1)
text_file = (head_type.find('text') != -1)
if text_file:
blacklisted_file_types = ['html', 'css', 'twb', 'twbx', 'tbm', 'xml']
for file_type in blacklisted_file_types:
if re.match('^\/\/depot.*\.{}#\d+$'.format(file_type), depot_path):
text_file = False
break
return text_file
def parse_args():
parser = PrintHelpOnErrorArgumentParser()
for arg_name, short_switch, long_switch, help, num_args in VALID_ARGUMENTS:
if num_args != 0:
parser.add_argument(
short_switch,
nargs=num_args,
type=str,
dest=arg_name)
else:
parser.add_argument(
long_switch,
short_switch,
action="store_true",
help=help,
dest=arg_name)
return parser.parse_args()
file_edited_regex = re.compile('^... .*?#\d+ edit\s*$')
file_deleted_regex = re.compile('^... .*?#\d+ delete\s*$')
file_integrated_regex = re.compile('^... .*?#\d+ integrate\s*$')
file_added_regex = re.compile('^... (.*?#\d+) add\s*$')
affected_files_regex = re.compile('^Affected files ...')
outliers = [] # Changes that seem as if they weren't hand coded and merit inspection
def num_lines_in_file(depot_path):
lines = len(subprocess.Popen(['p4', 'print', depot_path], stdout=subprocess.PIPE).communicate()[0].split('\n'))
return lines
def parse_change(changelist):
change_description = subprocess.Popen(['p4', 'describe', '-ds', changelist], stdout=subprocess.PIPE).communicate()[0].split('\n')
parsing_differences = False
parsing_affected_files = False
differences_regex = re.compile('^Differences \.\.\..*$')
line_added_regex = re.compile('^add \d+ chunks (\d+) lines.*$')
line_removed_regex = re.compile('^deleted \d+ chunks (\d+) lines.*$')
line_changed_regex = re.compile('^changed \d+ chunks (\d+) / (\d+) lines.*$')
file_diff_regex = re.compile('^==== (\/\/depot.*#\d+)\s*\S+$')
skip_file = False
num_lines_added = 0
num_lines_deleted = 0
num_lines_changed_added = 0
num_lines_changed_deleted = 0
num_files_added = 0
num_files_edited = 0
for line in change_description:
if differences_regex.match(line):
parsing_differences = True
elif affected_files_regex.match(line):
parsing_affected_files = True
elif parsing_differences:
if file_diff_regex.match(line):
regex_match = file_diff_regex.match(line)
skip_file = not is_code_file(regex_match.group(1))
elif not skip_file:
regex_match = line_added_regex.match(line)
if regex_match:
num_lines_added += int(regex_match.group(1))
else:
regex_match = line_removed_regex.match(line)
if regex_match:
num_lines_deleted += int(regex_match.group(1))
else:
regex_match = line_changed_regex.match(line)
if regex_match:
num_lines_changed_added += int(regex_match.group(2))
num_lines_changed_deleted += int(regex_match.group(1))
elif parsing_affected_files:
if file_added_regex.match(line):
file_added_match = file_added_regex.match(line)
depot_path = file_added_match.group(1)
if is_code_file(depot_path):
lines_in_file = num_lines_in_file(depot_path)
if lines_in_file > 3000:
# Anomaly - probably a copy of existing code - discard this
lines_in_file = 0
num_lines_added += lines_in_file
num_files_added += 1
elif file_edited_regex.match(line):
num_files_edited += 1
return [num_files_added, num_files_edited, num_lines_added, num_lines_deleted, num_lines_changed_added, num_lines_changed_deleted]
def contains_integrates(changelist):
change_description = subprocess.Popen(['p4', 'describe', '-s', changelist], stdout=subprocess.PIPE).communicate()[0].split('\n')
contains_integrates = False
parsing_affected_files = False
for line in change_description:
if affected_files_regex.match(line):
parsing_affected_files = True
elif parsing_affected_files:
if file_integrated_regex.match(line):
contains_integrates = True
break
return contains_integrates
#################################################
# Note: Keep this function in sync with
# generate_line.
#################################################
def generate_output_specifier(output_headers):
output_specifier = ''
for output_header in output_headers:
output_specifier += '| {:'
output_specifier += '{}'.format(len(output_header))
output_specifier += '}'
if output_specifier != '':
output_specifier += ' |'
return output_specifier
#################################################
# Note: Keep this function in sync with
# generate_output_specifier.
#################################################
def generate_line(output_headers):
line = ''
for output_header in output_headers:
line += '--' # for the '| '
header_padding_specifier = '{:-<'
header_padding_specifier += '{}'.format(len(output_header))
header_padding_specifier += '}'
line += header_padding_specifier.format('')
if line != '':
line += '--' # for the last ' |'
return line
# Returns true if a change is a bulk addition or a private change
def is_black_listed_change(user, changelist):
large_add_change = False
all_adds = True
num_adds = 0
is_private_change = False
is_third_party_change = False
change_description = subprocess.Popen(['p4', 'describe', '-s', changelist], stdout=subprocess.PIPE).communicate()[0].split('\n')
for line in change_description:
if file_edited_regex.match(line) or file_deleted_regex.match(line):
all_adds = False
elif file_added_regex.match(line):
num_adds += 1
if line.find('... //depot/private') != -1:
is_private_change = True
break
if line.find('... //depot/third-party') != -1:
is_third_party_change = True
break
large_add_change = all_adds and num_adds > 70
#print "{}: {}".format(changelist, large_add_change or is_private_change)
return large_add_change or is_third_party_change
change_header_regex = re.compile('^Change (\d+)\s*.*?\s*(\S+)#.*$')
def get_user_and_change_header_for_change(changelist):
change_description = subprocess.Popen(['p4', 'describe', '-s', changelist], stdout=subprocess.PIPE).communicate()[0].split('\n')
user = None
change_header = None
for line in change_description:
change_header_match = change_header_regex.match(line)
if change_header_match:
user = change_header_match.group(2)
change_header = line
break
return [user, change_header]
if __name__ == "__main__":
log = logging.getLogger()
log.setLevel(logging.DEBUG)
args = parse_args()
user_stats = {}
user_stats['num_changes'] = 0
user_stats['lines_added'] = 0
user_stats['lines_deleted'] = 0
user_stats['lines_changed_added'] = 0
user_stats['lines_changed_removed'] = 0
user_stats['total_lines'] = 0
user_stats['files_edited'] = 0
user_stats['files_added'] = 0
change_log = []
if args.git:
git_log_command = ['git', 'log', '--author={}'.format(args.user[0]), '--pretty=tformat:', '--numstat']
git_log_output = subprocess.Popen(git_log_command, stdout=subprocess.PIPE).communicate()[0].split('\n')
git_log_line_regex = re.compile('^(\d+)\s*(\d+)\s*\S+$')
total = 0
adds = 0
subs = 0
for git_log_line in git_log_output:
line_match = git_log_line_regex.match(git_log_line)
if line_match:
adds += int(line_match.group(1))
subs += int(line_match.group(2))
total = adds - subs
num_commits = 0
git_shortlog_command = ['git', 'shortlog', '--author={}'.format(args.user[0]), '-s']
git_shortlog_output = subprocess.Popen(git_shortlog_command, stdout=subprocess.PIPE).communicate()[0].split('\n')
git_shortlog_line_regex = re.compile('^\s*(\d+)\s+.*$')
for git_shortlog_line in git_shortlog_output:
line_match = git_shortlog_line_regex.match(git_shortlog_line)
if line_match:
num_commits += int(line_match.group(1))
print "Git Stats for {}: Commits: {}. Lines of code: {}. Average Lines Per Change: {}.".format(args.user[0], num_commits, total, total*1.0/num_commits)
sys.exit(0)
elif args.change:
[args.user, change_header] = get_user_and_change_header_for_change(args.change)
change_log = [change_header]
else:
change_log = subprocess.Popen(['p4', 'changes', '-u', args.user, '-s', 'submitted'], stdout=subprocess.PIPE).communicate()[0].split('\n')
output_headers = ['Current Change', 'Num Changes', 'Files Added', 'Files Edited']
output_headers.append('Lines Added')
output_headers.append('Lines Deleted')
if not args.git:
output_headers.append('Lines Changed (Added/Removed)')
avg_change_size = 0.0
output_headers.append('Total Lines')
output_headers.append('Avg. Lines/Change')
line = generate_line(output_headers)
output_specifier = generate_output_specifier(output_headers)
print line
print output_specifier.format(*output_headers)
print line
output_specifier_with_carriage_return = output_specifier + '\r'
for change in change_log:
change_match = change_header_regex.search(change)
if change_match:
user_stats['num_changes'] += 1
changelist = change_match.group(1)
if not is_black_listed_change(args.user, changelist) and not contains_integrates(changelist):
[files_added_in_change, files_edited_in_change, lines_added_in_change, lines_deleted_in_change, lines_changed_added_in_change, lines_changed_removed_in_change] = parse_change(change_match.group(1))
if lines_added_in_change > 5000 and changelist not in outliers:
outliers.append([changelist, lines_added_in_change])
else:
user_stats['lines_added'] += lines_added_in_change
user_stats['lines_deleted'] += lines_deleted_in_change
user_stats['lines_changed_added'] += lines_changed_added_in_change
user_stats['lines_changed_removed'] += lines_changed_removed_in_change
user_stats['total_lines'] += lines_changed_added_in_change
user_stats['total_lines'] -= lines_changed_removed_in_change
user_stats['total_lines'] += lines_added_in_change
user_stats['files_edited'] += files_edited_in_change
user_stats['files_added'] += files_added_in_change
current_output = [changelist, user_stats['num_changes'], user_stats['files_added'], user_stats['files_edited']]
current_output.append(user_stats['lines_added'])
current_output.append(user_stats['lines_deleted'])
if not args.git:
current_output.append('{}/{}'.format(user_stats['lines_changed_added'], user_stats['lines_changed_removed']))
current_output.append(user_stats['total_lines'])
current_output.append(user_stats['total_lines']*1.0/user_stats['num_changes'])
print output_specifier_with_carriage_return.format(*current_output),
print
print line
if len(outliers) > 0:
print "Outliers (changes that merit inspection - and have not been included in the stats):"
outlier_headers = ['Changelist', 'Lines of Code']
outlier_specifier = generate_output_specifier(outlier_headers)
outlier_line = generate_line(outlier_headers)
print outlier_line
print outlier_specifier.format(*outlier_headers)
print outlier_line
for change in outliers:
print outlier_specifier.format(*change)
print outlier_line
The other answers seem to have missed the source-control history side of things.
From http://forums.perforce.com/index.php?/topic/359-how-many-lines-of-code-have-i-written/
Calculate the answer in multiple steps:
1) Added files:
p4 filelog ... | grep ' add on .* by <username>'
p4 print -q foo#1 | wc -l
2) Changed files:
p4 describe <changelist> | grep "^>" | wc -l
Combine all the counts together (scripting...), and you'll have a total.
You might also want to get rid of whitespace lines, or lines without alphanumeric chars, with a grep?
Also if you are doing it regularly, it would be more efficient to code the thing in P4Python and do it incrementally - keeping history and looking at only new commits.
Yes, there are many ways to count lines of code.
tl;dr Install Eclipse Metrics Plugin. Here is the instruction how to do it. Below there is a short script if you want to do it without Eclipse.
Shell script
I will present you quite general approach. It works on Linux, however it's portable to other systems. Save this 2 lines to lines.sh file:
#!/bin/sh
find -name "*.java" | awk '{ system("wc "$0) }' | awk '{ print $1 "\t" $4; lines += $1; files++ } END { print "Total: " lines " lines in " files " files."}'
It's a shell script which uses find, wc and great awk. Add permission to execute:
chmod +x lines.sh
Now we can execute our shell script.
Let's say you saved lines.sh in /home/you/workspace/projectX.
Script counts lines in .java files, which are located in subdirectories of /home/you/workspace/projectX.
So let's run it with ./lines.sh. You can change *.java for any other types of files.
Sample output:
adam#adam ~/workspace/Checkers $ ./lines.sh
23 ./src/Checkers.java
14 ./src/event/StartGameEvent.java
38 ./src/event/YourColorEvent.java
52 ./src/event/BoardClickEvent.java
61 ./src/event/GameQueue.java
14 ./src/event/PlayerEscapeEvent.java
14 ./src/event/WaitEvent.java
16 ./src/event/GameEvent.java
38 ./src/event/EndGameEvent.java
38 ./src/event/FakeBoardEvent.java
127 ./src/controller/ServerThread.java
14 ./src/controller/ServerConfig.java
46 ./src/controller/Server.java
170 ./src/controller/Controller.java
141 ./src/controller/ServerNetwork.java
246 ./src/view/ClientNetwork.java
36 ./src/view/Messages.java
53 ./src/view/ButtonField.java
47 ./src/view/ViewConfig.java
32 ./src/view/MainWindow.java
455 ./src/view/View.java
36 ./src/view/ImageLoader.java
88 ./src/model/KingJump.java
130 ./src/model/Cords.java
70 ./src/model/King.java
77 ./src/model/FakeBoard.java
90 ./src/model/CheckerMove.java
53 ./src/model/PlayerColor.java
73 ./src/model/Checker.java
201 ./src/model/AbstractPiece.java
75 ./src/model/CheckerJump.java
154 ./src/model/Model.java
105 ./src/model/KingMove.java
99 ./src/model/FieldType.java
269 ./src/model/Board.java
56 ./src/model/AbstractJump.java
80 ./src/model/AbstractMove.java
82 ./src/model/BoardState.java
Total: 3413 lines in 38 files.
Find an app to calculate the lines, there are many subtleties to counting lines - comments, blank lines, multiple operators per line etc.
Visual Studio has "Calculate Code Metrics" functionality, since you're not mentioning one single language I can't be more specific about which tool to use, just saying "find" and "grep" may not be the way to go.
Also consider the fact that lines of code don't measure actual progress. Completed features on your roadmap measures progress and the lower the lines of code - the better. It wouldn't be a first if a proud developer claims his 60,000 lines of code are marvelous only to find out there's a way to do the same thing in 1000 lines.
Have a look at SLOCCount. It only counts actual lines of code and performs some additional computations as well.
On OSX, you can easily install it via Homebrew with brew install sloccount.
Sample output for a project of mine:
$ sloccount .
Have a non-directory at the top, so creating directory top_dir
Adding /Users/padde/Desktop/project/./Gemfile to top_dir
Adding /Users/padde/Desktop/project/./Gemfile.lock to top_dir
Adding /Users/padde/Desktop/project/./Procfile to top_dir
Adding /Users/padde/Desktop/project/./README to top_dir
Adding /Users/padde/Desktop/project/./application.rb to top_dir
Creating filelist for config
Adding /Users/padde/Desktop/project/./config.ru to top_dir
Creating filelist for controllers
Creating filelist for db
Creating filelist for helpers
Creating filelist for models
Creating filelist for public
Creating filelist for tmp
Creating filelist for views
Categorizing files.
Finding a working MD5 command....
Found a working MD5 command.
Computing results.
SLOC Directory SLOC-by-Language (Sorted)
256 controllers ruby=256
66 models ruby=66
10 config ruby=10
9 top_dir ruby=9
5 helpers ruby=5
0 db (none)
0 public (none)
0 tmp (none)
0 views (none)
Totals grouped by language (dominant language first):
ruby: 346 (100.00%)
Total Physical Source Lines of Code (SLOC) = 346
Development Effort Estimate, Person-Years (Person-Months) = 0.07 (0.79)
(Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05))
Schedule Estimate, Years (Months) = 0.19 (2.28)
(Basic COCOMO model, Months = 2.5 * (person-months**0.38))
Estimated Average Number of Developers (Effort/Schedule) = 0.34
Total Estimated Cost to Develop = $ 8,865
(average salary = $56,286/year, overhead = 2.40).
SLOCCount, Copyright (C) 2001-2004 David A. Wheeler
SLOCCount is Open Source Software/Free Software, licensed under the GNU GPL.
SLOCCount comes with ABSOLUTELY NO WARRANTY, and you are welcome to
redistribute it under certain conditions as specified by the GNU GPL license;
see the documentation for details.
Please credit this data as "generated using David A. Wheeler's 'SLOCCount'."
There is an easier way to do all this, which incidentally is faster than using grep:
First get all the change lists for a particular user, this is a commandline command you can use it in python script by using os.system():
p4 changes -u <username> > 'some_text_file.txt'
Now you need to extract all the changelists number so ,we will use regex for it, here it is done using python :
f = open('some_text_file.txt','r')
lists = f.readlines()
pattern = re.compile(r'\b[0-9][0-9][0-9][0-9][0-9][0-9][0-9]\b')
labels = []
for i in lists:
labels.append(pattern.findall(i))
changelists = []
for h in labels:
if(type(h) is list):
changelists.append(str(h[0]))
else:
changelists.append(str(h))
Now that you have all the changelists numbers in 'labels'.
We will iterate through the list and for every changelist find number of lines added and number of lines deleted, getting the ultimate difference would give us total number of lines added. The following liens of code do exactly that:
for i in changelists:
os.system('p4 describe -ds '+i+' | findstr "^add" >> added.txt')
os.system('p4 describe -ds '+i+' | findstr "^del" >> deleted.txt')
added = []
deleted = []
file = open('added.txt')
for i in file:
added.append(i)
count = []
count_added = 0
count_add = 0
count_del = 0
for j in added:
count = [int(s) for s in j.split() if s.isdigit()]
count_add += count[1]
count = []
file = open('deleted.txt')
for i in file:
deleted.append(i)
for j in labels:
count = [int(s) for s in j.split() if s.isdigit()]
count_del += count[1]
count = []
count_added = count_add - count_del
print count_added
count_added will have number of lines that were added by the user.

Resources