Python script to move oldest 1000 file into another directory - python-3.x

Here is my code with reads the input from a config file and moving files to another directory based on a condition and logs the information to a log file
import shutil
import configparser
import logging.handlers
import os
#Reading the input configuration
config = configparser.ConfigParser()
config.read("config_input.ini")
src_filepath = (config.get("Configuration Inputs","src_filepath"))
dst_filepath = (config.get("Configuration Inputs","dst_filepath"))
log_file_name = (config.get("Configuration Inputs","log_file_name"))
file_limit = int((config.get("Configuration Inputs","file_limit")))
if not os.path.exists (dst_filepath):
os.makedirs(dst_filepath)
onlyfiles_in_dst = next ( os.walk ( dst_filepath ) ) [ 2 ]
file_count_indst = len ( onlyfiles_in_dst )
onlyfiles_in_src = next ( os.walk ( src_filepath ) ) [ 2 ]
file_count_insrc = len ( onlyfiles_in_src )
def sorted_ls(src_filepath):
mtime = lambda f: os.stat(os.path.join(src_filepath, f)).st_mtime
return list(sorted(os.listdir(src_filepath), key=mtime))
move_list = sorted_ls(src_filepath)
#print (move_list)
if file_count_indst < file_limit:
for mfile in move_list:
shutil.move(src_filepath + '\\' + mfile, dst_filepath)
**#Logging everything**
logger = logging.getLogger()
logging.basicConfig(filename=log_file_name, format='%(asctime)s %(message)s', filemode='a')
logger.setLevel(logging.INFO)
logger.info('Number of files moved from source ' + str(len(move_list)))
But the problem is I want to move only the 1000 files from source to destination.
Something like
"ls -lrt| head ls -lrt | head -n 1000"
which I can not do iy as I am running this script on Windows platform.
Please suggest a proper way to do it.
Also please suggest how can I put it under a user defined class and may be can use in some other program.

Can't a simple counter be the solution?
if file_count_indst < file_limit:
count=0;
for mfile in move_list:
shutil.move(src_filepath + '\\' + mfile, dst_filepath)
count = count +1
if count==1000:
break

Related

Python Program error - The process cannot access the file because it is being used by another process

I am trying to test a python code which moves file from source path to target path . The test is done using pytest in Python3 . But I am facing a roadblock here. It is that , I am trying to remove the source and target paths at end of code completion. For this I am using a command like shutil.rmtree(path) or os.rmdir(path) . This is causing me the error - " [WinError 32] The process cannot access the file because it is being used by another process". Please help me on this. Below is the python pytest code :
import pytest
import os
import shutil
import tempfile
from sample_test_module import TestCondition
object_test_condition = TestCondition()
#pytest.mark.parametrize("test_value",['0'])
def test_condition_pass(test_value):
temp_dir = tempfile.mkdtemp()
temp_src_folder = 'ABC_File'
temp_src_dir = os.path.join(temp_dir , temp_src_folder)
temp_file_name = 'Sample_Test.txt'
temp_file_path = os.path.join(temp_src_dir , temp_file_name)
os.chdir(temp_dir)
os.mkdir(temp_src_folder)
try:
with open(temp_file_path , "w") as tmp:
tmp.write("Hello-World\n")
tmp.write("Hi-All\n")
except IOError:
print("Error has occured , please check it.")
org_val = object_test_condition.sample_test(temp_dir)
print("Temp file path is : " + temp_file_path)
print("Temp Dir is : " + temp_dir)
shutil.rmtree(temp_dir)
print("The respective dir path is now removed.)
assert org_val == test_value
Upon execution of the code , the below error is popping up :
[WinError32] The process cannot access the file because it is being used by another process : 'C:\Users\xyz\AppData\Local\Temp\tmptryggg56'
You are getting this error because the directory you are trying to remove is the current directory of the process. If you save the current directory before calling os.chdir (using os.getcwd()), and chdir back to that directory before removing temp_dir, it should work.
Your code isn't correctly indented, so here is my best guess at what it should look like.
import pytest
import os
import shutil
import tempfile
from sample_test_module import TestCondition
object_test_condition = TestCondition()
#pytest.mark.parametrize("test_value",['0'])
def test_condition_pass(test_value):
temp_dir = tempfile.mkdtemp()
temp_src_folder = 'ABC_File'
temp_src_dir = os.path.join(temp_dir , temp_src_folder)
temp_file_name = 'Sample_Test.txt'
temp_file_path = os.path.join(temp_src_dir , temp_file_name)
prev_dir = os.getcwd()
os.chdir(temp_dir)
os.mkdir(temp_src_folder)
try:
with open(temp_file_path , "w") as tmp:
tmp.write("Hello-World\n")
tmp.write("Hi-All\n")
except IOError:
print("Error has occured , please check it.")
org_val = object_test_condition.sample_test(temp_dir)
print("Temp file path is : " + temp_file_path)
print("Temp Dir is : " + temp_dir)
os.chdir(prev_dir)
shutil.rmtree(temp_dir)
print("The respective dir path is now removed.)
assert org_val == test_value
Can you try to close the temp file before removing
temp.close()

Iterate through folder/sub-directories and move found regex files into new folder

I´ve got a folder/sub-directories structure as follow:
-main_folder
-sub_1
322.txt
024.ops
-sub_2
977.txt
004.txt
-sub_3
396.xml
059.ops
I´m trying to iterate with os.walk through the folder and its sub-directories and collect the names inside these folders. When a name gets found by a regex rule, I want to either store the path in list or directly move that file into a new folder (mkdir).
I´ve already got the regex done to find the document I want.
For example:
find_000_099 = r'\b(0\d{2}.\w{1,4})'
find_300_399 = r'\b(3\d{2}.\w{1,4})'
find_900_999 = r'\b(9\d{2}.\w{1,4})'
I wish my expected result to be like:
-main_folder
-sub_from_000_099
024.ops
004.txt
059.ops
-sub_from_300_399
322.txt
396.xml
-sub_from_900_999
977.txt
You can use the below-given code, which moves the file from its initial directory to the desired directory.
import os
import re
import shutil
find_000_099 = r'\b(0\d{2}.\w{1,4})'
find_300_399 = r'\b(3\d{2}.\w{1,4})'
find_900_999 = r'\b(9\d{2}.\w{1,4})'
count = 0
for roots,dirs,files in os.walk('Directory Path'):
#print(roots, len(dirs), len(files))
if count == 0:
parent_dir = roots
os.mkdir ( parent_dir + "/sub_from_000_099" )
os.mkdir ( parent_dir + "/sub_from_300_399" )
os.mkdir ( parent_dir + "/sub_from_900_999" )
count += 1
else:
print(count)
for file in files:
print(file)
if re.match(find_000_099, file):
shutil.move ( roots + "/" + file, parent_dir + "/sub_from_000_099/" + file)
elif re.match ( find_300_399, file ):
shutil.move ( roots + "/" + file, parent_dir + "/sub_from_300_399/" + file )
elif re.match ( find_900_999, file ):
shutil.move ( roots + "/" + file, parent_dir + "/sub_from_900_999/" + file )
It's a skeleton code, which fulfills your requirements.
You can add checks on creating directories, by first checking whether the directory exists or not, and other checks as per your needs.
Here is a simpler way, using pathlib and shutil
import re
import shutil
from pathlib import Path
new_path = Path("new_folder")
if not new_path.exists(): new_path.mkdir()
# Getting all files in the main directory
files = Path("main_folder").rglob("*.*")
regs = {
r'\b(0\d{2}.\w{1,4})': "sub_1", # find_000_099
r'\b(3\d{2}.\w{1,4})': "sub_2", # find_300_399
r'\b(9\d{2}.\w{1,4})': "sub_3" # find_900_999
}
for f in files:
for reg in regs:
if re.search(reg, f.name):
temp_path = new_path / regs[reg]
if not temp_path.exists(): temp_path.mkdir()
# Change the following method to 'move' after testing it
shutil.copy(f, temp_path / f.name)
break

Lines of code you have written [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Out of curiosity, is there any way to get the number of lines of code you have written (in a specific project)?
I tried perforce with p4 describe #CLN | wc -l, but apart from so many edge cases (comments being included, new lines being added etc.), it skips the newly added files as well. Edge cases can be ignored, if we try to display physical line of code but newly added files still cause the issue.
I went ahead and wrote a Python script that prints out the number of lines of code added/changed by a user and the average number of lines per change.
Tested on Windows with Python 2.7.2. You can run from the command line - it assumes you have p4 in your path.
Usage: codestats.py -u [username]
It works with git too: codestats.py -u [authorname] -g.
It does some blacklisting to prune out bulk adds (e.g. you just added a library), and also imposes a blacklist on certain types of files (e.g. .HTML files, etc.). Otherwise, it works pretty well.
Hope this helps!
########################################################################
# Script that computes the lines of code stats for a perforce/git user.
########################################################################
import argparse
import logging
import subprocess
import sys
import re
VALID_ARGUMENTS = [
("user", "-u", "--user", "Run lines of code computation for the specified user.", 1),
("change", "-c", "--change", "Just display lines of code in the passed in change (useful for debugging).", 1),
("git", "-g", "--git", "Use git rather than perforce (which is the default versioning system queried).", 0)
]
class PrintHelpOnErrorArgumentParser(argparse.ArgumentParser):
def error(self, message):
logging.error("error: {0}\n\n".format(message))
self.print_help()
sys.exit(2)
def is_code_file(depot_path):
fstat_output = subprocess.Popen(['p4', 'fstat', depot_path], stdout=subprocess.PIPE).communicate()[0].split('\n')
text_file = False
head_type_regex = re.compile('^... headType (\S+)\s*$')
for line in fstat_output:
head_type_line = head_type_regex.match(line)
if head_type_line:
head_type = head_type_line.group(1)
text_file = (head_type.find('text') != -1)
if text_file:
blacklisted_file_types = ['html', 'css', 'twb', 'twbx', 'tbm', 'xml']
for file_type in blacklisted_file_types:
if re.match('^\/\/depot.*\.{}#\d+$'.format(file_type), depot_path):
text_file = False
break
return text_file
def parse_args():
parser = PrintHelpOnErrorArgumentParser()
for arg_name, short_switch, long_switch, help, num_args in VALID_ARGUMENTS:
if num_args != 0:
parser.add_argument(
short_switch,
nargs=num_args,
type=str,
dest=arg_name)
else:
parser.add_argument(
long_switch,
short_switch,
action="store_true",
help=help,
dest=arg_name)
return parser.parse_args()
file_edited_regex = re.compile('^... .*?#\d+ edit\s*$')
file_deleted_regex = re.compile('^... .*?#\d+ delete\s*$')
file_integrated_regex = re.compile('^... .*?#\d+ integrate\s*$')
file_added_regex = re.compile('^... (.*?#\d+) add\s*$')
affected_files_regex = re.compile('^Affected files ...')
outliers = [] # Changes that seem as if they weren't hand coded and merit inspection
def num_lines_in_file(depot_path):
lines = len(subprocess.Popen(['p4', 'print', depot_path], stdout=subprocess.PIPE).communicate()[0].split('\n'))
return lines
def parse_change(changelist):
change_description = subprocess.Popen(['p4', 'describe', '-ds', changelist], stdout=subprocess.PIPE).communicate()[0].split('\n')
parsing_differences = False
parsing_affected_files = False
differences_regex = re.compile('^Differences \.\.\..*$')
line_added_regex = re.compile('^add \d+ chunks (\d+) lines.*$')
line_removed_regex = re.compile('^deleted \d+ chunks (\d+) lines.*$')
line_changed_regex = re.compile('^changed \d+ chunks (\d+) / (\d+) lines.*$')
file_diff_regex = re.compile('^==== (\/\/depot.*#\d+)\s*\S+$')
skip_file = False
num_lines_added = 0
num_lines_deleted = 0
num_lines_changed_added = 0
num_lines_changed_deleted = 0
num_files_added = 0
num_files_edited = 0
for line in change_description:
if differences_regex.match(line):
parsing_differences = True
elif affected_files_regex.match(line):
parsing_affected_files = True
elif parsing_differences:
if file_diff_regex.match(line):
regex_match = file_diff_regex.match(line)
skip_file = not is_code_file(regex_match.group(1))
elif not skip_file:
regex_match = line_added_regex.match(line)
if regex_match:
num_lines_added += int(regex_match.group(1))
else:
regex_match = line_removed_regex.match(line)
if regex_match:
num_lines_deleted += int(regex_match.group(1))
else:
regex_match = line_changed_regex.match(line)
if regex_match:
num_lines_changed_added += int(regex_match.group(2))
num_lines_changed_deleted += int(regex_match.group(1))
elif parsing_affected_files:
if file_added_regex.match(line):
file_added_match = file_added_regex.match(line)
depot_path = file_added_match.group(1)
if is_code_file(depot_path):
lines_in_file = num_lines_in_file(depot_path)
if lines_in_file > 3000:
# Anomaly - probably a copy of existing code - discard this
lines_in_file = 0
num_lines_added += lines_in_file
num_files_added += 1
elif file_edited_regex.match(line):
num_files_edited += 1
return [num_files_added, num_files_edited, num_lines_added, num_lines_deleted, num_lines_changed_added, num_lines_changed_deleted]
def contains_integrates(changelist):
change_description = subprocess.Popen(['p4', 'describe', '-s', changelist], stdout=subprocess.PIPE).communicate()[0].split('\n')
contains_integrates = False
parsing_affected_files = False
for line in change_description:
if affected_files_regex.match(line):
parsing_affected_files = True
elif parsing_affected_files:
if file_integrated_regex.match(line):
contains_integrates = True
break
return contains_integrates
#################################################
# Note: Keep this function in sync with
# generate_line.
#################################################
def generate_output_specifier(output_headers):
output_specifier = ''
for output_header in output_headers:
output_specifier += '| {:'
output_specifier += '{}'.format(len(output_header))
output_specifier += '}'
if output_specifier != '':
output_specifier += ' |'
return output_specifier
#################################################
# Note: Keep this function in sync with
# generate_output_specifier.
#################################################
def generate_line(output_headers):
line = ''
for output_header in output_headers:
line += '--' # for the '| '
header_padding_specifier = '{:-<'
header_padding_specifier += '{}'.format(len(output_header))
header_padding_specifier += '}'
line += header_padding_specifier.format('')
if line != '':
line += '--' # for the last ' |'
return line
# Returns true if a change is a bulk addition or a private change
def is_black_listed_change(user, changelist):
large_add_change = False
all_adds = True
num_adds = 0
is_private_change = False
is_third_party_change = False
change_description = subprocess.Popen(['p4', 'describe', '-s', changelist], stdout=subprocess.PIPE).communicate()[0].split('\n')
for line in change_description:
if file_edited_regex.match(line) or file_deleted_regex.match(line):
all_adds = False
elif file_added_regex.match(line):
num_adds += 1
if line.find('... //depot/private') != -1:
is_private_change = True
break
if line.find('... //depot/third-party') != -1:
is_third_party_change = True
break
large_add_change = all_adds and num_adds > 70
#print "{}: {}".format(changelist, large_add_change or is_private_change)
return large_add_change or is_third_party_change
change_header_regex = re.compile('^Change (\d+)\s*.*?\s*(\S+)#.*$')
def get_user_and_change_header_for_change(changelist):
change_description = subprocess.Popen(['p4', 'describe', '-s', changelist], stdout=subprocess.PIPE).communicate()[0].split('\n')
user = None
change_header = None
for line in change_description:
change_header_match = change_header_regex.match(line)
if change_header_match:
user = change_header_match.group(2)
change_header = line
break
return [user, change_header]
if __name__ == "__main__":
log = logging.getLogger()
log.setLevel(logging.DEBUG)
args = parse_args()
user_stats = {}
user_stats['num_changes'] = 0
user_stats['lines_added'] = 0
user_stats['lines_deleted'] = 0
user_stats['lines_changed_added'] = 0
user_stats['lines_changed_removed'] = 0
user_stats['total_lines'] = 0
user_stats['files_edited'] = 0
user_stats['files_added'] = 0
change_log = []
if args.git:
git_log_command = ['git', 'log', '--author={}'.format(args.user[0]), '--pretty=tformat:', '--numstat']
git_log_output = subprocess.Popen(git_log_command, stdout=subprocess.PIPE).communicate()[0].split('\n')
git_log_line_regex = re.compile('^(\d+)\s*(\d+)\s*\S+$')
total = 0
adds = 0
subs = 0
for git_log_line in git_log_output:
line_match = git_log_line_regex.match(git_log_line)
if line_match:
adds += int(line_match.group(1))
subs += int(line_match.group(2))
total = adds - subs
num_commits = 0
git_shortlog_command = ['git', 'shortlog', '--author={}'.format(args.user[0]), '-s']
git_shortlog_output = subprocess.Popen(git_shortlog_command, stdout=subprocess.PIPE).communicate()[0].split('\n')
git_shortlog_line_regex = re.compile('^\s*(\d+)\s+.*$')
for git_shortlog_line in git_shortlog_output:
line_match = git_shortlog_line_regex.match(git_shortlog_line)
if line_match:
num_commits += int(line_match.group(1))
print "Git Stats for {}: Commits: {}. Lines of code: {}. Average Lines Per Change: {}.".format(args.user[0], num_commits, total, total*1.0/num_commits)
sys.exit(0)
elif args.change:
[args.user, change_header] = get_user_and_change_header_for_change(args.change)
change_log = [change_header]
else:
change_log = subprocess.Popen(['p4', 'changes', '-u', args.user, '-s', 'submitted'], stdout=subprocess.PIPE).communicate()[0].split('\n')
output_headers = ['Current Change', 'Num Changes', 'Files Added', 'Files Edited']
output_headers.append('Lines Added')
output_headers.append('Lines Deleted')
if not args.git:
output_headers.append('Lines Changed (Added/Removed)')
avg_change_size = 0.0
output_headers.append('Total Lines')
output_headers.append('Avg. Lines/Change')
line = generate_line(output_headers)
output_specifier = generate_output_specifier(output_headers)
print line
print output_specifier.format(*output_headers)
print line
output_specifier_with_carriage_return = output_specifier + '\r'
for change in change_log:
change_match = change_header_regex.search(change)
if change_match:
user_stats['num_changes'] += 1
changelist = change_match.group(1)
if not is_black_listed_change(args.user, changelist) and not contains_integrates(changelist):
[files_added_in_change, files_edited_in_change, lines_added_in_change, lines_deleted_in_change, lines_changed_added_in_change, lines_changed_removed_in_change] = parse_change(change_match.group(1))
if lines_added_in_change > 5000 and changelist not in outliers:
outliers.append([changelist, lines_added_in_change])
else:
user_stats['lines_added'] += lines_added_in_change
user_stats['lines_deleted'] += lines_deleted_in_change
user_stats['lines_changed_added'] += lines_changed_added_in_change
user_stats['lines_changed_removed'] += lines_changed_removed_in_change
user_stats['total_lines'] += lines_changed_added_in_change
user_stats['total_lines'] -= lines_changed_removed_in_change
user_stats['total_lines'] += lines_added_in_change
user_stats['files_edited'] += files_edited_in_change
user_stats['files_added'] += files_added_in_change
current_output = [changelist, user_stats['num_changes'], user_stats['files_added'], user_stats['files_edited']]
current_output.append(user_stats['lines_added'])
current_output.append(user_stats['lines_deleted'])
if not args.git:
current_output.append('{}/{}'.format(user_stats['lines_changed_added'], user_stats['lines_changed_removed']))
current_output.append(user_stats['total_lines'])
current_output.append(user_stats['total_lines']*1.0/user_stats['num_changes'])
print output_specifier_with_carriage_return.format(*current_output),
print
print line
if len(outliers) > 0:
print "Outliers (changes that merit inspection - and have not been included in the stats):"
outlier_headers = ['Changelist', 'Lines of Code']
outlier_specifier = generate_output_specifier(outlier_headers)
outlier_line = generate_line(outlier_headers)
print outlier_line
print outlier_specifier.format(*outlier_headers)
print outlier_line
for change in outliers:
print outlier_specifier.format(*change)
print outlier_line
The other answers seem to have missed the source-control history side of things.
From http://forums.perforce.com/index.php?/topic/359-how-many-lines-of-code-have-i-written/
Calculate the answer in multiple steps:
1) Added files:
p4 filelog ... | grep ' add on .* by <username>'
p4 print -q foo#1 | wc -l
2) Changed files:
p4 describe <changelist> | grep "^>" | wc -l
Combine all the counts together (scripting...), and you'll have a total.
You might also want to get rid of whitespace lines, or lines without alphanumeric chars, with a grep?
Also if you are doing it regularly, it would be more efficient to code the thing in P4Python and do it incrementally - keeping history and looking at only new commits.
Yes, there are many ways to count lines of code.
tl;dr Install Eclipse Metrics Plugin. Here is the instruction how to do it. Below there is a short script if you want to do it without Eclipse.
Shell script
I will present you quite general approach. It works on Linux, however it's portable to other systems. Save this 2 lines to lines.sh file:
#!/bin/sh
find -name "*.java" | awk '{ system("wc "$0) }' | awk '{ print $1 "\t" $4; lines += $1; files++ } END { print "Total: " lines " lines in " files " files."}'
It's a shell script which uses find, wc and great awk. Add permission to execute:
chmod +x lines.sh
Now we can execute our shell script.
Let's say you saved lines.sh in /home/you/workspace/projectX.
Script counts lines in .java files, which are located in subdirectories of /home/you/workspace/projectX.
So let's run it with ./lines.sh. You can change *.java for any other types of files.
Sample output:
adam#adam ~/workspace/Checkers $ ./lines.sh
23 ./src/Checkers.java
14 ./src/event/StartGameEvent.java
38 ./src/event/YourColorEvent.java
52 ./src/event/BoardClickEvent.java
61 ./src/event/GameQueue.java
14 ./src/event/PlayerEscapeEvent.java
14 ./src/event/WaitEvent.java
16 ./src/event/GameEvent.java
38 ./src/event/EndGameEvent.java
38 ./src/event/FakeBoardEvent.java
127 ./src/controller/ServerThread.java
14 ./src/controller/ServerConfig.java
46 ./src/controller/Server.java
170 ./src/controller/Controller.java
141 ./src/controller/ServerNetwork.java
246 ./src/view/ClientNetwork.java
36 ./src/view/Messages.java
53 ./src/view/ButtonField.java
47 ./src/view/ViewConfig.java
32 ./src/view/MainWindow.java
455 ./src/view/View.java
36 ./src/view/ImageLoader.java
88 ./src/model/KingJump.java
130 ./src/model/Cords.java
70 ./src/model/King.java
77 ./src/model/FakeBoard.java
90 ./src/model/CheckerMove.java
53 ./src/model/PlayerColor.java
73 ./src/model/Checker.java
201 ./src/model/AbstractPiece.java
75 ./src/model/CheckerJump.java
154 ./src/model/Model.java
105 ./src/model/KingMove.java
99 ./src/model/FieldType.java
269 ./src/model/Board.java
56 ./src/model/AbstractJump.java
80 ./src/model/AbstractMove.java
82 ./src/model/BoardState.java
Total: 3413 lines in 38 files.
Find an app to calculate the lines, there are many subtleties to counting lines - comments, blank lines, multiple operators per line etc.
Visual Studio has "Calculate Code Metrics" functionality, since you're not mentioning one single language I can't be more specific about which tool to use, just saying "find" and "grep" may not be the way to go.
Also consider the fact that lines of code don't measure actual progress. Completed features on your roadmap measures progress and the lower the lines of code - the better. It wouldn't be a first if a proud developer claims his 60,000 lines of code are marvelous only to find out there's a way to do the same thing in 1000 lines.
Have a look at SLOCCount. It only counts actual lines of code and performs some additional computations as well.
On OSX, you can easily install it via Homebrew with brew install sloccount.
Sample output for a project of mine:
$ sloccount .
Have a non-directory at the top, so creating directory top_dir
Adding /Users/padde/Desktop/project/./Gemfile to top_dir
Adding /Users/padde/Desktop/project/./Gemfile.lock to top_dir
Adding /Users/padde/Desktop/project/./Procfile to top_dir
Adding /Users/padde/Desktop/project/./README to top_dir
Adding /Users/padde/Desktop/project/./application.rb to top_dir
Creating filelist for config
Adding /Users/padde/Desktop/project/./config.ru to top_dir
Creating filelist for controllers
Creating filelist for db
Creating filelist for helpers
Creating filelist for models
Creating filelist for public
Creating filelist for tmp
Creating filelist for views
Categorizing files.
Finding a working MD5 command....
Found a working MD5 command.
Computing results.
SLOC Directory SLOC-by-Language (Sorted)
256 controllers ruby=256
66 models ruby=66
10 config ruby=10
9 top_dir ruby=9
5 helpers ruby=5
0 db (none)
0 public (none)
0 tmp (none)
0 views (none)
Totals grouped by language (dominant language first):
ruby: 346 (100.00%)
Total Physical Source Lines of Code (SLOC) = 346
Development Effort Estimate, Person-Years (Person-Months) = 0.07 (0.79)
(Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05))
Schedule Estimate, Years (Months) = 0.19 (2.28)
(Basic COCOMO model, Months = 2.5 * (person-months**0.38))
Estimated Average Number of Developers (Effort/Schedule) = 0.34
Total Estimated Cost to Develop = $ 8,865
(average salary = $56,286/year, overhead = 2.40).
SLOCCount, Copyright (C) 2001-2004 David A. Wheeler
SLOCCount is Open Source Software/Free Software, licensed under the GNU GPL.
SLOCCount comes with ABSOLUTELY NO WARRANTY, and you are welcome to
redistribute it under certain conditions as specified by the GNU GPL license;
see the documentation for details.
Please credit this data as "generated using David A. Wheeler's 'SLOCCount'."
There is an easier way to do all this, which incidentally is faster than using grep:
First get all the change lists for a particular user, this is a commandline command you can use it in python script by using os.system():
p4 changes -u <username> > 'some_text_file.txt'
Now you need to extract all the changelists number so ,we will use regex for it, here it is done using python :
f = open('some_text_file.txt','r')
lists = f.readlines()
pattern = re.compile(r'\b[0-9][0-9][0-9][0-9][0-9][0-9][0-9]\b')
labels = []
for i in lists:
labels.append(pattern.findall(i))
changelists = []
for h in labels:
if(type(h) is list):
changelists.append(str(h[0]))
else:
changelists.append(str(h))
Now that you have all the changelists numbers in 'labels'.
We will iterate through the list and for every changelist find number of lines added and number of lines deleted, getting the ultimate difference would give us total number of lines added. The following liens of code do exactly that:
for i in changelists:
os.system('p4 describe -ds '+i+' | findstr "^add" >> added.txt')
os.system('p4 describe -ds '+i+' | findstr "^del" >> deleted.txt')
added = []
deleted = []
file = open('added.txt')
for i in file:
added.append(i)
count = []
count_added = 0
count_add = 0
count_del = 0
for j in added:
count = [int(s) for s in j.split() if s.isdigit()]
count_add += count[1]
count = []
file = open('deleted.txt')
for i in file:
deleted.append(i)
for j in labels:
count = [int(s) for s in j.split() if s.isdigit()]
count_del += count[1]
count = []
count_added = count_add - count_del
print count_added
count_added will have number of lines that were added by the user.

Writing to an excel sheet using Bash

Is it possible to write to an excel sheet(any type) from a bash script ?
What I am looking for is something along these lines :
sed -e :a -e '$!N; s/\n/ /; ta' file.c > #( first coloumn ,second row of the spread sheet )
echo "$cdvar" > #( second coloumn ,third row of the spread sheet )
Thank you for your replies and suggestion .
You could write excel by bash, perl, python, .. I think that each program language has its solutions.
bash
You could use join or awk, and I think that there are other solutions.
join
If you want join to files with same column, look these posts: Bash join command and join in bash like in SAS
awk
You could write a csv, but you could rename into xls and then with excel, gnumeric, or other programs, it is recognized like xls.
ls -R -ltr / | head -50 | awk '{if ($5 >0) print $5,$9}' OFS="," > sample.xls
when you modify xls with excel, gnumeric, or other programs, and save in xls,
you could not read by bash. So that #Geekasaur recommended perl or python solutions.
perl
You could write xls in perl, follow a sample:
#!/usr/bin/perl
use Spreadsheet::WriteExcel;
my $workbook = Spreadsheet::WriteExcel->new("test.xls");
my $worksheet = $workbook->add_worksheet();
open(FH,"<file") or die "Cannot open file: $!\n";
my ($x,$y) = (0,0);
while (<FH>){
chomp;
#list = split /\s+/,$_;
foreach my $c (#list){
$worksheet->write($x, $y++, $c);
}
$x++;$y=0;
}
close(FH);
$workbook->close();
And then you could modify xls with Spreadsheet::ParseExcel package: look How can I modify an existing Excel workbook with Perl? and reading and writing sample [Editor's note: This link is broken and has been reported to IBM]
python
You could write real xls in python, follow a sample:
#!/usr/local/bin/python
# Tool to convert CSV files (with configurable delimiter and text wrap
# character) to Excel spreadsheets.
import string
import sys
import getopt
import re
import os
import os.path
import csv
from pyExcelerator import *
def usage():
""" Display the usage """
print "Usage:" + sys.argv[0] + " [OPTIONS] csvfile"
print "OPTIONS:"
print "--title|-t: If set, the first line is the title line"
print "--lines|-l n: Split output into files of n lines or less each"
print "--sep|-s c [def:,] : The character to use for field delimiter"
print "--output|o : output file name/pattern"
print "--help|h : print this information"
sys.exit(2)
def openExcelSheet(outputFileName):
""" Opens a reference to an Excel WorkBook and Worksheet objects """
workbook = Workbook()
worksheet = workbook.add_sheet("Sheet 1")
return workbook, worksheet
def writeExcelHeader(worksheet, titleCols):
""" Write the header line into the worksheet """
cno = 0
for titleCol in titleCols:
worksheet.write(0, cno, titleCol)
cno = cno + 1
def writeExcelRow(worksheet, lno, columns):
""" Write a non-header row into the worksheet """
cno = 0
for column in columns:
worksheet.write(lno, cno, column)
cno = cno + 1
def closeExcelSheet(workbook, outputFileName):
""" Saves the in-memory WorkBook object into the specified file """
workbook.save(outputFileName)
def getDefaultOutputFileName(inputFileName):
""" Returns the name of the default output file based on the value
of the input file. The default output file is always created in
the current working directory. This can be overriden using the
-o or --output option to explicitly specify an output file """
baseName = os.path.basename(inputFileName)
rootName = os.path.splitext(baseName)[0]
return string.join([rootName, "xls"], '.')
def renameOutputFile(outputFileName, fno):
""" Renames the output file name by appending the current file number
to it """
dirName, baseName = os.path.split(outputFileName)
rootName, extName = os.path.splitext(baseName)
backupFileBaseName = string.join([string.join([rootName, str(fno)], '-'), extName], '')
backupFileName = os.path.join(dirName, backupFileBaseName)
try:
os.rename(outputFileName, backupFileName)
except OSError:
print "Error renaming output file:", outputFileName, "to", backupFileName, "...aborting"
sys.exit(-1)
def validateOpts(opts):
""" Returns option values specified, or the default if none """
titlePresent = False
linesPerFile = -1
outputFileName = ""
sepChar = ","
for option, argval in opts:
if (option in ("-t", "--title")):
titlePresent = True
if (option in ("-l", "--lines")):
linesPerFile = int(argval)
if (option in ("-s", "--sep")):
sepChar = argval
if (option in ("-o", "--output")):
outputFileName = argval
if (option in ("-h", "--help")):
usage()
return titlePresent, linesPerFile, sepChar, outputFileName
def main():
""" This is how we are called """
try:
opts,args = getopt.getopt(sys.argv[1:], "tl:s:o:h", ["title", "lines=", "sep=", "output=", "help"])
except getopt.GetoptError:
usage()
if (len(args) != 1):
usage()
inputFileName = args[0]
try:
inputFile = open(inputFileName, 'r')
except IOError:
print "File not found:", inputFileName, "...aborting"
sys.exit(-1)
titlePresent, linesPerFile, sepChar, outputFileName = validateOpts(opts)
if (outputFileName == ""):
outputFileName = getDefaultOutputFileName(inputFileName)
workbook, worksheet = openExcelSheet(outputFileName)
fno = 0
lno = 0
titleCols = []
reader = csv.reader(inputFile, delimiter=sepChar)
for line in reader:
if (lno == 0 and titlePresent):
if (len(titleCols) == 0):
titleCols = line
writeExcelHeader(worksheet, titleCols)
else:
writeExcelRow(worksheet, lno, line)
lno = lno + 1
if (linesPerFile != -1 and lno >= linesPerFile):
closeExcelSheet(workbook, outputFileName)
renameOutputFile(outputFileName, fno)
fno = fno + 1
lno = 0
workbook, worksheet = openExcelSheet(outputFileName)
inputFile.close()
closeExcelSheet(workbook, outputFileName)
if (fno > 0):
renameOutputFile(outputFileName, fno)
if __name__ == "__main__":
main()
And then you could also convert to csv with this sourceforge project.
And if you could convert to csv, you could rewrite xls.. modifing the script.
You can easily do this by first creating a R script (xsltocsv), and then calling it from your Bash file.
The R script would look something like:
#!/usr/bin/Rscript
suppressMessages(library("gdata"))
suppressMessages(library("argparse"))
#. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
parser <- ArgumentParser(
description = "A script to convert a given xsl file to a csv one"
)
parser$add_argument(
'-rn',
'--print-row-names',
action = 'store_true',
help = 'outputs row names in the output csv file'
)
parser$add_argument(
'-cn',
'--print-column-names',
action = 'store_true',
help = 'outputs column names in the output csv file'
)
parser$add_argument(
'-s',
'--separator',
metavar='separator',
type='character',
default=';',
action = 'store',
help = 'outputs column names in the output csv file'
)
parser$add_argument(
"xsl",
metavar = "xsl-file",
action = "store",
help = "xsl input file"
)
parser$add_argument(
"csv",
metavar = "csv-file",
action = "store",
help = "csv output file"
)
args <- parser$parse_args(commandArgs(TRUE))
#. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
vals <- read.xls(args$xsl)
write.table(n, file=args$csv, quote = FALSE,
col.names=args$print_column_names,
row.names=args$print_row_names, sep=args$separator)
Let us say that you put this into your system path after making the file executable (chmod +x xsltocsv). Then, invoke this script passing the associated parameters, and you are good to go ;)

Script to rename and copy files to a new directory.

Hi I have recently made this script to rename files I scan for work with a prefix and a date. It works pretty well however it would be great if it could make a directory in the current directory with the same name as the first file then move all the scanned files there. E.g. First file is renamed to 'Scanned As At 22-03-2012 0' then a directory called 'Scanned As At 22-03-2012 0' (Path being M:\Claire\Scanned As At 22-03-2012 0) is made and that file is placed in there.
I'm having a hard time figuring out the best way to do this. Thanks in advance!
import os
import datetime
#target = input( 'Enter full directory path: ')
#prefix = input( 'Enter prefix: ')
target = 'M://Claire//'
prefix = 'Scanned As At '
os.chdir(target)
allfiles = os.listdir(target)
count = 0
for filename in allfiles:
t = os.path.getmtime(filename)
v = datetime.datetime.fromtimestamp(t)
x = v.strftime( ' %d-%m-%Y')
os.rename(filename, prefix + x + " "+str(count) +".pdf")
count +=1
Not quite clear about your requirement. If not rename the file, only put it under the directory, then you can use the following codes (only the for-loop of your example):
for filename in allfiles:
if not os.isfile(filename): continue
t = os.path.getmtime(filename)
v = datetime.datetime.fromtimestamp(t)
x = v.strftime( ' %d-%m-%Y')
dirname = prefix + x + " " + str(count)
target = os.path.join(dirname, filename)
os.renames(filename, target)
count +=1
You can check help(os.renames).

Resources