I'm new to programming in Python and now i have made a script to move files from one location to another.
Now i wanted to have a logfile for it, but i can't find a way to farmat the text it puts in the logfile.
I have the following code:
#logging
log= 'Succesfully moved', x, 'to', moveto
logging.basicConfig(filename='\\\\fatboy.leleu.be\\iedereen\\Glenn\\insitecopy.log',filemode='a',level=logging.INFO,format='%(asctime)s %(message)s',datefmt='%d/%m/%Y ' ' %I:%M:%S %p')
logging.info(log)
The output is this:
14/12/2018 08:54:17 AM ('Succesfully moved', '2126756_landrover.pdf', 'to', '\\\\fatboy.leleu.be\\MPWorkflow\\Jobs\\2126756_test\\PDF Druk')
14/12/2018 08:54:17 AM ('Succesfully moved', '2126757_landrover - kopie.pdf', 'to', '\\\\fatboy.leleu.be\\MPWorkflow\\Jobs\\2126757_test2\\PDF Druk')
Now i want to remove the brackets, the apostrophe and the comma, but don't know how
The simplest way is to use logging.info(" ".join(log)), because your "log" variable looks if it was a tuple. But it will working only if log is really a tuple and contains only str type elements.
Python shows tuples in that form as you can see in your log: opening round bracket, items (between apostrophes if item is a string), closing round bracket.
Pls, try code below;
log= 'Succesfully moved ' + x + ' to ' + moveto
This
log= 'Succesfully moved', x, 'to', moveto
Is creating a tuple try something like
log = 'Succefully moved {} to {}'.format(x, moveto)
Related
I spent too much time trying to write a generic solution to a problem (below this). I ran into a couple issues, so I ended up writing a Do-It script, which is here:
# No imports necessary
# set file paths
annofh="/Path/To/Annotation/File.tsv"
datafh="/Path/To/Data/File.tsv"
mergedfh="/Path/To/MergedOutput/File.tsv"
# Read all the annotation data into a dict:
annoD={}
with open(annofh, 'r') as annoObj:
h1=annoObj.readline()
for l in annoObj:
l=l.strip().split('\t')
k=l[0] + ':' + l[1] + ' ' + l[3] + ' ' + l[4]
annoD[k]=l
keyset=set(annoD.keys())
with open(mergedfh, 'w') as oF:
with open(datafh, 'r') as dataObj:
h2=dataObj.readline().strip(); oF.write(h2 + '\t'+ h1) # write the header line to the output file
for l in dataObj:
l=l.strip().split('\t') # Read through the data to be annotated line-by-line:
if "-" in l[13]:
pos=l[13].split('-')
l[13]=pos[0]
key=l[12][3:] + ":" + l[13] + " " + l[15] + " " + l[16]
if key in annoD.keys():
l = l + annoD[key]
oF.write('\t'.join(l) + '\n')
else:
oF.write('\t'.join(l) + '\n')
The function of DoIt.py (which functions correctly, above ^ ) is simple:
first read a file containing annotation information into a dictionary.
read through the data to be annotated line-by-line, and add annotation info. to the data by matching a string constructed by pasting together 4 columns.
As you can see, this script contains index positions, that I obtained by writing a quick awk one-liner, finding the corresponding columns in both files, then putting these into the python script.
Here's the thing. I do this kind of task all the time. I want to write a robust solution that will enable me to automate this task, *even if column names vary. My first goal is to use partial string matching; but eventually it would be nice to be even more robust.
I got part of the way to doing this, but at present the below solution is actually no better than the DoIt.py script...
# Across many projects, the correct columns names vary.
# For example, the name might be "#CHROM" or "Chromosome" or "CHR" for the first DF, But "Chrom" for the second df.
# in any case, if I conduct str.lower() then search for a substring, it should match any of the above options.
MasterColNamesList=["chr", "pos", "ref", "alt"]
def selectFields(h, columnNames):
##### currently this will only fix lower case uppercase problems. need to fix to catch any kind of mapping issue, like a partial string match (e.g., chr will match #CHROM)
indices=[]
h=map(str.lower,h)
for fld in columnNames:
if fld in h:
indices.append(h.index(fld))
#### Now, this will work, but only if the field names are an exact match.
return(indices)
def MergeDFsByCols(DF1, DF2, colnames): # <-- Single set of colnames; no need to use indices
pass
# eventually, need to write the merge statement; I could paste the cols together to a string and make that the indices for both DFs, then match on the indices, for example.
def mergeData(annoData, studyData, MasterColNamesList):
####
import pandas as pd
aDF=pd.read_csv(annoData, header=True, sep='\t')
sDF=pd.read_csv(studyData, header=True, sep='\t')
####
annoFieldIdx=selectFields(list(aVT.columns.values), columnNames1) # currently, columnNames1; should be MasterColNamesList
dataFieldIdx=selectFields(list(sD.columns.values), columnNames2)
####
mergeDFsByCols(aVT, sD):
Now, although the above works, it is actually no more automated than the DoIt.py script, because the columnNames1 and 2 are specific to each file and still need to be found manually ...
What I want to be able to do is enter a list of generic strings that, if processed, will result in the correct columns being pulled from both files, then merge the pandas DFs on those columns.
Greatly appreciate your help.
I'm trying to split a string in a list of strings. Right now i have to split whenever I see any of these characters: '.', ';', ':', '?', '!', '( )', '[ ]', '{ }' (keep in mind that I have to mantain whatever is inside the brackets).
To solve it I tried to write
print(re.split("\(([^)]*)\)|[.,;:?!]\s*", "Hello world,this is(example)"))
but as output I get:
['Hello world', None, 'this is', 'example', '']
Omitting the ' ' at the end that I'll solve later, how can I remove the None that appears in the middle of the list?
By the way I can't iterate in the list another time because the program shall work with huge files and I have to make it as fast as possible.
Also I don't have to necessarily use re.split so everything that works will be just fine!
I'm still new at this so I'm sorry if something is incorrect.
Not sure if this is fast enough but you could do this:
re.sub(r";|,|:|\(|\)|\[|\]|\?|\.|\{|\}|!", " ", "Hello world,this is(example)").split()
When I try to print whatever data on several lines using python 3, a single whitespace gets added to the beginning of all the lines except first one. for example:
[in] print('a','\n','b','\n','c')
the output will be:
a
b
c
but my desired output is:
a
b
c
so far I've only been able to do this by doing three print commands. Anyone has any thoughts?
From the docs:
print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False)
Print objects to the text stream file, separated by sep and followed by end.
sep, end, file and flush, if present, must be given as keyword
arguments.
Calling print('a', '\n', 'b') will print each of those three items with a space in between, which is what you are seeing.
You can change the separator argument to get what you want:
print('a', 'b', sep='\n')
Also see the format method.
I have an external file that I'm reading a list from, and then printing out the list. So far I have a for loop that is able to read through the list and print out each item in the list, in the same format as it is stored in the external file. My list in the file is:
['1', '10']
['Hello', 'World']
My program so far is:
file = open('Original_List.txt', 'r')
file_contents = file.read()
for i in file_contents.split():
print(i)
file.close()
The output I'm trying to get:
1 10
Hello World
And my current output is:
['1',
'10']
['Hello',
'World']
I'm part way there, I've managed to separate the items in the list into separate lines, but I still need to remove the square brackets, quotation marks, and commas. I've tried using a loop to loop through each item in the line, and only display it if it doesn't contain any square brackets, quotation marks, and commas, but when I do that, it separates the list item into individual characters, rather than leave it as one entire item. I also need to be able to display the first item, then tab it over, and print the second item, etc, so that the output looks identical to the external file, except with the square brackets, quotation marks, and commas removed. Any suggestions for how to do this? I'm new to Python, so any help would be greatly appreciated!
Formatting is your friend.
file = open('Original_List.txt', 'r'))
file_contents = file.readlines() # change this to readlines so that it splits on each line already
for list in file_contents:
for item in eval(list): # be careful when using eval but it suits your use case, basically turns the list on each line into an 'actual' list
print("{:<10}".format(i)) # print each item with 10 spaces of padding and left align
print("\r\n") # print a newline after each line that we have interpreted
file.close()
I have a .csv file that I am creating, and it is being created by iterating through an input file. My current code for the specific column this question is about looks like this:
input_filename = sys.argv[1]
output_filename = sys.argv[2]
f = open(sys.argv[3]).read()
list.append(("A B", f[0:2], "numeric", "A B"))
For the portion of the code 'f[0:2]', rather than having it append the first few characters of f as a whole file (which obviously makes it append the first few characters every time it is appended), I want it to append [0:2] for the next line in f every time the loop is executed. I have tried:
list.append(("A B", f.line[0:2], "numeric", "A B"))
and other similar approaches, to no avail. I hope this question is clear - if not, I am happy to clarify. Any suggestions for putting this stipulation into this append line are appreciated!
Thank you!
It's a little hard for me to guess what you're trying to do here, but is this something like what you're looking for?
Contents of data.txt
abc
def
The code:
# I'm simply replacing your names so I can test this more easily
input_filename = 'input.txt'
output_filename = 'output.txt'
data_filename = 'data.txt'
transformed_data = []
with open(data_filename) as df:
for line in df:
# remove surrounding whitespace- assuming you want this
line = line.strip()
if line: # make sure there's non-whitespace characters left
transformed_data.append(("A B", line[0:2], "numeric", "A B"))
print(transformed_data)
# produces
# [('A B', 'ab', 'numeric', 'A B'), ('A B', 'de', 'numeric', 'A B')]
If you're working with .csv files, I highly recommend the csv library that comes with Python. Let it handle encoding and formatting for you.