Re-organizing the data in a text file in python3 - python-3.x

I have a text file which looks like the small example:
small example:
Name sample1 sample2 sample3
A2M 9805.6 3646.8 1376.48
ACVR1C 20 37.8 20
ADAM12 197.8 120.96 31.28
I am trying to re-organize the data and make a new text file which looks like the expected output:
expected output:
Name Sample
A2M 9805.6
A2M 3646.8
A2M 1376.48
ACVR1C 20
ACVR1C 37.8
ACVR1C 20
ADAM12 197.8
ADAM12 120.96
ADAM12 31.28
in fact the last 3 columns (of input data) will be included in the 2nd column of output data and every item in the 1st column of input file will be repeated 3 times (there are 3 samples per Name).
to do so, I wrote the following code in python3:
def convert(input_file, output_file):
with open(input_file, 'r') as infile:
res = {}
line = infile.split()
res.keys = line[0]
res.values = line[2:]
outfile = open(output_file, "w")
for k, v in res.items():
outfile.write(str(k) + '\t'+ str(v) + '\n')
but it does not return what I want to get. do you know how to fix it?

You have a few problems in your code.
First you should also open the outfile within the with statement. Second, a dict's keys and values are read only. And last you try to split the whole file which is not possible. You want to loop on all the lines like so:
def convert(input_file, output_file):
with open(input_file) as infile, open(output_file, "w") as outfile:
outfile.write("Name\tSample")
for line in infile:
values = line.split()
for value in values[1:]:
outfile.write(values[0] + "\t" + value + "\n")
Although you should consider changing your format to csv and reading it to a dataframe.

Try this,
d= {}
with open('file1.txt','r') as f: # Your file
header = next(f)
for i in f:
d.setdefault(i.split()[0],[]).extend(i.split()[1:])
with open('nflie1.txt','w') as f: # New file
f.write('Name Sample\n')
for k,v in d.items():
for el in v:
f.write('{} {}\n'.format(k,el))
Output:
Name Sample
A2M 9805.6
A2M 3646.8
A2M 1376.48
ACVR1C 20
ACVR1C 37.8
ACVR1C 20
ADAM12 197.8
ADAM12 120.96
ADAM12 31.28

Related

How to sum specific values from two different txt files in python

I have 2 txt files with names and scores. For example:
File 1 File 2 Desired Output
Name Score Name Score Name Score
Michael 20 Michael 30 Michael 50
Adrian 40 Adrian 50 Adrian 90
Jane 60 Jane 60
I want to sum scores with same names and print them. I tried to pair names and scores in two different dictionaries and after that merge the dictionaries. However, I can't keep same names with different scores. So, I'm stuck here. I've written something like following :
d1=dict()
d2=dict()
with open('data1.txt', "r") as f:
test = [i for line in f for i in line.split()]
i = 0
while i < len(test) - 1:
d1[test[i]] = test[i + 1]
i += 2
del d1['Name']
with open('data2.txt', "r") as f:
test = [i for line in f for i in line.split()]
i = 0
while i < len(test) - 1:
d2[test[i]] = test[i + 1]
i += 2
del d2['Name']
z = dict(d2.items() | d1.items())
Using a dictionary comprehension should get you what you are after. I have assumed the contents of the files are:
File1.txt:
Name Score
Michael 20
Adrian 40
Jane 60
File2.txt:
Name Score
Michael 30
Adrian 50
Then you can get a total as:
with open("file1.txt", "r") as file_in:
next(file_in) # skip header
file1_data = dict(row.split() for row in file_in if row)
with open("file2.txt", "r") as file_in:
next(file_in) # skip header
file2_data = dict(row.split() for row in file_in if row)
result = {
key: int(file1_data.get(key, 0)) + int(file2_data.get(key, 0))
for key
in set(file1_data).union(file2_data) # could also use file1_data.keys()
}
print(result)
This should give you a result like:
{'Michael': 50, 'Jane': 60, 'Adrian': 90}
Use defaultdict
from collections import defaultdict
name_scores = defaultdict(int)
files = ('data1.txt', 'data2.txt')
for file in files:
with open(file, 'r') as f:
for name, score in f.split():
name_scores[name] += int(score)
edit: You'll probably have to skip any header line and maybe clean up trailing white spaces, but the gist of it is above.

How to convert DataFrame to single row list

I want to convert DataFrames in file "text.txt" that contain 101 rows × 1 columns into 1 rows x 1 columns with separator (',')
I tried this code :
tweets_data = []
with open("text.txt", "r", encoding="utf8") as f:
for tweet in f:
ayu = tweet.rstrip('\n').split(',')
print(ayu)
I expected the output [{'text'},{'text'},....,{'text'}]
but the actual output is
[{'text\n'},
{'text\n'},
...
{'text\n'},]
Anyone can help me?
From what I can see the solution would be something like this:
with open("fileName.txt", "r", encoding="utf-8") as f:
listOfTweets = []
for tweet in f:
listOfTweets.append(tweet)
print(listOfTweets)

get multiple colums into text file

I have a CSV file and I want to convert it to a text file based on the first column which is the ids. and then each file contain multiple columns. for example
file.csv
id val1 val 2 val3
1 50 52 60
2 45 84 96
and etc.
here is my code:
dir_name = '/Users/user/My Documents/test/'
with io.open('file1.csv', 'rt',encoding='utf8') as f:
reader = csv.reader(f, delimiter=',')
next(reader)
xx = []
for row in reader:
with open(os.path.join(dir_name, row[0] + ".txt"),'a') as f2:
xx = row[1:2]
f2.write(xx +"\n")
so it should be:
1.text
50 52 60
2.text
45 84 96
but it only creates files without content.
can anyone help me?. Thanks in advance
There were a couple of issues:
It's actually a whitespace separated values file, not a comma-separated values file. So, you have to change the delimiter from ,. Also, the whitespace is repeated, so you can pass an additional flag to the csv module.
Some funkiness with the array indexing and conversion to string.
This program meets your requirements:
#!/usr/bin/python
import io
import csv
import os
dir_name = './'
with io.open('input.csv', 'rt',encoding='utf8') as f:
reader = csv.reader(f, skipinitialspace=True, delimiter=' ')
next(reader)
xx = []
for row in reader:
filename = os.path.join(dir_name, row[0])
with open(filename + ".txt", 'a') as f2:
xx = row[1:]
f2.write(" ".join(xx) +"\n")

I want to read file and write another file. Basically, I want to do some arithmetic and write few other columns

I have a file like
2.0 4 3
0.5 5 4
-0.5 6 1
-2.0 7 7
.......
the actual file is pretty big
which I want to read and add couple of columns, first added column, column(4) = column(2) * column(3) and 2nd column added would be column 5 = column(2)/column(1) + column(4) so the result should be
2.0 4 3 12 14
0.5 5 4 20 30
-0.5 6 1 6 -6
-2.0 7 7 49 45.5
.....
which I want to write in a different file.
with open('test3.txt', encoding ='latin1') as rf:
with open('test4.txt', 'w') as wf:
for line in rf:
float_list= [float(i) for i in line.split()]
print(float_list)
But so far I just have this. I am just able create the list not sure how to perform arithmetic on the list and create new columns. I think I am completely off here. I am just a beginner in python. Any help will be greatly appreciated. Thanks!
I would reuse your formulae, but shifting indexes since they start at 0 in python.
I would extend the read column list of floats with the new computations, and write back the line, space separated (converting back to str in a list comprehension)
So, the inner part of the loop can be written as follows:
with open('test3.txt', encoding ='latin1') as rf:
with open('test4.txt', 'w') as wf:
for line in rf:
column= [float(i) for i in line.split()] # your code
column.append(column[1] * column[2]) # add column
column.append(column[1]/column[0] + column[3]) # add another column
wf.write(" ".join([str(x) for x in column])+"\n") # write joined strings, separated by spaces
Something like this - see comments in code
with open('test3.txt', encoding ='latin1') as rf:
with open('test4.txt', 'w') as wf:
for line in rf:
float_list = [float(i) for i in line.split()]
# calculate two new columns
float_list.append(float_list[1] * float_list[2])
float_list.append(float_list[1]/float_list[0] + float_list[3])
# convert all values to text
text_list = [str(i) for i in float_list]
# concatente all elements and write line
wf.write(' '.join(text_list) + '\n')
Try the following:
map() is used to convert each element of the list to float, by the end it is used again to convert each float to str so we can concatenate them.
with open('out.txt', 'w') as out:
with open('input.txt', 'r') as f:
for line in f:
my_list = map(float, line.split())
my_list.append(my_list[1]*my_list[2])
my_list.append(my_list[1] / my_list[0] + my_list[3])
my_list = map(str, my_list)
out.write(' '.join(my_list) + '\n')

How to calculate from a dictionary in python

import operator
with open("D://program.txt") as f:
Results = {}
for line in f:
part_one,part_two = line.split()
Results[part_one] = part_two
c=sum(int(Results[x]) for x in Results)
r=c/12
d=len(Results)
F=max(Results.items(), key=operator.itemgetter(1))[0]
u=min(Results.items(), key=operator.itemgetter(1))[0]
print ("Number of entries are",d)
print ("Student with HIGHEST mark is",F)
print ("Student with LOWEST mark is",u)
print ("Avarage mark is",r)
Results = [ (v,k) for k,v in Results.items() ]
Results.sort(reverse=True)
for v,k in Results:
print(k,v)
import sys
orig_stdout = sys.stdout
f = open('D://programssr.txt', 'w')
sys.stdout = f
print ('Number of entries are',d)
print ("Student with HIGHEST mark is",F)
print ("Student with LOWEST mark is",u)
print ("Avarage mark is",r)
for v,k in Results:
print(k,v)
sys.stdout = orig_stdout
f.close()
I want to read a txt file but problem is it cant compute the results i want to write in a new file because of the NAMES and MARKS in file.if you remove them it works fine.i want to make calculations without removing NAMES and MARKS in txt file..Help what i am i doing wrong
NAMES MARKS
Lux 95
Veron 70
Lesley 88
Sticks 80
Tipsey 40
Joe 62
Goms 18
Wesley 35
Villa 11
Dentist 72
Onty 50
Just consume the first line using next() function, before looping over it:
with open("D://program.txt") as f:
Results = {}
next(f)
for line in f:
part_one,part_two = line.split()
Results[part_one] = part_two
Note that file objects are iterator-like object (one shot iterable) and when you loop over them you consume the items and you have no access to them anymore.

Resources