How to save values to row in one cycle, please?
import numpy as np
A = 5.4654
B = [4.78465, 6.46545, 5.798]
for i in range(2):
f = open(f'file.txt', 'a')
np.savetxt(f, np.r_[A, B], fmt='%22.16f')
f.close()
The output is:
5.4653999999999998
4.7846500000000001
6.4654499999999997
5.7980000000000000
5.4653999999999998
4.7846500000000001
6.4654499999999997
5.7980000000000000
The desired output is:
5.4653999999999998 4.7846500000000001 6.4654499999999997 5.7980000000000000
5.4653999999999998 4.7846500000000001 6.4654499999999997 5.7980000000000000
According to the documentation:
newlinestr, optional
String or character separating lines.
So, perhaps:
np.savetxt(f, np.r_[A, B], fmt='%22.16f', newlinestr=' ')
print('\n', file=f)
An alternative might be to np.transpose(np.r_[A, B]) perhaps?
Related
I try to iterate excel files under a directory with code below:
import glob
import pandas as pd
files = glob.glob('./*.xlsx')
for file_path in files:
print(file_path)
Out:
./data\S273-2021-12-09.xlsx
./data\S357-2021-12-09.xlsx
./data\S545-2021-12-09.xlsx
./data\S607-2021-12-09.xlsx
Now I hope to replace S273, S357, etc. based dataframe df to map old_name to new_name:
old_name new_name
0 S273 a
1 S357 b
2 S545 c
3 S607 d
4 S281 e
To convert dataframe to dictionary if necessary: name_dict = dict(zip(df.old_name, df.new_name))
The expected result will like:
./data\a-2021-12-09.xlsx
./data\b-2021-12-09.xlsx
./data\c-2021-12-09.xlsx
./data\d-2021-12-09.xlsx
How could I achieve that in Python? Sincere thanks at advance.
EDIT:
for file_path in files:
for key, value in name_dict.items():
if key in str(file_path):
new_path = file_path.replace(key, value)
print(new_path)
The code above works, welcome to share other solutions if it's possible.
You can split basename first by os.path.split, then first part of name of file by - and mapping by dict.get, if no match is return same value, so second argument is also first:
import os
name_dict = dict(zip(df.old_name, df.new_name))
print (name_dict)
{'S273': 'a', 'S357': 'b', 'S545': 'c', 'S607': 'd', 'S281': 'e'}
#for test
L = './data\S273-2021-12-09.xlsx ./data\S357-2021-12-09.xlsx ./data\S545-2021-12-09.xlsx ./data\S607-2021-12-09.xlsx'
files = L.split()
for file_path in files:
head, tail = os.path.split(file_path)
first, last = tail.split('-', 1)
out = os.path.join(head, f'{name_dict.get(first, first)}-{last}')
print(out)
./data\a-2021-12-09.xlsx
./data\b-2021-12-09.xlsx
./data\c-2021-12-09.xlsx
./data\d-2021-12-09.xlsx
I am trying to find specific words from a pandas column and assign it to a new column and column may contain two or more words. Once I find it I wish to replicate the row by creating it for that word.
import pandas as pd
import numpy as np
import re
wizard=pd.read_excel(r'C:\Python\L\Book1.xlsx'
,sheet_name='Sheet1'
, header=0)
test_set = {'941', '942',}
test_set2={'MN','OK','33/3305'}
wizard['ZTYPE'] = wizard['Comment'].apply(lambda x: any(i in test_set for i in x.split()))
wizard['ZJURIS']=wizard['Comment'].apply(lambda x: any(i in test_set2 for i in x.split()))
wizard_new = pd.DataFrame(np.repeat(wizard.values,3,axis=0))
wizard_new.columns = wizard.columns
wizard_new.head()
I am getting true and false, however unable to split it.
Above is how the sample data reflects. I need to find anything like this '33/3305', Year could be entered as '19', '2019', and quarter could be entered are 'Q1'or '1Q' or 'Q 1' or '1 Q' and my test set lists.
ZJURIS = dict(list(itertools.chain(*[[(y_, x) for y_ in y] for x, y in wizard.comment()])))
def to_category(x):
for w in x.lower().split(" "):
if w in ZJURIS:
return ZJURIS[w]
return None
Finally, apply the method on the column and save the result to a new one:
wizard["ZJURIS"] = wizard["comment"].apply(to_category)
I tried the above solution well it did not
Any suggestions how to do I get the code to work.
Sample data.
data={ 'ID':['351362278576','351539320880','351582465214','351609744560','351708198604'],
'BU':['SBS','MAS','NAS','ET','SBS'],
'Comment':['940/941/w2-W3NYSIT/SUI33/3305/2019/1q','OK SUI 2Q19','941 - 3Q2019NJ SIT - 3Q2019NJ SUI/SDI - 3Q2019','IL,SUI,2016Q4,2017Q1,2017Q2','1Q2019 PA 39/5659 39/2476','UT SIT 1Q19-3Q19']
}
df = pd.DataFrame(data)
Based on the data sample data set attached is the output.
There are three files with names: file_2018-01-01_01_temp.tif, file_2018-01-01_02_temp.tif and file_2018-01-01_03_temp.tif. I want to list them names as ['2018010101', '2018010102', '2018010103'] in python.
The below code create an incorrect list.
import pandas as pd
from glob import glob
from os import path
pattern = '*.tif'
filenames = [path.basename(x) for x in glob(pattern)]
pd.DatetimeIndex([pd.Timestamp(f[5:9]) for f in filenames])
Result:
DatetimeIndex(['2018-01-01', '2018-01-01', '2018-01-01']
I think simpliest is indexing with replace in list comprehension:
a = [f[5:18].replace('_','').replace('-','') for f in filenames]
print (a)
['2018010101', '2018010102', '2018010103']
Similar with Series.str.replace:
a = pd.Index([f[5:18] for f in filenames]).str.replace('\-|_', '')
print (a)
Index(['2018010101', '2018010102', '2018010103'], dtype='object')
Or convert values to DatetimeIndex and then use DatetimeIndex.strftime:
a = pd.to_datetime([f[5:18] for f in filenames], format='%Y-%m-%d_%H').strftime('%Y%m%d%H')
print (a)
Index(['2018010101', '2018010102', '2018010103'], dtype='object')
EDIT:
dtype is in object, but it must be in dtype='datetime64[ns]
If need datetimes, then formating has to be default, not possible change it:
d = pd.to_datetime([f[5:18] for f in filenames], format='%Y-%m-%d_%H')
print (d)
DatetimeIndex(['2018-01-01 01:00:00', '2018-01-01 02:00:00',
'2018-01-01 03:00:00'],
dtype='datetime64[ns]', freq=None)
I have a csv file a list of name and mean.
For example:
ali,5.0
hamid,6.066666666666666
mandana,7.5
soheila,7.833333333333333
sara,9.75
sina,11.285714285714286
sarvin,11.375
I am going to rewrite the csv by three lower mean. I have write the code, but I have a problem to write the csv again. I should keep the mean number exactly as an input.
import csv
import itertools
from collections import OrderedDict
with open ('grades4.csv', 'r') as input_file:
reader=csv.reader(input_file)
val1=[]
key=list()
threelowval=[]
for row in reader:
k = row[0]
val=[num for num in row[1:]] #seperate a number in every row
key.append(k) #making a name -key- list
val1.append(val) #making a value list
value = list(itertools.chain.from_iterable(val1)) #making a simple list from list of list in value
value=[float(i) for i in value] ##changing string to float in values
#print(key)
#print(value)
dictionary = dict(zip(key, value))
#print(dictionary)
findic=OrderedDict(sorted(dictionary.items(), key=lambda t: t[1])) ##making a sorted list by OrderedDict
#print(findic)
##make a separation for the final dict to derive the three lower mean
lv=[]
for item in findic.values():
lv.append(item)
#print(lv)
for item in lv[0:3]:
threelowval.append(item)
print(threelowval)
I have tried below code but I get the error.
with open('grades4.csv', 'w', newline='') as output_file_name:
writer = csv.writer(output_file_name)
writer.writerows(threelowval)
expected result:
5.0
6.066666666666666
7.5
You should try this:
with open('grades4.csv', 'w', newline='') as output_file_name:
writer = csv.writer(output_file_name)
for i in threelowval:
writer.writerow([i])
I have tried below code and receive the correct results.
with open('grades4.csv', 'w', newline='') as output_file_name:
writer = csv.writer(output_file_name)
writer.writerows(map(lambda x: [x], threelowval))
I am new to Python, apologies if this is a stupid question.
I have a text file with the following input:
Apple Apple1
Apple Apple2
Aaron Aaron1
Aaron Aaron2
Aaron Aaron3
Tree Tree1
I have the following code:
import csv
import sys
from itertools import groupby
with open('File.txt', 'r', encoding='utf-8') as csvfile:
reader = csv.reader(csvfile, delimiter='\t')
next(reader, None)
a = [[k] + [x[1] for x in g] for k, g in groupby(reader, key=lambda row: row[0])]
sys.stdout=open('Out.txt','w', encoding='utf-8')
print (str(a))
What I want to achieve:
Apple Apple1,Apple2
Aaron Aaron1,Aaron2,Aaron3
Tree Tree1
However, the output I am now getting is in list form, while I want it to be printed line per line. How can I achieve this?
How about
import pandas as pd
df = pd.read_csv('File.txt',delimiter=' ' , header=None)
grouped = df.groupby(0).agg(lambda col: ', '.join(col)).to_records()
for group in grouped:
print(group[0] + ' ' + group[1])