l can't read values of pixels from pandas in img() opencv here are my code and the reported errorr
import cv2
import numpy as np
import csv
import os
import pandas as pd
path_csv='/home/'
npa=pd.read_csv(path_csv+"char.csv", usecols=[2,3,4,5], header=None)
nb_charac=npa.shape[0]-1
#stock the actual letters of your csv in an array
characs=[]
cpt=0
#take characters
f = open(path_csv+"char.csv", 'rt')
reader = csv.reader(f)
for row in reader:
if cpt>=1: #skip header
characs.append(str(row[1]))
cpt+=1
#open your image
path_image= '/home/'
img=cv2.imread(os.path.join(path_image,'image1.png'))
path_save= '/home/2/'
i=0
#for every line on your csv,
for i in range(nb_charac):
#get coordinates
#coords=npa[i,:]
coords=npa.iloc[[i]]
charac=characs[i]
#actual cropping of the image (easy with numpy)
img_charac=img[int(coords[2]):int(coords[4]),int(coords[3]):int(coords[5])]
img_charac=cv2.resize(img_charac, (32, 32), interpolation=cv2.INTER_NEAREST)
i+=1
#charac=charac.strip('"\'')
#x=switch(charac)
#saving the image
cv2.imwrite(path_save+str(charac)+"_"+str(i)+"_"+str(img_charac.shape)+".png",img_charac)
img_charac2 = 255 - img_charac
cv2.imwrite(path_save +str(charac)+ "_switched" + str(i) + "_" + str(img_charac2.shape) + ".png", img_charac2)
print(i)
l got the following error
img_charac=img[int(coords[2]):int(coords[3]),int(coords[0]):int(coords[1])]
File "/usr/lib/python2.7/dist-packages/pandas/core/series.py", line 79, in wrapper
return converter(self.iloc[0])
ValueError: invalid literal for int() with base 10: 'left_column_pixel'
the error is related to this line of code :
img_charac=img[int(coords[2]):int(coords[4]),int(coords[3]):int(coords[5])]
such that my variable coords is as follow :
>>> coords=npa.iloc[[1]]
>>> coords
2 3 4 5
1 38 104 2456 2492
and the different values of the column 2,3,4,5 needed in image_char are :
>>> coords[2]
1 38
Name: 2, dtype: object
>>> coords[3]
1 104
Name: 3, dtype: object
>>> coords[4]
1 2456
Name: 4, dtype: object
>>> coords[5]
1 2492
Name: 5, dtype: object
l updated the line of img_charac as follow
img_charac = img[int(float(coords[2].values[0])):int(float(coords[4].values[0])), int(float(coords[3].values[0])):int(float(coords[5].values[0]))]
l don't have anymore
ValueError: invalid literal for int() with base 10: 'left_column_pixel'
but l got the following error :
ValueError: could not convert string to float: left_column_pixel
l noticed that outside the loop img_charac works
I think the ValueError occurs because you are reading the header row of your csv file within the first iteration of your for-loop. The header contains string labels which can't converted to integers:
for i in range(nb_charac) will start with i having 0 as the first value.
Then, coords=npa.iloc[[i]] will return the first row (0th row) of your csv-file.
Since you've set header=None in npa=pd.read_csv(path_csv+"char.csv", usecols=[2,3,4,5], header=None), you iterate over strings within your header row.
So either set header=0 or for i in range(1, nb_charac).
Related
I have this .txt:
'4 1 15 12'
It's just one long line separating its' items with tab. I need to read it into a list containing int items.
I can't seem to make pandas, csv module or open to do the trick.
This kinda works:
f = open('input.txt')
for line in f:
memory = line.split()
for item in memory:
item = int(item)
print(memory)
['4', '1', '15', '12']
But it gives me an error when i compare its' max value to an int:
max_val = max(memory)
while max_val > 0:
TypeError: '>' not supported between instances of 'str' and 'int'
It appears as though the text in the question was not tab spaced.
I have created a tab spaced file and the following works:
import pandas as pd
test_file = "C:\\Users\\lefcoe\\Desktop\\test.txt"
df = pd.read_csv(test_file, delimiter='\t', header=None)
print(df)
#%% convert to a list of ints
my_list = df.loc[0, :].values.tolist()
my_list_int = [int(x) for x in my_list]
my_list_int
#%% get the max
m = max(my_list_int)
print(m)
result:
1 1 2 3
0 4 1 15 12
15
its a TypeError you cant check if a type(str) is a type(int) because they are both different types
max_val = max(memory)
print(type(max_val))
>>> <class 'str'>
just change max_val to an int for example
max_val = max(memory)
while int(max_val) > 0:
I found this thread how to make a variable change from the text "1m" into "1000000" in python
My string values are in a column within a pandas dataframe. The string/0bkects values are like 18M, 345K, 12.9K, 0, etc.
values = df5['Values']
multipliers = { 'k': 1e3,
'm': 1e6,
'b': 1e9,
}
pattern = r'([0-9.]+)([bkm])'
for number, suffix in re.findall(pattern, values):
number = float(number)
print(number * multipliers[suffix])
Running the code gives this error:
Traceback (most recent call last):
File "c:/Users/thebu/Documents/Python Projects/trading/screen.py", line 19, in <module>
for number, suffix in re.findall(pattern, values):
File "C:\Users\thebu\Anaconda3\envs\trading\lib\re.py", line 223, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or bytes-like object
Thanks
Here's another way using regex:
import re
def get_word(s):
# find word
r = re.findall(r'[a-z]', s)
# find numbers
w = re.findall(r'[0-9]', s)
if len(r) > 0 and len(w) > 0:
r = r[0]
v = multipliers.get(r, None)
if v:
w = int(''.join(w))
w *= v
return round(w)
df['col2'] = df['col'].apply(get_word)
print(df)
col col2
0 10k 10000
1 20m 20000000
Sample Data
df = pd.DataFrame({'col': ['10k', '20m']})
This code read CSV file line by line and counts the number on each Unicode but I can't understand two parts of code like below.I've already googled but I could't find the answer. Could you give me advice ?
1) Why should I use numpy here instead of []?
emoji_time = np.zeros(200)
2) What does -1 mean ?
emoji_time[len(emoji_list)-1] = 1 ```
This is the code result:
0x100039, 47,
0x10002D, 121,
0x100029, 30,
0x100078, 6,
unicode_count.py
import codecs
import re
import numpy as np
file0 = "./message.tsv"
f0 = codecs.open(file0, "r", "utf-8")
list0 = f0.readlines()
f0.close()
print(len(list0))
len_list = len(list0)
emoji_list = []
emoji_time = np.zeros(200)
for i in range(len_list):
a = "0x1000[0-9A-F][0-9A-F]"
if "0x1000" in list0[i]: # 0x and 0x1000: same nuumber
b = re.findall(a, list0[i])
# print(b)
for j in range(len(b)):
if b[j] not in emoji_list:
emoji_list.append(b[j])
emoji_time[len(emoji_list)-1] = 1
else:
c = emoji_list.index(b[j])
emoji_time[c] += 1
print(len(emoji_list))
1) If you use a list instead of a numpy array the result should not change in this case. You can try it for yourself running the same code but replacing emoji_time = np.zeros(200) with emoji_time = [0]*200.
2) emoji_time[len(emoji_list)-1] = 1. What this line is doing is the follow: If an emoji appears for the first time, 1 is add to emoji_time, which is the list that contains the amount of times one emoji occurred. len(emoji_list)-1 is used to set the position in emoji_time, and it is based on the length of emoji_list (the minus 1 is only needed because the list indexing in python starts from 0).
How can I Import this file which contains plain text with numbers?
It's difficult to import because the first line contains 7 numbers and the second line contains 8 numbers...
In general:
LINE 1: 7 numbers.
LINE 2: 8 numbers.
LINE 3: 7 numbers.
LINE 4: 8 numbers.
... and so on
I just had tried to read but cannot import it. I need to save the data in a NumPy array.
filepath = 'CHALLENGE.001'
with open(filepath) as fp:
line = fp.readline()
cnt = 1
while line:
print("Line {}: {}".format(cnt, line.strip()))
line = fp.readline()
cnt += 1
LINK TO DATA
This file contains information for each frequency has is explained below:
You'll have to skip the blank lines when reading as well.
Just check if the first line is blank. If it isn't, read 3 more lines.
Rinse and repeat.
Here's an example of both a numpy array and a pandas dataframe.
import pandas as pd
import numpy as np
filepath = 'CHALLENGE.001'
data = []
headers = ['frequency in Hz',
'ExHy coherency',
'ExHy scalar apparent resistivity',
'ExHy scalar phase',
'EyHz coherency',
'EyHx scalar apparent resistivity',
'EyHx scalar phase',
're Zxx/√(µo)',
'im Zxx/√(µo)',
're Zxy/√(µo)',
'im Zxy/√(µo)',
're Zyx/√(µo)',
'im Zyx/√(µo)',
're Zyy/√(µo)',
'im Zyy/√(µo)',
]
with open(filepath) as fp:
while True:
line = fp.readline()
if not len(line):
break
fp.readline()
line2 = fp.readline()
fp.readline()
combined = line.strip().split() + line2.strip().split()
data.append(combined)
df = pd.DataFrame(data, columns=headers).astype('float')
array = np.array(data).astype(np.float)
# example of type
print(type(df['frequency in Hz'][0]))
d =dict(input('Enter a dictionary'))
sum = 0
for i in d.values():
sum +=i
print(sum)
outputs: Enter a dictionary{'a': 100, 'b':200, 'c':300}
this is the problem arises:
Traceback (most recent call last):
File "G:/DurgaSoftPython/smath.py", line 2, in <module>
d =dict(input('Enter a dictionary'))
ValueError: dictionary update sequence element #0 has length 1; 2 is required
You can't create a dict from a string using the dict constructor, but you can use ast.literal_eval:
from ast import literal_eval
d = literal_eval(input('Enter a dictionary'))
s = 0 # don't name your variable `sum` (which is a built-in Python function
# you could've used to solve this problem)
for i in d.values():
s +=i
print(s)
Output:
Enter a dictionary{'a': 100, 'b':200, 'c':300}
600
Using sum:
d = literal_eval(input('Enter a dictionary'))
s = sum(d.values())
print(s)
import json
inp = input('Enter a dictionary')
inp = dict(json.loads(inp))
sum = sum(inp.values())
print(sum)
input Enter a dictionary{"a": 100, "b":200, "c":300}
output 600
Actually the return of input function is a string. So, in order to have a valid python dict you need to evaluate the input string and convert it into dict.
One way to do this can be done using literal_eval from ast package.
Here is an example:
from ast import literal_eval as le
d = le(input('Enter a dictionary: '))
_sum = 0
for i in d.values():
_sum +=i
print(_sum)
Demo:
Enter a dictionary: {'a': 100, 'b':200, 'c':300}
600
PS: Another way can be done using eval but it's not recommended.