How to get sum of floats in python? - python-3.x

I have a file containing a Persian sentence, then a tab, a Persian word, a tab and then an English word in each line of that. I also have a dictionary with keys and float values. I have to find the words of the file in each line that are in the dictionary, too. And then return their values. Then I have to calculate the logarithm of each word and finally calculate the sum of them for each line separately. The problem is, when I want to calculate the sum, this error occurs: TypeError: 'float' object is not iterable. How can I fix it?
import math
probabilities = {"شور": 0.02, "نمک": 0.05,"زندگی": 0.07, "غذاهای": 0.01, "غذای": 0.05}
filename = "F.txt"
for line in open(filename, encoding="utf-8"):
list_line = line.split("\t")
words = list_line[0].split()
for key, value in probabilities.items():
for word in words:
if word == key:
result = sum(float(math.log(value)))
print(word, result, end=" ")
print()
When I run it, this error appears:
Traceback (most recent call last):
File "C:\example.py", line 14, in <module>
result = sum(float(math.log(value)))
TypeError: 'float' object is not iterable
F.txt ([https://www.dropbox.com/s/ag5at9iuuln2x02/F.txt?dl=0):
شور ورود دانشگاه جالب توجه شور passion
۱۳ راهکار شور اشتیاق واقعی زندگی شور passion
نمک موجود ذائقه غذاهای شور عادت شور salty
از مضرات نمک غذای شور بدانید شور salty
I have to calculate the sum of each line separately and have just one number for each line at last.

Your code is very wrong indeed (you may skip to point #4):
your dictionary has syntax error with the quotes
you're splitting a file handle not lines
you create a double loop to search for keys when you already have a dictionary
you just need result += float(math.log(value)), (init result to 0 outside the loop) sum is for iterables.

Related

Find, multiply and replace numbers in strings, by line

I'm scaling the Gcode for my CNC laser power output. The laser's "S" value maxes at 225 and the current file scale is 1000. I need to multiply only/all S values by .225, omit S values of 0, and replace in the string for each line. There are pre-designated "M", "G", "X", "Y", "Z", "F", and "S" in the Gcode for axis movement and machine functions.
Note: I can't do this manually as there's like 7.5k lines of code.
Hoping for .py with an outcome like (top 3 lines):
Old> G1Y0.1S0 New> G1Y0.1S0
Old> G1X0.1S248 New> G1X0.1S55.8
Old> G1X0.1S795.3 New> G1X0.1S178.9
Example file Code:
G1Y0.1S0
G1X0.1S248
G1X0.1S795.3
G1X0.2S909.4
G1X0.1S874
G1X0.1S374
G1X1.1S0
G1X0.1S610.2
G1X0.1S893.7
G1X0.6S909.4
G1X0.1S893.7
G1X0.1S661.4
G1X0.1S157.5
G1X0.1Y0.1S0
G1X-0.1S66.9
G1X-0.1S539.4
G1X-0.2S909.4
G1X-0.1S897.6
G1X-0.1S811
G1X-0.1S515.7
G1X-0.1S633.9
G1X-0.1S874
G1X-0.3S909.4
G1X-0.1S326.8
G1X-0.8S0
Tried this:
import os
import sys
import fileinput
print("Text to Search For:")
textToSearch = input("> ")
print("Set Max Power Output:")
valueMult = input("> ")
print("File to work:")
fileToWork = input("> ")
tempFile = open(fileToWork, 'r+')
sValue = int
for line in fileinput.input (fileToWork):
if textToSearch in line:
c = str(textToSearch,(sValue)) #This is where I'm stuck.
print("Match Found >> ", sValue)
else:
print("Match Not Found")
tempFile.write(line.replace(textToSearch, (sValue,"(sValue * (int(valueMult)/1000))")))
tempFile.close()
#input("\n\n Press Enter to Exit")
Output:
Text to Search For:
> S
Set Max Power Output:
> 225
File to work:
> test.rtf
Match Not Found
Traceback (most recent call last):
File "/Users/iamme/Desktop/ConvertGcode.py", line 25, in <module>
tempFile.write(line.replace(textToSearch, (sValue,"(sValue * (int(valueMult)/1000))")))
TypeError: replace() argument 2 must be str, not tuple
>>>
test.rtf file:
Hello World
X-095Y15S434.5
That is Solid!
Your code has a couple of issues that need to be addressed:
first, you declare the sValue variable but never assign it the value from every line in your loop,
second, said variable is an int, but should be a float or you'll lose the decimal part seen in the file,
and third, since you're not getting the corresponding values, you're not multiplying the aforementioned values by the new scale factor to then replace the old with this.
Additionally, you're opening the original file in read/write mode (r+), but I would recommend you write to a new file instead.
Now, here is your code with fixes and changes (I'm taking the liberty to write variable names in Python style):
multiplier = input("New max power output for S: ")
input_file = input("Input file: ")
output_file = input("Output file: ")
with open(input_file, 'r') as source, open(output_file, 'w') as target:
for line in source:
if 'S' in line:
line = line.removesuffix('\n')
split_line = line.split('S', -1)
new_value = float(split_line[1]) * float(multiplier)
new_line = f'{split_line[0]}S{new_value:.1f}\n'
print(f'Old> {line:25s}New> {new_line}', end='')
target.write(new_line)
else:
target.write(line)
As you can see, we open both source and target files at the same time. By using the with statement, the files are closed at the end of that block.
The code assumes the text to search will appear no more than once per line.
When a match is found, we need to remove the newline from the line (\n) so it's easy to work with the number after the S. We split the line in two parts (stored in the list split_line), and convert the second element (S's value) to a float and multiply it by the entered multiplier. Then we construct the new line with its new value, print the old and new lines, and write it to the target file. We also write the line to the target file when a match isn't found so we don't lose them.
IMPORTANT: this code also assumes no additional values appear after S{value} in the lines, as per your sample. If that is not the case, this code will fail when reaching those lines.

Trying to print the line a string appears in a text file

I am analysing an episode of Brooklyn 99 specifically trying to find the line number in a text file where Gina says Scully looks 'like an eggplant' but my code isn't working, any help would be appreciated, I am using jupyter and not getting an error message when running my code.
f = open(r'C:\Users\bubba\Downloads\B99_episode_.txt', 'r')
print(f)
# Choosing TERRY
# Initialising the value of count as -1 because it appears in the cast list
count = -1
terry_in_f = f.readlines()
for line in terry_in_f:
if 'TERRY' in line:
count = count + 1
print(count)
# Finding the line number in which Gina states 'like an eggplant'
for index, line in enumerate(f):
if line.lower() == "like an eggplant":
print(index)
break
if "like an eggplant" will always enter the block because "like an eggplant" isn't falsey. You need to check the actual line from the file is equal to the string you're looking for. So it should be if line == "like an eggplant".
Also, you want to print the line number. You can use enumerate() to give you the index of the line you're on instead of just printing the actual line itself.
for index, line in enumerate(f):
if line.lower() == "like an eggplant":
print(index)
break
Lastly, instead of doing a hard comparison of if line == "like an eggplant":, it may be better to do if "like an eggplant" in line:. This will return true if the string "like an eggplant" is in the script line, even if there is some surrounding noise. For example, if the script says "Gina: like an eggplant", having a direct comparison would return false. Checking if the string is inside the line would return True. It gives you more flexibility.

Finding a List Position Represented by a Variable

Basically I have a variable equal to a number and want to find the number in the position represented by the variable. This is what I
numbertocheck =1
loopcriteria = 1
while loopcriteria == 1:
if numbertocheck in ticketnumber:
entryhour.append[numbertocheck] = currenttime.hour
entryminute.append[numbertocheck] = currenttime.minute
print("Thank you. Your ticket number is", numbertocheck)
print("There are now", available_spaces, "spaces available.")
loopcriteria = 2
I get this error (in pyCharm):
Traceback (most recent call last): File
"/Users/user1/Library/Preferences/PyCharmCE2017.3/scratches/scratch_2.py",
line 32, in entryhour.append[numbertocheck] =
currenttime.hour TypeError: 'builtin_function_or_method' object does
not support item assignment
How do I do what I'm trying to do?
Though you haven't provided the complete code, I think you only have problem with using append. You cannot use [] just after an append. To insert into a particular position, you need insert
Putting the relevant lines you need to replace below...
entryhour.insert(numbertocheck,currenttime.hour)
entryminute.insert(numbertocheck,currenttime.minute)
# available_spaces-=1 # guessing you need this too here?
P.S. your loop doesn't seem to make sense, I hope you debug it yourself if it doesn't work the way you want.

pandas to_numeric couldn't convert string values to integers

I am trying to use pandas.to_numeric to convert a series to ints.
df['numeric_col'] = pd.to_numeric(df['numeric_col'], errors='raise')
I got errors,
Traceback (most recent call last):
File "/home/user_name/script.py", line 86, in execute
data = module(**module_args).execute(data)
File "/home/user_name/script.py", line 62, in execute
invoices['numeric_invoice_no'] = pd.to_numeric(invoices['numeric_invoice_no'], errors='raise')
File "/usr/local/lib/python3.5/dist-packages/pandas/core/tools/numeric.py", line 126, in to_numeric
coerce_numeric=coerce_numeric)
File "pandas/_libs/src/inference.pyx", line 1052, in pandas._libs.lib.maybe_convert_numeric (pandas/_libs/lib.c:56638)
ValueError: Integer out of range. at position 106759
if I change it to,
df['numeric_col'] = pd.to_numeric(df['numeric_col'], errors='coerce')
the values in numeric_col will not convert to ints, i.e. they are still strings.
if I changed to,
df['numeric_col'] = df['numeric_col'].astype(int)
I got error,
OverflowError: Python int too large to convert to C long
so I have to change it to,
df['numeric_col'] = df['numeric_col'].astype(float)
then there was no error generated.
The size of the series is about 994572, the strings in the column are like 52333612273, 56032860 or 02031757.
I am wondering what are the issues with to_numeric and astype here.
I am running Python 3.5 on Linux mint 18.1 64-bit.
Maybe you have a comma(,) within your numeric string values or still having a null value(NaN) within the columns of your dataframe , so try to replace the commas with empty space using the
.replace() method
and then drop or fill in the Null values with
.fillna() or .replace or .dropna()
before using
df['DataFrame Column'] = df['DataFrame Column'].astype(int)

How to get first synset from list of Sentiwordnet?

I have a review text and I want to define if it is positive or negative. I 'm using sentiwordnet for getting the score of each word in the review. My problem is since each word has multiple synset I want only the first one:
for example:
swn.senti_synsets('slow')
[SentiSynset('decelerate.v.01'), SentiSynset('slow.v.02'), \
SentiSynset('slow.v.03'), SentiSynset('slow.a.01'), SentiSynset('slow.a.02'), \
SentiSynset('slow.a.04'), SentiSynset('slowly.r.01'), SentiSynset('behind.r.03')]
I want the first one which is SentiSynset('decelerate.v.01')
Here is my code:
Text = " I love the movie but hate the music"
word_tok = word_tokenize(Text)
for i in word_tok :
g = nltk.tag.pos_tag([i])
for word, tag in g:
if tag.startswith('JJ'):
new = 'a'
elif tag.startswith('V'):
new = 'v'
elif tag.startswith('R'):
new = 'r'
else:
new =''
if new != '':
synsets = list(swn.senti_synsets(word, new))
b = synsets[0]
First I tokenize the text, Then I get the tag of each word and change it to the tag that recognizes by Sentiwordnet. If the word is adjective/adverb/verb I want their first synset to get the pos/neg score.
when I run this script I get the error
Traceback (most recent call last):
File "C:\Python34\test2.py", line 39, in <module>
b = synsets[0]
IndexError: list index out of range
Can anyone see where I get wrong in my code?
Thanks in advance

Resources