How to get first synset from list of Sentiwordnet? - python-3.x

I have a review text and I want to define if it is positive or negative. I 'm using sentiwordnet for getting the score of each word in the review. My problem is since each word has multiple synset I want only the first one:
for example:
swn.senti_synsets('slow')
[SentiSynset('decelerate.v.01'), SentiSynset('slow.v.02'), \
SentiSynset('slow.v.03'), SentiSynset('slow.a.01'), SentiSynset('slow.a.02'), \
SentiSynset('slow.a.04'), SentiSynset('slowly.r.01'), SentiSynset('behind.r.03')]
I want the first one which is SentiSynset('decelerate.v.01')
Here is my code:
Text = " I love the movie but hate the music"
word_tok = word_tokenize(Text)
for i in word_tok :
g = nltk.tag.pos_tag([i])
for word, tag in g:
if tag.startswith('JJ'):
new = 'a'
elif tag.startswith('V'):
new = 'v'
elif tag.startswith('R'):
new = 'r'
else:
new =''
if new != '':
synsets = list(swn.senti_synsets(word, new))
b = synsets[0]
First I tokenize the text, Then I get the tag of each word and change it to the tag that recognizes by Sentiwordnet. If the word is adjective/adverb/verb I want their first synset to get the pos/neg score.
when I run this script I get the error
Traceback (most recent call last):
File "C:\Python34\test2.py", line 39, in <module>
b = synsets[0]
IndexError: list index out of range
Can anyone see where I get wrong in my code?
Thanks in advance

Related

How to get elements from a text file and searching them in another text file in python 3?

I have players.txt It is like:
Iteration: 1
Lloris
Kounde
Varane
Upamecano
Hernandez
Tchoumeni
Rabiot
Dembele
Griezman
Mbappe
Giroud
Iteration:2
Martinez
Molina
Otamendi
Romero
Tagliafico
Mac Allister
De Paul
Fernandez
Alvarez
Messi
Lautaro
Iteration 3:
xxxxx
yyyyy
zzzzz
namenamename
name
name
name
And I have superstarplayers.txt like:
Messi
Modric
Ronaldo
Mbappe
playername
playername
playername
playername
It goes like this, a lot of player name
No blank line in superstarplayer.txt
But in players.txt there is a blank line afterward the name of a player line. It is like 1st row is player 2nd row blank 3rd row is player 4th row is blank. One player one blank.
For each iteration, I want to get number of how many players in my iteration. And also I want to get the number of how many players is also exist in superstarplayers.txt
How can I do that in python3. I am learning python file operations.
(in players.txt in every iteration there are not always 11 players the numbers might different)
I'm learning how to read from a txt file, how to write to a txt file. I have no idea about how to get elements from a text file and searching them in another text file in python 3. How can I do that?
You can do what you are asking with this
def fix_data(lst):
""" Fixes any blank lines or empty strings"""
#remove the newline character from the strings if there
remove_newline = [x.replace("\n","") for x in lst]
#remove empty strings (your example has them) - also remove whitespace
remove_empty = [x.strip() for x in remove_newline]
#return data
return remove_empty
#open a text file for reading like this - its wrapped in a fix_data function
with open('/home/lewis/players.txt', 'r') as f:
lines = fix_data(f.readlines())
#open superstar players - also wrapped
with open('/home/lewis/superstar.txt', 'r') as f:
superstar_lines = fix_data(f.readlines())
#create a iteration dictionary.
iteration_store = {}
#create a variable to store the iteration no
iteration = 0
#create a variable to store the iteration start no
iteration_start = 0
#create a variable to store unknown player no
unknown_player = 1
#loop each line and and to dict on each iteration
for line in lines:
#check if an iteration line
iteration_start += 1
if "Iteration" in line:
# get the iteration number from the string by splitting the string on 'Iteration:' and taking the last value, then removing white space
iteration = line.split("Iteration:")[-1].strip()
#store this value in the dict as a list
iteration_store[iteration] = []
#set to zero
iteration_start = 0
# after the interation line is found then find the player name in every other space
elif iteration_start % 2 != 0:
#check if the player is unknown and rename the line
if line == "" or line is None:
#set the line
line = f"Unknown Player {unknown_player}"
#increment the player number
unknown_player += 1
#if not an iteration line then store the player name in the dict[list]
iteration_store[iteration].append(line)
#print the information
print(iteration_store)
#print a new line to seperate the data
print("\n\n")
#loop the newly create dict and print each iteration number and count of players
for k,v in iteration_store.items():
# check the superstar list for matches
superstars = [x for x in v if x in superstar_lines]
#print the information
print(f"Iteration: {k} - No Players {len(v)} - No Superstars {len(superstars)}")
EDIT:
Changed the code to take the player name from every other line after the "Iteration" line has been found. This will only work if it follows the following format ALWAYS
iteration line
player name (empty or not)
empty string
player name (empty or not)
ALSO
If the superstar player is not found, then there must be something that doesnt match like its written messi and Messi as they are different. The code is sound.
Read the codes, research and play with it. Google is your friend.

Find, multiply and replace numbers in strings, by line

I'm scaling the Gcode for my CNC laser power output. The laser's "S" value maxes at 225 and the current file scale is 1000. I need to multiply only/all S values by .225, omit S values of 0, and replace in the string for each line. There are pre-designated "M", "G", "X", "Y", "Z", "F", and "S" in the Gcode for axis movement and machine functions.
Note: I can't do this manually as there's like 7.5k lines of code.
Hoping for .py with an outcome like (top 3 lines):
Old> G1Y0.1S0 New> G1Y0.1S0
Old> G1X0.1S248 New> G1X0.1S55.8
Old> G1X0.1S795.3 New> G1X0.1S178.9
Example file Code:
G1Y0.1S0
G1X0.1S248
G1X0.1S795.3
G1X0.2S909.4
G1X0.1S874
G1X0.1S374
G1X1.1S0
G1X0.1S610.2
G1X0.1S893.7
G1X0.6S909.4
G1X0.1S893.7
G1X0.1S661.4
G1X0.1S157.5
G1X0.1Y0.1S0
G1X-0.1S66.9
G1X-0.1S539.4
G1X-0.2S909.4
G1X-0.1S897.6
G1X-0.1S811
G1X-0.1S515.7
G1X-0.1S633.9
G1X-0.1S874
G1X-0.3S909.4
G1X-0.1S326.8
G1X-0.8S0
Tried this:
import os
import sys
import fileinput
print("Text to Search For:")
textToSearch = input("> ")
print("Set Max Power Output:")
valueMult = input("> ")
print("File to work:")
fileToWork = input("> ")
tempFile = open(fileToWork, 'r+')
sValue = int
for line in fileinput.input (fileToWork):
if textToSearch in line:
c = str(textToSearch,(sValue)) #This is where I'm stuck.
print("Match Found >> ", sValue)
else:
print("Match Not Found")
tempFile.write(line.replace(textToSearch, (sValue,"(sValue * (int(valueMult)/1000))")))
tempFile.close()
#input("\n\n Press Enter to Exit")
Output:
Text to Search For:
> S
Set Max Power Output:
> 225
File to work:
> test.rtf
Match Not Found
Traceback (most recent call last):
File "/Users/iamme/Desktop/ConvertGcode.py", line 25, in <module>
tempFile.write(line.replace(textToSearch, (sValue,"(sValue * (int(valueMult)/1000))")))
TypeError: replace() argument 2 must be str, not tuple
>>>
test.rtf file:
Hello World
X-095Y15S434.5
That is Solid!
Your code has a couple of issues that need to be addressed:
first, you declare the sValue variable but never assign it the value from every line in your loop,
second, said variable is an int, but should be a float or you'll lose the decimal part seen in the file,
and third, since you're not getting the corresponding values, you're not multiplying the aforementioned values by the new scale factor to then replace the old with this.
Additionally, you're opening the original file in read/write mode (r+), but I would recommend you write to a new file instead.
Now, here is your code with fixes and changes (I'm taking the liberty to write variable names in Python style):
multiplier = input("New max power output for S: ")
input_file = input("Input file: ")
output_file = input("Output file: ")
with open(input_file, 'r') as source, open(output_file, 'w') as target:
for line in source:
if 'S' in line:
line = line.removesuffix('\n')
split_line = line.split('S', -1)
new_value = float(split_line[1]) * float(multiplier)
new_line = f'{split_line[0]}S{new_value:.1f}\n'
print(f'Old> {line:25s}New> {new_line}', end='')
target.write(new_line)
else:
target.write(line)
As you can see, we open both source and target files at the same time. By using the with statement, the files are closed at the end of that block.
The code assumes the text to search will appear no more than once per line.
When a match is found, we need to remove the newline from the line (\n) so it's easy to work with the number after the S. We split the line in two parts (stored in the list split_line), and convert the second element (S's value) to a float and multiply it by the entered multiplier. Then we construct the new line with its new value, print the old and new lines, and write it to the target file. We also write the line to the target file when a match isn't found so we don't lose them.
IMPORTANT: this code also assumes no additional values appear after S{value} in the lines, as per your sample. If that is not the case, this code will fail when reaching those lines.

Removing a line from a file

I have a file named (data.txt):
243521,Biscuit,Flour:Cream,89.5,9,1
367534,Bread,Flour,67.3,1,2
463254,Chocolate,Cocoa butter:Sugar:Milk powder,45.6,4,0
120014,Buns,Wheat Flour,24.9,5,2
560214,Cake,Flour:Baking Powder:Cake Mix,70.5,3,1
123456,burger,bread crumbs:beef:tomato,99.9,10,0
The numbers after the last comma is sold items. I want to write a code that can delete a line just if the number after the last comma is 0. This is the code I wrote but it removes the line even if the number after the last comma is not zero :
productID=input("")
with open("data.txt","r+") as file:
lines= file.readlines()
file.seek(0)
for line in lines:
productInfo= line.split(",")
y=0
if productInfo[5]>"0":
if y==0:
print("Product cannot be removed: sold items must be 0")
y=1
elif productID not in line:
file.write(line)
file.truncate()
print("Product is removed successfully")
I regret that I do not understand what you are asking for. If you have trouble expressing a difficult question, try asking the question to a friend, and then write down what you say.
Other than the noise that y introduces for no reason, the other odd thing about this code is this comparison:
productInfo[5]>"0"
Probably that comparison does not do what you expect.
I believe you just want to know if the last token is a "0" or not. For this it is better to test for equality or inequality, instead of attempting to perform a value comparison of two strings.
String equality can be tested with ==. Check for inequality with !=.
I believe you want this:
productInfo[5] != "0"
From what I have understood, you have a file that contains comma-separated data and last value is of your interest. If that value is 0, you want to remove that line.
General idea is to read the lines in file and split the , in line and access last item. Your mistake is, as many have pointed out, you are trying to compare strings with > which is not valid in python. Following code works for me with your sample data:
#reading the lines in data as list
with open("data.txt", "r") as f:
lines = f.readlines()
new_array = []
#empty array so we can populate it with lines that don't have a 0 at the end
user_input = input("Enter a number: ") #Capturing user input
for line in lines: #iterating over all lines
line = line.split(",") #splitting , in line
if line[len(line)-1].strip() != "0" and line[0] != user_input:
#if first item and last item are not user input and 0 respectively
new_array.append(",".join(line))
elif line[len(line)-1].strip() == "0" and line[0] != user_input:
#if last item is 0 but first item is not user input
new_array.append(",".join(line))
else:
print("ignoring:", *line)
with open("data2.txt", "w") as f: #creating new data file without 0
f.writelines(new_array) #writing new array to new datafile
#data2.txt now contains only lines that have no 0 at the end

Python coinflip code problem: How to delete lines in .text files?

So I'm tring to write a basic coinflip program that I will implement in a web i need the three .text files : heads, crowns and total to keep the overal values. Is there an algorithm or a module that lets u delete the privius content of the file?
(sorry for anything wrong with my question it is my first time asking in stack)
I tried runing the code and it works. My problem is that after it read and tries to write the new number the new number gets writen after the previous one. My only expirience with file handling was in c and in c if u write it makes a new file.
def main():
tot_num = open('total.txt', 'r+')
while True:
try:
x = input('Flip(F) or Exit(E)').lower()
except ValueError:
print('You had ur options try again')
else:
if x == 'f' or x == 'flip':
cf = coin_flip()
if cf == 'head':
print('Coin --> HEAD')
heads = open('heads.txt', 'r+')
h_num = int(heads.read())
heads.write(f'{h_num + 1}')
tn = int(tot_num.read())
tot_num.write(f'{tn + 1}')
heads.close()
show_coin_flip_num()
elif cf == 'crown':
print('Coin --> CROWN')
crowns = open('crown.txt', 'r+')
c_num = int(crowns.read())
crowns.write(f'{c_num + 1}')
tn = int(tot_num.read())
tot_num.write(f'{tn + 1}')
crowns.close()
show_coin_flip_num()
else:
break
else:
print('Exiting...')
break
The error is basically there cuz after the new number is added it goes next to the previous one it can read it normally the next time. It takes '012'
from the file.
Traceback (most recent call last):
File "file_path", line 462, in <module>
main()
File "file_path", line 442, in main
tn = int(tot_num.read())
ValueError: invalid literal for int() with base 10: ''
I made my program work but it is peculiar that there is not function to delete a specific line from file. I will try to make one myself and post it here because all the answers in
https://stackoverflow.com/q/4710067/7987118 are used to remove specific strings from a text file.

text file reading and writing, ValueError: need more than 1 value to unpack

I need to make a program in a single def that opens a text file 'grades' where first, last and grade are separated by comas. Each line is a separate student. Then it displays students and grades as well as class average. Then goes on to add another student and grade and saves it to the text file while including the old students.
I guess I just don't understand the way python goes through the text file. If i comment out 'lines' I see it prints the old_names but its as if everything is gone after. When lines is not commented out 'old_names' is not printed which makes me think the file is closed? or empty? however everything is still in the txt file as it should be.
currently i get this error.... Which I am pretty sure is telling me I'm retarded there's no information in 'line'
File "D:\Dropbox\Dropbox\1Python\Batch Processinga\grades.py", line 45, in main
first_name[i], last_name[i], grades[i] = line.split(',')
ValueError: need more than 1 value to unpack
End goal is to get it to give me the current student names and grades, average. Then add one student, save that student and grade to file. Then be able to pull the file back up with all the students including the new one and do it all over again.
I apologize for being a nub.
def main():
#Declare variables
#List of strings: first_name, last_name
first_name = []
last_name = []
#List of floats: grades
grades = []
#Float grade_avg, new_grade
grade_avg = new_grade = 0.0
#string new_student
new_student = ''
#Intro
print("Program displays information from a text file to")
print("display student first name, last name, grade and")
print("class average then allows user to enter another")
print("student.\t")
#Open file “grades.txt” for reading
infile = open("grades.txt","r")
lines = infile.readlines()
old_names = infile.read()
print(old_names)
#Write for loop for each line creating a list
for i in len(lines):
#read in line
line = infile.readline()
#Split data
first_name[i], last_name[i], grades[i] = line.split(',')
#convert grades to floats
grades[i] = float(grades[i])
print(first_name, last_name, grades)
#close the file
infile.close()
#perform calculations for average
grade_avg = float(sum(grades)/len(grades))
#display results
print("Name\t\t Grade")
print("----------------------")
for n in range(5):
print(first_name[n], last_name[n], "\t", grades[n])
print('')
print('Average Grade:\t% 0.1f'%grade_avg)
#Prompt user for input of new student and grade
new_student = input('Please enter the First and Last name of new student:\n').title()
new_grade = eval(input("Please enter {}'s grade:".format(new_student)))
#Write new student and grade to grades.txt in same format as other records
new_student = new_student.split()
new_student = str(new_student[1] + ',' + new_student[0] + ',' + str(new_grade))
outfile = open("grades.txt","w")
print(old_names, new_student ,file=outfile)
outfile.close()enter code here
File objects in Python have a "file pointer", which keeps track of what data you've already read from the file. It uses this to know where to start looking when you call read or readline or readlines. Calling readlines moves the file pointer all the way to the end of the file; subsequent read calls will return an empty string. This explains why you're getting a ValueError on the line.split(',') line. line is an empty string, so line.split(",") returns a list of length 0, but you need a list of length 3 to do the triple assignment you're attempting.
Once you get the lines list, you don't need to interact with the infile object any more. You already have all the lines; you may as well simply iterate through them directly.
#Write for loop for each line creating a list
for line in lines:
columns = line.split(",")
first_name.append(columns[0])
last_name.append(columns[1])
grades.append(float(columns[2]))
Note that I'm using append instead of listName[i] = whatever. This is necessary because Python lists will not automatically resize themselves when you try to assign to an index that doesn't exist yet; you'll just get an IndexError. append, on the other hand, will resize the list as desired.

Resources