Converting matrix of strings to PyTorch tensor - pytorch

I wanted to convert the following matrix into a PyTorch tensor:
[['SELF', '', '', '', ''],
['nsubj', 'SELF', '', '', ''],
['', 'compound', 'SELF', '', ''],
['dobj', '', '', 'SELF', ''],
['pobj', '', '', '', 'SELF']]
I wanted to have a boolean matrix where any position with a string other than empty would have a 1, otherwise 0. This should be easy, but I do not seem to find an answer that does not require to iterate through the matrix and build the tensor a cell at a time.
The solution I have:
size = len(sample["edges"])
edge_mask = torch.zeros([size, size])
for i, row in enumerate(sample["edges"]):
for j, v in enumerate(row):
if v != "":
edge_mask[i, j] = 1

You can convert it to a boolean array, then use torch.from_numpy followed with a convert to int:
torch.from_numpy(np.array(sample["edges"], dtype=bool)).to(int)

Related

Need to generate a dictionary using comprehension

I have this code in python:
layers = ['two', 'three', 'five', 'six']
dict_toggle_bullet = {}
for x, y in enumerate(layers):
dict_toggle_bullet[y] = ["active" if j == y else "" for _, j in enumerate(layers)]
The output is:
{'two': ['active', '', '', ''], 'three': ['', 'active', '', ''], 'five': ['', '', 'active', ''], 'six': ['', '', '', 'active']}
Is there a way to convert this using dictionary comprehension in a single line?
This is what I came up with:
lst = ['two','three','five','six']
d = {lst[i]:['' if j != i else 'active' for j in range(4)] for i in range(4)}
However, I wouldn't advice you to do this as it seems a bit complicated and difficult to understand, so the direct method looks to be better.

text file with list of list to open as dataframe

I am new to python. I have a text file as 'asv.txt' having the following content:
[['10', '50', '', ' Ind ', '', ''], ['40', '30', '', ' Ind ', 'Mum', ''], ['50', '10', '', ' Cd ', '', '']]
How do I read it as a csv or as a dataframe.
# Read file (or just copy text)
with open('asv.txt') as f:
data = f.read()
# Convert str to list with ast
import ast
data = ast.literal_eval(data)
## Load dataframe using the "data" argument, which can accept a list and treats it as rows
df = pd.DataFrame(data=data)
Or much simpler for this specific case:
df = pd.DataFrame(data=[['10', '50', '', ' Ind ', '', ''], ['40', '30', '', ' Ind ', 'Mum', ''], ['50', '10', '', ' Cd ', '', '']])

looping diffulties with 2 csv files

Ok this is the last question about csv files and looping.
So I with my loops I want to do the following.
This is the csv file of students I have made into lists.
File 1
['Needie Seagoon', '57', '', '83', '55', '78', '', '91', '73', '65', '56', '', '', '']
['Eccles', '', '98', '91', '80', '', '66', '', '', '', '77', '78', '48', '77']
['Bluebottle', '61', '', '88', '80', '60', '', '45', '52', '91', '85', '', '', '']
['Henry Crun', '92', '', '58', '50', '57', '', '67', '45', '77', '72', '', '', '']
['Minnie Bannister', '51', '', '97', '52', '53', '', '68', '58', '70', '69', '', '', '']
['Hercules Grytpype-Thynne', '', '78', '62', '75', '', '67', '', '', '', '48', '56', '89', '67']
['Count Jim Moriarty', '51', '', '68', '51', '66', '', '55', '72', '50', '74', '', '', '']
['Major Dennis Bloodnok', '', '54', '47', '59', '', '48', '', '', '', '66', '58', '53', '83']
I then have another csv file with the max scores of each course:
File 2
CITS1001 95
CITS1401 100
CITS1402 97
CITS2002 99
CITS2211 94
CITS2401 95
CITS3001 93
CITS3002 93
CITS3003 91
CITS3200 87
CITS3401 98
CITS3402 93
CITS3403 88
So what I want to do and have been trying very hard to achieve is try and divide each student score by the max score of the course.
so for each student, the value going horizontal, I want it to divide by the values of the other value vertically.
For example:
['Needie Seagoon', '57', '', '83', '55', '78', '', '91', '73', '65', '56', '', '', '']
I want 57/95 , skip, 83/100, 55/97... you get where I'm going?
I want to do this for every name In the file. This code might be familiar to some of you but I know I'm doing something wrong.
def normalise(students_file, units_list):
file1 = open(students_file, 'r')
data1 = file1.read().splitlines()
file2 = open(units_list, 'r')
data2 = file2.read().splitlines()
for line in data1:
line = line.split(",")
for row in data2:
row = row.split(",")
for n in range(1, len(row), 2):
for i in range(1, len(line), 1):
if line[i] == '' :
pass
else:
answer = int(line[1]) / int(row[n])
file1.close()
file2.close()
I'll show you some of the output(goes on for a very long time).
output:
1st loop
0.6
none
0.8736842105263158
0.5789473684210527
0.8210526315789474
none
0.9578947368421052
0.7684210526315789
0.6842105263157895
0.5894736842105263
none
none
none
2nd loop
0.57
none
0.83
0.55
0.78
none
0.91
0.73
0.65
0.56
none
none
I understand that I have readline() but when I do readlines(), I cant strip the /n as it doesn't allow me to and the end='' makes the code messy. This output is saying that every value in the students row is getting divided by 95 then looping back to the start and looping every value by 100 and so on. How can I make the first value divide by 95, second by 100 and so on.
Sorry for the long explanation/question but I get told to explain myself more.
thanks.

How to add a character into a chararray that already have character in ipython 3

In python 2.7 I can do...
>>> import numpy
>>> flag=numpy.chararray(10) + ' '
>>> flag
chararray(['', '', '', '', '', '', '', '', '', ''],
dtype='|S6')
>>> flag[5] = 'a'
>>> flag
chararray(['', '', '', '', '', 'a', '', '', '', ''],
dtype='|S6')
>>> flag[5]=flag[5]+'b'
>>> flag
chararray(['', '', '', '', '', 'ab', '', '', '', ''],
dtype='|S6')
But this did not word in python 3.....
BTW. How can I save the "flag" array with some number array in to a text file. Like
1 1
1 1
1 1
1 1
1 1
1 1 ab
1 1
1 1
1 1
1 1
I had used
np.savetxt but.... won't work....
many thx.....

What does the following code say in the simple way?

Can you tell me what this code says in the simple way:
board = [['' for x in range(BOARD_SIZE)] for y in range(BOARD_SIZE)]
This code creates a list of BOARD_SIZE lists. Each of these lists will contain BOARD_SIZE empty strings. So if BOARD_SIZE is 3 then the board will be:
board = [ ['', '', ''],
['', '', ''],
['', '', ''] ]
You can rewrite this code in a single line:
board = [['', '', ''], ['', '', ''], ['', '', '']]

Resources