Sum of two square matrixes code keeps failing - python-3.x

I have this task: 'Write a program that adds two square matrices. The program will read the dimension of the matrix, N, and will then read N*N numbers representing the first matrix, row by row. It will then read another N*N numbers representing the second matrix. The program will output the resulting matrix, one row per line. ' for which I wrote the code below. However, the platform I am doing the task on keeps saying that 1 of 2 tests failed...It works just fine for me. Maybe the problem is on their side?
from operator import add
#Enter a digit for you matrix, e.g if you want it to be 2x2 enter 2
n = int(input())
#Input digits for both matrixes rows one at a time
matrix1_r1 = [int(input()) for x in range(n)]
matrix1_r2 = [int(input()) for x in range(n)]
matrix2_r1 = [int(input()) for x in range(n)]
matrix2_r2 = [int(input()) for x in range(n)]
final1 = list(map(add, matrix1_r1, matrix2_r1))
final2 = list(map(add, matrix1_r2, matrix2_r2))
print(final1)
print(final2)
Their sample innput is:
2
1
2
3
4
5
6
7
8
their sample output is:
[6, 8]
[10, 12]

Your code works for the example, and for any input that is 2 by 2. It will fail for any other sized matrix, because your code only computes two rows for each matrix. Rather than hard-coding something so fundamental, you should be using nested loops and a list of lists to get the right number of rows. Or, if you want to be a little fancy, list comprehensions can do it all really neatly:
n = int(input())
matrix1 = [[int(input()) for col in range(n)] for row in range(n)]
matrix2 = [[int(input()) for col in range(n)] for row in range(n)]
matrix_sum = [[a + b for a, b in zip(row1, row2)] for row1, row2 in zip(matrix1, matrix2)]
print(*matrix_sum, sep='\n')

Related

Confused in behaviour of 2-d array in python

n,m,k=map(int,input().split())
students=[int(x) for x in input().split()]
classroom=[]
count=0
rows=[0]*k
for i in range(m):
classroom.append(rows)
for i in students:
for j in range(k):
if c[i-1][j]==1:
continue
else:
c[i-1][j]=1
count+=1
break
print(classroom)
`"""
I want to calculate the number of students who are seated in their preferred row(it should vacant for student), in my case 0 is my vacancy and there are n students with their preferred rows( array of n with ith elemnt as preferred row).
Now my input is 5 2 2
1 1 2 1 1
here n=5,k=2(row length), m=2(no. of rows)
array=[1,1,2,1,1](students in the above code)
as per my code classroom will be my 2d array of size 2x2
Now, here logically it should print the classroom [[1,1],[1,0]] but im unable to understand why it is printing the classroom [[1,1],[1,1]]
I have testeed with input 5 2 2
1
so logically it should print classroom [[1,0][0,0]] but it is printing classroom [[1,0],[1,0]]. I have tested this on python 3 .
Please let me know what did i do wrong or what is the concept i didn't understand or what is logic behind this`
This line
classroom.append(rows)
appends the same address again and again. Thus when one of them is modified, all the others are modified. That's why the rows of your output are all the same.
Change this line to
classroom.append([0] * k)
This ensures that the rows are independent of each other.

Finding x numbers in list greater than 0 with potential duplicates and assigning index of original list

I have lists of floats which will have some zeros in it. Eg.
numbers = [1.2, 0.0, 0.0, 1.2, 2.0, 2.5, 17, 1.3, 1.8, 1.3, 1.2]
I am trying to assign these values to n numbers (assuming it will be the first 5) for the lowest 5 values that are greater than 0.
I can get the first by using:
first = min(o for o in numbers if o > 0)
But as there are duplicates in the smallest value (1.2), I cannot easily assign second, third, fourth and fifth.
I need to assign these and allow me to keep the index of their values in the original list and assign these too. Eg.
first_pos = numbers.index(first)
I cannot use the above for second as it will assign it the first index value.
Is there any efficient way using a for loop or list comprehension or even a small function to assigning the other numbers so that:
second = 1.2
second_pos = 4
third = 1.2
third_pos = 10
fourth = 1.3
fourth_pos = 7
fifth = 1.3
fifth_pos = 9
I cannot do this with any list comprehension I know of for second as it will not pick up a duplicate. Eg.:
sec = min(o for o in numbers if o > first)
The lists vary in length of values (at least 5, though) and may or may not have duplicates and zeros but many will.
IIUC, one way using sorted with enumerate:
sorted(((n, i) for n, i in enumerate(numbers) if i > 0), key=lambda x: x[1])[:5]
Output of (index, value) pairs of first 5 smallest values:
[(0, 1.2), (3, 1.2), (10, 1.2), (7, 1.3), (9, 1.3)]
Ok, to badly answer my own question, I have been able to do this by copying and removing the zeros, enumerating over the list for the index values and removing each number once assigned:
number = [n for n in numbers if n > 0]
numbs = [n for n, x in enumerate(numbers) if x > 0]
for n in number:
first = min(number)
first_pos = number.index(first)
first_ind = numbs[first_pos]
number.remove(first)
numbs.remove(first_ind)
for n in number:
second = min(number)
sec_pos = number.index(second)
sec_ind = numbs[sec_pos]
number.remove(second)
numbs.remove(sec_ind)
This will keep and assign the values and indexes of each minimum value greater than zero.
Is there any way to add this into a function to assign all values greater than zero in the list to its own variables?

I want to improve speed of my algorithm with multiple rows input. Python. Find average of consequitive elements in list

I need to find average of consecutive elements from list.
At first I am given lenght of list,
then list with numbers,
then am given how many test i need to perform(several rows with inputs),
then I am given several inputs to perform tests(and need to print as many rows with results)
every row for test consist of start and end element in list.
My algorithm:
nu = int(input()) # At first I am given lenght of list
numbers = input().split() # then list with numbers
num = input() # number of rows with inputs
k =[float(i) for i in numbers] # given that numbers in list are of float type
i= 0
while i < int(num):
a,b = input().split() # start and end element in list
i += 1
print(round(sum(k[int(a):(int(b)+1)])/(-int(a)+int(b)+1),6)) # round up to 6 decimals
But it's not fast enough.I was told it;s better to get rid of "while" but I don't know how. Appreciate any help.
Example:
Input:
8 - len(list)
79.02 36.68 79.83 76.00 95.48 48.84 49.95 91.91 - list
10 - number of test
0 0 - a1,b1
0 1
0 2
0 3
0 4
0 5
0 6
0 7
1 7
2 7
Output:
79.020000
57.850000
65.176667
67.882500
73.402000
69.308333
66.542857
69.713750
68.384286
73.668333
i= 0
while i < int(num):
a,b = input().split() # start and end element in list
i += 1
Replace your while-loop with a for loop. Also you could get rid of multiple int calls in the print statement:
for _ in range(int(num)):
a, b = [int(j) for j in input().split()]
You didn't spell out the constraints, but I am guessing that the ranges to be averaged could be quite large. Computing sum(k[int(a):(int(b)+1)]) may take a while.
However, if you precompute partial sums of the input list, each query can be answered in a constant time (sum of numbers in the range is a difference of corresponding partial sums).

I want to remove rows where a specific value doesn't increase. Is there a faster/more elegant way?

I have a dataframe with 30 columns, 1.000.000 rows and about 150 MB size. One column is categorical with 7 different elements and another column (Depth) contains mostly increasing numbers. The graph for each of the elements looks more or less like this.
I tried to save the column Depth as series and iterate through it while dropping rows that won't match the criteria. This was reeeeeaaaally slow.
Afterwards I added a boolean column to the dataframe which indicates if it will be dropped or not, so I could drop the rows in the end in a single step. Still slow. My last try (the code to it is in this post) was to create a boolean list to save the fact if it passes the criteria there. Still really slow (about 5 hours).
dropList = [True]*len(df.index)
for element in elements:
currentMax = 0
minIdx = df.loc[df['Element']==element]['Depth'].index.min()
maxIdx = df.loc[df['Element']==element]['Depth'].index.max()
for x in range(minIdx,maxIdx):
if df.loc[df['Element']==element]['Depth'][x] < currentMax:
dropList[x]=False
else:
currentMax = df.loc[df['Element']==element]['Depth'][x]
df: The main dataframe
elements: a list with the 7 different elements (same as in the categorical column in df)
All rows in an element, where the value Depth isn't bigger than all previous ones should be dropped. With the next element it should start with 0 again.
Example:
Input: 'Depth' = [0 1 2 3 4 2 3 5 6]
'AnyOtherColumn' = [a b c d e f g h i]
Output: 'Depth' [0 1 2 3 4 5 6]
'AnyOtherColumn' = [a b c d e h i]
This should apply to whole rows in the dataframe of course.
Is there a way to get this faster?
EDIT:
The whole rows of the input dataframe should stay as they are. Just the ones where the 'Depth' does not increase should be dropped.
EDIT2:
The remaining rows should stay in their initial order.
How about you take a 2-step approach. First you use a fast sorting algorithm (for example Quicksort) and next you get rid of all the duplicates?
Okay, I found a way thats faster. Here is the code:
dropList = [True]*len(df.index)
for element in elements:
currentMax = 0
minIdx = df.loc[df['Element']==element]['Tiefe'].index.min()
# maxIdx = df.loc[df['Element']==element]['Tiefe'].index.max()
elementList = df.loc[df['Element']==element]['Tiefe'].to_list()
for x in tqdm(range(len(elementList))):
if elementList[x] < currentMax:
dropList[x+minIdx]=False
else:
currentMax = elementList[x]
I took the column and saved it as a list. To preserve, the index of the dataframe I saved the lowest one and within the loop it gets added again.
Overall it seems the problem was the loc function. From initially 5 hours runtime, its now about 10 seconds.

how to get a kind of "maximum" in a matrix, efficiently

I have the following problem: I have a matrix opened with pandas module, where each cell has a number between -1 and 1. What I wanted to find is the maximum "posible" value in a row that is also not the maximum value in another row.
If for example 2 rows has their maximum value at the same column, I compare both values and take the bigger one, then for the row that has its maximum value smaller that the other row, I took the second maximum value (and do the same analysis again and again).
To explain myself better consider my code
import pandas as pd
matrix = pd.read_csv("matrix.csv")
# this matrix has an id (or name) for each column
# ... and the firt column has the id of each row
results = pd.DataFrame(np.empty((len(matrix),3),dtype=pd.Timestamp),columns=['id1','id2','max_pos'])
l = len(matrix.col[[0]]) # number of columns
while next = 1:
next = 0
for i in range(0, len(matrix)):
max_column = str(0)
for j in range(1, l): # 1 because the first column is an id
if matrix[max_column][i] < matrix[str(j)][i]:
max_column = str(j)
results['id1'][i] = str(i) # I coul put here also matrix['0'][i]
results['id2'][i] = max_column
results['max_pos'][i] = matrix[max_column][i]
for i in range(0, len(results)): #now I will check if two or more rows have the same max column
for ii in range(0, len(results)):
# if two id1 has their max in the same column, I keep it with the biggest
# ... max value and chage the other to "-1" to iterate again
if (results['id2'][i] == results['id2'][ii]) and (results['max_pos'][i] < results['max_pos'][ii]):
matrix[results['id2'][i]][i] = -1
next = 1
Putting an example:
#consider
pd.DataFrame({'a':[1, 2, 5, 0], 'b':[4, 5, 1, 0], 'c':[3, 3, 4, 2], 'd':[1, 0, 0, 1]})
a b c d
0 1 4 3 1
1 2 5 3 0
2 5 1 4 0
3 0 0 2 1
#at the first iterarion I will have the following result
0 b 4 # this means that the row 0 has its maximum at column 'b' and its value is 4
1 b 5
2 a 5
3 c 2
#the problem is that column b is the maximum of row 0 and 1, but I know that the maximum of row 1 is bigger than row 0, so I take the second maximum of row 0, then:
0 c 3
1 b 5
2 a 5
3 c 2
#now I solved the problem for row 0 and 1, but I have that the column c is the maximum of row 0 and 3, so I compare them and take the second maximum in row 3
0 c 3
1 b 5
2 a 5
3 d 1
#now I'm done. In the case that two rows have the same column as maximum and also the same number, nothing happens and I keep with that values.
#what if the matrix would be
pd.DataFrame({'a':[1, 2, 5, 0], 'b':[5, 5, 1, 0], 'c':[3, 3, 4, 2], 'd':[1, 0, 0, 1]})
a b c d
0 1 5 3 1
1 2 5 3 0
2 5 1 4 0
3 0 0 2 1
#then, at the first itetarion the result will be:
0 b 5
1 b 5
2 a 5
3 c 2
#then, given that the max value of row 0 and 1 is at the same column, I should compare the maximum values
# ... but in this case the values are the same (both are 5), this would be the end of iterating
# ... because I can't choose between row 0 and 1 and the other rows have their maximum at different columns...
This code works perfect to me if I have a matrix of 100x100 for example. But, if the matrix size goes to 50,000x50,000 the code takes to much time in finish it. I now that my code could be the most inneficient way to do it, but I don't know how to deal with this.
I have been reading about threads in python that could help but it doesn't help if I put 50,000 threads because my computer doesn't use more CPU. I also tried to use some functions as .max() but I'm not able to get column of the max an compare it with the other max ...
If anyone could help me of give me a piece of advice to make this more efficient I would be very grateful.
Going to need more information on this. What are you trying to accomplish here?
This will help you get some of the way, but in order to fully achieve what you're doing I need more context.
We'll import numpy, random, and Counter from collections:
import numpy as np
import random
from collections import Counter
We'll create a random 50k x 50k matrix of numbers between -10M and +10M
mat = np.random.randint(-10000000,10000000,(50000,50000))
Now to get the maximums for each row we can just do the following list comprehension:
maximums = [max(mat[x,:]) for x in range(len(mat))]
Now we want to find out which ones are not maximums in any other rows. We can use Counter on our maximums list to find out how many of each there are. Counter returns a counter object that is like a dictionary with the maximum as the key, and the # of times it appears as the value.
We then do dictionary comprehension where the value is == to 1. That will give us the maximums that only show up once. we use the .keys() function to grab the numbers themselves, and then turn it into a list.
c = Counter(maximums)
{9999117: 15,
9998584: 2,
9998352: 2,
9999226: 22,
9999697: 59,
9999534: 32,
9998775: 8,
9999288: 18,
9998956: 9,
9998119: 1,
...}
k = list( {x: c[x] for x in c if c[x] == 1}.keys() )
[9998253,
9998139,
9998091,
9997788,
9998166,
9998552,
9997711,
9998230,
9998000,
...]
Lastly we can do the following list comprehension to iterate through the original maximums list to get the indicies of where these rows are.
indices = [i for i, x in enumerate(maximums) if x in k]
Depending on what else you're looking to do we can go from here.
Its not the speediest program but finding the maximums, the counter, and the indicies takes 182 seconds on a 50,000 by 50,000 matrix that is already loaded.

Resources