Finite loop over a list with repeated parameter - python-3.x

I am working on an algorithm, and I would like to iterate a function, finitely, over 3 rows of an array. Like, I want to perform the iteration of the function on row1, row2, row3, back to row1, row2, etc.
What I did only stops at after the 3rd row
import numpy as np
m, n = 3, 3
A = [[1,2,3], [4,5,6], [7,8,9]]
b = [1, 7, 9]
def my_func(x, i):
pro = x + A_i^T[i,:]
return pro
rows = 1
rows = [1, 2, 3]
x = np.zeros(n)
for n in range(1000):
y = my_func(x, rows)
print(y)
x = y
rows += 1

Iterate the operation over another iteration
for i in range(int(1000/len(rows))): #range(no. of times to iterate row 1, 2 & 3)
for row in A:
#put your operations here

Related

Sum of two square matrixes code keeps failing

I have this task: 'Write a program that adds two square matrices. The program will read the dimension of the matrix, N, and will then read N*N numbers representing the first matrix, row by row. It will then read another N*N numbers representing the second matrix. The program will output the resulting matrix, one row per line. ' for which I wrote the code below. However, the platform I am doing the task on keeps saying that 1 of 2 tests failed...It works just fine for me. Maybe the problem is on their side?
from operator import add
#Enter a digit for you matrix, e.g if you want it to be 2x2 enter 2
n = int(input())
#Input digits for both matrixes rows one at a time
matrix1_r1 = [int(input()) for x in range(n)]
matrix1_r2 = [int(input()) for x in range(n)]
matrix2_r1 = [int(input()) for x in range(n)]
matrix2_r2 = [int(input()) for x in range(n)]
final1 = list(map(add, matrix1_r1, matrix2_r1))
final2 = list(map(add, matrix1_r2, matrix2_r2))
print(final1)
print(final2)
Their sample innput is:
2
1
2
3
4
5
6
7
8
their sample output is:
[6, 8]
[10, 12]
Your code works for the example, and for any input that is 2 by 2. It will fail for any other sized matrix, because your code only computes two rows for each matrix. Rather than hard-coding something so fundamental, you should be using nested loops and a list of lists to get the right number of rows. Or, if you want to be a little fancy, list comprehensions can do it all really neatly:
n = int(input())
matrix1 = [[int(input()) for col in range(n)] for row in range(n)]
matrix2 = [[int(input()) for col in range(n)] for row in range(n)]
matrix_sum = [[a + b for a, b in zip(row1, row2)] for row1, row2 in zip(matrix1, matrix2)]
print(*matrix_sum, sep='\n')

Iterate through Pandas dataframe rows in a triangular fashion

I have a Pandas dataframe df like this:
col1. col2
0. value11 List1
1. value12 List2
2. value13. List3
.. ... ...
i. value1i. List_i
j. value1j. List_j
.. ... ...
Col1 is the key (it does not repeat). Col2 is a list. In the end, I want a set intersection of each of the rows of Col2.
I would like to iterate through this dataframe in a triangular fashion.
Something along the lines of:
for i = 0 ; i < len(df); i++
for j = i+1 ; j < len(df) ; j++
Set(List_i).intersect(Set(List_j)
So, 1st iterator goes through the full dataframe, while the second iterator, starts from one greater index than the 1st iterator and goes until the end of the dataframe.
How to do this efficiently and in a fast manner?
Edit:
Naive way of doing this is:
col1_list = list(set(df.col1))
num_col1_entries = len(col1_list)
for idx, value1 in enumerate(col1_list):
for j in range(idx + 1, num_col1_entries):
value2 = col1_list[j]
list1 = df.loc[df.col1 == value1]['col2']
list2 = df.loc[df.col2 == value2]['col2']
print(set(list1).intersection(set(list2)))
Expected output: n(n-1)/2 prints of set intersections of each pair of rows of col2.
You can use itertools. Let's say this is your dataframe:
col1. col2
0 value11 List1
1 value12 List2
2 value13 List3
3 value14 List4
4 value15 List5
5 value16 List6
Then get al the combinations (15) and print the intersection between the two lists:
from itertools import combinations
for pair in list(combinations(df.index, 2)):
print(pair)
list1 = df.iloc[pair[0],1]
list2 = df.iloc[pair[1],1]
print(set(list1).intersection(set(list2)))
Output (only printing the pair):
(0, 1)
(0, 2)
(0, 3)
(0, 4)
(0, 5)
(1, 2)
(1, 3)
(1, 4)
(1, 5)
(2, 3)
(2, 4)
(2, 5)
(3, 4)
(3, 5)
(4, 5)

Replace dataframe value by indices

I've the dataframe in
import pandas as pd
in = pd.DataFrame(
columns=[1, 2],
data= [['a','b'],['c','d']],
)
in
1 2
0 a b
1 c d
and would like to replace single values (here: d with z) by indices (of row, column) resulting in a dataframe out:
1 2
0 a b
1 c z
How can I replace a value by indices (here: row idx is 1, column idx is 1) most efficient (memory consumption, execution time)?
Use DataFrame.iloc if want set values by positions (first value is 0, because python counts from 0):
df.iloc[1,1] = 'z'
Or if want set by labels (index and columns values) use DataFrame.loc:
df.loc[1,2] = 'z'
If want set one value only better is use DataFrame.iat or
DataFrame.at:
#by positions
df.iat[1,1] = 'z'
#by labels
df.at[1,2] = 'z'

how to get a kind of "maximum" in a matrix, efficiently

I have the following problem: I have a matrix opened with pandas module, where each cell has a number between -1 and 1. What I wanted to find is the maximum "posible" value in a row that is also not the maximum value in another row.
If for example 2 rows has their maximum value at the same column, I compare both values and take the bigger one, then for the row that has its maximum value smaller that the other row, I took the second maximum value (and do the same analysis again and again).
To explain myself better consider my code
import pandas as pd
matrix = pd.read_csv("matrix.csv")
# this matrix has an id (or name) for each column
# ... and the firt column has the id of each row
results = pd.DataFrame(np.empty((len(matrix),3),dtype=pd.Timestamp),columns=['id1','id2','max_pos'])
l = len(matrix.col[[0]]) # number of columns
while next = 1:
next = 0
for i in range(0, len(matrix)):
max_column = str(0)
for j in range(1, l): # 1 because the first column is an id
if matrix[max_column][i] < matrix[str(j)][i]:
max_column = str(j)
results['id1'][i] = str(i) # I coul put here also matrix['0'][i]
results['id2'][i] = max_column
results['max_pos'][i] = matrix[max_column][i]
for i in range(0, len(results)): #now I will check if two or more rows have the same max column
for ii in range(0, len(results)):
# if two id1 has their max in the same column, I keep it with the biggest
# ... max value and chage the other to "-1" to iterate again
if (results['id2'][i] == results['id2'][ii]) and (results['max_pos'][i] < results['max_pos'][ii]):
matrix[results['id2'][i]][i] = -1
next = 1
Putting an example:
#consider
pd.DataFrame({'a':[1, 2, 5, 0], 'b':[4, 5, 1, 0], 'c':[3, 3, 4, 2], 'd':[1, 0, 0, 1]})
a b c d
0 1 4 3 1
1 2 5 3 0
2 5 1 4 0
3 0 0 2 1
#at the first iterarion I will have the following result
0 b 4 # this means that the row 0 has its maximum at column 'b' and its value is 4
1 b 5
2 a 5
3 c 2
#the problem is that column b is the maximum of row 0 and 1, but I know that the maximum of row 1 is bigger than row 0, so I take the second maximum of row 0, then:
0 c 3
1 b 5
2 a 5
3 c 2
#now I solved the problem for row 0 and 1, but I have that the column c is the maximum of row 0 and 3, so I compare them and take the second maximum in row 3
0 c 3
1 b 5
2 a 5
3 d 1
#now I'm done. In the case that two rows have the same column as maximum and also the same number, nothing happens and I keep with that values.
#what if the matrix would be
pd.DataFrame({'a':[1, 2, 5, 0], 'b':[5, 5, 1, 0], 'c':[3, 3, 4, 2], 'd':[1, 0, 0, 1]})
a b c d
0 1 5 3 1
1 2 5 3 0
2 5 1 4 0
3 0 0 2 1
#then, at the first itetarion the result will be:
0 b 5
1 b 5
2 a 5
3 c 2
#then, given that the max value of row 0 and 1 is at the same column, I should compare the maximum values
# ... but in this case the values are the same (both are 5), this would be the end of iterating
# ... because I can't choose between row 0 and 1 and the other rows have their maximum at different columns...
This code works perfect to me if I have a matrix of 100x100 for example. But, if the matrix size goes to 50,000x50,000 the code takes to much time in finish it. I now that my code could be the most inneficient way to do it, but I don't know how to deal with this.
I have been reading about threads in python that could help but it doesn't help if I put 50,000 threads because my computer doesn't use more CPU. I also tried to use some functions as .max() but I'm not able to get column of the max an compare it with the other max ...
If anyone could help me of give me a piece of advice to make this more efficient I would be very grateful.
Going to need more information on this. What are you trying to accomplish here?
This will help you get some of the way, but in order to fully achieve what you're doing I need more context.
We'll import numpy, random, and Counter from collections:
import numpy as np
import random
from collections import Counter
We'll create a random 50k x 50k matrix of numbers between -10M and +10M
mat = np.random.randint(-10000000,10000000,(50000,50000))
Now to get the maximums for each row we can just do the following list comprehension:
maximums = [max(mat[x,:]) for x in range(len(mat))]
Now we want to find out which ones are not maximums in any other rows. We can use Counter on our maximums list to find out how many of each there are. Counter returns a counter object that is like a dictionary with the maximum as the key, and the # of times it appears as the value.
We then do dictionary comprehension where the value is == to 1. That will give us the maximums that only show up once. we use the .keys() function to grab the numbers themselves, and then turn it into a list.
c = Counter(maximums)
{9999117: 15,
9998584: 2,
9998352: 2,
9999226: 22,
9999697: 59,
9999534: 32,
9998775: 8,
9999288: 18,
9998956: 9,
9998119: 1,
...}
k = list( {x: c[x] for x in c if c[x] == 1}.keys() )
[9998253,
9998139,
9998091,
9997788,
9998166,
9998552,
9997711,
9998230,
9998000,
...]
Lastly we can do the following list comprehension to iterate through the original maximums list to get the indicies of where these rows are.
indices = [i for i, x in enumerate(maximums) if x in k]
Depending on what else you're looking to do we can go from here.
Its not the speediest program but finding the maximums, the counter, and the indicies takes 182 seconds on a 50,000 by 50,000 matrix that is already loaded.

Max Value in List Array

The following code creates a list with entered values:
def locateLargest():
matrix = []
numberOfRows = int(input("Enter the number of rows: "))
numberOfColumns = 2
for row in range(0, numberOfRows):
matrix.append([])
for column in range(0, numberOfColumns):
value = int(input("Enter a value: "))
matrix[row].append(value)
max_value = None
for value in matrix:
if not max_value:
max_value = value
elif value > max_value:
max_value = value
print(max_value)
locateLargest()
The issue I am running into is that it is asking for each value individual in the row, and is returning the maximum pair of values in the row, not the maximum value's index.
The sample run of what I should be getting is:
Enter the number of rows in the list: 3
Enter a row: 23.5 35 2 10
Enter a row: 4.5 3 45 3.5
Enter a row: 35 44 5.5 11.6
The location of the largest element is at (1,2)
Any ideas?
My current output is:
Enter the number of rows: 2
Enter the number of columns: 6
Enter a value: 2
Enter a value: 2
Enter a value: 2
Enter a value: 2
Enter a value: 2
Enter a value: 2
Enter a value: 7
Enter a value: 6
Enter a value: 4
Enter a value: 3
Enter a value: 6
Enter a value: 2
[7, 6, 4, 3, 6, 2]
This is not very 'pythonic' but will help you achieve your end goal and hopefully understand the process. As Ɓukasz mentioned, you need to do an iteration for each row, and for each column in each row:
First declare the variable to store your location:
maxPoint = [0,0]
Then enumerate your matrix such that you can get the list from each row, but also retrieve the index of the currently active row:
for idx, row in enumerate(matrix):
Find the max value in the current list of values, ie: [10, 20, 30]
maxRowValue = max(row)
Find which column this maximum value lives in, ie: [0, 1, 2, ...]
maxRowIndex = row.index(maxRowValue)
Determine if max row value is in fact greater than any other previously located points, if it is less discard it:
if maxRowValue <= matrix[maxPoint[0]][maxPoint[1]]:
continue
If the value is greater, save it to the maxPoint variable:
maxPoint = [idx, maxRowIndex]
EDIT
For the sake of completeness, here is the complete code sample with AChampion's performance improvements added:
def locateLargest():
matrix = []
numberOfRows = int(input("Enter the number of rows: "))
numberOfColumns = 2
for row in range(0, numberOfRows):
matrix.append([])
for column in range(0, numberOfColumns):
value = int(input("Enter a value: "))
matrix[row].append(value)
maxPoint = [0,0]
for rIndex, row in enumerate(matrix):
cIndex, maxRowValue = max(enumerate(row), key=lambda x: x[1])
if maxRowValue <= matrix[maxPoint[0]][maxPoint[1]]:
continue
maxPoint = [rIndex, cIndex]
print(maxPoint)
locateLargest()
EDIT 2
Here is the same algorithm without using enumerate:
currentRow = 0
for row in matrix:
maxRowValue = max(row)
maxRowIndex = row.index(maxRowValue)
if maxRowValue > matrix[maxPoint[0]][maxPoint[1]]:
maxPoint = [currentRow, maxRowIndex]
currentRow += 1
Using enumerate() and some generator expressions, you can reduce this code quite a bit:
Generate the rows
Generate the maximum for each row
Find the maximum across all rows
More complex perhaps than some would like:
numberOfRows = int(input("Enter the number of rows: "))
# Generate the rows
rows = (map(int, input("Enter a row: ").split()) for _ in range(numberOfRows))
# Generate the maximum for each row
max_row = (max(enumerate(data), key=lambda x: x[1]) for data in rows)
# Find the maximum across all rows
i, (j, v) = max(enumerate(max_row), key=lambda x: x[1][1])
print("The location of the largest element is at {} [{}]".format((i, j), v))
Input / Output:
Enter the number of rows: 3
Enter a row: 1 2 3
Enter a row: 3 6 3
Enter a row: 1 2 3
'The location of the largest element is at (1, 1) [6]'
If you want to see what is going on change the generators to list comprehensions:
>>> rows = [list(map(int, input("Enter a row: ").split())) for _ in range(numberOfRows)]
Enter a row: 1 2 3
Enter a row: 3 6 3
Enter a row: 1 2 3
>>> rows
[[1, 2, 3], [3, 6, 3], [1, 2, 3]]
>>> max_row = [max(enumerate(data), key=lambda x: x[1]) for data in rows]
>>> max_row
[(2, 3), (1, 6), (2, 3)]
>>> list(enumerate(max_row))
[(0, (2, 3), (1, (1, 6)), (2, (2, 3))]
^^^^^^^^^
i, (j, v)

Resources