Merging multiples lists with same length when matching conditions - python-3.x

given input: theList = [<userID>,<number_of_views>]
theList = [
[[3, 5], [1, 1], [2, 3]],
[[1, 2], [3, 5], [3, 0], [2, 3], [4, 2]],
[[1, 2], [3, 5], [3, 0], [2, 3], [4, 2]],
[[1, 2], [1, 1], [4, 2]]
]
expected output = [
[[3, 5], [2, 3], [1, 1]],
[[3, 5], [2, 3], [1, 2], [4, 2]],
[[3, 5], [2, 3], [1, 2], [4, 2]],
[[1, 3], [4, 2]]
]
for sublist in theList:
e.x -->
theList[3] = [[1,2], [1,1], [4,2]]
how to merge items that have same userIDs = 1 in this case and sum all the corresponding views to this (userID=1) (2+1) = 3 views into a new_list --> [1,3]
expected theList[3] = [[1,3], [4,2]].
How could I make this process for all theList?
Thanks so much for spending time on this question!

This is one approach using collections.defaultdict.
Ex:
from collections import defaultdict
theList = [
[[3, 5], [1, 1], [2, 3]],
[[1, 2], [3, 5], [3, 0], [2, 3], [4, 2]],
[[1, 2], [3, 5], [3, 0], [2, 3], [4, 2]],
[[1, 2], [1, 1], [4, 2]]
]
result = []
for i in theList:
r = defaultdict(int)
for j, k in i:
r[j] += k
result.append(list(r.items()))
print(result)
Output:
[[(3, 5), (1, 1), (2, 3)],
[(1, 2), (3, 5), (2, 3), (4, 2)],
[(1, 2), (3, 5), (2, 3), (4, 2)],
[(1, 3), (4, 2)]]

Related

Converting CSV to PyG graph

I have a CSV dataset as shown below:
index
s_key
identifier
edge_pairs
0
[1683, 1684, 1685, 1686, 1688, 1689, 1691, 1692, 1693, 1694, 1695, 1696, 1697, 1698, 1699, 12740]
[0, 0]
[[0, 793]]
1
[9774, 9800, 9807, 9818, 9831, 9834, 9836, 9837, 9839, 9843, 13723, 21455]
[0, 1]
[[1, 3], [1, 123], [1, 152], [1, 163], [1, 266], [1, 337], [1, 351], [1, 352], [1, 355], [1, 606], [1, 869], [1, 962], [1, 1125], [1, 1412], [1, 1413], [1, 1417], [1, 1435], [1, 1440], [1, 1454], [1, 1572], [1, 1588], [1, 1653], [1, 1726], [1, 1898], [1, 2075], [1, 2076], [1, 2166], [1, 2297], [1, 2299], [1, 2319], [1, 2327], [1, 2330], [1, 2335], [1, 2393], [1, 2395], [1, 2400], [1, 2405], [1, 2486]]
3
[2156, 2896, 3028, 4023, 4256, 6787, 7265, 8882, 8970, 9831, 10959, 11268, 11341, 12601, 13737, 17264, 18906, 20430, 21747, 22228, 22229, 22512, 22841, 24049, 25104, 25394, 25731, 26045, 26103, 31121, 31522, 31839, 31851, 31859, 31872, 35527, 35547, 36538, 37150, 37345, 37692, 37888, 37895, 38962, 45332]
[0, 3]
[[3, 8], [3, 11], [3, 12], [3, 13], [3, 27], [3, 34], [3, 99], [3, 123], [3, 125], [3, 130], [3, 132], [3, 133], [3, 134], [3, 144], [3, 147], [3, 152], [3, 154], [3, 180], [3, 181], [3, 207]]
4
[25203, 25204, 25215, 25219, 25227, 25232, 25235, 25248, 25251, 25252, 25259, 25270]
[0, 4]
[[4, 215], [4, 322], [4, 342], [4, 793], [4, 1043], [4, 1127], [4, 1176], [4, 1454], [4, 2154], [4, 2284], [4, 2331], [4, 2400], [4, 2759], [4, 2920], [4, 3335]]
5
[27099, 27101, 27104, 27107, 27108, 27111, 27117, 27120, 27123, 27131, 27143, 27153, 27156, 27158, 27162, 27167, 27172, 27175, 27176, 27178, 27184, 27185]
[0, 5]
[[5, 8], [5, 239], [5, 378], [5, 1163], [5, 1220], [5, 1378], [5, 1422], [5, 1440], [5, 1636], [5, 1681], [5, 2190], [5, 2303], [5, 2399]]
The index column represents each node.
The edge_pairs column represents the connection of each node.
For example: In Index 0, the edge pair column: [[0, 793]] represents the connection of node 0 with Node 793 and so on.
I want to make a graph out of this CSV in a format that PyG accepts data = Data(x=x, edge_index=edge_index, y=y).
I am unsure of what to take as Node Features & Labels and how to represent the connection of edges between them.

Why removing a value from a 2D list works differently with a dynamic list vs. a fixed list (pre-defined)

If I generate a list and try to remove a value (e.g.; 1) from a sub-list, it removes it from all sub-lists but if I use a pre-defined list (identical to the one created, the result is different. WHY?
The build function creates a matrix of x rows by x columns where the first item of each row is the row#
e.g.; [0,[1,2,3],[1,2,3],[1,2,3]] [1,[1,2,3],[1,2,3],[1,2,3]] [2,[1,2,3],[1,2,3],[1,2,3]]
def build(size):
values = []
activetable = []
for value in range(size): # create the list of possible values
values.append(value + 1)
for row in range(size):
# Create the "Active" table with all possible values
activetable.append([row])
for item in range(size):
activetable[row].append(values)
return activetable
This function is intended to remove a specific value in the list using the row and column coordinate
def remvalue(row, col, value, table):
before = table[row][col]
before.remove(value)
table[row][col] = before
return table
When I build a list and try to remove a value in a sub-list, it is removing it from all sub-list
print("start")
table1 = build(3) # this function create a 2d table called table1
print(f" table 1: {table1}")
newtable = remvalue(row=0, col=1, value=1, table=table1)
print(f"from a dynamic table : {newtable}")
As you can see the value "1" has been removed from all sub-lists
start
table 1: [[0, [1, 2, 3], [1, 2, 3], [1, 2, 3]], [1, [1, 2, 3], [1, 2, 3], [1, 2, 3]], [2, [1, 2, 3], [1, 2, 3], [1, 2, 3]]]
from a dynamic table : [[0, [2, 3], [2, 3], [2, 3]], [1, [2, 3], [2, 3], [2, 3]], [2, [2, 3], [2, 3], [2, 3]]]
But if I use a pre-defined list with exactly the same data, the result is different
table1 = [[0, [1, 2, 3], [1, 2, 3], [1, 2, 3]], [1, [1, 2, 3], [1, 2, 3], [1, 2, 3]], [2, [1, 2, 3], [1, 2, 3], [1, 2, 3]]]
newtable = remvalue(row=0, col=1, value=1, table=table1)
print(f"from a predefined table : {newtable}")
As you can see it works as desired only when I use a pre-defined list. Why do we have this difference?
start
table 1: [[0, [1, 2, 3], [1, 2, 3], [1, 2, 3]], [1, [1, 2, 3], [1, 2, 3], [1, 2, 3]], [2, [1, 2, 3], [1, 2, 3], [1, 2, 3]]]
from a dynamic table : [[0, [2, 3], [2, 3], [2, 3]], [1, [2, 3], [2, 3], [2, 3]], [2, [2, 3], [2, 3], [2, 3]]]
from a predefined table : [[0, [2, 3], [1, 2, 3], [1, 2, 3]], [1, [1, 2, 3], [1, 2, 3], [1, 2, 3]], [2, [1, 2, 3], [1, 2, 3], [1, 2, 3]]]

Removing a list from a list of lists! PYTHON

a1=[[1, 2], [2, 3], [2, 4],[3, 4] ,[3, 6], [4, 5]]
i want the output to be:
a1=[[1, 2], [2, 3], [3, 4], [4, 5]]
I've tried removing it with a for loop, but it throws an error index out of range
You can use pop() if you want to remove by index (e.g. the fourth element):
In [1]: a1 = [[1, 2], [2, 3], [2, 4],[3, 4] ,[3, 6], [4, 5]]
In [2]: a1.pop(4)
Out[2]: [3, 6]
In [3]: a1
Out[3]: [[1, 2], [2, 3], [2, 4], [3, 4], [4, 5]]
Or, you can remove by specifying the element:
In [4]: a1 = [[1, 2], [2, 3], [2, 4],[3, 4] ,[3, 6], [4, 5]]
In [5]: a1.remove([3, 6])
In [6]: a1
Out[6]: [[1, 2], [2, 3], [2, 4], [3, 4], [4, 5]]
The answer is very simple just use the pop function.
https://www.geeksforgeeks.org/python-list-pop/
For your case it would be :
a1.pop(4)
you can loop over the Pop() function to remove multiple ones.

Recursion - Euler 15

I am aware that there are published solutions to Euler 15. I have technically got a working solution (it yields the correct number) but when it print's the routes, it does so incorrectly
def legal_moves (row, column, grid):
legal_moves = []
if row != grid:
legal_moves.append ([row+1,column])
if column != grid:
legal_moves.append ([row,column+1])
if column == grid and row == grid:
return False
return legal_moves
def find_route (row,column,grid, route):
l_moves = legal_moves (row,column,grid)
if l_moves == False:
route.append ([row,column])
list_routes.append (route)
return
else:
route.append ([row,column])
if len(l_moves) == 1:
row = l_moves[0][0]
column = l_moves[0][1]
find_route (row,column,grid,route)
if len(l_moves) ==2:
row_a, column_a = l_moves[0][0], l_moves[0][1]
row_b, column_b = l_moves[1][0], l_moves[1][1]
find_route (row_a,column_a,grid,route)
find_route (row_b,column_b,grid,route)
grid = int(input("Enter A for grid size AxA: "))
list_routes = []
find_route(0,0,grid, route = [])
for item in list_routes:
print (item)
print ()
I ran it for board size 2 (so input: grid = 2) and the terminal printed
[[0, 0], [1, 0], [2, 0], [2, 1], [2, 2], [1, 1], [2, 1], [2, 2], [1, 2], [2, 2], [0, 1], [1, 1], [2, 1], [2, 2], [1, 2], [2, 2], [0, 2], [1, 2], [2, 2]]
[[0, 0], [1, 0], [2, 0], [2, 1], [2, 2], [1, 1], [2, 1], [2, 2], [1, 2], [2, 2], [0, 1], [1, 1], [2, 1], [2, 2], [1, 2], [2, 2], [0, 2], [1, 2], [2, 2]]
[[0, 0], [1, 0], [2, 0], [2, 1], [2, 2], [1, 1], [2, 1], [2, 2], [1, 2], [2, 2], [0, 1], [1, 1], [2, 1], [2, 2], [1, 2], [2, 2], [0, 2], [1, 2], [2, 2]]
[[0, 0], [1, 0], [2, 0], [2, 1], [2, 2], [1, 1], [2, 1], [2, 2], [1, 2], [2, 2], [0, 1], [1, 1], [2, 1], [2, 2], [1, 2], [2, 2], [0, 2], [1, 2], [2, 2]]
[[0, 0], [1, 0], [2, 0], [2, 1], [2, 2], [1, 1], [2, 1], [2, 2], [1, 2], [2, 2], [0, 1], [1, 1], [2, 1], [2, 2], [1, 2], [2, 2], [0, 2], [1, 2], [2, 2]]
[[0, 0], [1, 0], [2, 0], [2, 1], [2, 2], [1, 1], [2, 1], [2, 2], [1, 2], [2, 2], [0, 1], [1, 1], [2, 1], [2, 2], [1, 2], [2, 2], [0, 2], [1, 2], [2, 2]]
I cannot work out why it line 1 does not just print:
[0, 0], [1, 0], [2, 0], [2, 1], [2, 2]
and line 2:
[0, 0], [1, 0], [1, 1], [2, 1], [2, 2]
etc...
Please can someone help me understand why?
Thanks

How can I generate all possible sums in the following sequence?

Suppose I have an array lets say [1,2,3,4]:
I want to find the sum as follows:
First I generate pairs like:
(1 2 3 4)
(123)(4)
(1)(234)
(12)(34)
(12)(3)(4)
(1)(23)(4)
(1)(2)(34)
(1)(2)(3)(4)
The ans would then be sum of elements in one group multiplied by the length of that group(for all possible groups)
eg in the arrangement (123)(4), the sum would be
(1+2+3)*3 + (4)*1
I just want the final sum which is sum of all such values , not the actual groups. How can I do this?
I was able to do it by first generating all possible groups and then finding the sum
But since I only need the sum and not the actual groups, is there a better way?
The number of arrangements is 2**(len(L)-1). A list of 8 elements produce 128 different arrangements. It is an exponential problem. You either generate all possible solutions and then calculate each answer, or you calculate each answer on the fly. Either way it is still exp.
def part1(L, start, lsum):
if start == len(L):
print lsum
else:
for i in range(start, len(L)):
left = sum(L[start:i+1]) * (i-start+1)
part1(L, i + 1, lsum + left)
def part2(L, M, X, start):
if start == len(L):
M.append(X)
print sum([sum(x) * len(x) for x in X])
else:
for i in range(start, len(L)):
part2(L, M, X + [L[start:i+1]], i + 1)
ex:
>>> part1(L, 0, 0)
10
17
15
28
13
20
22
40
>>> M = []
>>> part2(L, M, [], 0)
10
17
15
28
13
20
22
40
edit: sum of all the sums in O(n**3)
for L = [1,2,3,4,5,6]
[[[1], [2], [3], [4], [5], [6]],
[[1], [2], [3], [4], [5, 6]],
[[1], [2], [3], [4, 5], [6]],
[[1], [2], [3], [4, 5, 6]],
[[1], [2], [3, 4], [5], [6]],
[[1], [2], [3, 4], [5, 6]],
[[1], [2], [3, 4, 5], [6]],
[[1], [2], [3, 4, 5, 6]],
[[1], [2, 3], [4], [5], [6]],
[[1], [2, 3], [4], [5, 6]],
[[1], [2, 3], [4, 5], [6]],
[[1], [2, 3], [4, 5, 6]],
[[1], [2, 3, 4], [5], [6]],
[[1], [2, 3, 4], [5, 6]],
[[1], [2, 3, 4, 5], [6]],
[[1], [2, 3, 4, 5, 6]],
[[1, 2], [3], [4], [5], [6]],
[[1, 2], [3], [4], [5, 6]],
[[1, 2], [3], [4, 5], [6]],
[[1, 2], [3], [4, 5, 6]],
[[1, 2], [3, 4], [5], [6]],
[[1, 2], [3, 4], [5, 6]],
[[1, 2], [3, 4, 5], [6]],
[[1, 2], [3, 4, 5, 6]],
[[1, 2, 3], [4], [5], [6]],
[[1, 2, 3], [4], [5, 6]],
[[1, 2, 3], [4, 5], [6]],
[[1, 2, 3], [4, 5, 6]],
[[1, 2, 3, 4], [5], [6]],
[[1, 2, 3, 4], [5, 6]],
[[1, 2, 3, 4, 5], [6]],
[[1, 2, 3, 4, 5, 6]]]
There seems to be a pattern. The odd case is: the sets having the first elements of the sequence as the smallest element as the sorted set, there are 32. But then all the rest there are 16. For each element of the list, I add all the sets which contains that element as the first sorted element.
def part3(L):
ret = 0
for i in range(len(L)):
p = 0
for k in range(len(L) - i - 1):
p += sum(L[i:i+k+1]) * (k+1) * 2**(len(L) - i - k - 2)
p += sum(L[i:]) * (len(L) - i)
ret += p * max(1, 2**(i-1))
return ret
edit2: to lower it to O(n^2) you need to use DP. building a table of sums to calculate each sum in O(1). You build an array S with S[i] = S[i-1] + L[i] and sum(L[a:b]) is S[b] - S[a].

Resources