Related
I have a list[(position,id)] from the user based on the ids it chooses which I collect in data
data[(position,id)] = [(1,0),(2,0),(7,3),(8,6),(3,11),(3,11),(4,0),(5,1),(5,1),(6,2),(9,5),(10,7),(15,0),(16,10),(11,0),(11,1),(12,15),(13,8),(13,8),(13,9),(14,9)]
There are some duplicate elements in the list which I sorted out using *set(list) command
listdata = data
res=[]
res = [*set(listdata)]
print(res)
I get the res list as follows:
output: res = [(11, 1), (13, 8), (6, 2), (4, 0), (16, 10), (11, 0), (2, 0), (5, 1), (10, 7), (7, 3), (9, 5), (15, 0), (13, 9), (14, 9), (8, 6), (12, 15), (1, 0), (3, 11)]
But what I want is only 16 elements in the list on first come basis with positions 1 to 16, if you see here I have got 2 entries with position 11 [(11,1),(11,0)] and position 13[(13,8),(13,9)]
Required output:
res=[(11, 1), (13, 8), (6, 2), (4, 0), (16, 10), (2, 0), (5, 1), (10, 7), (7, 3), (9, 5), (15, 0), (14, 9), (8, 6), (12, 15), (1, 0), (3, 11)]
Can anyone suggest alternate solution?
This should probably be done with a generator.
res = [(11, 1), (13, 8), (6, 2), (4, 0), (16, 10), (11, 0), (2, 0), (5, 1), (10, 7), (7, 3), (9, 5), (15, 0), (13, 9), (14, 9), (8, 6), (12, 15), (1, 0), (3, 11)]
def foo(_res, max_items):
keys = []
for k,v in _res:
if k not in keys:
yield (k,v)
keys.append(k)
if len(keys) > max_items:
return
print(list(foo(res, 16)))
output:
[(11, 1), (13, 8), (6, 2), (4, 0), (16, 10), (2, 0), (5, 1), (10, 7), (7, 3), (9, 5), (15, 0), (14, 9), (8, 6), (12, 15), (1, 0), (3, 11)]
you can use a map to record every position only once
data = [(1,0),(2,0),(7,3),(8,6),(3,11),(3,11),(4,0),(5,1),(5,1),(6,2),(9,5),(10,7),(15,0),(16,10),(11,0),(11,1),(12,15),(13,8),(13,8),(13,9),(14,9)]
bucket = {}
for d in data:
bucket[d[0]] = d[1]
print(list(bucket.items()))
output is:
[(1, 0), (2, 0), (7, 3), (8, 6), (3, 11), (4, 0), (5, 1), (6, 2), (9, 5), (10, 7), (15, 0), (16, 10), (11, 1), (12, 15), (13, 9), (14, 9)]
update:
I am sorry I missed the description: "But what I want is only 16 elements in the list on first come basis", my answer above keeps the later came value, if you want to keep the first came one, do something like this:
data = [(1,0),(2,0),(7,3),(8,6),(3,11),(3,11),(4,0),(5,1),(5,1),(6,2),(9,5),(10,7),(15,0),(16,10),(11,0),(11,1),(12,15),(13,8),(13,8),(13,9),(14,9)]
bucket = {}
for d in data:
if d[0] not in bucket:
# if you never seen it, keep it, else ingore it.
bucket[d[0]] = d[1]
print(list(bucket.items()))
I've got a list of tuples where I need to return a list of the frequency of elements per every 10-second interval depending on a variable ie
data_list =[(0, 84), (1, 84), (2, 84), (3, 84), (4, 84), (5, 84), (6, 84), (7, 84), (8, 84), (9, 84), (10, 84), (11, 84), (12, 84), (13, 84), (14, 84), (15, 84), (16, 84), (17, 84), (18, 84), (19, 84), (20, 84)]
and size = 3
should return
[[0, 10], [1, 10], [2, 1]]
as there are 10(index 2) elements in range 0-9(index 1), 10 elements in range 10-19 and 1 element in range 20-29
I was thinking about creating a for loop that creates x many lists depending on the variable size but not sure that would work at all. Then I tried using a counter but not sure how I would group them in groups of 10 and by the index
Any ideas would be much appreciated.
from collections import Counter
def get_frequency(tuple_list):
x = Counter(elem[0] for elem in tuple_list)
return x
A file (included with two examples) is a list of banned number intervals. A line that contains, for example, 12-18, indicates that all numbers 12 to (inclusive) 18 are prohibited. The intervals may overlap.
We want to know what the minimum number is.
Use variables to analyze run-time (not necessarily need all them):
• N: Maximum (not maximum permissible) number; So the numbers are between 0 and N
• K: number of intervals in a file
• M: width of maximum interval.
A. There is an obvious way to solve this problem: we're checking all numbers until we run into the smallest allowed.
• How fast is such an algorithm?
B. You can probably imagine another simple algorithm that uses N bytes (or bits) of memory.
(Hint: strikethrough.)
• Describe it with words. For example, you can make your own assignment (say a few intervals with numbers between 0 and 20), and show the algorithm on them. However, it also draws up a general description.
• How fast is this algorithm? When thinking, use N, K, and M (if you need it).
C. Make an algorithm that does not consume additional memory (more accurately: the memory consumption should be independent of N, K and M), but it is faster than the algorithm under point A.
• Describe it.
• How fast is it? Is it faster than the B algorithm?
D. Now we are interested in how many numbers are allowed (between 0 and N). How would you adjust the above algorithms for this question? What happens to their rates?
file = "0-19.txt"
intervals = [tuple(map(int, v.split("-"))) for v in open(file)]
#example# intervals = [(12, 18), (2, 5), (3, 8), (0, 4), (15, 19), (6, 9), (13, 17), (4, 8)]#
my current code just executes the program but better algorithms for the code i am yet to figure, still need a lot of work to understand, i would need a quick solution code/algorithm for examples A, B, and C and maybe D. Then i can study the time analysis myself. Appreciate help!
def generator_intervala(start, stop, step):
forbidden_numbers = set()
while start <= stop:
forbidden_numbers.add(start)
start += step
return (forbidden_numbers)
mnozica = set()
for interval in intervals:
a, b = interval
values = (generator_intervala(a, b, 1))
for i in values:
mnozica.add(i)
allowed_numbers = set()
N = max(mnozica)
for i in range(N):
if i not in mnozica:
allowed_numbers.add(i)
print(intervals)
print(mnozica)
print(min(allowed_numbers))
print(max(mnozica))
Output:
[(12, 18), (2, 5), (3, 8), (0, 4), (15, 19), (6, 9), (13, 17), (4, 8)]
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 13, 14, 15, 16, 17, 18, 19}
10
19
Your set approach is needlessly complex:
N = 100
ranges = [(12, 18), (2, 5), (3, 8), (0, 4), (15, 19), (6, 9), (13, 17), (4, 8)]
do_not_use = set()
for (a,b) in ranges:
do_not_use.update(range(a,b+1))
print(do_not_use)
print( min(a for a in range(N+1) if a not in do_not_use))
Is about all that is needed. Output:
set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 13, 14, 15, 16, 17, 18, 19])
10
This is independend of N it just depends on how many numbers are in the ranges.
Storing only forbidden numbers in a set takes O(1) for checking, using the min() buildin over a range to get the minimum.
You can make it faster if you sort your tuples first and then iterate them until you find the first gap making it Θ(N log N) for the sort, followed by Θ(N) for the search:
def findme():
ranges = [(12, 18), (2, 5), (3, 8), (0, 4), (15, 19), (6, 9), (13, 17), (4, 8)]
ranges.sort() # inplace sort, no additional space requirements
if ranges[0][0]>0:
return 0
for ((a_min,a_max),(b_min,b_max)) in zip(ranges,ranges[1:]):
if a_max < b_min-1:
return a_max+1
return ranges[-1][1]+1 # might give you N+1 if no solution in 0-N exists
timeit of yours vs mine:
Your code uses 2 sets, as well as multiple loops, incremental addition to your set and function calls that makes it slower:
N = 100
def findme():
ranges = [(12, 18), (2, 5), (3, 8), (0, 4), (15, 19), (6, 9), (13, 17), (4, 8)]
ranges.sort()
if ranges[0][0]>0:
return 0
for ((a_min,a_max),(b_min,b_max)) in zip(ranges,ranges[1:]):
if a_max < b_min-1:
return a_max+1
return ranges[-1][1]+1
def mine():
ranges = [(12, 18), (2, 5), (3, 8), (0, 4), (15, 19), (6, 9), (13, 17), (4, 8)]
N = 100
do_not_use = set()
for (a,b) in ranges:
do_not_use.update(range(a,b+1))
return min(a for a in range(N+1) if a not in do_not_use)
def yours():
ranges = [(12, 18), (2, 5), (3, 8), (0, 4), (15, 19), (6, 9), (13, 17), (4, 8)]
def generator_intervala(start, stop, step):
forbidden_numbers = set()
while start <= stop:
forbidden_numbers.add(start)
start += step
return (forbidden_numbers)
mnozica = set()
for interval in ranges:
a, b = interval
values = (generator_intervala(a, b, 1))
for i in values:
mnozica.add(i)
allowed_numbers = set()
N = max(mnozica)
for i in range(N):
if i not in mnozica:
allowed_numbers.add(i)
return min(allowed_numbers)
import timeit
print("yours", timeit.timeit(yours,number=100000))
print("mine", timeit.timeit(mine,number=100000))
print("findme", timeit.timeit(findme,number=100000))
Output:
yours 1.3931225209998956
mine 1.263602267999886
findme 0.1711935210005322
I am trying to find a shortest path that passes through a set of nodes [4,7,9] (order does not need to be preserved) and then returns to the origin (node 1). I have the set of edges:
E = [(1, 10), (1, 11), (2, 3), (2, 10), (3, 2), (3, 12), (4, 5), (4, 12), (5, 4), (5, 14), (6, 7), (6, 11), (7, 6), (7, 13), (8, 9), (8, 13), (9, 8), (9, 15), (10, 1), (10, 11), (10, 2), (11, 1), (11, 10), (11, 6), (12, 13), (12, 3), (12, 4), (13, 12), (13, 7), (13, 8), (14, 15), (14, 5), (15, 14), (15, 9)]
and I tried adapting the answer at How can I use BFS to get a path containing some given nodes in order? but yielded the error:
Traceback (most recent call last):
File "C:/Users/../rough-work.py", line 41, in <module>
graph[edge[0]].link(graph[edge[-1]])
KeyError: 15
My adapted code is as follows:
class Node:
def __init__(self, name):
self.name = name
self.neighbors = []
def link(self, node):
# The edge is undirected: implement it as two directed edges
self.neighbors.append(node)
node.neighbors.append(self)
def shortestPathTo(self, target):
# A BFS implementation which retains the paths
queue = [[self]]
visited = set()
while len(queue):
path = queue.pop(0) # Get next path from queue (FIFO)
node = path[-1] # Get last node in that path
for neighbor in node.neighbors:
if neighbor == target:
# Found the target node. Return the path to it
return path + [target]
# Avoid visiting a node that was already visited
if not neighbor in visited:
visited.add(neighbor)
queue.append(path + [neighbor])
###
n = 15
nodes = list(range(1,n))
E = [(1, 10), (1, 11), (2, 3), (2, 10), (3, 2), (3, 12), (4, 5), (4, 12), (5, 4), (5, 14), (6, 7), (6, 11), (7, 6), (7, 13), (8, 9), (8, 13), (9, 8), (9, 15), (10, 1), (10, 11), (10, 2), (11, 1), (11, 10), (11, 6), (12, 13), (12, 3), (12, 4), (13, 12), (13, 7), (13, 8), (14, 15), (14, 5), (15, 14), (15, 9)]
# Create the nodes of the graph (indexed by their names)
graph = {}
for letter in nodes:
graph[letter] = Node(letter)
print(graph)
# Create the undirected edges
for edge in E:
graph[edge[0]].link(graph[edge[-1]])
# Concatenate the shortest paths between each of the required node pairs
start = 1
path = [graph[1]]
for end in [4,7,9,1]:
path.extend( graph[start].shortestPathTo(graph[end])[1:] )
start = end
# Print result: the names of the nodes on the path
print([node.name for node in path])
What could possibly be the problem with the code? I will like to extend the graph to a arbitrarily large number of nodes, greater than 26 - the number of alphabets (as I infer that the previous implementation was only for character-based nodes). Or, if there is a more straightforward way in doing this that will be great!
Thanks and some help will be deeply appreciated!
The KeyError: 15 and your line print(graph) should have given you the clue: the latter shows that your graph dictionary contains only 14 entries, whereas your edges in E clearly make reference to 15 separate indices.
Change n = 15 to n = 16 and it works:
[1, 10, 2, 3, 12, 4, 12, 13, 7, 13, 8, 9, 8, 13, 7, 6, 11, 1]
Remember that:
>>> len(list(range(1,16)))
15
I'm trying to convert .txt to .gpickle in order to obtain the nodes and edges in networkx. I used the following codes to do so:
M = open("data.txt", "r")
G=nx.path_graph(M)
>>> nx.write_gpickle(G,"data.gpickle")
>>> G=nx.read_gpickle("data.gpickle")
After looking up the nodes and edges:
G.nodes()
G.edges()
I got outputs such as NodeView(()) and EdgeView([]), which should contain numerical values in the brackets. I assume that G=nx.path_graph(M) is the problem since it worked fine when I tried using the example from the reference:
>>> G = nx.path_graph(4)
>>> nx.write_gpickle(G, "test.gpickle")
>>> G = nx.read_gpickle("test.gpickle")
What you have is a weighted adjacency matrix in data.txt, the example you are using is to create a path graph, which has nothing to do with the information in your data. In order to create the proper graph, networkx cannot read it directly with that format. However, you can use numpy or pandas to read data.txt and then convert it to a networkx graph.
See the following code to get your graph with numpy:
import numpy as np
import networkx as nx
numpy_array = np.genfromtxt('data.txt', delimiter='\t', dtype='float')
G = nx.from_numpy_array(numpy_array)
Now you will have
In [1]: G.nodes()
Out[1]: NodeView((0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15))
In [2]: G.edges()
Out[2]: EdgeView([(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (0, 7), (0, 8), (0, 9), (0, 10), (0, 11), (0, 12), (0, 13), (0, 14), (0, 15), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (1, 8), (1, 9), (1, 10), (1, 11), (1, 12), (1, 13), (1, 14), (1, 15), (2, 3), (2, 4), (2, 5), (2, 6), (2, 7), (2, 8), (2, 9), (2, 10), (2, 11), (2, 12), (2, 13), (2, 14), (2, 15), (3, 4), (3, 5), (3, 6), (3, 7), (3, 8), (3, 9), (3, 10), (3, 11), (3, 12), (3, 13), (3, 14), (3, 15), (4, 5), (4, 6), (4, 7), (4, 8), (4, 9), (4, 10), (4, 11), (4, 12), (4, 13), (4, 14), (4, 15), (5, 6), (5, 7), (5, 8), (5, 9), (5, 10), (5, 11), (5, 12), (5, 13), (5, 14), (5, 15), (6, 7), (6, 8), (6, 9), (6, 10), (6, 11), (6, 12), (6, 13), (6, 14), (6, 15), (7, 8), (7, 9), (7, 10), (7, 11), (7, 12), (7, 13), (7, 14), (7, 15), (8, 9), (8, 10), (8, 11), (8, 12), (8, 13), (8, 14), (8, 15), (9, 10), (9, 11), (9, 12), (9, 13), (9, 14), (9, 15), (10, 11), (10, 12), (10, 13), (10, 14), (10, 15), (11, 12), (11, 13), (11, 14), (11, 15), (12, 13), (12, 14), (12, 15), (13, 14), (13, 15), (14, 15)])
To save your graph with gpickle format you do:
nx.write_gpickle(G, 'my_graph.gpickle')
Now you should be able to read it with G = nx.read_gpickle('my_graph.gpickle').
Depend on your data, you may read them directly, using networkx read_edgelist function, then write it into a pickle using its Doc:
G = nx.read_edgelist('test.txt', delimiter='\t', data=[('weight', int)], create_using=nx.DiGraph())
nx.write_gpickle(G, "test.gpickle")