Extract list from all_simple_paths and their lengths in python - python-3.x

I have a long list of sources and targets that form a graph as follows:
id_a = [...] #source nodes
id_b = [...] #target nodes
distance = [..] #distance between source and target nodes
G = nx.Graph()
path, length = [], []
for a, b, c in zip(id_a, id_b, distance):
G.add_edge(a, b, weight=c)
cl is a subset of all the nodes in the graph and I want to extract the paths interconnecting all of cl together so I use all_simple_paths()
path = []
for i in range(len(cl)):
for j in range(len(cl)):
if i != j:
path.append(nx.all_simple_paths(G, source=cl[i], target=cl[j]))
I want to be able to list all the simple paths and their lengths so I try:
for i in range(len(path)):
total_length = 0
for j in range(len(path[i])-1):
source, target = path[i][j], path[i][j+1]
edge = G[source][target]
length = edge['weight']
total_length += length
length.append(total_length)
But I keep getting the error
object of type 'generator' has no len()
And I can't figure out how to convert the generator of all_simple_paths() to lists that I can iterate over and extract the full lengths of all the paths.
Any help is appreciated!

If you read the documentation of all_simple_paths, you will see that it returns a generator. So, just use extend instead of append method like this
path = []
for i in range(len(cl)):
for j in range(len(cl)):
if i != j:
path.extend(nx.all_simple_paths(G, source=cl[i], target=cl[j]))
For more info on why extend works in this case, see this answer.
Also I see in the last part of your code, you are setting length as length = edge['weight'], then appending using length.append(total_length). This will return as error, since the edge weight will be an int. Use different variable names something like this
path_weight = [] #<----- List to store all path's weights
for i in range(len(path)):
total_length = 0
for j in range(len(path[i])-1):
source, target = path[i][j], path[i][j+1]
edge = G[source][target]
length = edge['weight'] #<--- Get the weight
total_length += length
path_weight.append(total_length) #Append to the list

Related

Don't understand why I am getting list index out of range here

I wrote this to simulate studying for a test
#number of questions
qList = [] * q
#question numbers
for i in range(len(qList)):
qList[i] = i + 1
#questions studied
sList = [] * s
for i in range (0,s):
x = randint(0,len(qList))
sList[i] = qList[x]
#questions passed
pList = []
for i in range(len(sList)):
if i in sList and i in qList:
pList.apppend(i)
the line sList[i] = qList[x] gives me an index out of range error. I haven't used lists in a while and I can't figure out what is wrong here.
I am trying to get the output here as three lists
a list of questions
the questions that have been studied
the questions passed
randint includes both boundaries:
Return a random integer N such that a <= N <= b. Alias for randrange(a, b+1).
Change invocation to:
x = randint(0, len(qList) - 1)
# or
x = randrange(0, len(qList))
Also you are instantiating empty arrays, you need to pass some element so it can be repeated (or better use list comprehensions):
qList = [1] * q
sList = [1] * s

Extracting groundlevel from Res1d file

I would like to know if its possible to extraxt the groundlevel of nodes from the resulfile. I have already extracted the water level and computed the maximum values, but in order for it to be truly useful, i need the groundlevel to calculate if the node is flooded.
I use this function to get the coordinates:
def get_node_geometry(nodes):
allnodes = [[n.ID, Point(n.XCoordinate, n.YCoordinate)] for n in nodes]
allnode_df = pd.DataFrame(allnodes, columns=['MUID', 'geometry'])
allnode_df.set_index('MUID', inplace = True)
return gpd.GeoDataFrame(allnode_df, geometry = 'geometry')
and these lines to get waterlevel:
varname = "WaterLevel"
queries_all = [QueryDataNode(varname, n.Id) for n in res1d.data.Nodes]
df_node = res1d.read(queries_all)

Writing sequences into separate list or array

I'm trying to extracts these sequences into separate lists or arrays in Python from a file.
My data looks like:
>gene_FST
AGTGGGTAATG--TGATG...GAAATTTG
>gene_FPY
AGT-GG..ATGAAT---AAATGAAAT--G
I would like to have
seq1 = [AGTGGGTAATG--TGATG...GAAATTTG]
seq2 = [AGT-GG..ATGAAT---AAATGAAAT--G]
My plan is to later compare the contents of the list
I would appreciate any advise
So far, here's what I have done, that
f = open (r"C:\Users\Olukayode\Desktop\my_file.txt", 'r') #first r - before the normal string it converts normal string to raw string
def parse_fasta(lines):
seq = []
seq1 = []
seq2 = []
head = []
data = ''
for line in lines:
if line.startswith('>'):
if data:
seq.append(data)
data = ''
head.append(line[1:])
else:
data+= line.rstrip()
seq.append(data)
return seq
h = parse_fasta(f)
print(h)
print(h[0])
print(h[1])
gives:
['AGTGGGTAATG--TGATG...GAAATTTG', 'AGT-GG..ATGAAT---AAATGAAAT--G']
AGTGGGTAATG--TGATG...GAAATTTG
AGT-GG..ATGAAT---AAATGAAAT--G
I think I just figured it out, I can pass each string the list containing both sequences into a separate list, if possible though
If you want to get the exact results you were looking for in your original question, i.e.
seq1 = [AGTGGGTAATG--TGATG...GAAATTTG]
seq2 = [AGT-GG..ATGAAT---AAATGAAAT--G]
you can do it in a variety of ways. Instead of changing anything you already have though, you can just convert your data into a dictionary and print the dictionary items.
your code block...
h = parse_fasta(f)
sDict = {}
for i in range(len(h)):
sDict["seq"+str(i+1)] = [h[i]]
for seq, data in sDict.items():
print(seq, "=", data)

How to apply multiprocessing in python3.x for the following nested loop

for i in range(1,row):
for j in range(1,col):
if i > j and i != j:
x = Aglo[0][i][0]
y = Aglo[j][0][0]
Aglo[j][i] = offset.myfun(x,y)
Aglo[i][j] = Aglo[j][i]
Aglo[][] is a 2D array, which consists of lists in the first row
offset.myfun() is a function defined elsewhere
This might be a trivial question but i couldn't understand how to use multiprocessing for these nested loops as x,y (used in myfun()) is different for each process(if multiprocessing is used)
Thank you
If I'm reading your code right, you are not overwriting any previously calculated values. If that's true, then you can use multiprocessing. If not, then you can't guarantee that the results from multiprocessing will be in the correct order.
To use something like multiprocessing.Pool, you would need to gather all valid (x, y) pairs to pass to offset.myfun(). Something like this might work (untested):
pairs = [(i, j, Aglo[0][i][0], Aglo[j][0][0]) for i in range(1, row) for j in range(1, col) if i > j and i != j]
# offset.myfun now needs to take a tuple instead of x, y
# it additionally needs to emit i and j in addition to the return value
# e.g. (i, j, result)
p = Pool(4)
results = p.map(offset.myfun, pairs)
# fill in Aglo with the results
for pair in pairs:
i, j, value = pair
Aglo[i][j] = value
Aglo[j][i] = value
You will need to pass in i and j to offset.myfun because otherwise there is no way to know which result goes where. offset.myfun should then return i and j along with the result so you can fill in Aglo appropriately. Hope this helps.

python3 string to variable

I am currently trying to implement Conway's Game of Life in a Code, and therefore built a function which generates the coordinates depending of the size of the window.
def coords_maker(num_x, num_y):
num_x += 1
num_y += 1
coords = []
for i in range (0,num_y, 1):
for n in range (0,num_x,1):
coords.append ('x'+str(n)+'y'+str(i))
return coords
Yet, I would like to randomly assign values to the resulting strings, to mark them either as alive (1) or dead (0). However they only way to convert a string to a variable name known to me is via a dict and var(), but however, it is essential for the further code that the coordinates stay sorted, as I want to be able to iterate over the ordered items and put the cursor accordingly to the coordinates name. Something like:
print ('\033['+X_COORD+';'+Y_COORD+'f'+ x1y5)
if e.g. x1y5 is the corresponding value (0 or 1) of the variable
Is there a convenient method how to either do this via a dict or how to convert the name of the strings to variable names?
Or probably. If I keep one dict and one list and store the coordinate names in the list and the values in the dict?
Thank you in advance!
kyril
You use a dictionary:
def coords_maker(num_x, num_y):
num_x += 1
num_y += 1
coords = {}
for i in range (0,num_y, 1):
for n in range (0,num_x,1):
coords['x'+str(n)+'y'+str(i)] = 0
return coords
You then access the value with
coords[x][y]
And change it like so:
coords[x][y] = 1
Now, of course this converting of coordinates to strings is completely pointless. Simply use a list of lists:
def coords_maker(num_x, num_y):
num_x += 1
num_y += 1
coords = [[0]*num_x for x in range(num_y)]
return coords
And I don't know why you add 1 to the coordinates either:
def coords_maker(num_x, num_y):
return [[0]*num_x for x in range(num_y)]

Resources