How to tackle the following error in NetworkX? - python-3.x

I gotta add the nodes and edges from two different lists that I created. Although the nodes are being processed but the edges aren't and thus, the error.
Here's a li'l snippet of my code:
#list3 is the list of the whole dataset
listOfRelations = []
listOfNodes = []
for index in range(0, len(list3)):
#if list3[index].isalpha():
if list3[index].isdigit():
listOfNodes.append(list3[index])
else:
listOfRelations.append(list3[index])
#testing purposes
print(listOfNodes[3])
print(listOfRelations[2])
G.add_nodes_from((listOfNodes))
G.add_edges_from((TupleOfEdges))
I also tried to convert the list into a tuple, but that didn't work either :(
Error: NetworkXError: Edge tuple _hypernym must be a 2-tuple or 3-tuple.

Related

Python filter string

With the following command i can print the balance of my assets from my binance ac.
Command:
USDT_BAL = client.futures_account_balance(asset='USDT')
Return:
[{'accountAlias': 'sRuXXqTioCfWFz', 'asset': 'BNB', 'balance': '0.00000142', 'withdrawAvailable': '0.00000142', 'updateTime': 1621516315044}, {'accountAlias': 'sRuXXqTioCfWFz', 'asset': 'USDT', 'balance': '0.00000000', 'withdrawAvailable': '0.00000000', 'updateTime': 0}, {'accountAlias': 'sRuXXqTioCfWFz', 'asset': 'BUSD', 'balance': '0.00000000', 'withdrawAvailable': '0.00000000', 'updateTime': 0}]
It returns the balances of other assets, but i only need the balance of the USDT asset. How could I filter the USDT_BAL variable for it?
Expanding on my comment:
You have a list of dict. list access is done by iteration (for loops) or by indexes. my_list[0], etc..
dict access can, also done by iteration, but a big benefit is keyed access. my_dict['some_key'], etc..
Python has simplified ways to do common list and dict building commonly called "comprehensions".
So a list comprehension for something like:
my_list = []
for i in range(10):
my_list.append(i)
Could be written as
my_list = [i for i in range(10)]
What I gave you isn't necessarily a list comprehension but follows the same idea. It's called a "generator expression". The difference is it generates some output when you iterate over it but it's output as a whole isn't in the form of some built-in collection (list or dict).
The reason it makes sense in this context is:
I need to iterate over the list to find dict with the correct 'asset' key.
I expect there is only one occurrence of this so I care only about the first occurrence.
So to break it down you have a generator expression:
(i['balance'] for i in USDT_BAL if i['asset'] == 'USDT')
Which is roughly equivalent to.
def my_gen():
for i in USDT_BAL:
if i['asset'] == 'USDT':
yield i['balance']
Or if you're not familiar with generators and would like it as a list:
my_list = []
for i in USDT_BAL:
if i['asset'] == 'USDT':
my_list.append(i['balance'])
So now you can see we have a problem.
If we have it as a list comprehension it's in the form of a list with one element.
print(my_list) # ['0.00000000']
We could access it with my_list[0] but that looks ugly IMO but to each it's own.
So that's where the next function comes in.
According to the docs next calls the __next__ method on an iterator (which a generator is) and basically advances the generator.
So if our generator were to produce 1 then 2 then 3, calling next(my_gen) would produce 1 then calling it again would produce 2 and so on.
Since I expect this generator expression to only produce 1 item, I only call it once. Giving it a default of None means, if it's empty, rather than raising an error it will produce None.
So:
next((i['balance'] for i in USDT_BAL if i['asset'] == 'USDT'), None)
creates a generator that iterates over your list, only produces the 'balance' key of dicts who's 'asset' key equals 'USDT' and calls next on that generator with a default of None.

Flatten List of Lists Recursively

Can someone illustrate or decompose how this recursive function is executed
def flatten(S):
if S == []:
return S
if isinstance(S[0], list):
return flatten(S[0]) + flatten(S[1:])
return S[:1] + flatten(S[1:])
s=[[1,2],[3,4]]
print("Flattened list is: ",flatten(s))
How could I trace the execution of this algorithm?
Ok so this is a recursive function as you have stated. It is a mostly 'look at the next element and decide what to do with it' method. It is started with the base case.
if S == []:
return S
So this makes sense. You have an empty list, so you would expect to get back an empty list, it's flat.
if isinstance(S[0], list):
return flatten(S[0]) + flatten(S[1:])
Next is the first 'look at the next element, decide what to do', if I receive a list and at the first element there is a list, I will get the program to run this same flattening method on the first element.
But then comes the rest of the list, we don't know if that is flat so I will be doing the same thing for that calling flatten on that as well.
When this returns they should both be flat lists. Adding two lists just joins them together into a new list so this would be returned up a level to the previous call of the recursive method or return to the user.
return S[:1] + flatten(S[1:])
From before we know that the first element of the list is not a list as the if statement was if isinstance(S[0], list) so this is just taking a list with the first element stored in it and just like before running flatten on the rest of the list as we don't know whether the rest of the list is flat or not.
As for tracing, if you don't have Pycharm or pdb is to complex for you. Throw in some prints within each of the if statements. Don't be shy, you're the one that's going to read them. do a print(f"First element was a list: {S[0]}, {S[1:]}") that will be fine if you're a beginner dealing with such a small amount of code. Otherwise try PDB or such.

Mapping a List-Value pair to a key-value pair with PySpark

I am working on a problem where I have to convert around 7 million list-value pairs to key-value pairs by using map() function in PySpark where the length of the list used in given list-value pair can be at most 20.
For example:
listVal= [(["ank","nki","kit"],21),(["arp","rpi","pit"],22)]
Now, I want key-value pairs as
keyval= [("ank",21),("nki",21),("kit",21),("arp",22),("rpi",22),("pit",22)]
When I write
keyval= listval.map(lambda x: some_function(x))
where some_function() is defined as:
def some_function(x):
shingles=[]
for i in range(len(x[0])):
temp=[]
temp.append(x[0][i])
temp.append(x[1])
shingles.append(tuple(temp))
return shingles
I don't get the desired output because I think map() returns one key-value pair for an item of the list, not multiple key-value pairs. I have tried other things also and searched on web but did not find anything related to it.
Any help would be appreciated.
so using your limitations this can be done with pyspark's .flatmap()
def conversion(n):
return [(x, n[1]) for x in n[0]]
listVal.flatMap(conversion)
or in one line
listVal.flatMap(lambda n: [(x, n[1]) for x in n[0]])

numba gives error when reshaping numpy array

I am trying to optimize some code which has some loops and matrix operations. However, I am running into some errors. Please find the code and output below.
Code:
#njit
def list_of_distance(d1): #d1 was declared as List()
list_of_dis = List()
for k in range(len(d1)):
sum_dist = List()
for j in range(3):
s = np.sum(square(np.reshape(d1[k][:,:,j].copy(),d1[k][:,:,j].shape[0]*d1[k][:,:,j].shape[1])))
sum_dist.append(s) # square each value in the resulting list (dimenstion)
distance = np.sum(sum_dist) # adding the total value for each dimension to a list
list_of_dis.append(np.round(np.sqrt(distance))) # Sum the values to get the total squared values of residual images
return list_of_dis
Output:
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Invalid use of Function(<function sum at 0x7f898814bd08>) with argument(s) of type(s): (list(int64))
* parameterized
In definition 0:
All templates rejected with literals.
In definition 1:
All templates rejected without literals.
This error is usually caused by passing an argument of a type that is unsupported by the named function.
[1] During: resolving callee type: Function(<function sum at 0x7f898814bd08>)
[2] During: typing of call at <ipython-input-18-8c787cc8deda> (7)
File "<ipython-input-18-8c787cc8deda>", line 7:
def list_of_distance(d1):
<source elided>
for j in range(3):
s = np.sum(square(np.reshape(d1[k][:,:,j].copy(),d1[k][:,:,j].shape[0]*d1[k][:,:,j].shape[1])))
^
This is not usually a problem with Numba itself but instead often caused by
the use of unsupported features or an issue in resolving types.
To see Python/NumPy features supported by the latest release of Numba visit:
http://numba.pydata.org/numba-doc/latest/reference/pysupported.html
and
http://numba.pydata.org/numba-doc/latest/reference/numpysupported.html
For more information about typing errors and how to debug them visit:
http://numba.pydata.org/numba-doc/latest/user/troubleshoot.html#my-code-doesn-t-compile
If you think your code should work with Numba, please report the error message
and traceback, along with a minimal reproducer at:
https://github.com/numba/numba/issues/new
Would anyone be able to help me out regarding this issue.
Thanks & Best Regards
Michael
I had to make a few changes to get this to work and mocked up "d1", but this does work for me with Numba. This main issue that caused the runtime error appears to be that np.sum does not work on list with Numba, although it did run correctly when I commented out #jit. wrapping sumdist with np.array() resolves this issue.
d1 = [np.arange(27).reshape(3,3,3), np.arange(27,54).reshape(3,3,3)]
#njit
def list_of_distance(d1): #d1 was declared as List()
list_of_dis = [] #List() Changed - would not compile
for k in range(len(d1)):
sum_dist = [] #List() #List() Changed - would not compile
for j in range(3):
s = np.sum(np.square(np.reshape(d1[k][:,:,j].copy(),d1[k][:,:,j].shape[0]*d1[k][:,:,j].shape[1]))) #Added np. to "square"
sum_dist.append(s) # square each value in the resulting list (dimenstion)
distance = np.sum(np.array(sum_dist)) # adding the total value for each dimension to a list - Wrapped list in np.array
list_of_dis.append(np.round(np.sqrt(distance))) # Sum the values to get the total squared values of residual images
return list_of_dis
list_of_distance(d1)
Out[11]: [79.0, 212.0]

Merging uneven Corresponding Elements from a method returning value in Dictionaries

How to sort the data that are stored in a global list after inserting them within a method; so that before they are stacked into another list in accordance to their inserted elements? Or is this a bad practice and complicate things in storing data inside of a global list instead of seperated ones within a method; and finally sorting them thereafter ?
Below is the example of the scenario
list= []
dictionary = {}
def MethodA(#returns title):
#searches for corresponding data using beautifulsoup
#adds data into dictionary
# list.append(dictionary)
# returns list
def MethodB(#returns description):
#searches for corresponding data using beautifulsoup
#adds data into dictionary
# list.append(dictionary)
# returns list
Example of Wanted output
MethodA():[title] #scraps(text.title) data from the web
MethodB():[description] #scraps(text.description) from the web
#print(list)
>>>list=[{title,description},{title.description},{title,description},{title.description}]
Actual output
MethodA():[title] #scraps(text.title) data from the web
MethodB():[description] #scraps(text.description) from the web
#print(list)
>>>list =[{title},{title},{description},{description}]
There are a few examples I've seen; such as using Numpy and sorting them in an Array;-
arraylist = np.array(list)
arraylist[:, 0]
#but i get a 'too many indices for array'-
#because I have too much data loading in; including that some of them
#do not have data and are replaced as `None`; so there's an imbalance of indexes.
Im trying to keep it as modulated as possible. I've tried using the norm of iteration;
but it's sort of complicated because I have to indent more loops in it;
I've tried Numpy and Enumerate, but I'm not able to understand how to go about with it. But because it's an unbalanced list; meaning that some value are returned as Nonegives me the return error that; all the input array dimensions except for the concatenation axis must match exactly
Example : ({'Toy Box','Has a toy inside'},{'Phone', None }, {'Crayons','Used for colouring'})
Update; code sample of methodA
def MethodA(tableName, rowName, selectedLink):
try:
for table_tag in selectedLink.find_all(tableName, {'class': rowName}):
topic_title = table_tag.find('a', href=True)
if topic_title:
def_dict1 = {
'Titles': topic_title.text.replace("\n", "")}
global_list.append(def_dict1 )
return def_dict1
except:
def_dict1 = None
Assuming you have something of the form:
x = [{'a'}, {'a1'}, {'b'}, {'b1'}, {'c'}, {None}]
you can do:
dictionary = {list(k)[0]: list(v)[0] for k, v in zip(x[::2], x[1::2])}
or
dictionary = {s.pop(): v.pop() for k, v in zip(x[::2], x[1::2])}
The second method will clear your sets in x

Resources