Bulbs: g.scripts.update() : TypeError: sequence item 2: expected string or Unicode, NoneType found - groovy

I'm using TitanGraphDB + Cassandra. I'm starting Titan as follows
cd titan-cassandra-0.3.1
bin/titan.sh config/titan-server-rexster.xml config/titan-server-cassandra.properties
I have a Rexster shell that I can use to communicate to Titan + Cassandra above.
cd rexster-console-2.3.0
bin/rexster-console.sh
I'm attempting to model a network topology using Titan Graph DB. I want to program the Titan Graph DB from my python program. I'm using bulbs package for that.
I create five types of vertices
- switch
- port
- device
- flow
- flow_entry
I create edges between vertices that are connected logically. The edges are not labelled.
Let us say I want to test the connectivity between Vertex A and Vertex B
I have a groovy script is_connected.groovy
def isConnected (portA, portB) {
return portA.both().retain([portB]).hasNext()
}
Now from my rexster console
g = rexster.getGraph("graph")
==>titangraph[embeddedcassandra:null]
rexster[groovy]> g.V('type', 'flow')
==>v[116]
==>v[100]
rexster[groovy]> g.V('type', 'flow_entry')
==>v[120]
==>v[104]
As you can see above I have two vertices of type flow v[116] and v[100]
I have two vertices of type flow_entry v[120] and v[104]
I want to check for the connectivity between v[120] and v[116] for e.g
rexster[groovy]> ?e is_connected.groovy
==>null
rexster[groovy]> is_connected(g.v(116),g.v(120))
==>true
So far so good.Now I want to be able to use this script from my python program that imports bulbs package.
My directory structure is as follows.
Projects/ryu
--> ryu/app_simple_switch.py
Projects/ryu_extras
--> rexster-console-2.3.0
--> titan-cassandra-0.3.1
My script is_connected.groovy which contains isConnected() function/procedure is kept in Projects/ryu_extras/rexster-console-2.3.0
Now from my python program which is in Projects/ryu/ryu/app/simple_switch.py I try to do the following.
self.g.scripts.update('Projects/ryu_extras/rexster-console-2.3.0/is_connected.groovy') # add file to scripts index
script = self.g.scripts.get('isConnected') # get a function by its name
params = dict(portA=flow,portB=fe1) # put function params in dict
items = self.g.gremlin.query(script, params)
self.create_outgoing_edge(flow,fe1)
self.create_outgoing_edge(fe1,sw_vertex)
I get the following error.
hub: uncaught exception: Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/ryu/lib/hub.py", line 48, in _launch
func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/ryu/base/app_manager.py", line 256, in _event_loop
handler(ev)
File "/home/karthik/Projects/ryu/ryu/app/simple_switch.py", line 322, in _packet_in_handler
self.compute_path(src,dst,datapath)
File "/home/karthik/Projects/ryu/ryu/app/simple_switch.py", line 289, in compute_path
self.g.scripts.update('/home/karthik/Projects/ryu_extras/rexster-console-2.3.0/is_connected.groovy') # add file to scripts index
File "/usr/local/lib/python2.7/dist-packages/bulbs/groovy.py", line 120, in update
methods = self._get_methods(file_path)
File "/usr/local/lib/python2.7/dist-packages/bulbs/groovy.py", line 160, in _get_methods
return Parser(file_path).get_methods()
File "/usr/local/lib/python2.7/dist-packages/bulbs/groovy.py", line 255, in __init__
Scanner(handlers).scan(groovy_file)
File "/usr/local/lib/python2.7/dist-packages/bulbs/groovy.py", line 246, in scan
self.get_item(fin,line)
File "/usr/local/lib/python2.7/dist-packages/bulbs/groovy.py", line 236, in get_item
content = "\n".join(sections).strip()
TypeError: sequence item 2: expected string or Unicode, NoneType found
As you can see the error is in scripts.update() function.I just can't seem to figure out what I am doing wrong.Any help would be appreciated.

You need to save your scripts in a file named gremlin.groovy or specify the script's namespace when you get it from the Bulbs script index.
Like Rexster, Bulbs uses the first part of the Groovy filename as a namespace.
For example, methods defined in gremlin.groovy files are added to the Bulbs gremlin namespace.
All of Bulbs' pre-defined Gremlin scripts are defined in gremlin.groovy files and thus gremlin is the default namespace:
https://github.com/espeed/bulbs/blob/master/bulbs/gremlin.groovy
https://github.com/espeed/bulbs/blob/master/bulbs/rexster/gremlin.groovy
https://github.com/espeed/bulbs/blob/master/bulbs/titan/gremlin.groovy
You can have multiple/additional gremlin.groovy files in your app. Do this if you want to keep everything under the same namespace or if you want to override a pre-defined method:
>>> g.scripts.update("/path/to/gremlin.groovy") # add scripts to gremlin namespace
See...
https://github.com/espeed/bulbs/blob/f666fa89b3c99bc0a6b4e964aa1bff70b05a2e96/bulbs/groovy.py#L112
You can create app-specific and model-specific namespaces by defining your Gremlin methods in files with names like myapp.groovy or somemodel.groovy:
>>> g.scripts.update("/path/to/myapp.groovy") # add scripts to myapp namespace
>>> g.scripts.update("/path/to/somemodel.groovy") # add scripts to somemodel namespace
And then to get a script under a particular namespace, do:
>>> script = g.scripts.get('myapp:isConnected') # use prefix notation, or...
>>> script = g.scripts.get('isConnected', 'myapp') # specify namespace as arg
See...
https://github.com/espeed/bulbs/blob/f666fa89b3c99bc0a6b4e964aa1bff70b05a2e96/bulbs/groovy.py#L77
To generate concatenated server-side script files for each namespace, use the g.make_script_files() method:
>>> g.make_script_files() # write files to the current dir, or...
>>> g.make_script_files("/path/to/scripts/dir") # write files to specified dir
The make_scripte_files() method will create a separate .groovy file for each namespace. If you override a method in a namespace, only the latest will be included in the generated file.
See...
https://github.com/espeed/bulbs/blob/f666fa89b3c99bc0a6b4e964aa1bff70b05a2e96/bulbs/rexster/graph.py#L62
For more details, see...
Using Rexster Server-Side Scripts with Bulbs
https://groups.google.com/d/topic/gremlin-users/Up3JQUwrq-A/discussion

There might be a "bulbs way" to do this, but you could try to expose your function globally by putting it on the server with Rexster. Then in the <script-engine><init-scripts> section you can just add your is_connected.groovy. The sample rexster.xml should already have an example of this:
<script-engines>
<script-engine>
<name>gremlin-groovy</name>
<reset-threshold>-1</reset-threshold>
<init-scripts>config/is_connected.groovy</init-scripts>
<imports>com.tinkerpop.rexster.client.*</imports>
<static-imports>java.lang.Math.PI</static-imports>
</script-engine>
</script-engines>

Related

How to recognize a third paty yaml dump object back- without specifying redundant import statemnets

Imagine you have something similar to the following yaml:
model: !!python/object:Thirdpartyfoo.foo_module.foo_class
some_attribue: value
In addition, assume you already installed package Thirdpartyfoo using some pip install or something.
Now you want to get things out of the yaml back into python obeject so you do:
import yaml
with open('foo.yaml') as f:
dict = yaml.load(f, yaml.Loader)
But after you run it you get error like:
Except ImportError as exc:
raise ConstructorError("while constructing a Python object", mark,
"cannot find module %r (%s)" % (module_name, exc), mark)
if module_name not in sys.modules:
raise ConstructorError("while constructing a Python object", mark,
"module %r is not imported" % module_name, mark)
yaml.constructor.ConstructorError: while constructing a Python object
module 'Thirdpartyfoo.foo_module' is not imported
You end up with a very ugly solution for that:
import Thirdpartyfoo.foo_module.foo_class as dummy_import
with open('foo.yaml') as f:
dict = yaml.load(f, yaml.Loader)
Note that if I won't explicitly mention line dummy_import I will get unused import at line ... by flake8 lint check :)
Any ideas?
Based on the YAML documentation, what yaml.load() does is that it converts YAML documents to python objects.
The function yaml.load() converts a YAML document to a Python object. yaml.load() accepts a byte string, a Unicode string, an open binary file object, or an open text file object.
The reason you're getting this error is that you're loading a data from yaml file that has no proper object to be stored in. As long as you do not import a library in the python runtime environment, it's just a file like any other file on your computer and has no specific value to python. So the only way to fix this issue is that you should import the proper class definition to store your data into it, which you just did in the last part. Just importing Thirdpartyfoo would do the job as well, cause it loads every definition into the python runtime environment.

Execute stand alone python script inside another script

I have made a .py file (i call it acomp.py) that reads from and SQL database and after a series of calculations and exports its output to an excel file.Later I sends this .xlsx file by e-mail to various persons.
I wish to put it inside another python script so I can use the schedule module to call acomp.py aromatically at selected times, run it and send the output by email:
def exec_scrpit():
exec(open("acomp.py", encoding='utf-8').read())
send_email()
schedule.every().day.at("07:20").do(exec_scrpit)
When the second script calls the fist it messes with the acom.py internal functions returning the error:
" File "<string>", line 125, in <lambda>
NameError: name 'modalidade_contrato' is not defined"
This 'modalidade_contrato' is defined inside acomp.py and it runs perfectly when I execute acom.py directly.
Any ideia how should I proceed? I think my whole strategy is not usual, but I have to do it this way because I do not have admin privileges on my computer.
I learned that you can simply run it as you would in cmd:
os.system(r'"C:\Users\user.name\Miniconda3\python.exe acomp.py"')

Finding the source code of methods implemented in C?

Please note I am asking this question for informational purposes only
I know the title sound like a duplicate of Finding the source code for built-in Python functions?. But let me explain.
Say for example, I want to find the source code of most_common method of collections.Counter class. Since the Counter class is implemented in python I could use the inspect module get it's source code.
ie,
>>> import inspect
>>> import collections
>>> print(inspect.getsource(collections.Counter.most_common))
This will print
def most_common(self, n=None):
'''List the n most common elements and their counts from the most
common to the least. If n is None, then list all element counts.
>>> Counter('abcdeabcdabcaba').most_common(3)
[('a', 5), ('b', 4), ('c', 3)]
'''
# Emulate Bag.sortedByCount from Smalltalk
if n is None:
return sorted(self.items(), key=_itemgetter(1), reverse=True)
return _heapq.nlargest(n, self.items(), key=_itemgetter(1))
So if the method or class is implemented in C inspect.getsource will raise TypeError.
>>> my_list = []
>>> print(inspect.getsource(my_list.append))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\abdul.niyas\AppData\Local\Programs\Python\Python36-32\lib\inspect.py", line 968, in getsource
lines, lnum = getsourcelines(object)
File "C:\Users\abdul.niyas\AppData\Local\Programs\Python\Python36-32\lib\inspect.py", line 955, in getsourcelines
lines, lnum = findsource(object)
File "C:\Users\abdul.niyas\AppData\Local\Programs\Python\Python36-32\lib\inspect.py", line 768, in findsource
file = getsourcefile(object)
File "C:\Users\abdul.niyas\AppData\Local\Programs\Python\Python36-32\lib\inspect.py", line 684, in getsourcefile
filename = getfile(object)
File "C:\Users\abdul.niyas\AppData\Local\Programs\Python\Python36-32\lib\inspect.py", line 666, in getfile
'function, traceback, frame, or code object'.format(object))
TypeError: <built-in method append of list object at 0x00D3A378> is not a module, class, method, function, traceback, frame, or code object.
So my question is, Is there is any way(or Using third party package?) that we can find the source code of class or method implemented in C as well?
ie, something like this
>> print(some_how_or_some_custom_package([].append))
int
PyList_Append(PyObject *op, PyObject *newitem)
{
if (PyList_Check(op) && (newitem != NULL))
return app1((PyListObject *)op, newitem);
PyErr_BadInternalCall();
return -1;
}
No, there is not. There is no metadata accessible from Python that will let you find the original source file. Such metadata would have to be created explicitly by the Python developers, without a clear benefit as to what that would achieve.
First and foremost, the vast majority of Python installations do not include the C source code. Next, while you could conceivably expect users of the Python language to be able to read Python source code, Python's userbase is very broad and a large number do not know C or are interested in how the C code works, and finally, even developers that know C can't be expected to have to read the Python C API documentation, something that quickly becomes a requirement if you want to understand the Python codebase.
C files do not directly map to a specific output file, unlike Python bytecode cache files and scripts. Unless you create a debug build with a symbol table, the compiler doesn't retain the source filename in the generated object file (.o) it outputs, nor will the linker record what .o files went into the result it produces. Nor do all C files end up contributing to the same executable or dynamic shared object file; some become part of the Python binary, others become loadable extensions, and the mix is configurable and dependent on what external libraries are available at the time of compilation.
And between makefiles, setup.py and C pre-propressor macros, the combination of input files and what lines of source code are actually used to create each of the output files also varies. Last but not least, because the C source files are no longer consulted at runtime, they can't be expected to still be available in the same original location, so even if there was some metadata stored you still couldn't map that back to the original.
So, it's just easier to just remember a few base rules about how the Python C-API works, then map that back to the C code with a few informed code searches.
Alternatively, download the Python source code and create a debug build, and use a good IDE to help you map symbols and such back to source files. Different compilers, platforms and IDEs have different methods of supporting symbol tables for debugging.
There could be a way if you had the whole debug information (which are usually stripped).
Then you would get to the so or pyd, and use platform specific tools to extract the debug information (stored in the so or in the pdb on Windows) for the required function. You may want to have a look at DWARF information for Linux (on Windows, there is no documentation AFAIK).

Broadcast python objects using mpi4py

I have a python object
<GlobalParams.GlobalParams object at 0x7f8efe809080>
which contains various numpy arrays, parameter values etc. which I am using in various functions calling as for example:
myParams = GlobalParams(input_script) #reads in various parameters from an input script and assigns these to myParams
myParams.data #calls the data array from myParams
I am trying to parallelise my code and would like to broadcast the myParams object so that it is available to the other child processes. I have done this previously for individual numpy arrays, values etc. in the form:
points = comm.bcast(points, root = 0)
However, I don't want to have to do this individually for all the contents of myParams. I would like to broadcast the object in its entirety so that it can be accessed on other cores. I have tried the obvious:
myParams = comm.bcast(myParams, root=0)
but this returns the error:
myParams = comm.bcast(myParams, root=0)
File "MPI/Comm.pyx", line 1276, in mpi4py.MPI.Comm.bcast (src/mpi4py.MPI.c:108819)
File "MPI/msgpickle.pxi", line 612, in mpi4py.MPI.PyMPI_bcast (src/mpi4py.MPI.c:47005)
File "MPI/msgpickle.pxi", line 112, in mpi4py.MPI.Pickle.dump (src/mpi4py.MPI.c:40704)
TypeError: cannot serialize '_io.TextIOWrapper' object
What is the appropriate way to share this object with the other cores? Presumably this is a common requirement in python, but I can't find any documentation on this. Most examples look at broadcasting a single variable/array.
This doesn't look like an MPI problem; it looks like a problem with object serialisation for broadcast, which internally is using the Pickle module.
Specifically in this case, it can't serialise a _io.TextIOWrapper - so I suggest hunting down where in your class this is used.
Once you work out which field(s) can't be serialised, you can remove them, broadcast, then reassemble them on each individual rank, using some method that you need to design yourself (recreateUnpicklableThing() in the example below). You could do that by adding these methods to your class for Pickle to call before and after broadcast:
def __getstate__(self):
members = self.__dict__.copy()
# remove things that can't be pickled, using its name
del members['someUnpicklableThing']
return members
def __setstate__(self, members):
self.__dict__.update(members)
# On unpickle, manually recreate the things that you couldn't pickle
# (this method recreates self.someUnpickleableThing using some metadata
# carefully chosen by you that Pickle can serialise).
self.recreateUnpicklableThing(self.dataForSettingUpSometing)
See here for more on how these methods work
https://docs.python.org/2/library/pickle.html

Calling a gremlin script from python program that uses Bulbs

I am using TitanGraphDB + Cassandra.I am starting Titan as follows
cd titan-cassandra-0.3.1
bin/titan.sh config/titan-server-rexster.xml config/titan-server-cassandra.properties
I have a Rexster shell that I can use to communicate to Titan+Cassandra above.
cd rexster-console-2.3.0
bin/rexster-console.sh
I am attempting to model a network topology using Titan Graph DB.I want to program the Titan Graph DB from my python program.I am using bulbs package for that.
I create three types of vertices
- switch
- port
- device
I create labelled edges between ports that are connected physically.The label that I use is "link".
Let us say I have two port vertices portA and portB.
I want to check if portA is connected to portB from my python program using bulbs package.
As a first step.I write a script (saved in a file is_connected.sh)
def is_connected(portA, portB):
return portA.both("link").retain([portB]).hasNext()
If I try to execute the above script from my rexster-console as follows,I get the following result.
sudo ./start_rexter.sh
(l_(l
(_______( 0 0
( (-Y-) <woof>
l l-----l l
l l,, l l,,
opening session [127.0.0.1:8184]
?h for help
rexster[groovy]> ?e
specify the file to executerexster[groovy]> is_connected.sh
==>An error occurred while processing the script for language [groovy]. All transactions across all graphs in the session have been concluded with failure: java.util.concurrent.ExecutionException: javax.script.ScriptException: javax.script.ScriptException: groovy.lang.MissingPropertyException: No such property: is_connected for class: Script2
This is my very first attempt at writing a stored procedure (a.k.a gremlin script).I don't know if this is the right way to approach it.Also my final aim would be to be able to call this script from my python program that uses bulbs.If someone could point me in the right direction that would be great!
The ?e command requires that you specify the file to execute in the same line. I created sum.groovy:
def sum(x,y) { x+y }
then from the console:
rexster[groovy]> ?e sum.groovy
==>null
rexster[groovy]> sum(1,2)
==>3
Strange that specifying ?e without the file doesn't do a proper line feed. I'll try to go fix that.

Resources