How to extract the origen_testers test names in hierarchial order based on the AST? - origen-sdk

We use n external binning rules engine vs using the Native binning API in origen_testers and need to pass it a list of the test names in the proper order. The issue arises when we use on_fail or on_pass flow arguments that contain procs. These child tests can contain several layers of similar proc calls. The child tests are processed first before the parent tests so the binning order gets reversed. Is there a way to extract a list of test names that preserves the parent to child relationship? I know flow.atp.raw shows the hierarchy.
How can I select out the TestSuite names in the proper order?
thx
EDIT
Here is an example of the ATP:
s(:test,
s(:object, <TestSuite: jtag_ccra_all_vmin>),
s(:name, "jtag_ccra_all_vmin"),
s(:number, 0),
s(:id, "jtag_ccra_all_vmin"),
s(:on_fail,
s(:if_flag, "$Alarm",
s(:render, "multi_bin;")),
s(:test,
s(:object, <TestSuite: jtag_ccra_top_vmin>),
s(:name, "jtag_ccra_top_vmin"),
s(:number, 0),
s(:id, "jtag_ccra_top_vmin"),
s(:on_fail,
s(:if_flag, "$Alarm",
s(:render, "multi_bin;")))),
s(:test,
s(:object, <TestSuite: jtag_ccra_gasket_vmin>),
s(:name, "jtag_ccra_gasket_vmin"),
s(:number, 0),
s(:id, "jtag_ccra_gasket_vmin"),
s(:on_fail,
I am wondering if there is way to extract the names in order:
['jtag_ccra_all_vmin', 'jtag_ccra_top_vmin', 'jtag_ccra_gasket_vmin']
flow.atp.raw.children returns them in order as an Array but I would like to just query out the name attribute versus the ATP node objects.
thx

The way to extract information from the AST is to make a processor, something like this: https://github.com/Origen-SDK/origen_testers/blob/master/lib/origen_testers/atp/processors/extract_set_flags.rb
In your processor you would make a method called on_test and then from there you can extract the name.
This is untested, but something like this:
class ExtractTestNames < OrigenTesters::ATP::Processor
def run(node)
#results = []
process(node)
#results
end
# All 'test' nodes in the AST will be handed to this method, all nodes which don't have
# a specific handler defined will pick up a default handler which simply processes
# (looks for a handler for) all of the node's children
def on_test(node)
#results << node.find(:name).value
# Keep processing the children of this node, so that any tests embedded
# in on_fail nodes, etc. are picked up
process_all(node.children)
end
end
To call it:
ExtractTestNames.new.run(flow.atp.raw) # => ['jtag_ccra_all_vmin', 'jtag_ccra_top_vmin', ...]

Related

How to inspect mapped tasks' inputs from reduce tasks in Prefect

I'm exploring Prefect's map-reduce capability as a powerful idiom for writing massively-parallel, robust importers of external data.
As an example - very similar to the X-Files tutorial - consider this snippet:
#task
def retrieve_episode_ids():
api_connection = APIConnection(prefect.context.my_config)
return api_connection.get_episode_ids()
#task(max_retries=2, retry_delay=datetime.timedelta(seconds=3))
def download_episode(episode_id):
api_connection = APIConnection(prefect.context.my_config)
return api_connection.get_episode(episode_id)
#task(trigger=all_finished)
def persist_episodes(episodes):
db_connection = DBConnection(prefect.context.my_config)
...store all episodes by their ID with a success/failure flag...
with Flow("import_episodes") as flow:
episode_ids = retrieve_episode_ids()
episodes = download_episode.map(episode_ids)
persist_episodes(episodes)
The peculiarity of my flow, compared with the simple X-Files tutorial, is that I would like to persist results for all the episodes that I have requested, even for the failed ones. Imagine that I'll be writing episodes to a database table as the episode ID decorated with an is_success flag. Moreover, I'd like to write all episodes with a single task instance, in order to be able to perform a bulk insert - as opposed to inserting each episode one by one - hence my persist_episodes task being a reduce task.
The trouble I'm having is in being able to gather the episode ID for the failed downloads from that reduce task, so that I can store the failed information in the table under the appropriate episode ID. I could of course rewrite the download_episode task with a try/catch and always return an episode ID even in the case of failure, but then I'd lose the automatic retry/failure functionality which is a good deal of the appeal of Prefect.
Is there a way for a reduce task to infer the argument(s) of a failed mapped task? Or, could I write this differently to achieve what I need, while still keeping the same level of clarity as in my example?
Mapping over a list preserves the order. This is a property you can use to link inputs with the errors. Check the code I have below, will add more explanation after.
from prefect import Flow, task
import prefect
#task
def retrieve_episode_ids():
return [1,2,3,4,5]
#task
def download_episode(episode_id):
if episode_id == 5:
return ValueError()
return episode_id
#task()
def persist_episodes(episode_ids, episodes):
# Note the last element here will be the ValueError
prefect.context.logger.info(episodes)
# We change that ValueError into a "fail" message
episodes = ["fail" if isinstance(x, BaseException) else x for x in episodes]
# Note the last element here will be the "fail"
prefect.context.logger.info(episodes)
result = {}
for i, episode_id in enumerate(episode_ids):
result[episode_id] = episodes[i]
# Check final results
prefect.context.logger.info(result)
return
with Flow("import_episodes") as flow:
episode_ids = retrieve_episode_ids()
episodes = download_episode.map(episode_ids)
persist_episodes(episode_ids, episodes)
flow.run()
The handling will largely happen in the persist_episodes. Just pass the list of inputs again and then we can match the inputs with the failed tasks. I added some handling around identifying errors and replacing them with what you want. Does that answer the question?
Always happy to chat more. You can reach out in the Prefect Slack or Discourse as well.

Unable to fetch only values from a List in Groovy with Jmeter script

In groovy, I am getting below output in List. I am using Jmeter JSR223 Post processor for the script. My List print below data in result.
def a = [{Zip=36448, CountryID=2}]
I want to fetch only values (36448 and 2) from this List and not Key. How Can I do that?
For simple single instance fetch do this:
def zip = a.first().Zip
def countryId = a.first().CountryID
Seems pretty straight forward if those are only known values that you want.
If you want all Zips and CountryIDs then you can do this:
def zips = a*.Zip
def countryIds = a*.CountryID
That will return 2 Lists one with all the Zips, and one with all the CountryIDs using the spread operator.
I don't know what is the data structure is inside your list your code is not a valid Groovy code.
For Map it would be something like:
a[0].collect {it -> it.value}
More information on Groovy scripting in JMeter: Apache Groovy - Why and How You Should Use It

Creating custom component in SpaCy

I am trying to create SpaCy pipeline component to return Spans of meaningful text (my corpus comprises pdf documents that have a lot of garbage that I am not interested in - tables, headers, etc.)
More specifically I am trying to create a function that:
takes a doc object as an argument
iterates over the doc tokens
When certain rules are met, yield a Span object
Note I would also be happy with returning a list([span_obj1, span_obj2])
What is the best way to do something like this? I am a bit confused on the difference between a pipeline component and an extension attribute.
So far I have tried:
nlp = English()
Doc.set_extension('chunks', method=iQ_chunker)
####
raw_text = get_test_doc()
doc = nlp(raw_text)
print(type(doc._.chunks))
>>> <class 'functools.partial'>
iQ_chunker is a method that does what I explain above and it returns a list of Span objects
this is not the results I expect as the function I pass in as method returns a list.
I imagine you're getting a functools partial back because you are accessing chunks as an attribute, despite having passed it in as an argument for method. If you want spaCy to intervene and call the method for you when you access something as an attribute, it needs to be
Doc.set_extension('chunks', getter=iQ_chunker)
Please see the Doc documentation for more details.
However, if you are planning to compute this attribute for every single document, I think you should make it part of your pipeline instead. Here is some simple sample code that does it both ways.
import spacy
from spacy.tokens import Doc
def chunk_getter(doc):
# the getter is called when we access _.extension_1,
# so the computation is done at access time
# also, because this is a getter,
# we need to return the actual result of the computation
first_half = doc[0:len(doc)//2]
secod_half = doc[len(doc)//2:len(doc)]
return [first_half, secod_half]
def write_chunks(doc):
# this pipeline component is called as part of the spacy pipeline,
# so the computation is done at parse time
# because this is a pipeline component,
# we need to set our attribute value on the doc (which must be registered)
# and then return the doc itself
first_half = doc[0:len(doc)//2]
secod_half = doc[len(doc)//2:len(doc)]
doc._.extension_2 = [first_half, secod_half]
return doc
nlp = spacy.load("en_core_web_sm", disable=["tagger", "parser", "ner"])
Doc.set_extension("extension_1", getter=chunk_getter)
Doc.set_extension("extension_2", default=[])
nlp.add_pipe(write_chunks)
test_doc = nlp('I love spaCy')
print(test_doc._.extension_1)
print(test_doc._.extension_2)
This just prints [I, love spaCy] twice because it's two methods of doing the same thing, but I think making it part of your pipeline with nlp.add_pipe is the better way to do it if you expect to need this output on every document you parse.

How to modify the signature of a function dynamically

I am writing a framework in Python. When a user declares a function, they do:
def foo(row, fetch=stuff, query=otherStuff)
def bar(row, query=stuff)
def bar2(row)
When the backend sees query= value, it executes the function with the query argument depending on value. This way the function has access to the result of something done by the backend in its scope.
Currently I build my arguments each time by checking whether query, fetch and the other items are None, and launching it with a set of args that exactly matches what the user asked for. Otherwise I got the "got an unexpected keyword argument" error. This is the code in the backend:
#fetch and query is something computed by the backend
if fetch= None and query==None:
userfunction(row)
elif fetch==None:
userunction (row, query=query)
elif query == None:
userfunction (row, fetch=fetch)
else:
userfunction (row,fetch=fetch,query=query)
This is not good; for each additional "service" the backend offers, I need to write all the combinations with the previous ones.
Instead of that I would like to primarily take the function and manually add a named parameter, before executing it, removing all the unnecessary code that does these checks. Then the user would just use the stuff it really wanted.
I don't want the user to have to modify the function by adding stuff it doesn't want (nor do I want them to specify a kwarg every time).
So I would like an example of this if this is doable, a function addNamedVar(name, function) that adds the variable name to the function function.
I want to do that that way because the users functions are called a lot of times, meaning that it would trigger me to, for example, create a dict of the named var of the function (with inspect) and then using **dict. I would really like to just modify the function once to avoid any kind of overhead.
This is indeed doable in AST and that's what I am gonna do because this solution will suit better for my use case . However you could do what I asked more simply by having a function cloning approach like the code snippet I show. Note that this code return the same functions with different defaults values. You can use this code as example to do whatever you want.
This works for python3
def copyTransform(f, name, **args):
signature=inspect.signature(f)
params= list(signature.parameters)
numberOfParam= len(params)
numberOfDefault= len(f.__defaults__)
listTuple= list(f.__defaults__)
for key,val in args.items():
toChangeIndex = params.index(key, numberOfDefault)
if toChangeIndex:
listTuple[toChangeIndex- numberOfDefault]=val
newTuple= tuple(listTuple)
oldCode=f.__code__
newCode= types.CodeType(
oldCode.co_argcount, # integer
oldCode.co_kwonlyargcount, # integer
oldCode.co_nlocals, # integer
oldCode.co_stacksize, # integer
oldCode.co_flags, # integer
oldCode.co_code, # bytes
oldCode.co_consts, # tuple
oldCode.co_names, # tuple
oldCode.co_varnames, # tuple
oldCode.co_filename, # string
name, # string
oldCode.co_firstlineno, # integer
oldCode.co_lnotab, # bytes
oldCode.co_freevars, # tuple
oldCode.co_cellvars # tuple
)
newFunction=types.FunctionType(newCode, f.__globals__, name, newTuple, f.__closure__)
newFunction.__qualname__=name #also needed for serialization
You need to do that weird stuff with the names if you want to Pickle your clone function.

class inherit from networkx.Graph fails when using networkx.union()

All
I try to inherit networkx.Graph with my own, adding two node and an edge when the graph is created. But it fail with
networkx.exception.NetworkXError: ('The node sets of G and H are not disjoint.', 'Use appropriate rename=(Gprefix,Hprefix)or use disjoint_union(G,H).')
when I am trying to union my graphs, here is my code. Anything do I miss?
#!/usr/bin/python3
import networkx as nx
class die(nx.Graph):
nLatency = 2
def __init__(self):
super().__init__()
self.addNet()
def addNet(self):
self.add_node('N0')
self.add_node('N1')
self.add_edge('N0', 'N1', name='nLink', latency=self.nLatency)
S0D0 = die()
S1D0 = die()
Top = nx.union(S0D0, S1D0, rename=('S0D0', 'S1D0'))
So what is happening here is that networkx tries to create two temporary graphs whose nodes are 'S0D0-N0', 'S0D0-N1' for one and 'S1D0-N1', 'S1D0-N2' for the other. Then it tries to join them.
However as you dig through the code when it does that, the two new graphs created have the same class as the originals. So, let's call the new graphs created H1 and H2. Because H1 and H2 also both have class die, they are initialized with the nodes 'N0' and 'N1' and then 'S0D0-N0', 'S0D0-N1' or 'S1D0-N1', 'S1D0-N2' are added. So both are initialized with 'N0' and 'N1'.
So then at the next stage in the union process it tests whether or not H1 and H2 have any common nodes, and they do. So you get the error.
So that's the cause of the error. How to fix it probably depends on why you are initializing the graphs with these nodes, and what class you want Top to have.
If Top has class die, it's going to have to have 'N0' and 'N1' (because of the initialization), which I suspect you don't actually want. If you just want Top to be a Graph, you can first turn S0D0 and S1D0 into Graphs:
Top = nx.union(nx.Graph(S0D0), nx.Graph(S1D0), rename=('S0D0', 'S1D0'))

Resources