Does it support executing only selected scenario based on tag from called feature file? [duplicate] - cucumber

We have a feature A with several scenarios. And we need one scenario from that file. Can we call it in our feature B?

No. You need to extract that Scenario into a separate*.feature file and then re-use it using the call keyword.
EDIT: Karate 0.9.0 onwards will support being able to call by tag as follows:
* def result = call read('some.feature#tagname')

To illustrate the answer from Karate-inventor Peter Thomas with an example:
Given a feature-file some.feature with multiple scenarios tagged by a tag-decorator:
#tagname
Scenario: A, base case that shares results to e.g. B
// given, when, then ... and share result in a map with keyword `uri`
* def uri = responseHeaders['Location'][0]
#anotherTag
Scenario: X, base case that shares results to e.g. B
// given, when, then ... and share result in a map with keyword `createdResourceId`
* def createdResourceId = $.id
In another feature we can call a specific scenario from that feature by its tag, e.g. tagname:
Scenario: B, another case reusing some results of A
* def result = call read('some.feature#tagname')
* print "given result of A is: $result"
Given path result.uri + '/update'
See also: demo of adding custom tags to scenarios

Related

How to inspect mapped tasks' inputs from reduce tasks in Prefect

I'm exploring Prefect's map-reduce capability as a powerful idiom for writing massively-parallel, robust importers of external data.
As an example - very similar to the X-Files tutorial - consider this snippet:
#task
def retrieve_episode_ids():
api_connection = APIConnection(prefect.context.my_config)
return api_connection.get_episode_ids()
#task(max_retries=2, retry_delay=datetime.timedelta(seconds=3))
def download_episode(episode_id):
api_connection = APIConnection(prefect.context.my_config)
return api_connection.get_episode(episode_id)
#task(trigger=all_finished)
def persist_episodes(episodes):
db_connection = DBConnection(prefect.context.my_config)
...store all episodes by their ID with a success/failure flag...
with Flow("import_episodes") as flow:
episode_ids = retrieve_episode_ids()
episodes = download_episode.map(episode_ids)
persist_episodes(episodes)
The peculiarity of my flow, compared with the simple X-Files tutorial, is that I would like to persist results for all the episodes that I have requested, even for the failed ones. Imagine that I'll be writing episodes to a database table as the episode ID decorated with an is_success flag. Moreover, I'd like to write all episodes with a single task instance, in order to be able to perform a bulk insert - as opposed to inserting each episode one by one - hence my persist_episodes task being a reduce task.
The trouble I'm having is in being able to gather the episode ID for the failed downloads from that reduce task, so that I can store the failed information in the table under the appropriate episode ID. I could of course rewrite the download_episode task with a try/catch and always return an episode ID even in the case of failure, but then I'd lose the automatic retry/failure functionality which is a good deal of the appeal of Prefect.
Is there a way for a reduce task to infer the argument(s) of a failed mapped task? Or, could I write this differently to achieve what I need, while still keeping the same level of clarity as in my example?
Mapping over a list preserves the order. This is a property you can use to link inputs with the errors. Check the code I have below, will add more explanation after.
from prefect import Flow, task
import prefect
#task
def retrieve_episode_ids():
return [1,2,3,4,5]
#task
def download_episode(episode_id):
if episode_id == 5:
return ValueError()
return episode_id
#task()
def persist_episodes(episode_ids, episodes):
# Note the last element here will be the ValueError
prefect.context.logger.info(episodes)
# We change that ValueError into a "fail" message
episodes = ["fail" if isinstance(x, BaseException) else x for x in episodes]
# Note the last element here will be the "fail"
prefect.context.logger.info(episodes)
result = {}
for i, episode_id in enumerate(episode_ids):
result[episode_id] = episodes[i]
# Check final results
prefect.context.logger.info(result)
return
with Flow("import_episodes") as flow:
episode_ids = retrieve_episode_ids()
episodes = download_episode.map(episode_ids)
persist_episodes(episode_ids, episodes)
flow.run()
The handling will largely happen in the persist_episodes. Just pass the list of inputs again and then we can match the inputs with the failed tasks. I added some handling around identifying errors and replacing them with what you want. Does that answer the question?
Always happy to chat more. You can reach out in the Prefect Slack or Discourse as well.

I dont understand How does python Dash #Callback know to execute the def function

I want understand how the #callback function knows how to execute the def update_graph, because I dont see any link where I use a variable lets say in the Input country_selector or value in the callback AND in the def function at the same time so callback knows that I want the def function to be executed. Can anyone give me simple answer for that?
#app.callback(
Output('timeseries', 'figure'),
[Input('country_selector', 'value')]
)
def update_graph(selected_dropdown_value):
trace = []
for countriesAndTerritories in selected_dropdown_value:
#Erstelle Balkeindiagramm iterativ
trace.append(go.Bar(
x = df.month,
y= df[df["countriesAndTerritories"] == countriesAndTerritories] ["cases"],
name = countriesAndTerritories
))
data = trace
A humble attempt to explain callbacks. Let's look at the first few lines :
#app.callback(
Output('timeseries', 'figure'),
[Input('country_selector', 'value')]
)
#app.callback is dash's way of reactivity of the display to a user input. It can take inputs and states of inputs and change outputs. So one defines all the Output() components that need to change (this can be a list of more than one, in that case use a [] to enclose all of them. Similarly, Input and State can be lists, to denote multiple inputs or states that can then effect or change the outputs.
Further, if we take a look at Output('timeseries', 'figure'), what we are telling dash is that we want to react the element with an id called timeseries and we want to react the figure element of this id. figure can be replaced with say value or children depending on what we are trying to change. Similar holds good for the Input and State too. First parameter depicts id of the element and second, the element that is to change.
Now, moving on to the def that is defined below the #app.callback. The name of this function is not a major factor per se, but it's parameters will now be all the inputs that we have defined earlier. In your specific example here, def update_graph(selected_dropdown_value):, we have one input - which is the value of country_selector. So selected_dropdown_value will now have this value.
Inside this function then, we can either call other business logic functions defined in other modules or within the dash app itself, which may take in these inputs and generate or return the necessary output.
An example psuedo code:
def generate_bar(country):
#logic for extracting right info goes here
scatter_fig = go.Figure()
scatter_fig.add_trace(go.Bar(x=df['country'], y=df['counts']
scatter_fig.update_layout(title='new graph')
return scatter_fig
#app.callback(
Output('timeseries', 'figure'),
[Input('country_selector', 'value')]
)
def update_figs(selected_dropdown_value):
new_fig = generate_bar(selected_dropdown_value)
return new_fig
Finally this new_fig now replaces the figure element having the id as timeseries.
To add to what #Syamanthaka said, the call back function acts on the function definition that comes directly below it.
I understand your concern as I had the same concern especially when there are more than one function definitions in the 'code window', the functions that are not intended to be affected directly by the callback decorator are to be placed above it.
I understand that you would have preferred something like the code below to show it wraps it, sadly it does not work this way.
#app.callback
(
Output('timeseries', 'figure'),
[Input('country_selector', 'value')]
def function_to_be_called_back(selected_dropdown_value):
trace = []
for countriesAndTerritories in selected_dropdown_value:
#Erstelle Balkeindiagramm iterativ
trace.append(go.Bar(
x = df.month,
y= df[df["countriesAndTerritories"] ==
countriesAndTerritories] ["cases"],
name = countriesAndTerritories
))
data = trace
)
Sadly, it is not designed this way. It works based on the positioning of the function definition

Creating custom component in SpaCy

I am trying to create SpaCy pipeline component to return Spans of meaningful text (my corpus comprises pdf documents that have a lot of garbage that I am not interested in - tables, headers, etc.)
More specifically I am trying to create a function that:
takes a doc object as an argument
iterates over the doc tokens
When certain rules are met, yield a Span object
Note I would also be happy with returning a list([span_obj1, span_obj2])
What is the best way to do something like this? I am a bit confused on the difference between a pipeline component and an extension attribute.
So far I have tried:
nlp = English()
Doc.set_extension('chunks', method=iQ_chunker)
####
raw_text = get_test_doc()
doc = nlp(raw_text)
print(type(doc._.chunks))
>>> <class 'functools.partial'>
iQ_chunker is a method that does what I explain above and it returns a list of Span objects
this is not the results I expect as the function I pass in as method returns a list.
I imagine you're getting a functools partial back because you are accessing chunks as an attribute, despite having passed it in as an argument for method. If you want spaCy to intervene and call the method for you when you access something as an attribute, it needs to be
Doc.set_extension('chunks', getter=iQ_chunker)
Please see the Doc documentation for more details.
However, if you are planning to compute this attribute for every single document, I think you should make it part of your pipeline instead. Here is some simple sample code that does it both ways.
import spacy
from spacy.tokens import Doc
def chunk_getter(doc):
# the getter is called when we access _.extension_1,
# so the computation is done at access time
# also, because this is a getter,
# we need to return the actual result of the computation
first_half = doc[0:len(doc)//2]
secod_half = doc[len(doc)//2:len(doc)]
return [first_half, secod_half]
def write_chunks(doc):
# this pipeline component is called as part of the spacy pipeline,
# so the computation is done at parse time
# because this is a pipeline component,
# we need to set our attribute value on the doc (which must be registered)
# and then return the doc itself
first_half = doc[0:len(doc)//2]
secod_half = doc[len(doc)//2:len(doc)]
doc._.extension_2 = [first_half, secod_half]
return doc
nlp = spacy.load("en_core_web_sm", disable=["tagger", "parser", "ner"])
Doc.set_extension("extension_1", getter=chunk_getter)
Doc.set_extension("extension_2", default=[])
nlp.add_pipe(write_chunks)
test_doc = nlp('I love spaCy')
print(test_doc._.extension_1)
print(test_doc._.extension_2)
This just prints [I, love spaCy] twice because it's two methods of doing the same thing, but I think making it part of your pipeline with nlp.add_pipe is the better way to do it if you expect to need this output on every document you parse.

How to return a variable from a python function with a single parameter

I have the following function:
def test(crew):
crew1 = crew_data['CrewEquipType1']
crew2 = crew_data['CrewEquipType2']
crew3 = crew_data['CrewEquipType3']
return
test('crew1')
I would like to be able to use any one of the 3 variables as an argument and return the output accordingly to use as a reference later in my code. FYI, each of the variables above is a Pandas series from a DataFrame.
I can create functions without a parameter, but for reason I can't quite get the concept of how to use parameters effectively such as that above, instead I find myself writing individual functions rather then writing a single one and adding a parameter.
If someone could provide a solution to the above that would be greatly appreciated.
Assumption: You problem seems to be that you want to return the corresponding variable crew1, crew2 or crew3 based on your input to the function test.
Some test cases based on my understanding of your problem
test('crew1') should return crew_data['CrewEquipType1']
test('crew2') should return crew_data['CrewEquipType2']
test('crew3') should return crew_data['CrewEquipType3']
To accomplish this you can implement a function like this
def test(crew):
if crew=='crew1':
return crew_data['CrewEquipType1']
elif crew=='crew2':
return crew_data['CrewEquipType2']
elif crew=='crew3':
return crew_data['CrewEquipType3']
...
... # add as many cases you would like
...
else:
# You could handle incorrect value for `crew` parameter here
Hope this helps!
Drop a comment if not

Semantics and ambiguity of "returning" a value?

I was reading up on questions from a python quiz. Here is the following code and its respective question:
class Player(object):
def __init__(self, name, health):
self._name = name
self._health = health
def get_health(self):
"""Return the players health."""
## LINE ##
What is the required code for ## LINE ## so that the method satisfies the comment?
(a) print(self.health)
(b) return self.health
(c) print(self._health)
(d) return self._health
(e) More than one of the above is correct.
So, I'm wondering, is this question ambiguous?
If I state that a specific function's purpose is to "return the value of x", could that not be interpreted as both literally employing the return command to give x's value and using the print command to display the value.
Both give the same answer at face value in the interpreter.
Of course, things are different if you attempt to manipulate it indirectly:
get_health() * 5 yields a normal output if using return
get_health() * 5 yields an error if using print
So should I always treat 'return something' as actually using the return command?
I suppose print and return would both be viable only if the function's purpose said something like "Display the value in the python interpreter".
The correct answer is simply d): return self._health.
You almost answered your own question. Return in programming parlance means use of (the) return (Python/C/... statement, or an implicit return in other languages, etc).
The point here is that that the comment is meant for programmers, not users.
A print statement would imply something to the user running your program ("return output visible to the user"), but the user will not see or know about that comment.
And, as you already pointed out, the use of returning an actual value allows constructs like get_health() * 5.
Going one small step further, I would expect a printing function to be called print_health(); but that's up to the logic of the programming standard & style that is being used.

Resources