RunGreatExpectationsValidation execution returns an exception - prefect

I am struggling on a great_expectations integration problem.
I obviously use RunGreatExpectationsValidation task with:
validation_task = RunGreatExpectationsValidation()
with Flow(
"GE_pull_and_run",
) as GE_pull_and_run_flow:
.......
validation_task(
context_root_dir=root_dir,
checkpoint_name=expectation_checkpoint_name
)
When I run the command on GE (great_expectations --V3-api checkpoint run my_checkpoint), it works, but on prefect task, I have an exception:
With GE V3 api:
.....
File "c:\Users\vincent2\DK\prefect.data.pipeline\venv\lib\site-
packages\prefect\tasks\great_expectations\checkpoints.py", line 246, in run
for batch in ge_checkpoint["batches"]:
TypeError: 'Checkpoint' object is not subscriptable
The same with GE V2 api
...
for batch in ge_checkpoint["batches"]:
TypeError: 'LegacyCheckpoint' object is not subscriptable
Great_expectations=0.13.43 (also tried with 0.12.10 version)
prefect=0.15.9
Anyone experienced this pb?
Thanks

To provide an update on this: the issue has been fixed as part of this PR in Prefect. Feel free to give it a try now and if something still doesn't work for you, let us know.

Related

Strapi debug: Error on attribute departure in model

today when I try to run my Strapi with some exercises, there was an error showing that inversedBy attribute flight not found target api::airport.airport. However, the command shows Admin UI was built successfully. but I cannot access the Admin panel and do anything with it. It seems that the error is belonging to one of the content, but the entire API is not working. What should I do? Does anyone know how to fix this bug?
enter image description here
Thank you.
Firstly, I tried to run the start command(npm run develop) for several time, it keep reporting same error.
Secondly, I tried to access the administration panel directly, it is apparently that I failed.
Hopes someone can help me to figure out, how can I solve this bug/error.
I had a similar error.
The issue for me related to a problem where the 'key'(i.e. attribute key in JSON) didn't match that was referenced by the model in the mappedBy & inversedBy.
e.g. mappedBy:"f_light" should point to
"f_light":{type:"relation",...) --
At least that was the problem for me
Strapi Docs on how the schema is supposed to look
My issue: Error on attribute a_token in model a-request(api::a-request.a-request): inversedBy attribute a-requests not found target api::a-token.a-token
This occurred because I inversedBy:'a-token' when the attribute key was 'a_token'. Changing them so they matched solved my issue ('a-token' -> 'a_token').
The naming conventions of mappedBy, inversedBy, and the attribute keys MUST use '_' instead of '-' for spaces, otherwise it will fail the naming convention tests.

Azure timer function. When to add cursor.close

I am very new to Azure functions and have a question. I am working on an Azure timer function that pulls data via an API and inserts it into an Azure SQL db. I am able to do all that part successfully. However, at the end of the script, I get the following error:
Exception: ProgrammingError: Attempt to use a closed cursor.
My question is, when would I include cursor.close? Should I have that in there at all? I assume yes, but, if so, where do I use that?
If I comment it out, it works fine, but I feel like I should have that in there.
Here's my code:
def main(mytimer: func.TimerRequest) -> None:
gp_data=get_properties()
for index, row in gp_data.iterrows():
cursor.execute("""INSERT INTO dbo.get_properties3 (propertyid, property_name, street_address,
city, state_code, zip_code, phone, email, manager, currentperiod_start,
currentperiod_end, as_of_date) values(?,?,?,?,?,?,?,?,?,?,?,?)""", \
row.propertyid, row.property_name, row.street_address, row.city, row.state_code, row.zip_code, \
row.phone, row.email, row.manager, \
row.currentperiod_start, row.currentperiod_end,row.as_of_date)
cnxn.commit()
# cursor.close()
Any advice would be greatly appreciated.
Thanks!
In my opinion, I think the line of code cursor.close() is unnecessary because it can be garbage collected like any other object in python. Each running instance of your timer function will not be affected even though you don't add cursor.close().

Aws cloudwatch logs getQueryResults returns empty when tried with boto3

Using boto3 of aws, I am trying to run start query and get the results using query id. but it didnt work as expected in python script. It returns the expected json output for start_query and able to fetch the queryID. But if i try to fetch the query results using queryID, it returns empty json.
<code>
import boto3
client = boto3.client('logs')
executeQuery = client.start_query(
logGroupName='LOGGROUPNAME',
startTime=STARTDATE,
endTime=ENDDATE,
queryString='fields status',
limit=10000
)
getQueryId=executeQuery.get('queryId')
getQueryResults = client.get_query_results(
queryId=getQueryId
)
</code>
it returns the reponse of get_query_results as
{'results': [], 'statistics': {'recordsMatched': 0.0, 'recordsScanned': 0.0, 'bytesScanned': 0.0}, 'status': 'Running',
But if i try using aws cli with the queryID generated from script, it returns json output as expected.
Anyone could able to tell why it didnt work from boto3 python script and worked in cli?
Thank you.
The query status is Running in your example. Its not in Complete status yet.
Running queries is not instantaneous. Have to wait a bit for query to complete, before you can get results.
You can use describe_queries to check if your query has completed or not. You can also check if logs service has dedicated waiters in boto3 for the results. They would save you from pulling describe_queries API in a loop waiting till your queries finish.
When you do this in CLI, probably there is more time before you start the query, and query results using CLI.
The other issue you might be encountering is that the syntax for the queryString in the API is different from a query you would type into the CloudWatch console.
Console query syntax example:
{ $.foo = "bar" && $.baz > 0 }
API syntax for same query:
filter foo = "bar" and baz > 0
Source: careful reading and extrapolation from the official documentation plus some trial-and-error.
My logs are in JSON format. YMMV.
Not sure if this problem is resolved. I was facing the same issue with AWS Java SDK . But when i terminate the thread performing the executeQuery query and perform the get_query_results using a new thread and the old queryId. It seems to be working fine.
Adding sleep will work here. If the query is exceeding the Sleep time then again it will show as Running status. You can write a Loop where you can check the status Completed, if the status is Running you can run Sleep again for some second and retry. You can give some retry count here.
Sample Pseudocode:
function for sleep; (let's say SleepFunc())
Loop till retry count
        check if the status is Completed;
        If yes                break;
        else call               SleepFunc();

Reading return from server.workbooks.refresh using Tableau REST API

This will probably sound stupid, but I have a python script which is trying to refresh a Tableau Extract using a workbook id on Server. I have all the code working just fine and I am even getting the extract to work using the server.workbooks.refresh method passing the workbook id in the call. I am returning the value into a value called "results". The problem is that I want to pull the job id from the results variable and everything I have tried to reference the id within the "result" variable does not work. I keep getting an AttributeError 'JobItem' object has no attribute error.
I have tried to reference the object as a string, as a tuple, as a dictionary, and a list. But I cannot figure out what this object actually is so I can reference the data within it and I cannot find anywhere on the internet that talks about what is returned.
results = server.workbooks.refresh(selected_workbook_id)
print(results)
print("\nThe data of workbook {0} is refreshed.".format(results.name))
Here is the error after the print statement:
<Job#fc62052d-e824-4594-8681-64dbb9a8216c RefreshExtract created_at(2019-11-06 22:18:21+00:00) started_at(None) completed_at(None) progress (None) finish_code(-1)>
https://wnuapesstablu01.dstcorp.net/api/3.4/auth
Traceback (most recent call last):
File "C:\Users\dt24358\Python36\Scripts\Tableau REST API Scripts\Refresh_Single_Extract_v2.py", line 134, in <module>
main()
File "C:\Users\dt24358\Python36\Scripts\Tableau REST API Scripts\Refresh_Single_Extract_v2.py", line 131, in main
print("\nThe data of workbook {0} is refreshed.".format(results.name))
AttributeError: 'JobItem' object has no attribute 'name'
To close this issue out. I realized that I needed to use the right API reference for the JobItem Class. See https://tableau.github.io/server-client-python/docs/api-ref#jobs
Valid references are things like "id", "type", "created_at", "started_at". So the for those who didn't understand this like me, the reference is:
workbook = server.workbooks.get_by_id(selected_workbook_id)
results = server.workbooks.refresh(workbook.id)
print(results)
jobid = results.id
This will return the job id that the refresh task started on. You can then write a routine to poll the server looking to see when the extract job is finished.
Hope this helps someone... It was driving me crazy.

How to translate error messages in Colander

How can I translate the error messages from the colander validators? The documentation just says that it's possible.
def valid_text(node, value):
raise Invalid(node, u"Some error message")
class form(colander.MappingSchema):
name = colander.SchemaNode(colander.String(), validator=valid_text)
I know that deform does it already but I need to use colander on his own.
According to the API documentation, the msg argument to Invalid can be a translation string instance. Information on working with translation strings is here.
Looks like this issue was already addressed and fixed, but it will be part of the next release. I've just added the changes from commit f6be836 and it works like a charm.

Resources