How to load a specific catalog dataset instance in kedro 0.17.0? - hook

We were using kedro version 0.15.8 and we were loading one specific item from the catalog this way:
from kedro.context import load_context
get_context().catalog.datasets.__dict__[key]
Now, we are changing to kedro 0.17.0 and trying to load the catalogs datasets the same way(using the framework context):
from kedro.framework.context import load_context
get_context().catalog.datasets.__dict__[key]
And now we get the error:
kedro.framework.context.context.KedroContextError: Expected an instance of ConfigLoader, got NoneType instead.
It's because the hook register_config_loader from the project it's not being used by the hook_manager that calls the function.
The project hooks are the defined the following way:
class ProjectHooks:
#hook_impl
def register_pipelines(self) -> Dict[str, Pipeline]:
"""Register the project's pipeline.
Returns:
A mapping from a pipeline name to a ``Pipeline`` object.
"""
pm = pre_master.create_pipeline()
return {
"pre_master": pm,
"__default__": pm
}
#hook_impl
def register_config_loader(self, conf_paths: Iterable[str]) -> ConfigLoader:
return ConfigLoader(conf_paths)
#hook_impl
def register_catalog(
self,
catalog: Optional[Dict[str, Dict[str, Any]]],
credentials: Dict[str, Dict[str, Any]],
load_versions: Dict[str, str],
save_version: str,
journal: Journal,
) -> DataCatalog:
return DataCatalog.from_config(
catalog, credentials, load_versions, save_version, journal
)
project_hooks = ProjectHooks()
And the settings are called the following way:
"""Project settings."""
from price_based_trading.hooks import ProjectHooks
HOOKS = (ProjectHooks(),)
How can we configure that in a way that the hooks are used calling the method load_context(_working_dir).catalog.datasets ?
I posted the same question in the kedro community: https://discourse.kedro.community/t/how-to-load-a-specific-catalog-item-in-kedro-0-17-0/310

It was a silly mistake because I was not creating the Kedro session. To load an item of the catalog it can be done with the following code:
from kedro.framework.session import get_current_session
from kedro.framework.session import KedroSession
KedroSession.create("name_of_proyect") as session:
key = "item_of_catalog"
session = get_current_session()
context = session.load_context()
kedro_connector = context.catalog.datasets.__dict__[key]
// or kedro_connector = context.catalog._get_datasets(key)

Related

Subiquity can't establish connection after a new server controller has been added

I'm trying to add a new screen to Subiquity but Subiquity gets stuck every time I run it.
running server pid 40812
connecting... /
To create my controller I used an example from DESIGN.md
My server controller class is as below:
import logging
from subiquity.common.apidef import API
from subiquity.server.controller import SubiquityController
log = logging.getLogger('subiquity.server.controllers.example')
class ExampleController(SubiquityController):
endpoint = API.example
model_name = 'example'
async def GET(self) -> str:
return self.model.thing
async def POST(self, data: str):
self.model.thing = data
await self.configured()
Also, I added my Example controller class to controllers[] in subiquity/server/server.py:
controllers = [
# other classes
"Kernel",
"Keyboard",
"Example",
"Zdev",
"Source",
# some other classes
]
And added example = simple_endpoint(str) to apidef.py
My model as simple as possible:
class ExampleModel:
thing = "example_var"
What might cause the problem? If I remove my Example controller from controllers[], Subiquity works but obviously doesn't use my controller.

Repeated checking of ID path parameter in FastAPI

I have the following route specs:
GET /councils/{id}
PUT /councils/{id}
DELETE /councils/{id}
In all three routes, I have to check in the database whether the council with the id exists, like this:
council = crud.council.get_by_id(db=db, id=id)
if not council:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail='Council not found'
)
Which adds to a lot of boilerplate in the code. Is there any way of reducing this? I have thought of creating a dependency but then I have to write different dependency function for different models in my database. Is there any standard practice for this?
Thanks.
Using a dependency is the way to go - it allows you to extract the logic around "get a valid council from the URL"; you can generalize it further if you want to map /<model>/<id> to always retrieving something from a specific model; however - you might want to validate this against a set of possible values to avoid people trying to make you load random Python identifiers in your models class.
from fastapi import Path
def get_council_from_path(council_id: int = Path(...),
db: Session = Depends(get_db),
):
council = crud.council.get_by_id(db=db, id=id)
if not council:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail='Council not found'
)
return council
#app.get('/councils/{council_id}')
def get_council(council: Council = Depends(get_council_from_path)):
pass
You can generalize the dependency definition to make it reusable (I don't have anything available to test this right now, but the idea should work):
def get_model_from_path(model, db: Session = Depends(get_db)):
def get_model_from_id(id: int = Path(...)):
obj = model.get_by_id(db=db, id=id)
if not council:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail='{type(model)} not found'
)
return get_model_from_id
#app.get('/councils/{id}')
def get_council(council: Council = Depends(get_model_from_path(crud.council)):
pass
Here is one solution that works. FastAPI supports classes as dependencies. Therefore I can have a class like this:
class DbObjIdValidator:
def __init__(self, name: str):
if name not in dir(crud):
raise ValueError('Invalid model name specified')
self.crud = getattr(crud, name)
def __call__(self, db: Session = Depends(get_db), *, id: Any):
db_obj = self.crud.get_by_id(db=db, id=id)
if not db_obj:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail='not found'
)
return db_obj
Considering crud module imports the crud classes for all the ORM models. My example is closely following the fastAPI cookiecutter project by Tiangolo, and the crud design pattern he followed.
Now in order to use it, for example in councils_route.py, I can do the following:
router = APIRouter(prefix='/councils')
council_id_dep = DbObjIdValidator('council') # name must be same as import name
#router.get('/{id}', response_model=schemas.CouncilDB)
def get_council_by_id(
id: UUID,
db: Session = Depends(get_db),
council: Councils = Depends(council_id_dep)
):
return council

Using hydra.main on main method

Summary of the problem:
I'm running a requests call on an API endpoint, whose request params are hidden in a config file and I decided to try out hydra to retrieve those params [Reason being the request params do change as I'm working on collecting custom dataset using RapidAPI]
I have created a class called QueryParamsLocations which implements the getter methods to fetch the parameters to be later used by run_query method.
class QueryParamsLocations(QueryParams):
#hydra.main(config_path='configs', config_name='location_query')
def get_params_query_string(self, cfg: DictConfig) -> dict():
return {
'query': cfg.location_params.query,
'locale': cfg.location_params.locale,
'currency': cfg.location_params.currency
}
#hydra.main(config_path='configs', config_name='location_query')
def get_url(self, cfg: DictConfig) -> str():
return cfg.urls.location_url
#hydra.main(config_path='configs', config_name='location_query')
def get_headers(self, cfg: DictConfig) -> dict():
return {
'X-RapidAPI-Host': cfg.headers.x_rapidapi_host,
'X-RapidAPI-Key': cfg.headers.x_rapidapi_key
}
class QueryParams is an abstract class which has these 3 getter templates. run_query method is an external call to run the request.
#hydra.main(config_path='configs', config_name='location_query')
def run_query(cfg: DictConfig) -> None:
try:
LoggerFactory.get_logger('logs/logger.log', 'INFO').info('Running query for location')
qpl = QueryParamsLocations()
response = requests.request("GET", qpl.get_url(cfg), headers=qpl.get_headers(cfg), params=qpl.get_params_query_string(cfg))
print(response.json())
except Exception as e:
LoggerFactory.get_logger('logs/logger.log',
'ERROR').error(f'Error in running query: {e}')
run_query()
While running run_query without if name == 'main': and with it as well , the following error is encountered :
[2022-05-16 13:43:32,614][logs/logger.log][ERROR] - Error in running query: **decorated_main()** takes from 0 to 1 positional arguments but 2 were given
Although newer version of hydra (I'm using hydra-core==1.1.2) uses two arguments while creating cfg object however , I'm not sure as to whether there's other way of handling this as such.
Also, by searching through other threads, following was also tried - Compose API
however, from the docs, it requires an override parameter , which is not needed atm.
Would like to know if any other approach can be tried out. Happy to provide more details if needed.
Definitely use the Compose API and not hydra.main() for this use case.
You can just pass an empty array for your override list if you have nothing to override.

How do I properly set up a single SQLAlchemy session for each unit test?

When testing my Pyramid application using WebTest, I have not been able to create/use a separate Session in my tests without getting warnings about a scoped session already being present.
Here is the main() function of the Pyramid application, which is where the database is configured.
# __init__.py of Pyramid application
from pyramid_sqlalchemy import init_sqlalchemy
from sqlalchemy import create_engine
def main(global_config, **settings):
...
db_url = 'some-url'
engine = create_engine(db_url)
init_sqlalchemy(engine) # Warning thrown here.
Here is the test code.
# test.py (Functional tests)
import transaction
from unittest import TestCase
from pyramid.paster import get_appsettings
from pyramid_sqlalchemy import init_sqlalchemy, Session
from sqlalchemy import create_engine
from webtest import TestApp
from app import main
from app.models.users import User
class BaseTestCase(TestCase):
def base_set_up(self):
# Create app using WebTest
settings = get_appsettings('test.ini', name='main')
app = main({}, **settings)
self.test_app = TestApp(app)
# Create session for tests.
db_url = 'same-url-as-above'
engine = create_engine(db_url)
init_sqlalchemy(engine)
# Note: I've tried both using pyramid_sqlalchemy approach here and
# creating a "plain, old" SQLAlchemy session here using sessionmaker.
def base_tear_down(self):
Session.remove()
class MyTests(BaseTestCase):
def setUp(self):
self.base_set_up()
with transaction.manager:
self.user = User('user#email.com', 'John', 'Smith')
Session.add(self.user)
Session.flush()
Session.expunge_all()
...
def tearDown(self):
self.base_tear_down()
def test_1(self):
# This is a typical workflow on my tests.
response = self.test_app.patch_json('/users/{0}'.format(self.user.id), {'email': 'new.email#email.com')
self.assertEqual(response.status_code, 200)
user = Session.query(User).filter_by(id=self.user.id).first()
self.assertEqual(user.email, 'new.email#email.com')
...
def test_8(self):
...
Running the tests gives me 8 passed, 7 warnings, where every test except the first one gives the following warning:
From Pyramid application: __init__.py -> main -> init_sqlalchemy(engine):
sqlalchemy.exc.SAWarning: At least one scoped session is already present. configure() can not affect sessions that have already been created.
If this is of any use, I believe I am seeing the same issue here, except I am using pyramid_sqlalchemy rather than creating my own DBSession.
https://github.com/Pylons/webtest/issues/5
Answering my own question: I'm not sure if this is the best approach, but one that worked for me.
Instead of trying to create a separate session within my tests, I am instead using the pyramid_sqlalchemy Session factory, which is configured in the application. As far as I can tell, calls to Session in both test and application code return the same registered scoped_session.
My original intent with creating a separate session for my tests was to confirm that records were being written to the database, and not just updated in the active SQLAlchemy session. With this new approach, I've managed to avoid these "caching" issues by issuing Session.expire_all() at points in the tests where I transition between test transactions and application transactions.
# test.py (Functional tests)
import transaction
from unittest import TestCase
from pyramid.paster import get_appsettings
from pyramid_sqlalchemy import Session
from webtest import TestApp
from app import main
from app.models.users import User
class BaseTestCase(TestCase):
def base_set_up(self):
# Create app using WebTest
settings = get_appsettings('test.ini', name='main')
app = main({}, **settings)
self.test_app = TestApp(app)
# Don't set up an additional session in the tests. Instead import
# and use pyramid_sqlalchemy.Session, which is set up in the application.
def base_tear_down(self):
Session.remove()
class MyTests(BaseTestCase):
def setUp(self):
self.base_set_up()
with transaction.manager:
self.user = User('user#email.com', 'John', 'Smith')
Session.add(self.user)
Session.flush()
Session.expunge_all()
Session.expire_all() # "Reset" your session.
def tearDown(self):
self.base_tear_down()

How to get a user by email in JIRA Script Runner

When writing a Groovy script for JIRA Script Runner, how do you get a user, or just their username, given their email address?
It seems that you're supposed to use the findUsersByEmail method in the UserSearchService interface.
https://docs.atlassian.com/jira/7.0.2/com/atlassian/jira/bc/user/search/UserSearchService.html
But how do you get an instance of this class?
Related question: How to get a user by email in a JIRA plugin.
The difference is that question is about a plugin, and my question is about JIRA Script Runner.
This code does not work:
setUserProperties(httpMethod: "POST", groups: ["jira-administrators"])
{ MultivaluedMap queryParams, String body, HttpServletRequest request ->
def userPropertyManager = ComponentAccessor.getUserPropertyManager()
def userManager = ComponentAccessor.getUserManager()
def userSearchService = DefaultUserPickerSearchService;
def users = userSearchService.findUsersByEmail("felicity.smoak#queenconsolidated.com")
users.each {
aUser ->
userPropertyManager.getPropertySet(aUser).setString("jira.meta.Company", "Smoak Technologies")
}
return Response.ok(users).build();
}
This is the error I got:
2016-04-18 15:23:06,168 ERROR [common.UserCustomScriptEndpoint]: *************************************************************************************
2016-04-18 15:23:06,168 ERROR [common.UserCustomScriptEndpoint]: Script endpoint failed on method: POST setUserProperties
groovy.lang.MissingMethodException: No signature of method: static com.atlassian.jira.bc.user.search.DefaultUserPickerSearchService.findUsersByEmail() is applicable for argument types: (java.lang.String) values: [felicity.smoak#queenconsolidated.com]
Possible solutions: findUsersByEmail(java.lang.String), findUserKeysByEmail(java.lang.String)
at Script462$_run_closure3.doCall(Script462.groovy:40)
at com.onresolve.scriptrunner.runner.rest.common.UserCustomScriptEndpoint.doEndpoint(UserCustomScriptEndpoint.groovy:308)
at com.onresolve.scriptrunner.runner.rest.common.UserCustomScriptEndpoint.postUserEndpoint(UserCustomScriptEndpoint.groovy:208)
EDIT
Based on #Oldskultxo's and #BjörnKautler suggestions, this is now my working code:
import com.atlassian.jira.component.ComponentAccessor
import com.atlassian.jira.ComponentManager
import com.atlassian.jira.user.*
import com.atlassian.jira.bc.user.search.UserSearchService
import com.atlassian.sal.api.user.UserManager
import com.onresolve.scriptrunner.runner.rest.common.CustomEndpointDelegate
import groovy.json.*
import groovy.transform.BaseScript
import javax.servlet.http.HttpServletRequest
import javax.ws.rs.core.MultivaluedMap
import javax.ws.rs.core.Response
#BaseScript CustomEndpointDelegate delegate
setUserProperties(httpMethod: "POST", groups: ["jira-administrators"])
{ MultivaluedMap queryParams, String body, HttpServletRequest request ->
def userPropertyManager = ComponentAccessor.getUserPropertyManager()
def userManager = ComponentAccessor.getUserManager()
def userSearchService = ComponentAccessor.getComponent(UserSearchService.class)
def users = userSearchService.findUsersByEmail("felicity.smoak#queenconsolidated.com")
users.each {
aUser ->
userPropertyManager.getPropertySet(aUser).setString("jira.meta.Company", "Smoak Technologies")
}
return Response.ok("200").build();
}
Use ComponentAccessor.getComponent(UserSearchService) to get the right service if there is no concrete getUserSearchService() method.
I usually get components this way:
ComponentManager.getComponentInstanceOfType(UserSearchService.class);
And then just look for its methods.
Regards

Resources