DBT workflow on Databricks fails: AttributeError in object SeedNode - databricks

Today our DBT workflow in databricks failed. The workflow runs as:
dbt run --target workflow --project-dir dbt/projectdir/ --profiles-dir dbt/
Any suggestions what could be wrong or how to fix it?
Version reported in Databricks logs:
Running with dbt=1.4.1
The error message below:
'SeedNode' object has no attribute 'depends_on'
09:59:17 Traceback (most recent call last):
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/main.py", line 135, in main
results, succeeded = handle_and_check(args)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/main.py", line 198, in handle_and_check
task, res = run_from_args(parsed)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/main.py", line 245, in run_from_args
results = task.run()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/task/runnable.py", line 454, in run
self._runtime_initialize()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/task/runnable.py", line 165, in _runtime_initialize
super()._runtime_initialize()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/task/runnable.py", line 94, in _runtime_initialize
self.load_manifest()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/task/runnable.py", line 81, in load_manifest
self.manifest = ManifestLoader.get_full_manifest(self.config)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/parser/manifest.py", line 203, in get_full_manifest
manifest = loader.load()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/parser/manifest.py", line 339, in load
self.parse_project(
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/parser/manifest.py", line 467, in parse_project
parser.parse_file(block)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/parser/base.py", line 425, in parse_file
self.parse_node(file_block)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/parser/base.py", line 386, in parse_node
self.render_update(node, config)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/parser/base.py", line 363, in render_update
self.update_parsed_node_config(node, config, context=context)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/parser/base.py", line 336, in update_parsed_node_config
get_rendered(hook.sql, context, parsed_node, capture_macros=True)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/clients/jinja.py", line 590, in get_rendered
return render_template(template, ctx, node)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/clients/jinja.py", line 545, in render_template
return template.render(ctx)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/jinja2/environment.py", line 1301, in render
self.environment.handle_exception()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/jinja2/environment.py", line 936, in handle_exception
raise rewrite_traceback_stack(source=source)
File "", line 1, in top-level template code
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/jinja2/sandbox.py", line 393, in call
return __context.call(__obj, *args, **kwargs)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/clients/jinja.py", line 328, in call
with self.track_call():
File "/usr/lib/python3.9/contextlib.py", line 117, in enter
return next(self.gen)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/clients/jinja.py", line 319, in track_call
self.node.depends_on.add_macro(unique_id)
AttributeError: 'SeedNode' object has no attribute 'depends_on'

Got the same issue but I am on snowflake

Seems this was a version issue. Explicitly setting the task to use an older version seems to have solved it:
dbt-core<=1.3.1
dbt-databricks<=1.3.1
This can be set in the Databricks workflow task settings.
I'm not sure which is the last version that would work, but 1.3.1 at least works in our case.

Related

Getting an error while trying to create a SuperUser for Netbox thru Ubuntu, Whats my solve?

I'm trying to get an instance of Netbox setup. I'm at the step where I need to create a super user.
As per documentation, I'm running source /opt/netbox/venv/bin/activate
and confirm i'm in the venv
Followed by python3 manage.py createsuperuser
What I get in response is
`You have 167 unapplied migration(s). Your project may not work properly until you apply the migrations for app(s): admin, auth, circuits, contenttypes, dcim, django_rq, extras, ipam, sessions, social_django, taggit, tenancy, users, virtualization, wireless.
Run 'python manage.py migrate' to apply them.
Traceback (most recent call last):
File "/opt/netbox/venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 89, in _execute
return self.cursor.execute(sql, params)
psycopg2.errors.UndefinedTable: relation "auth_user" does not exist
LINE 1: ...user"."is_active", "auth_user"."date_joined" FROM "auth_user...
^
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/netbox/netbox/manage.py", line 10, in <module>
execute_from_command_line(sys.argv)
File "/opt/netbox/venv/lib/python3.10/site-packages/django/core/management/init.py", line 446, in execute_from_command_line
utility.execute()
File "/opt/netbox/venv/lib/python3.10/site-packages/django/core/management/init.py", line 440, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/opt/netbox/venv/lib/python3.10/site-packages/django/core/management/base.py", line 402, in run_from_argv
self.execute(*args, **cmd_options)
File "/opt/netbox/venv/lib/python3.10/site-packages/django/contrib/auth/management/commands/createsuperuser.py", line 88, in execute
return super().execute(*args, **options)
File "/opt/netbox/venv/lib/python3.10/site-packages/django/core/management/base.py", line 448, in execute
output = self.handle(*args, **options)
File "/opt/netbox/venv/lib/python3.10/site-packages/django/contrib/auth/management/commands/createsuperuser.py", line 109, in handle
default_username = get_default_username(database=database)
File "/opt/netbox/venv/lib/python3.10/site-packages/django/contrib/auth/management/init.py", line 163, in get_default_username
auth_app.User._default_manager.db_manager(database).get(
File "/opt/netbox/venv/lib/python3.10/site-packages/django/db/models/manager.py", line 85, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/opt/netbox/venv/lib/python3.10/site-packages/django/db/models/query.py", line 646, in get
num = len(clone)
File "/opt/netbox/venv/lib/python3.10/site-packages/django/db/models/query.py", line 376, in len
self._fetch_all()
File "/opt/netbox/venv/lib/python3.10/site-packages/django/db/models/query.py", line 1867, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File "/opt/netbox/venv/lib/python3.10/site-packages/django/db/models/query.py", line 87, in iter
results = compiler.execute_sql(
File "/opt/netbox/venv/lib/python3.10/site-packages/django/db/models/sql/compiler.py", line 1398, in execute_sql
cursor.execute(sql, params)
File "/opt/netbox/venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 67, in execute
return self._execute_with_wrappers(
File "/opt/netbox/venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 80, in _execute_with_wrappers
return executor(sql, params, many, context)
File "/opt/netbox/venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 84, in _execute
with self.db.wrap_database_errors:
File "/opt/netbox/venv/lib/python3.10/site-packages/django/db/utils.py", line 91, in exit
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/opt/netbox/venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 89, in _execute
return self.cursor.execute(sql, params)
django.db.utils.ProgrammingError: relation "auth_user" does not exist
LINE 1: ...user"."is_active", "auth_user"."date_joined" FROM "auth_user...`
Originally I was getting an error with my authorized users where I had forgot to put it in quotes. Fixed that, and this was the next error to come out.
I found the line in question, but I'm just not sure how I should change it to pass this command successfully?
See this part of your output:
You have 167 unapplied migration(s). Your project may not work properly until you apply the migrations for app(s): admin, auth, circuits, contenttypes, dcim, django_rq, extras, ipam, sessions, social_django, taggit, tenancy, users, virtualization, wireless. Run 'python manage.py migrate' to apply them.
Try applying your django migrations as prompted:
python manage.py migrate
This will install the necessary database tables where your new superuser will be stored.

Creating an Expectation Suite With an Automated Profiler Great Expectation

I am a newbie to great expectations and trying to set up but facing the below issue while creating an expectation Suite with an Automated Profiler.
C:\Users\user\great_expectations>great_expectations --v3-api suite new
Using v3 (Batch Request) API
How would you like to create your Expectation Suite?
1. Manually, without interacting with a sample batch of data (default)
2. Interactively, with a sample batch of data
3. Automatically, using a profiler
: 3
A batch of data is required to edit the suite - let's help you to specify it.
Traceback (most recent call last):
File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\Users\user\AppData\Local\Programs\Python\Python310\Scripts\great_expectations.exe\__main__.py", line 7, in <module>
File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\great_expectations\cli\cli.py", line 190, in main
cli()
File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\click\core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\click\core.py", line 1055, in main
rv = self.invoke(ctx)
File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\click\core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\click\core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\click\core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\click\core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\click\decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\great_expectations\cli\suite.py", line 151, in suite_new
_suite_new_workflow(
File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\great_expectations\cli\suite.py", line 335, in _suite_new_workflow
raise e
File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\great_expectations\cli\suite.py", line 268, in _suite_new_workflow
suite: ExpectationSuite = toolkit.get_or_create_expectation_suite(
File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\great_expectations\cli\toolkit.py", line 82, in get_or_create_expectation_suite
default_expectation_suite_name: str = get_default_expectation_suite_name(
File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\great_expectations\cli\toolkit.py", line 131, in get_default_expectation_suite_name
suite_name = f"batch-{BatchRequest(**batch_request).id}"
TypeError: BatchRequest.__init__() missing 1 required positional argument: 'data_asset_name'
C:\Users\user\great_expectations>
I had the same issue and, for me, the problem came from a badly configured data source. What I suggest you to do is to test your data source config and see how many datasets it found:
from ruamel import yaml
import great_expectations as ge
context = ge.get_context()
datasource_config = {...}
context.test_yaml_config(yaml.dump(datasource_config))
When running this, the test_yaml_config will output a report on how many assets it found.
If it didn't find any, then you'll run into the issue you're describing when you'll try to create a suite on your data.

JupyterLab/Elyra: pipeline run on Kubeflow Pipelines fails with "No host specified" in local deployment

I have Kubeflow Pipelines running in my local environment, along with JupyterLab and the Elyra extensions. I've created a notebook pipeline and configured the runtime configuration as follows, setting api_endpoint to http://localhost:31380/pipeline (with security disabled). Trying to run the pipeline the following error message is displayed:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/tornado/web.py", line 1703, in _execute
result = await result
File "/usr/local/lib/python3.8/site-packages/elyra/pipeline/handlers.py", line 89, in post
response = await PipelineProcessorManager.instance().process(pipeline)
File "/usr/local/lib/python3.8/site-packages/elyra/pipeline/processor.py", line 70, in process
res = await asyncio.get_event_loop().run_in_executor(None, processor.process, pipeline)
File "/usr/local/Cellar/python#3.8/3.8.6/Frameworks/Python.framework/Versions/3.8/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.8/site-packages/elyra/pipeline/processor_kfp.py", line 100, in process
raise lve
File "/usr/local/lib/python3.8/site-packages/elyra/pipeline/processor_kfp.py", line 89, in process
client.upload_pipeline(pipeline_path,
File "/usr/local/lib/python3.8/site-packages/kfp/_client.py", line 720, in upload_pipeline
response = self._upload_api.upload_pipeline(pipeline_package_path, name=pipeline_name, description=description)
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/api/pipeline_upload_service_api.py", line 83, in upload_pipeline
return self.upload_pipeline_with_http_info(uploadfile, **kwargs) # noqa: E501
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/api/pipeline_upload_service_api.py", line 177, in upload_pipeline_with_http_info
return self.api_client.call_api(
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/api_client.py", line 378, in call_api
return self.__call_api(resource_path, method,
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/api_client.py", line 195, in __call_api
response_data = self.request(
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/api_client.py", line 421, in request
return self.rest_client.POST(url,
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/rest.py", line 279, in POST
return self.request("POST", url,
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/rest.py", line 196, in request
r = self.pool_manager.request(
File "/usr/local/lib/python3.8/site-packages/urllib3/request.py", line 79, in request
return self.request_encode_body(
File "/usr/local/lib/python3.8/site-packages/urllib3/request.py", line 171, in request_encode_body
return self.urlopen(method, url, **extra_kw)
File "/usr/local/lib/python3.8/site-packages/urllib3/poolmanager.py", line 325, in urlopen
conn = self.connection_from_host(u.host, port=u.port, scheme=u.scheme)
File "/usr/local/lib/python3.8/site-packages/urllib3/poolmanager.py", line 231, in connection_from_host
raise LocationValueError("No host specified.")
urllib3.exceptions.LocationValueError: No host specified.
The root cause is an issue in the Kubeflow Pipelines kfp package version 1.0.0 that is distributed with Elyra v1.4.1 (and lower). To work around the issue, replace localhost with 127.0.0.1 in your runtime configuration, e.g. http://127.0.0.1:31380/pipeline.
Edit: With the availability of Elyra v1.5+, which requires a more recent version of the kfp package, you can also upgrade Elyra to resolve the issue.

MLFlow Projects throw JSONDecode error when run

I'm trying to get MLFlow Projects to run using the MLFlow CLI and its following the tutorial leads to an error. For any project I try to run from the CLI, I get the following error
Traceback (most recent call last):
File "/home/rbc/.local/bin/mlflow", line 11, in <module>
sys.exit(cli())
File "/home/rbc/.local/lib/python3.6/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/home/rbc/.local/lib/python3.6/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home/rbc/.local/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/rbc/.local/lib/python3.6/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/rbc/.local/lib/python3.6/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/cli.py", line 139, in run
run_id=run_id,
File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/projects/__init__.py", line 230, in run
storage_dir=storage_dir, block=block, run_id=run_id)
File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/projects/__init__.py", line 88, in _run
active_run = _create_run(uri, experiment_id, work_dir, entry_point)
File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/projects/__init__.py", line 579, in _create_run
active_run = tracking.MlflowClient().create_run(experiment_id=experiment_id, tags=tags)
File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/tracking/client.py", line 101, in create_run
source_version=source_version
File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/store/rest_store.py", line 156, in create_run
response_proto = self._call_endpoint(CreateRun, req_body)
File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/store/rest_store.py", line 66, in _call_endpoint
js_dict = json.loads(response.text)
File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Here's an example of the type of command I'm using to start the run, which comes directly from the tutorial
mlflow run https://github.com/mlflow/mlflow#examples/sklearn_elasticnet_wine -m databricks -c cluster-spec.json --experiment-id 72647065958042 -P alpha=2.0 -P l1_ratio=0.5
I've traced the error to something involving MLFLow returning empty when it tries to start a run but I can successfully run MLFlow experiments using the Databricks environment I'm connecting to so I'm not sure where the problem is, I'm running MLFlow 0.9.1 on Ubuntu 18.04
not sure if you have solved your issue, but here is how I fixed it:
the databricks-cli work with the following config without problem:
host = https://xxx.databricks.net/?o=<org_id>
token=dapixxx
but mlflow not quit happy about that, change it to:
host = https://xxx.databricks.net
username = token
password = dapixxx

canĀ“t start ocsmanager in openchange

I compiled and installed openchange. Started samba and then:
make ocsmanager-install
make pyopenchange
make pyopenchange-install
All ok without errors, but when I try to run
paster serve /etc/ocsmanager/ocsmanager.ini --pid-file /var/run/ocsmanager.pid --log-file /var/log/ocsmanager.log
It aborts and logs this (in log file):
Traceback (most recent call last):
File "/usr/bin/paster", line 9, in <module>
load_entry_point('PasteScript==1.7.5', 'console_scripts', 'paster')()
File "/usr/lib/python2.7/site-packages/paste/script/command.py", line 104, in run
invoke(command, command_name, options, args[1:])
File "/usr/lib/python2.7/site-packages/paste/script/command.py", line 143, in invoke
exit_code = runner.run(args)
File "/usr/lib/python2.7/site-packages/paste/script/command.py", line 238, in run
result = self.command()
File "/usr/lib/python2.7/site-packages/paste/script/serve.py", line 284, in command
relative_to=base, global_conf=vars)
File "/usr/lib/python2.7/site-packages/paste/script/serve.py", line 321, in loadapp
**kw)
File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 247, in loadapp
return loadobj(APP, uri, name=name, **kw)
File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 272, in loadobj
return context.create()
File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 710, in create
return self.object_type.invoke(self)
File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 146, in invoke
return fix_call(context.object, context.global_conf, **context.local_conf)
File "/usr/lib/python2.7/site-packages/paste/deploy/util.py", line 56, in fix_call
val = callable(*args, **kw)
File "/usr/local/samba/lib/python2.7/site-packages/ocsmanager/config/middleware.py", line 43, in make_app
config = load_environment(global_conf, app_conf)
File "/usr/local/samba/lib/python2.7/site-packages/ocsmanager/config/environment.py", line 157, in load_environment
mstore = mapistore.MAPIStore(config['ocsmanager']['main']['mapistore_root'])
SystemError: error in mapistore_init
Removing PID file /var/run/ocsmanager.pid
My /etc/ocsmanager/ocsmanager.conf is the installed by make scripts with the only change of the location of mapistore:
mapistore_root = /usr/local/samba/private
mapistore_data = /usr/local/samba/private/mapistore

Resources