Cannot create Jupyter Notebook in HDInsight 4.0 - apache-spark

I'm using Azure HDInsight 4.0 (Spark 2.4). When I attempt to create a new Jupyter notebook (Spark, but I get a similar error for PySpark notebooks), I get the following error message:
Traceback (most recent call last): File "/usr/bin/anaconda/lib/python2.7/site-packages/notebook/base/handlers.py", line 457, in wrapper result = yield gen.maybe_future(method(self, *args, **kwargs)) File "/usr/bin/anaconda/lib/python2.7/site-packages/tornado/gen.py", line 1015, in run value = future.result() File "/usr/bin/anaconda/lib/python2.7/site-packages/tornado/concurrent.py", line 237, in result raise_exc_info(self._exc_info) File "/usr/bin/anaconda/lib/python2.7/site-packages/tornado/gen.py", line 1021, in run yielded = self.gen.throw(*exc_info) File "/usr/bin/anaconda/lib/python2.7/site-packages/notebook/services/contents/handlers.py", line 216, in post yield self._new_untitled(path, type=type, ext=ext) File "/usr/bin/anaconda/lib/python2.7/site-packages/tornado/gen.py", line 1015, in run value = future.result() File "/usr/bin/anaconda/lib/python2.7/site-packages/tornado/concurrent.py", line 237, in result raise_exc_info(self._exc_info) File "/usr/bin/anaconda/lib/python2.7/site-packages/tornado/gen.py", line 285, in wrapper yielded = next(result) File "/usr/bin/anaconda/lib/python2.7/site-packages/notebook/services/contents/handlers.py", line 171, in _new_untitled model = yield gen.maybe_future(self.contents_manager.new_untitled(path=path, type=type, ext=ext)) File "/usr/bin/anaconda/lib/python2.7/site-packages/notebook/services/contents/manager.py", line 338, in new_untitled return self.new(model, path) File "/usr/bin/anaconda/lib/python2.7/site-packages/notebook/services/contents/manager.py", line 364, in new model = self.save(model, path) File "/var/lib/.jupyter/jupyterazure/jupyterazure/httpfscontentsmanager.py", line 84, in save self.create_checkpoint(path) File "/usr/bin/anaconda/lib/python2.7/site-packages/notebook/services/contents/manager.py", line 459, in create_checkpoint return self.checkpoints.create_checkpoint(self, path) File "/usr/bin/anaconda/lib/python2.7/site-packages/notebook/services/contents/checkpoints.py", line 79, in create_checkpoint model = contents_mgr.get(path, content=True) File "/var/lib/.jupyter/jupyterazure/jupyterazure/httpfscontentsmanager.py", line 56, in get 'metadata': {}}) File "/var/lib/.jupyter/jupyterazure/jupyterazure/model.py", line 45, in create_model_from_blob nbformat.version_info[0]) File "/usr/bin/anaconda/lib/python2.7/site-packages/nbformat/__init__.py", line 75, in reads nb = convert(nb, as_version) File "/usr/bin/anaconda/lib/python2.7/site-packages/nbformat/converter.py", line 54, in convert "version doesn't exist" % (to_version)) ValueError: Cannot convert notebook to v5 because that version doesn't exist
After this, a new notebook does appear on the home screen, but if I try to open it I get the following popup message:
An unknown error occurred while loading this notebook. This version can load notebook formats v4 or earlier. See the server log for details.
I can create a notebook just fine on an otherwise-identical HDI 3.6 cluster, but not on 4.0. (I need 4.0 because I need to use Spark 2.4.)
Has anyone experienced/resolved this before?

Recently, we have seen couple of questions on the same issue. You may follow the below steps to resolve the issue.
Steps to resolve this issue:
Step1: Connect to headnode via ssh and change content of file - /usr/bin/anaconda/lib/python2.7/site-packages/nbformat/_version.py, replace 5 to 4.
Change this to:
version_info = (4, 0, 3)
Step2: Restart Jupyter service via Ambari.
For more details, refer HDInshight Create not create Jupyter notebook
Hope this helps. Do let us know if you any further queries.

Related

DBT workflow on Databricks fails: AttributeError in object SeedNode

Today our DBT workflow in databricks failed. The workflow runs as:
dbt run --target workflow --project-dir dbt/projectdir/ --profiles-dir dbt/
Any suggestions what could be wrong or how to fix it?
Version reported in Databricks logs:
Running with dbt=1.4.1
The error message below:
'SeedNode' object has no attribute 'depends_on'
09:59:17 Traceback (most recent call last):
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/main.py", line 135, in main
results, succeeded = handle_and_check(args)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/main.py", line 198, in handle_and_check
task, res = run_from_args(parsed)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/main.py", line 245, in run_from_args
results = task.run()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/task/runnable.py", line 454, in run
self._runtime_initialize()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/task/runnable.py", line 165, in _runtime_initialize
super()._runtime_initialize()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/task/runnable.py", line 94, in _runtime_initialize
self.load_manifest()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/task/runnable.py", line 81, in load_manifest
self.manifest = ManifestLoader.get_full_manifest(self.config)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/parser/manifest.py", line 203, in get_full_manifest
manifest = loader.load()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/parser/manifest.py", line 339, in load
self.parse_project(
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/parser/manifest.py", line 467, in parse_project
parser.parse_file(block)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/parser/base.py", line 425, in parse_file
self.parse_node(file_block)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/parser/base.py", line 386, in parse_node
self.render_update(node, config)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/parser/base.py", line 363, in render_update
self.update_parsed_node_config(node, config, context=context)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/parser/base.py", line 336, in update_parsed_node_config
get_rendered(hook.sql, context, parsed_node, capture_macros=True)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/clients/jinja.py", line 590, in get_rendered
return render_template(template, ctx, node)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/clients/jinja.py", line 545, in render_template
return template.render(ctx)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/jinja2/environment.py", line 1301, in render
self.environment.handle_exception()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/jinja2/environment.py", line 936, in handle_exception
raise rewrite_traceback_stack(source=source)
File "", line 1, in top-level template code
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/jinja2/sandbox.py", line 393, in call
return __context.call(__obj, *args, **kwargs)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/clients/jinja.py", line 328, in call
with self.track_call():
File "/usr/lib/python3.9/contextlib.py", line 117, in enter
return next(self.gen)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/dbt/clients/jinja.py", line 319, in track_call
self.node.depends_on.add_macro(unique_id)
AttributeError: 'SeedNode' object has no attribute 'depends_on'
Got the same issue but I am on snowflake
Seems this was a version issue. Explicitly setting the task to use an older version seems to have solved it:
dbt-core<=1.3.1
dbt-databricks<=1.3.1
This can be set in the Databricks workflow task settings.
I'm not sure which is the last version that would work, but 1.3.1 at least works in our case.

Why am I not able to create table in SQLyog using pycharm, while pycharm is connecting to SQLyog?

I was trying to connect and make tables in database using python. I was using pycharm for that. I successfully connect pycharm with SQLyog database.
import pymysql
def CreateConn():
return pymysql.connect(host="localhost",database="myfirstDB",user="root",password="",port=3306)
CreateConn()
but as I was trying to make table by code , it shows some lines of error which I don't understand. I changed SQL database engine to SQLite and also tried to changed IDE to jupyter, still it shows error don't know why.
I tried below code for table creation in SQLyog;
def CreateTable():
conn=CreateConn()
cursor=conn.cursor() #helping to execute your query
query="create table student(sid int primary key auto_increment,name VARCHAR(50),email VARCHAR(50),city VARCHAR(50)"
cursor.execute(query)
conn.commit()
print("table created")
conn.close()
CreateTable()
I expected below table in SQLyog database;
expected result of above code in SQLyog database using pycharm
what is got as a result is below error lines;
Traceback (most recent call last):
File "C:\Users\asus\PycharmProjects\pythonProject\Database\Database.py", line 29, in <module>
CreateTable() #CALLING CREATE TABLE FUNCTION
File "C:\Users\asus\PycharmProjects\pythonProject\Database\Database.py", line 24, in CreateTable
cursor.execute(query)
File "C:\Users\asus\PycharmProjects\pythonProject\venv\lib\site-packages\pymysql\cursors.py", line 148, in execute
result = self._query(query)
File "C:\Users\asus\PycharmProjects\pythonProject\venv\lib\site-packages\pymysql\cursors.py", line 310, in _query
conn.query(q)
File "C:\Users\asus\PycharmProjects\pythonProject\venv\lib\site-packages\pymysql\connections.py", line 548, in query
self._affected_rows = self._read_query_result(unbuffered=unbuffered)
File "C:\Users\asus\PycharmProjects\pythonProject\venv\lib\site-packages\pymysql\connections.py", line 775, in _read_query_result
result.read()
File "C:\Users\asus\PycharmProjects\pythonProject\venv\lib\site-packages\pymysql\connections.py", line 1156, in read
first_packet = self.connection._read_packet()
File "C:\Users\asus\PycharmProjects\pythonProject\venv\lib\site-packages\pymysql\connections.py", line 725, in _read_packet
packet.raise_for_error()
File "C:\Users\asus\PycharmProjects\pythonProject\venv\lib\site-packages\pymysql\protocol.py", line 221, in raise_for_error
err.raise_mysql_exception(self._data)
File "C:\Users\asus\PycharmProjects\pythonProject\venv\lib\site-packages\pymysql\err.py", line 143, in raise_mysql_exception
raise errorclass(errno, errval)
pymysql.err.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1")

JupyterLab/Elyra: pipeline run on Kubeflow Pipelines fails with "No host specified" in local deployment

I have Kubeflow Pipelines running in my local environment, along with JupyterLab and the Elyra extensions. I've created a notebook pipeline and configured the runtime configuration as follows, setting api_endpoint to http://localhost:31380/pipeline (with security disabled). Trying to run the pipeline the following error message is displayed:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/tornado/web.py", line 1703, in _execute
result = await result
File "/usr/local/lib/python3.8/site-packages/elyra/pipeline/handlers.py", line 89, in post
response = await PipelineProcessorManager.instance().process(pipeline)
File "/usr/local/lib/python3.8/site-packages/elyra/pipeline/processor.py", line 70, in process
res = await asyncio.get_event_loop().run_in_executor(None, processor.process, pipeline)
File "/usr/local/Cellar/python#3.8/3.8.6/Frameworks/Python.framework/Versions/3.8/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.8/site-packages/elyra/pipeline/processor_kfp.py", line 100, in process
raise lve
File "/usr/local/lib/python3.8/site-packages/elyra/pipeline/processor_kfp.py", line 89, in process
client.upload_pipeline(pipeline_path,
File "/usr/local/lib/python3.8/site-packages/kfp/_client.py", line 720, in upload_pipeline
response = self._upload_api.upload_pipeline(pipeline_package_path, name=pipeline_name, description=description)
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/api/pipeline_upload_service_api.py", line 83, in upload_pipeline
return self.upload_pipeline_with_http_info(uploadfile, **kwargs) # noqa: E501
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/api/pipeline_upload_service_api.py", line 177, in upload_pipeline_with_http_info
return self.api_client.call_api(
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/api_client.py", line 378, in call_api
return self.__call_api(resource_path, method,
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/api_client.py", line 195, in __call_api
response_data = self.request(
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/api_client.py", line 421, in request
return self.rest_client.POST(url,
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/rest.py", line 279, in POST
return self.request("POST", url,
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/rest.py", line 196, in request
r = self.pool_manager.request(
File "/usr/local/lib/python3.8/site-packages/urllib3/request.py", line 79, in request
return self.request_encode_body(
File "/usr/local/lib/python3.8/site-packages/urllib3/request.py", line 171, in request_encode_body
return self.urlopen(method, url, **extra_kw)
File "/usr/local/lib/python3.8/site-packages/urllib3/poolmanager.py", line 325, in urlopen
conn = self.connection_from_host(u.host, port=u.port, scheme=u.scheme)
File "/usr/local/lib/python3.8/site-packages/urllib3/poolmanager.py", line 231, in connection_from_host
raise LocationValueError("No host specified.")
urllib3.exceptions.LocationValueError: No host specified.
The root cause is an issue in the Kubeflow Pipelines kfp package version 1.0.0 that is distributed with Elyra v1.4.1 (and lower). To work around the issue, replace localhost with 127.0.0.1 in your runtime configuration, e.g. http://127.0.0.1:31380/pipeline.
Edit: With the availability of Elyra v1.5+, which requires a more recent version of the kfp package, you can also upgrade Elyra to resolve the issue.

RobotFramework RIDE not opening

I have installed robotframework and robotframework-ride using pip. All the other required components are also updated. I am using Python 3.7.6 and Windows 10.
When I run RIDE, I get the following error:
Traceback (most recent call last):
File "C:\Python37-32\lib\site-packages\robotide\application\application.py", line 62, in OnInit
self._plugin_loader.enable_plugins()
File "C:\Python37-32\lib\site-packages\robotide\application\pluginloader.py", line 43, in enable_plugins
p.enable_on_startup()
File "C:\Python37-32\lib\site-packages\robotide\application\pluginconnector.py", line 52, in enable_on_startup
self.enable()
File "C:\Python37-32\lib\site-packages\robotide\application\pluginconnector.py", line 57, in enable
self._plugin.enable()
File "C:\Python37-32\lib\site-packages\robotide\recentfiles\recentfiles.py", line 44, in enable
self._add_recent_files_to_menu()
File "C:\Python37-32\lib\site-packages\robotide\recentfiles\recentfiles.py", line 114, in _add_recent_files_to_menu
self.register_action(action)
File "C:\Python37-32\lib\site-packages\robotide\pluginapi\plugin.py", line 204, in register_action
action = self.__frame.actions.register_action(action_info)
File "C:\Python37-32\lib\site-packages\robotide\ui\mainframe.py", line 751, in register_action
self._menubar.register(action)
File "C:\Python37-32\lib\site-packages\robotide\ui\actiontriggers.py", line 60, in register
menu.add_menu_item(action)
File "C:\Python37-32\lib\site-packages\robotide\ui\actiontriggers.py", line 98, in add_menu_item
menu_item = self._construct_menu_item(action)
File "C:\Python37-32\lib\site-packages\robotide\ui\actiontriggers.py", line 107, in _construct_menu_item
menu_item = self._create_menu_item(action)
File "C:\Python37-32\lib\site-packages\robotide\ui\actiontriggers.py", line 139, in _create_menu_item
pos = action.get_insertion_index(self.wx_menu)
File "C:\Python37-32\lib\site-packages\robotide\action\action.py", line 40, in get_insertion_index
return self._insertion_point.get_index(menu)
File "C:\Python37-32\lib\site-packages\robotide\action\actioninfo.py", line 286, in get_index
index = self._find_position_in_menu(menu)
File "C:\Python37-32\lib\site-packages\robotide\action\actioninfo.py", line 296, in _find_position_in_menu
if self._get_menu_item_name(item).lower() == self._item.lower():
File "C:\Python37-32\lib\site-packages\robotide\action\actioninfo.py", line 301, in _get_menu_item_name
return self._shortcut_remover.split(item.GetLabel())[0]
AttributeError: 'MenuItem' object has no attribute 'GetLabel'
OnInit returned false, exiting...
Error in atexit._run_exitfuncs:
wx._core.wxAssertionError: C++ assertion "GetEventHandler() == this" failed at ..\..\src\common\wincmn.cpp(475) in wxWindowBase::~wxWindowBase(): any pushed event handlers must have been removed
I can't deduce whether I am doing something wrong. I have installed RIDE previously on this PC and others as well and this is the first time I am running into this error.
Kindly help.
To use newest wxPython version 4.1.0, you will have to install the current development version of RIDE (2.0b1.dev1) from source code, Otherwise, you should install version 4.0.7.post2.
See project page at https://github.com/robotframework/RIDE

MLFlow Projects throw JSONDecode error when run

I'm trying to get MLFlow Projects to run using the MLFlow CLI and its following the tutorial leads to an error. For any project I try to run from the CLI, I get the following error
Traceback (most recent call last):
File "/home/rbc/.local/bin/mlflow", line 11, in <module>
sys.exit(cli())
File "/home/rbc/.local/lib/python3.6/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/home/rbc/.local/lib/python3.6/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home/rbc/.local/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/rbc/.local/lib/python3.6/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/rbc/.local/lib/python3.6/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/cli.py", line 139, in run
run_id=run_id,
File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/projects/__init__.py", line 230, in run
storage_dir=storage_dir, block=block, run_id=run_id)
File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/projects/__init__.py", line 88, in _run
active_run = _create_run(uri, experiment_id, work_dir, entry_point)
File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/projects/__init__.py", line 579, in _create_run
active_run = tracking.MlflowClient().create_run(experiment_id=experiment_id, tags=tags)
File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/tracking/client.py", line 101, in create_run
source_version=source_version
File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/store/rest_store.py", line 156, in create_run
response_proto = self._call_endpoint(CreateRun, req_body)
File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/store/rest_store.py", line 66, in _call_endpoint
js_dict = json.loads(response.text)
File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Here's an example of the type of command I'm using to start the run, which comes directly from the tutorial
mlflow run https://github.com/mlflow/mlflow#examples/sklearn_elasticnet_wine -m databricks -c cluster-spec.json --experiment-id 72647065958042 -P alpha=2.0 -P l1_ratio=0.5
I've traced the error to something involving MLFLow returning empty when it tries to start a run but I can successfully run MLFlow experiments using the Databricks environment I'm connecting to so I'm not sure where the problem is, I'm running MLFlow 0.9.1 on Ubuntu 18.04
not sure if you have solved your issue, but here is how I fixed it:
the databricks-cli work with the following config without problem:
host = https://xxx.databricks.net/?o=<org_id>
token=dapixxx
but mlflow not quit happy about that, change it to:
host = https://xxx.databricks.net
username = token
password = dapixxx

Resources