My ElasticsearchDocumentStore() is not initialized - python-3.x

from haystack.document_stores import ElasticsearchDocumentStore
doc_store= ElasticsearchDocumentStore(host='localhost', username='', password='', index='document')
I'm trying to build Question Answering system and using Haystack document store for indexing the document. But i'm not able to initialize the ElasticsearchDocumentStore module from haystack. While trying to create new index in document store from ElasticSearch Document store, but it throws an error saying: "Mapping definition for [embedding] has unsupported parameters"
WARNING - elasticsearch - PUT http://localhost:9200/document [status:400 request:0.008s]
---------------------------------------------------------------------------
RequestError Traceback (most recent call last)
File ~\anaconda3\lib\site-packages\elasticsearch\connection\base.py:315, in Connection._raise_error(self, status_code, raw_data)
312 except (ValueError, TypeError) as err:
313 logger.warning("Undecodable raw error response from server: %s", err)
--> 315 raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
316 status_code, error_message, additional_info
317 )
[RequestError: RequestError(400, 'mapper_parsing_exception', 'Mapping definition for [embedding] has unsupported parameters: [dims : 768]')][1]

Dense vector support has been introduced to Elasticsearch in version 7.0 as preview and in version 7.3 as release feature (see https://www.elastic.co/blog/text-similarity-search-with-vectors-in-elasticsearch). You'd need to upgrade your Elasticsearch instance in order to use it with Haystack if you're using an older version.

Related

Encountered an internal AutoML error- ClientException: Message: No objects to concatenate

I am trying to implement Hierarchical time series forecasting on azureautoml pipelines.
I followed this notebook for implementation
https://github.com/Azure/azureml-examples/blob/main/v1/python-sdk/tutorials/automl-with-azureml/forecasting-hierarchical-timeseries/auto-ml-forecasting-hierarchical-timeseries.ipynb
While I ran training pipeline on compute instance it worked, but when I am running the same on compute cluster it breaks at hts-proportion-calculation part.
This is the error I am getting,
system error:
Encountered an internal AutoML error. Error Message/Code: ClientException. Additional Info: ClientException:
      Message: No objects to concatenate
      InnerException: None
      ErrorResponse
{
"error": {
"message": "No objects to concatenate"
}
}
logs :
Loading arguments for scenario proportions-calculation
adding argument --input-medatadata
adding argument --hts-graph
adding argument --enable-event-logger
Input arguments dict is {'--input-medatadata': '/mnt/azureml/cr/j/85509be625484b6caa3c1d97b7ab2e33/cap/data-capability/wd/INPUT_automl_training_workspaceblobstore/azureml/17ca5ae7-7269-4246-888f-e781071e3f5c/automl_training', '--hts-graph': '/mnt/azureml/cr/j/85509be625484b6caa3c1d97b7ab2e33/cap/data-capability/wd/INPUT_hts_graph_workspaceblobstore/azureml/a2c1b15a-c895-41e8-b6a6-1ca37ebe9e77/hts_graph', '--enable-event-logger': None}
Unknown file to proceed outputs.txt
processing: outputs.txt with type None.
Cleaning up all outstanding Run operations, waiting 300.0 seconds
3 items cleaning up...
Cleanup took 0.001676321029663086 seconds
Traceback (most recent call last):
File "proportions_calculation_wrapper.py", line 47, in <module>
runtime_wrapper.run()
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/azureml/train/automl/runtime/_many_models/automl_pipeline_step_wrapper.py", line 63, in run
self._run()
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/azureml/train/automl/runtime/_hts/proportions_calculation.py", line 44, in _run
proportions_calculation(self.arguments_dict, self.event_logger, script_run=self.step_run)
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/azureml/train/automl/runtime/_hts/proportions_calculation.py", line 173, in proportions_calculation
proportion_files_list, forecasting_parameters.time_column_name, graph.label_column_name
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/azureml/train/automl/runtime/_hts/proportions_calculation.py", line 92, in calculate_time_agg_sum_for_all_files
df = pd.concat(pool.map(concat_func, files_batches), ignore_index=True)
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 304, in concat
sort=sort,
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 351, in __init__
raise ValueError("No objects to concatenate")
ValueError: No objects to concatenate
Please let me know how can I resolve this issue ?
This error was incurred as Iteration timeout was not less than experiment timeout , but the system error & logs are a kind of misleading.
df = pd.concat(pool.map(concat_func, files_batches), ignore_index=True)
logs was pointing to pandas "No objects to concatenate"
This error can be overcome by setting iterationtimeout value less than experimenttime out value.
I had set iteration_timeout_minutes=60 which caused the error.
automl_settings = AutoMLConfig(
task="forecasting",
primary_metric="normalized_root_mean_squared_error",
experiment_timeout_hours=1,
label_column_name=label_column_name,
track_child_runs=False,
forecasting_parameters=forecasting_parameters,
pipeline_fetch_max_batch_size=15,
model_explainability=model_explainability,
n_cross_validations="auto", # Feel free to set to a small integer (>=2) if runtime is an issue.
cv_step_size="auto",
# The following settings are specific to this sample and should be adjusted according to your own needs.
iteration_timeout_minutes=10,
iterations=15,
)
We are able to run the sample successfully using the compute cluster as given below.
from azureml.core.compute import ComputeTarget, AmlCompute
# Name your cluster
compute_name = "hts-compute"
if compute_name in ws.compute_targets:
compute_target = ws.compute_targets[compute_name]
if compute_target and type(compute_target) is AmlCompute:
print("Found compute target: " + compute_name)
else:
print("Creating a new compute target...")
provisioning_config = AmlCompute.provisioning_configuration(
vm_size="STANDARD_D16S_V3", max_nodes=20
)
# Create the compute target
compute_target = ComputeTarget.create(ws, compute_name, provisioning_config)
# Can poll for a minimum number of nodes and for a specific timeout.
# If no min node count is provided it will use the scale settings for the cluster
compute_target.wait_for_completion(
show_output=True, min_node_count=None, timeout_in_minutes=20
)
# For a more detailed view of current cluster status, use the 'status' property
print(compute_target.status.serialize())

ClientError: An error occurred (InvalidTextEncoding) when calling the SelectObjectContent operation: UTF-8 encoding is required. reading gzip file

I am getting the above error in my code. encoding=latin-1 needs to be included as a parameter somewhere in select-object-content. Since I am new to this, I am not sure, where to add it.
Can anyone help me in this?
Code:
client = boto3.client('s3',aws_access_key_id,aws_secret_access_key',region_name)
resp = client.select_object_content(
Bucket='mybucket',
Key='path_to_file/file_name.gz',
ExpressionType='SQL',
Expression=query,
InputSerialization = {'CSV': {"FileHeaderInfo": "Use"}, 'CompressionType': compressionType},
OutputSerialization = {'CSV': {}},
)
Traceback:
ClientError Traceback (most recent call last)
C:\path/3649752754.py in <module>
78 Expression=SQL,
79 InputSerialization = {'CSV': {"FileHeaderInfo": "Use"}, 'CompressionType': compression},
---> 80 OutputSerialization = {'CSV': {}},
81 )
82
ClientError: An error occurred (InvalidTextEncoding) when calling the SelectObjectContent operation: UTF-8 encoding is required. The text encoding error was found near byte 90,112.
You need to save your CSV file with UTF-8 encoding. For example, with Notepad++ or Excel->Save As->Select from the dropdown.

ValueError: Unsupported 'device_type'

I am new to Python.
I have this code in Python:
from netmiko import ConnectHandler
`sshCli = ConnectHandler(
device_type='Cisco_ios',
host='192.168.56.101',
port=22,
username='cisco',
password='cisco123!'
)
output = sshCli.send_command("show ip int brief")
print("show ip int brief:\n{}\n".format(output))
I get the following error:
========== RESTART: C:/Users/edanpc/AppData/Local/Programs/Python/Python38-32/Lab2.2.py =========
Traceback (most recent call last):
File "C:/Users/edanpc/AppData/Local/Programs/Python/Python38-32/Lab2.2.py", line 2, in <module>
sshCli = ConnectHandler(
File "C:\Users\edanpc\AppData\Local\Programs\Python\Python38-32\lib\site-packages\netmiko\ssh_dispatcher.py", line 297, in ConnectHandler
raise ValueError(
ValueError: Unsupported 'device_type' currently supported platforms are:
a10
accedian
adtran_os
alcatel_aos
alcatel_sros
apresia_aeos
arista_eos
aruba_os
avaya_ers
avaya_vsp
broadcom_icos
brocade_fastiron
brocade_netiron
brocade_nos
brocade_vdx
brocade_vyos
calix_b6
centec_os
checkpoint_gaia
ciena_saos
cisco_asa
cisco_ios
cisco_nxos
cisco_s300
cisco_tp
cisco_wlc
cisco_xe
cisco_xr
cloudgenix_ion
coriant
dell_dnos9
dell_force10
dell_isilon
dell_os10
dell_os6
dell_os9
dell_powerconnect
dlink_ds
eltex
eltex_esr
endace
enterasys
extreme
extreme_ers
extreme_exos
extreme_netiron
extreme_nos
extreme_slx
extreme_vdx
extreme_vsp
extreme_wing
f5_linux
f5_ltm
f5_tmsh
flexvnf
fortinet
generic
generic_termserver
hp_comware
hp_procurve
huawei
huawei_olt
huawei_smartax
huawei_vrpv8
ipinfusion_ocnos
juniper
juniper_junos
juniper_screenos
keymile
keymile_nos
linux
mellanox
mellanox_mlnxos
mikrotik_routeros
mikrotik_switchos
mrv_lx
mrv_optiswitch
netapp_cdot
netgear_prosafe
netscaler
nokia_sros
oneaccess_oneos
ovs_linux
paloalto_panos
pluribus
quanta_mesh
rad_etx
raisecom_roap
ruckus_fastiron
ruijie_os
sixwind_os
sophos_sfos
ubiquiti_edge
ubiquiti_edgeswitch
ubiquiti_unifiswitch
vyatta_vyos
vyos
watchguard_fireware
yamaha
zte_zxros
>>>
What is wrong with my code ?
This error is specific to netmiko. The error text says the list of supported device_type values. To solve the error, fix the below in your code.
device_type='cisco_ios'
The device type should contain small cisco_ios instead of Cisco_ios.

Bazel Error After Upgrading Nodejs Rules - ERROR: defs.bzl has been removed from build_bazel_rules_nodejs

After upgrading build_bazel_rules_nodejs from 0.42.2 to 1.0.1 I get this error:
ERROR: /home/flolu/.cache/bazel/_bazel_flolu/698f7adad10ea020bcdb85216703ce08/external/build_bazel_rules_nodejs/defs.bzl:19:5: Traceback (most recent call
last):
File "/home/flolu/Desktop/minimal-bazel-monorepo/services/server/src/BUILD", line 76
nodejs_image(name = "server", <2 more arguments>)
File "/home/flolu/.cache/bazel/_bazel_flolu/698f7adad10ea020bcdb85216703ce08/external/io_bazel_rules_docker/nodejs/image.bzl", line 112, in nodejs_image
nodejs_binary(name = binary, <2 more arguments>)
File "/home/flolu/.cache/bazel/_bazel_flolu/698f7adad10ea020bcdb85216703ce08/external/build_bazel_rules_nodejs/defs.bzl", line 19, in nodejs_binary
fail(<1 more arguments>)
ERROR: defs.bzl has been removed from build_bazel_rules_nodejs
Please update your load statements to use index.bzl instead.
See https://github.com/bazelbuild/rules_nodejs/wiki#migrating-off-build_bazel_rules_nodejsdefsbzl for help.
ERROR: error loading package 'services/server/src': Package 'services/server/src' contains errors
INFO: Elapsed time: 0.119s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (1 packages loaded)
FAILED: Build did NOT complete successfully (1 packages loaded)
Line 76 in the error refers to this part of the BUILD file:
load("#io_bazel_rules_docker//nodejs:image.bzl", "nodejs_image")
nodejs_image(
name = "server",
data = [":lib"],
entry_point = ":index.ts",
)
But there is no defs.bzl. So I am confused by the error.
So in detail I have upgraded from
http_archive(
name = "build_bazel_rules_nodejs",
sha256 = "16fc00ab0d1e538e88f084272316c0693a2e9007d64f45529b82f6230aedb073",
urls = ["https://github.com/bazelbuild/rules_nodejs/releases/download/0.42.2/rules_nodejs-0.42.2.tar.gz"],
)
to
http_archive(
name = "build_bazel_rules_nodejs",
sha256 = "e1a0d6eb40ec89f61a13a028e7113aa3630247253bcb1406281b627e44395145",
urls = ["https://github.com/bazelbuild/rules_nodejs/releases/download/1.0.1/rules_nodejs-1.0.1.tar.gz"],
)
You can recreate the error by cloning this repo: https://github.com/flolude/minimal-bazel-monorepo/tree/48add7ddcad4d25e361e1c7f7f257cf916a797b2 and running
bazel test //services/server/src:test
There are some breaking changes between those versions of build_bazel_rules_nodejs. Namely the import path this:
load("#build_bazel_rules_nodejs//:defs..bzl", <whatever>)
needs to become this
load("#build_bazel_rules_nodejs//:index.bzl", <whatever>)
You also need to update your io_bazel_rules_docker to at least v0.13.0. From looking at the release notes its the version compatible with 1.0.1 in node. https://github.com/bazelbuild/rules_docker/releases/

WebDriverException: Message: unknown error: bad inspector message error while printing HTML content using ChromeDriver Chrome through Selenium Python

I am scraping some HTML content..
for i, c in enumerate(cards[75:77]):
print(i)
a = c.find_element_by_class_name("influencer-stagename")
print(a.get_attribute('innerHTML'))
Works fine for all records except the 76th one. Output before error...
0
b'<a class="influencer-analytics-link" href="/influencers/sophiewilling"><h5><span>SOPHIE WILLING</span></h5></a>'
1
b'<a class="influencer-analytics-link" href="/influencers/ferntaylorr"><h5><span>Fern Taylor.</span></h5></a>'
2
b'<a class="influencer-analytics-link" href="/influencers/officialshaniceslatter"><h5><span>Shanice Slatter</span></h5></a>'
3
Stacktrace...
> -------------------------------------------------------------------------
WebDriverException Traceback (most recent call last) <ipython-input-484-0a80d1af1568> in <module>
3 #print(c.find_element_by_class_name("influencer-stagename").text)
4 a = c.find_element_by_class_name("influencer-stagename")
----> 5 print(a.get_attribute('innerHTML').encode('ascii', 'ignore'))
~/anaconda3/envs/py3-env/lib/python3.7/site-packages/selenium/webdriver/remote/webelement.py in get_attribute(self, name)
141 self, name)
142 else:
--> 143 resp = self._execute(Command.GET_ELEMENT_ATTRIBUTE, {'name': name})
144 attributeValue = resp.get('value')
145 if attributeValue is not None:
~/anaconda3/envs/py3-env/lib/python3.7/site-packages/selenium/webdriver/remote/webelement.py in _execute(self, command, params)
631 params = {}
632 params['id'] = self._id
--> 633 return self._parent.execute(command, params)
634
635 def find_element(self, by=By.ID, value=None):
~/anaconda3/envs/py3-env/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py in execute(self, driver_command, params)
319 response = self.command_executor.execute(driver_command, params)
320 if response:
--> 321 self.error_handler.check_response(response)
322 response['value'] = self._unwrap_value(
323 response.get('value', None))
~/anaconda3/envs/py3-env/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py in check_response(self, response)
240 alert_text = value['alert'].get('text')
241 raise exception_class(message, screen, stacktrace, alert_text)
--> 242 raise exception_class(message, screen, stacktrace)
243
244 def _value_or_default(self, obj, key, default):
WebDriverException: Message: unknown error: bad inspector message: {"id":110297,"result":{"result":{"type":"object","value":{"status":0,"value":"<a class=\"influencer-analytics-link\" href=\"/influencers/bookishemily\"><h5><span>Emily | 18 | GB | Student\uD83C...</span></h5></a>"}}}} (Session info: chrome=75.0.3770.100) (Driver info: chromedriver=2.40.565386 (45a059dc425e08165f9a10324bd1380cc13ca363),platform=Mac OS X 10.14.0 x86_64)
I suspect it is an invalid character in
value":"Emily | 18 | GB | Student\uD83C..."
Specifically I suspect "\uD83C"
Adding
.encode("utf-8") OR .encode('ascii', 'ignore')
to the second print statement changes nothing.
Any thoughts on how to solve this??
UPDATE: The problem is with Emoji characters. I have found 3 examples to far and each has an emoji (pink flower 🌸, russian flag 🇷🇺 and swirling leaves 🍃). If I edit them out with Chrome inspector my code runs fine but this is not a solution that works at scale
This error message...
WebDriverException: Message: unknown error: bad inspector message: {"id":110297,"result":{"result":{"type":"object","value":{"status":0,"value":"<a class=\"influencer-analytics-link\" href=\"/influencers/bookishemily\"><h5><span>Emily | 18 | GB | Student\uD83C...</span></h5></a>"}}}} (Session info: chrome=75.0.3770.100) (Driver info: chromedriver=2.40.565386 (45a059dc425e08165f9a10324bd1380cc13ca363),platform=Mac OS X 10.14.0 x86_64)
...implies that the ChromeDriver was unable to parse some non-UTF-8 characters due to JSON encoding/decoding issue.
Deep Dive
As per the discussion in Issue 723592: 'Bad inspector message' errors when running URL web-platform-tests via webdriver John Chen (Owner - WebDriver for Google Chrome) in his comment mentioned:
A JSON encoding/decoding issue caused the "Bad inspector message" error reported at https://travis-ci.org/w3c/web-platform-tests/jobs/232845351. Part of the error message from part 1 contains an invalid Unicode character \uFDD0 (from https://github.com/w3c/web-platform-tests/blob/34435a4/url/urltestdata.json#L3596). The JSON encoder inside Chrome didn't detect such error, and passed it through in the JSON blob sent to ChromeDriver. ChromeDriver uses base/json/json_parser.cc to parse the JSON string. This parser does a more thorough error detection, notices that \uFDD0 is an invalid character, and reports an error. I think our JSON encoder and decoder should have exactly the same amount of error checking. It's problematic that the encoder can create a blob that is rejected by the decoder.
Analysis
John Chen (Owner - WebDriver for Google Chrome) further added:
The JSON encoding happens in protocol layout of DevTools, just before the result is sent back to ChromeDriver. The relevant code is in https://cs.chromium.org/chromium/src/out/Debug/gen/v8/src/inspector/protocol/Protocol.cpp. In particular, escapeStringForJSON function is responsible for encoding strings. It's actually quite conservative. Anything above 126 is encoded in \uXXXX format. (Note that Protocol.cpp is a generated file. The real source is https://cs.chromium.org/chromium/src/v8/third_party/inspector_protocol/lib/Values_cpp.template.)
The error occurs in the JSON parser used by ChromeDriver. The decoding of \uXXXX sequence happens at https://cs.chromium.org/chromium/src/base/json/json_parser.cc?l=564 and https://cs.chromium.org/chromium/src/base/json/json_parser.cc?l=670. After decoding an escape sequence, the decoder rejects anything that's not a valid Unicode character.
I noticed that there was a recent change to prevent a JSON encoder from emitting invalid Unicode code point: https://crrev.com/478900. Unfortunately it's not the JSON encoder used by the code involved in this bug, so it doesn't help us directly, but it's an indication that we're not the only ones affected by this type of issue.
Solution
This issue was addressed replacing invalid UTF-16 escape sequences when decoding invalid UTF strings in chromedriver as Web platform tests may use ECMAScript strings which aren't necessarily utf-16 characters through this revision / commit.
So a quick solution would be to ensure the following and re-execute your tests:
Selenium is upgraded to current levels Version 3.141.59.
ChromeDriver is updated to current ChromeDriver v79.0.3945.36 level.
Chrome is updated to current Chrome Version 79.0 level. (as per ChromeDriver v79.0 release notes)
Alternative
As an alternative you can use GeckoDriver / Firefox combination and you can find a relevant discussion in Chromedriver only supports characters in the BMP error while sending Emoji with ChromeDriver Chrome using Selenium Python to Tkinter's label() textbox

Resources