pyflink Unsupported Python SqlFunction CAST when working with amazon-kinesis-sql-connector and udtf function - python-3.x

i am currently trying to get Pyflink running with the AWS-Kinesis-SQL-Connector.
A use the TableAPI and can read from Kinesis and also write back to another Kinesis Stream. As soon as i use a udtf decorated function i get the following exception:
File "/home/user/anaconda3/envs/flink-env/lib/python3.8/site-packages/pyflink/table/table_environment.py", line 828, in execute_sql
return TableResult(self._j_tenv.executeSql(stmt))
File "/home/user/anaconda3/envs/flink-env/lib/python3.8/site-packages/py4j/java_gateway.py", line 1321, in __call__
return_value = get_return_value(
File "/home/user/anaconda3/envs/flink-env/lib/python3.8/site-packages/pyflink/util/exceptions.py", line 158, in deco
raise java_exception
pyflink.util.exceptions.TableException: org.apache.flink.table.api.TableException: Unsupported Python SqlFunction CAST.
I try to sum up the core snippets of the script:
#udtf(result_types=[DataTypes.STRING(), DataTypes.INT()])
def flatten_row(row: Row) -> Row:
for s in row["members"]:
yield Row(str(s["id"]), s["name"])
result_table = input_table.flat_map(flatten_row).alias("id", "name")
table_env.create_temporary_view("result_table", result_table)
As soon as i want to execute it on the Stream the exception get's raised.
table_result = table_env.execute_sql(f"INSERT INTO {output_table_name} SELECT * FROM result_table")
The output_table and input_table are connected to Kinesis Streams and without the udtf function it works.
Environment
Used apache-flink==1.16.0 and python3.8. Tried Conda and PIP environments
Thank you!
Already tried different versions of the apache-flink and the amazon-kinesis-sql-connector. Conda and PIP environments with Python3.8.

Finally i found out that the problem was the JDK version pre-installed on my MacOS. I downgraded from 15.0.2 until i reached 11.0.16, which was finally working without any error. So it seems that the Python apache-flink package needs older JDK versions.

Related

Python xlwings: EventError: Command failed: Parameter error. (-50)

I wish I could use python to execute the Excel macro, so I tried to use the package xlwings to implement it.
The OS of my laptop is macOS Catalina (ver.: 10.15.7), my compiler is PyCharm (ver.: 2021.2.3), my Python version is 3.8.8, I used Anaconda (Ver.: 22.11.1) as my interpreter, my excel version is 16.66.1 (Microsoft Excel for Mac).
I faced the error "Command Error -1743: The User has declined permission" when I tried to use this package originally, and I solved this issue by installing an old compiler & using the old version of the compiler to run my code. My privacy setting for automation in the app Setting was shown below: (This is NOT the question I want to ask, but I'm not sure if it is also related to the issue I faced, so I still attached it here. I had uninstalled the old version of my compiler already.)
I wish I could implement an existing macro (called Hi) in an existing Excel file (called [StakeOverflow]HelloWorld.xlsm) through Python (represented as MY_PYTHON_FILE.py below), like the snapshot below:
My Excel file and my python code were stored on OneDrive. My macro code was shown below:
Sub Hi()
MsgBox "Good morning!"
End Sub
My python code was shown below:
import xlwings as xw
import time
wb = xw.Book('/Users/<MY NAME>/OneDrive/MY PATH DETAILS/[StakeOverflow]HelloWorld.xlsm')
time.sleep(10)
app = wb.app
macro_vba = app.macro("Hi")
macro_vba()
The code looks really simple, but I still faced the error. My Excel was not opened automatically, and I even could not open the Excel file manually thereafter. The error was shown below:
/Users/.../.conda/envs/Program/bin/python "/Users/.../OneDrive/.../MY_PYTHON_FILE.py"
Traceback (most recent call last):
File "/Users/.../.conda/envs/Program/lib/python3.9/site-packages/xlwings/main.py", line 4914, in open
impl= self.impl(name)
File "/Users/.../.conda/envs/Program/lib/python3.9/site-packages/xlwings/_xlmac.py", line 366, in __call__
raise KeyError(name_or_index)
KeyError: '[stakeoverflow] helloworld.xlsm'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/.../.conda/envs/Program/lib/python3.9/site-packages/aeosa/appscript/reference.py", line 482, in __call___
return self.AS_appdata.target().event (self._code, params, atts, codecs=self.AS_appdata).send(timeout, sendflags)
File "/Users/.../.conda/envs/Program/lib/python3.9/site-packages/aeosa/aem/aemsend.py", line 92, in send
raise EventError(errornum, errormsg, eventresult)
aem.aemsend.EventError: Command failed: Parameter error. (-50)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/.../OneDrive/.../MY_PYTHON_FILE.py", line 18, in <module>
wb = xw.Book('/Users/.../OneDrive/.../[stakeoverflow] helloworld.xlsm')
File "/Users/.../.conda/envs/Program/lib/python3.9/site-packages/xlwings/main.py", line 876, in __init__
impl= app.books.open(
File "/Users/.../.conda/envs/Program/lib/python3.9/site-packages/xlwings/main.py", line 4921, in open
impl = self.impl.open(
File "/Users/.../.conda/envs/Program/lib/python3.9/site-packages/xlwings/ xlmac.py", line 420, in open
self.app.xl.open_workbook(
File "/Users/.../.conda/envs/Program/lib/python3.9/site-packages/aeosa/appscript/reference.py", line 518, in __call__
raise CommandError(self, (args, kargs), e, self.AS_appdata) from e
appscript.reference.CommandError: Command failed:
OSERROR: -50
MESSAGE: Parameter error.
COMMAND: app(pid=1647).open_workbook (workbook_file_name='/users/.../onedrive/.../[stakeoverflow] helloworld.xlsm', update_links=k.do_not_update_links, read_only=None, format=None, password=None, write_reserved_password=None, ignore_read_only_recommended=None, origin=None, delimiter=None, editable=None, notify=None, converter=None, add_to_mru=None, timeout=-1)
Process finished with exit code 1
I tried to use terminal to run my python code, but I could not solve the problem, either.
I tried to open the Excel file manually thereafter, the error was shown below:
Excel cannot open the file ’[StakeOverflow]HelloWorld.xlsm’
because the file format or file extension is not valid. Verify
that the file has not been corrupted and that the file
extension matches the format of the file.
I tried to Google this error, but few solutions was found. It seems that it is related to the issue of external storage location. I tried to create another Excel file with the same name & macro code on my desktop and try again, and the error would disappear. (However, our company stored the files on OneDrive, so I wish I could utilise the file online.)
Just wondering if anyone here faced this situation before?
I found the answer by myself today. The issue is related to the naming issue rather than the permission issue.
As we could see that the error is KeyError: '[stakeoverflow] helloworld.xlsm', which implies that the problem is here. (Maybe because the system could not find the Excel file with this name.)
I tried to change the name from [stakeoverflow] helloworld.xlsm to helloworld.xlsm, and the error was gone. It seems that when xlwings want to open an Excel file, it would check the validity of the file name and change the name into smaller cases. If the file name contains special characters which are not allowed (e.g., "[]"), then the error would occur.
Notice that I could store the Excel file with these special characters in our laptop & OneDrive, but xlwings did not accept them.
Hope it is helpful to those who face this issue when using xlwings!

ElasticSearch error: 'The client noticed that the server is not a supported distribution of Elasticsearch'

New to ElasticSearch. I was following this guide to get things set up: https://john.soban.ski/boto3-ec2-to-amazon-elasticsearch.html
I ran the "connect_to_es.py" script there, and oddly it worked the first time, but in a subsequent runs, it started throwing this error:
Traceback (most recent call last):
File "../connect_to_es.py", line 21, in <module>
print(json.dumps(es.info(), indent=4, sort_keys=True))
File "/home/ubuntu/projects/.venv/lib/python3.8/site-packages/elasticsearch/client/utils.py", line 168, in _wrapped
return func(*args, params=params, headers=headers, **kwargs)
File "/home/ubuntu/projects/.venv/lib/python3.8/site-packages/elasticsearch/client/__init__.py", line 294, in info
return self.transport.perform_request(
File "/home/ubuntu/projects/.venv/lib/python3.8/site-packages/elasticsearch/transport.py", line 413, in perform_request
_ProductChecker.raise_error(self._verified_elasticsearch)
File "/home/ubuntu/projects/.venv/lib/python3.8/site-packages/elasticsearch/transport.py", line 630, in raise_error
raise UnsupportedProductError(message)
elasticsearch.exceptions.UnsupportedProductError: The client noticed that the server is not a supported distribution of Elasticsearch
The elasticsearch python library version I have is 7.14, and my elasticsearch on AWS is running 7.10. Any thoughts on what's going on here?
Copy of code:
from elasticsearch import Elasticsearch, RequestsHttpConnection
from requests_aws4auth import AWS4Auth
import boto3
import json
host = '<url>.us-east-1.es.amazonaws.com'
region = 'us-east-1'
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)
es = Elasticsearch(
hosts = [{'host': host, 'port': 443}],
http_auth = awsauth,
use_ssl = True,
verify_certs = True,
connection_class = RequestsHttpConnection
)
print(json.dumps(es.info(), indent=4, sort_keys=True))
Seems like downgrading fixed it pip3 install 'elasticsearch<7.14.0'
New elasticsearch-js has an issue:
The new product version check rejects oss distributions?
Downgrading it to lower version (e.g. 7.13) should help.
As some of the other answers indicate, you can downgrade right now but opensearch-py is a better long term solution
It should be a drop-in replacement for elasticsearch-py and it will be updated and patched over time. It supports OSS Elasticsearch and OpenSearch.
this error occurs because of the version conflict. Version of elasticsearch python library and elasticsearch should be the same.
In my case, elasticsearch version was 7.10 on AWS and I was using elasticsearch python library version 7.15 with my Django project. I removed it and installed new python library with version 7.10 in Django project and it worked fine for me.
I fixed the error by making following changes in Gemfile -
I changed -
gem 'elasticsearch'
to -
gem 'elasticsearch', '~> 7.1'
Ideally, I downgraded from 7.18(current version as of today) to 7.1

ValueError: check_hostname requires server_hostname using Fiddler 4

This question just recently posted has some useful answers, but it's not the same as mine. I'm running urllib3 1.26.4 and Python 3.7 from an ArcGIS Pro Notebook. I also have Fiddler 4 open because I want to track web traffic while troubleshooting a script. I only get the following error when I have Fiddler open. If I close Fiddler I get <Response [200]>. Is it not possible to use the requests module with Fiddler open? I'm new to Fiddler.
Truncated script:
import requests
#url
idph_data = 'https://idph.illinois.gov/DPHPublicInformation/api/covidVaccine/getVaccineAdministrationCurrent'
#headers
headers = {'user-agent': 'Mozilla/5.0'}
response = requests.get(idph_data, headers=headers, verify=True)
Error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
In [35]:
Line 4: response = requests.get(idph_data,verify=True)
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\requests\api.py, in get:
Line 76: return request('get', url, params=params, **kwargs)
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\requests\api.py, in request:
Line 61: return session.request(method=method, url=url, **kwargs)
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\requests\sessions.py, in request:
Line 542: resp = self.send(prep, **send_kwargs)
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\requests\sessions.py, in send:
Line 655: r = adapter.send(request, **kwargs)
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\requests\adapters.py, in send:
Line 449: timeout=timeout
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\urllib3\connectionpool.py, in urlopen:
Line 696: self._prepare_proxy(conn)
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\urllib3\connectionpool.py, in _prepare_proxy:
Line 964: conn.connect()
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\urllib3\connection.py, in connect:
Line 359: conn = self._connect_tls_proxy(hostname, conn)
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\urllib3\connection.py, in _connect_tls_proxy:
Line 506: ssl_context=ssl_context,
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\urllib3\util\ssl_.py, in ssl_wrap_socket:
Line 432: ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls)
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\urllib3\util\ssl_.py, in _ssl_wrap_socket_impl:
Line 474: return ssl_context.wrap_socket(sock)
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\ssl.py, in wrap_socket:
Line 423: session=session
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\ssl.py, in _create:
Line 827: raise ValueError("check_hostname requires server_hostname")
ValueError: check_hostname requires server_hostname
---------------------------------------------------------------------------
I am running into this issue as well with the environment provided by the current version of ArcGIS Pro. Per a lower-rated answer in the question you linked, I ran pip install urllib3==1.25.11 in the desired environment (in my case a clone of the default), and the issue appears to be resolved.
This is apparently due to a new feature in the urllib3 version provided by ArcGIS Pro. The above command downgrades to a relatively recent, but working, version. This will not be resolved in newer versions of urllib3, but instead, there is currently a pull request pending to fix the underlying issue in Python itself.
By the way, while it's possible to configure pip to be able to run through the fiddler proxy, it's not too easy, so it is best to turn off Fiddler while running any pip commands.
The pertinent bug report is found here. The issue appears to be that there is a very old bug in how Windows system proxy settings are being parsed by CPython / built-in urllib, causing the proxy entry for use with https URLs to always receive a HTTPS prefix (instead of HTTP). Newer version of urllib3 actually support using proxies over HTTPS, which was not previously the case. So before, urllib3 would ignore the prefix, but now, it attempts to use HTTPS to communicate with a HTTP url.
I've updated to requests v. 2.7.0, the latest, and I'm no longer receiving the error. If it was a version-specific issue related to v. 2.25.1, which was what I was using, I'm not sure. I haven't came across any evidence of that.
In a Windows command prompt in the same directory as my Python executable:
python -m pip install requests==2.7.0
Now if I run my original script with Fiddler capturing, I get a HTTP status of 200 and my script no longer gives me the error.

Why does draw() in pygraphviz/agraph not work on the server (but locally)?

I have a Python app using Pygraphviz that works fine locally, but on the server the draw function throws an error. It happens in make_svg. The following lines are the relevant part of the errors I get. (The full trail is here.)
File "/path/to/app/utils/make_svg.py", line 17, in make_svg
prog='dot'
File "/path/to/pygraphviz/agraph.py", line 1477, in draw
fh = self._get_fh(path, 'w+b')
File "/path/to/pygraphviz/agraph.py", line 1506, in _get_fh
fh = open(path, mode=mode)
FileNotFoundError: [Errno 2] No such file or directory: 'app/svg_files/nope.svg'
Logging type(g) gives <class 'pygraphviz.agraph.AGraph'> as expected.
I work in a virtualenv in a mod_wsgi 4.6.5/Python3.7 environment on a Webfaction server.
Locally I use a virtualenv with Python 3.5.
The version of Pygraphviz is 1.3.1.(First I had 1.5 on the server. The error was exactly the same, except for the line numbers.)
What can I do?
The same error is described in this bug report from last year.
I don't get which directory I am supposed to create. svg_files exists and has rights 777.
The draw function at the end of make_svg should create the SVG.(And at the end of extract_coordinates_from_svg the file is removed again.)The file name is a hash created in connected_dag (svg_name).
On the server app/svg_files seems not to describe the same place as locally.
I defined the path unambiguously, and now it works.
file_path = '{grandparent}/svg_files/{name}.svg'.format(
grandparent=os.path.dirname(os.path.dirname(__file__)),
name=name
)
g.draw(file_path, prog='dot')

Mapreduce no longer works after 2.7 conversion

After converting our app to Python 2.7, configuring for multithreading, and referencing mapreduce in app.yaml like this...
- url: /mapreduce(/.*)?
script: mapreduce.main.app
#script: google.appengine.ext.mapreduce.main.app
login: admin
and invoking mapreduce like this...
control.start_map(
"FNFR",
"fnfr.fnfrHandler",
"mapreduce.input_readers.BlobstoreLineInputReader",
{"blob_keys": blobKey},
shard_count=32,
mapreduce_parameters={'done_callback': '/fnfrdone','blobKey': blobKey, 'userID':thisUserID})
we get the following stack trace...
Traceback (most recent call last):
File "/base/python27_runtime/python27_lib/versions/1/google/appengine/runtime/wsgi.py", line 189, in Handle
handler = _config_handle.add_wsgi_middleware(self._LoadHandler())
File "/base/python27_runtime/python27_lib/versions/1/google/appengine/runtime/wsgi.py", line 241, in _LoadHandler
raise ImportError('%s has no attribute %s' % (handler, name))
ImportError: <module 'mapreduce.main' from '/base/data/home/apps/s~xxxxxxonline/2.361692533819432574/mapreduce/main.pyc'> has no attribute app
I found one SO reference ( How to migrate my app.yaml to 2.7? ) but as you can see from my yaml, I think I've tried all combinations to try to get it to resolve. Thanks.
This worked for me, but I'm still on a pretty old version of the SDK, I don't know if they fixed this:
- url: /mapreduce(/.*)?
script: mapreduce.main.APP
login: admin

Resources