How can I do a wget with parameters in NiFi - get

I'm trying to consult an API with NiFi, parameters which comes from a database, so I need to use attributes as part of the URL.
I cannot use GetHttp, because it doesn't accept attributes. I've tried to use ExecuteScript, using Jython, I have some troubles...
import json
import java.lang.Exception
from urllib import urlopen
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
# Define a subclass of InputStreamCallback for use in session.read()
class PyInputStreamCallback(StreamCallback):
def __init__(self):
pass
def process(self,inputStream,outputStream):
text = IOUtils.toString(inputStream,StandardCharsets.UTF_8)
data_old = json.loads(text)
data_new = {}
for data in data_old:
# Prepare key
ip = data_old.get('keys')[0].get('ip')
data_ok = urlopen('http://'+ip+'/api/data?begintime=2021-09-30T23:59:59.000+02:00')
#data_ok = list(data_new.values())
outputStream.write(bytearray(json.dumps(data_ok).encode("utf-8")))
flowFile = session.get()
if (flowFile != None):
try:
flowFile = session.write(flowFile,PyInputStreamCallback())
session.transfer(flowFile,REL_SUCCESS)
except java.lang.Exception as err:
log.error("Something went wrong", err)
session.transfer(flowFile,REL_FAILURE)
It shows me
ScriptException: TypeError: <addinfourl at 2764827 whose fp = <_socket._fileobject object at 0x2a301c>> is not JSON serializable in at line number 30
Line number 30 it not important, because it only says the error is in the write function, which uses PyInputStreamCallback.process
I've tried to use python's requests library, but it is not in Jython...
Does anybody fight with this trouble before?

While GetHTTP may not take attributes, InvokeHTTP does. GetHTTP is largely deprecated, so you should evaluate InvokeHTTP for your needs, as this sounds like a good fit and you can avoid custom scripting.

Related

Python mocking using MOTO for SSM

Taken from this answer:
Python mock AWS SSM
I now have this code:
test_2.py
from unittest import TestCase
import boto3
import pytest
from moto import mock_ssm
#pytest.yield_fixture
def s3ssm():
with mock_ssm():
ssm = boto3.client("ssm")
yield ssm
#mock_ssm
class MyTest(TestCase):
def setUp(self):
ssm = boto3.client("ssm")
ssm.put_parameter(
Name="/mypath/password",
Description="A test parameter",
Value="this is it!",
Type="SecureString",
)
def test_param_getting(self):
import real_code
resp = real_code.get_variable("/mypath/password")
assert resp["Parameter"]["Value"] == "this is it!"
and this is my code to test (or a cut down example):
real_code.py
import boto3
class ParamTest:
def __init__(self) -> None:
self.client = boto3.client("ssm")
pass
def get_parameters(self, param_name):
print(self.client.describe_parameters())
return self.client.get_parameters_by_path(Path=param_name)
def get_variable(param_name):
p = ParamTest()
param_details = p.get_parameters(param_name)
return param_details
I have tried a number of solutions, and switched between pytest and unittest quite a few times!
Each time I run the code, it doesn't reach out to AWS so it seems something is affecting the boto3 client, but it doesn't return the parameter. If I edit real_code.py to not have a class inside it the test passes.
Is it not possible to patch the client inside the class in the real_code.py file? I'm trying to do this without editing the real_code.py file at all if possible.
Thanks,
The get_parameters_by_path returns all parameters that are prefixed with the supplied path.
When providing /mypath, it would return /mypath/password.
But when providing /mypath/password, as in your example, it will only return parameters that look like this: /mypath/password/..
If you are only looking to retrieve a single parameter, the get_parameter call would be more suitable:
class ParamTest:
def __init__(self) -> None:
self.client = boto3.client("ssm")
pass
def get_parameters(self, param_name):
# Decrypt the value, as it is stored as a SecureString
return self.client.get_parameter(Name=param_name, WithDecryption=True)
Edit: Note that Moto behaves the same as AWS in this.
From https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ssm.html#SSM.Client.get_parameters_by_path:
[The Path-parameter is t]he hierarchy for the parameter. [...] The hierachy is the parameter name except the last part of the parameter. For the API call to succeeed, the last part of the parameter name can't be in the path.

Adding type-hinting to functions that return boto3 objects?

How do I add type-hinting to my functions that return various boto3 resources? I'd like to get automatic completion/checking on my return values in IDEs like PyCharm. Boto3 does some factory creation magic so I can't figure out how to declare the types correctly
import boto3
ec2 = boto3.Session().resource('ec2')
a = ec2.Image('asdf')
a.__class__ # => boto3.resources.factory.ec2.Image
But boto3.resources.factory.ec2.Image doesn't seem to be a class that's recognized by Python. So I can't use it for a type hint.
The docs show that the return type is EC2.Image. But is there a way to import that type as regular Python type?
UPDATE 2021
As mentioned by #eega, I no longer maintain the package. I'd recommend checking out boto3-stubs. It's a much more mature version of boto3_type_annotations.
Original Answer
I made a package that can help with this, boto3_type_annotations. It's available with or without documentation as well. Example usage below. There's also a gif at my github showing it in action using PyCharm.
import boto3
from boto3_type_annotations.s3 import Client, ServiceResource
from boto3_type_annotations.s3.waiter import BucketExists
from boto3_type_annotations.s3.paginator import ListObjectsV2
# With type annotations
client: Client = boto3.client('s3')
client.create_bucket(Bucket='foo') # Not only does your IDE knows the name of this method,
# it knows the type of the `Bucket` argument too!
# It also, knows that `Bucket` is required, but `ACL` isn't!
# Waiters and paginators and defined also...
waiter: BucketExists = client.get_waiter('bucket_exists')
waiter.wait('foo')
paginator: ListObjectsV2 = client.get_paginator('list_objects_v2')
response = paginator.paginate(Bucket='foo')
# Along with service resources.
resource: ServiceResource = boto3.resource('s3')
bucket = resource.Bucket('bar')
bucket.create()
# With type comments
client = boto3.client('s3') # type: Client
response = client.get_object(Bucket='foo', Key='bar')
# In docstrings
class Foo:
def __init__(self, client):
"""
:param client: It's an S3 Client and the IDE is gonna know what it is!
:type client: Client
"""
self.client = client
def bar(self):
"""
:rtype: Client
"""
self.client.delete_object(Bucket='foo', Key='bar')
return self.client
The boto3_type_annotations mentioned by Allie Fitter is deprecated, but he links to an alternative: https://pypi.org/project/boto3-stubs/

Flask-sqlalchemy: How serialize objects with custom constructor from existing database?

I'm trying to learn how to create python-based back-ends from some existing data that i have collected. I've come to realize that i definitely want to use sqlalchemy and that flask seems like a good library to go with it. My problem is that even after many hours of reading the sqlalchemy docs and browsing various answers on stackexchange i still don't understand how i can reshape data from an existing table into an object with a completely different structure.
The transformation i want to do is very concrete. I want to go from this structure in my MariaDB table:
Columns: company_name, date, indicators(1...23)
To this json output generated from a serialized class object:
{
"company_name[1]":
{
"indicator_name[1]":
{
"date[1]": "indicator_name[1].value[1]",
"date[2]": "indicator_name[1].value[2]",
"date[3]": "indicator_name[1].value[3]",
"date[4]": "indicator_name[1].value[4]",
"date[5]": "indicator_name[1].value[5]"
},
"indicator_name[2]":
{
"date[1]": "indicator_name[2].value[1]",
"date[2]": "indicator_name[2].value[2]",
"date[3]": "indicator_name[2].value[3]",
"date[4]": "indicator_name[2].value[4]",
"date[5]": "indicator_name[2].value[5]"
},
I found a great tutorial with which i can output the entire table record by record but the structure is not what i want, and i don't think creating the desired structure on the front-end makes sense in this case.
Here is the code that outputs the entire table to json record by record:
from flask import Flask, jsonify
from flask_sqlalchemy import SQLAlchemy
from sqlalchemy import PrimaryKeyConstraint
from sqlalchemy import orm
from sqlalchemy import select, func
from sqlalchemy import Column, Integer, String, ForeignKey
from flask_marshmallow import Marshmallow
import decimal
import flask.json
class MyJSONEncoder(flask.json.JSONEncoder): # Enables decimal queries for the API
def default(self, obj):
if isinstance(obj, decimal.Decimal):
# Convert decimal instances to strings.
return str(obj)
return super(MyJSONEncoder, self).default(obj)
app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'mysql+pymysql://USER:PASS#localhost:3306/kl_balance_sheets'
app.json_encoder = MyJSONEncoder
db = SQLAlchemy(app)
ma = Marshmallow(app)
# Bind declarative base to engine
db.Model.metadata.reflect(db.engine)
class CompanyData(db.Model):
__table__ = db.Model.metadata.tables['kl_balance_sheets']
class CompanyDataSchema(ma.ModelSchema):
class Meta:
model = CompanyData
#app.route('/')
def index():
company_data = CompanyData.query.all()
company_data_schema = CompanyDataSchema(many=True)
output = company_data_schema.dump(company_data).data
return jsonify({'company_data' : output})
if __name__ == '__main__':
app.run(debug=True)
My main question i guess is: How do i edit this code to produce the desired json?
What i think i should do is to create a custom constructor and then feed that into the index function but i can't figure out how to concretely do that. The two options i've come across are:
#orm.reconstructor
def init_on_load(self):
#do custom stuff
or:
class Foo(db.Model):
# ...
def __init__(**kwargs):
super(Foo, self).__init__(**kwargs)
# do custom stuff
To me this seems like a basic operation any flask-marshmallow user would be doing regularly. Could someone please explain how sql data is normally inserted into an object with a new structure and then serialized? In my case, do i need to change things mainly on the metadata, object or marshmallow level? I'm surprised i can't find some good examples of this.

What's get_products() missing 1 required positional argument: 'self'

I am trying to program for a friend of mine for fun and practice to make myself better in Python 3.6.3, I don't really understand why I got this error.
TypeError: get_products() missing 1 required positional argument: 'self'
I have done some research, it says I should initialize the object, which I did, but it is still giving me this error. Can anyone tell me where I did wrong? Or is there any better ways to do it?
from datetime import datetime, timedelta
from time import sleep
from gdax.public_client import PublicClient
# import pandas
import requests
class MyGdaxHistoricalData(object):
"""class for fetch candle data for a given currency pair"""
def __init__(self):
print([productList['id'] for productList in PublicClient.get_products()])
# self.pair = input("""\nEnter your product name separated by a comma.
self.pair = [i for i in input("Enter: ").split(",")]
self.uri = 'https://api.gdax.com/products/{pair}/candles'.format(pair = self.pair)
#staticmethod
def dataToIso8681(data):
"""convert a data time object to the ISO-8681 format
Args:
date(datetime): The date to be converted
Return:
string: The ISO-8681 formated date
"""
return 0
if __name__ == "__main__":
import gdax
MyData = MyGdaxHistoricalData()
# MyData = MyGdaxHistoricalData(input("""\nEnter your product name separated by a comma.
# print(MyData.pair)
Possibly you missed to create object of PublicClient. Try PublicClient().get_products()
Edited:
why I need the object of PublicClient?
Simple thumb rule of OOP's, if you wanna use some property(attribute) or behavior(method) of class, you need a object of that class. Else you need to make it static, use #staticmethod decorator in python.

Python twisted putChild not forwarding expectedly

Code here.
from twisted.web.static import File
from twisted.web.server import Site
from twisted.web.resource import Resource
from twisted.internet import ssl, reactor
from twisted.python.modules import getModule
import secure_aes
import urllib.parse
import cgi
import json
import os
import hashlib
import coserver
import base64
import sim
if not os.path.exists(os.path.join(os.getcwd(),'images')):
os.mkdir(os.path.join(os.getcwd(),'images'))
with open ('form.html','r') as f:
fillout_form = f.read()
with open ('image.html','r') as f:
image_output = f.read()
port = 80#int(os.environ.get('PORT', 17995))
class FormPage(Resource):
#isLeaf = True
def getChild(self, name, request):
print('GC')
if name == '':
return self
return Resource.getChild(self, name, request)
def render_GET(self, request):
print(request)
#do stuff and return stuff
root = FormPage()
root.putChild('rcs', File("./images"))
#factory = Site(FormPage())
factory = Site(root)
reactor.listenTCP(port, factory)
reactor.run()
As you can see, I did root.putChild towards the end of things, expecting that when I got to http://site/rcs I get given a directory listing of the contents of ./images but of course that doesn't happen. What am I missing? I've tried many things suggested from here. Also this one doesn't work because that's just serving static files anyways. It goes to getChild all the time regardless of whether if have specified putChild or not.
On Python 3, a bare string literal like "rcs" is a unicode string (which Python 3 calls "str" but which I will call "unicode" to avoid ambiguity).
However, twisted.web.resource.Resource.putChild requires a byte string as its first argument. It misbehaves rather poorly when given unicode, instead. Make your path segments into byte strings (eg b"rcs") and the server will behave better on Python 3.

Resources