How to read the boto3 file object in opencv python3

How to read the boto3 file object in opencv python3 - python-3.x

I am trying to read the AWS S3 presigned uri file with open-CV. But the read parameter is of NoneType. How to read the boto3 file_obj in opencv and process further?
import cv2
import boto3
s3Client = boto3.client('s3')
file_path = s3Client.generate_presigned_url('get_object', Params = {'Bucket':
'www.mybucket.com', 'Key': 'hello.txt'}, ExpiresIn = 100)
img = cv2.imread(file_path)
But it is reading the file as <class 'NoneType'>. But I need it to be read by the cv2.

import urllib2
response = urllib2.urlopen(file_path)
image = response.read()
img = cv2.imread(image)
can you please try this

Related

How to read parquet file from s3 using pandas

I am trying to read the parquet file which is in s3 using pandas.
Below is the code
import boto3
import pandas as pd
key = 'key'
secret = 'secret'
s3_client = boto3.client(
's3',
aws_access_key_id = key,
aws_secret_access_key = secret,
region_name = 'region_name'
)
print(s3_client)
AWS_S3_BUCKET='bucket_name'
filePath='data/wine_dataset'
response = s3_client.get_object(Bucket=AWS_S3_BUCKET, Key=filePath)
status = response.get("ResponseMetadata", {}).get("HTTPStatusCode")
if status == 200:
print(f"Successful S3 get_object response. Status - {status}")
books_df = pd.read_parquet(response.get("Body"))
print(books_df)
else:
print(f"Unsuccessful S3 get_object response. Status - {status}")
I am getting the below error
NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist.
But when I read the same s3 path using pyspark it worked
path= 's3a://bucket_name/data/wine_dataset'
df = spark.read.parquet(path)
I am not sure why it is not working using pandas. Can anyone help me on this?

Kaggle login and unzip file to store in s3 bucket

Create a lambda function for python 3.7.
Role attached to the lambda function should have S3 access and lambda basic execution.
Read data from https://www.kaggle.com/therohk/india-headlines-news-dataset/download and save into S3 as CSV. file is zip how to unzip and store in temp file
Getting Failed in AWS Lambda function:
Lambda Handler to download news headline dataset from kaggle
import urllib3
import boto3
from botocore.client import Config
http = urllib3.PoolManager()
def lambda_handler(event, context):
bucket_name = 'news-data-kaggle'
file_name = "india-news-headlines.csv"
lambda_path = "/tmp/" +file_name
kaggle_info = {'UserName': "bossdk", 'Password': "xxx"}
url = "https://www.kaggle.com/account/login"
data_url = "https://www.kaggle.com/therohk/india-headlines-news-dataset/download"
r = http.request('POST',url,kaggle_info)
r = http.request('GET',data_url)
f = open(lambda_path, 'wb')
for chunk in r.iter_content(chunk_size = 512 * 1024):
if chunk:
f.write(chunk)
f.close()
data = ZipFile(lambda_path)
# S3 Connect
s3 = boto3.resource('s3',config=Config(signature_version='s3v4'))
# Uploaded File
s3.Bucket(bucket_name).put(Key=lambda_path, Body=data, ACL='public-read')
return {
'status': 'True',
'statusCode': 200,
'body': 'Dataset Uploaded'
}

How to upload video to s3 using API GW and python?

Im trying to make a api which will upload video to s3 . I all ready managed to upload the video in s3, but the problem is the video file is not working . i checked content-type of video file, and it's binary/octet-stream instead on video/mp4 . So i set content-type to "video/mp4" while calling put_object api, but it still not working.
I use Lambda function for putting the video to s3 . here is my lambda code -
import json
import base64
import boto3
def lambda_handler(event, context):
bucket_name = 'ad-live-streaming'
s3_client = boto3.client('s3')
file_content = event['content']
merchantId = event['merchantId']
catelogId = event['catelogId']
file_name = event['fileName']
file_path = '{}/{}/{}.mp4'.format(merchantId, catelogId, file_name)
s3_response = s3_client.put_object(Bucket=bucket_name, Key=file_path, Body=file_content, ContentType='video/mp4')
return {
'statusCode': 200,
"merchantId":merchantId,
"catelogId":catelogId,
"file_name":file_name,
}
Any idea how to solve this issue ?

Based on the example in Upload binary files to S3 using AWS API Gateway with AWS Lambda | by Omer Hanetz | The Startup | Medium, it appears that you need to decode the file from base64:
file_content = base64.b64decode(event['content'])

Writing string to S3 with boto3: "'dict' object has no attribute 'put'"

In an AWS lambda, I am using boto3 to put a string into an S3 file:
import boto3
s3 = boto3.client('s3')
data = s3.get_object(Bucket=XXX, Key=YYY)
data.put('Body', 'hello')
I am told this:
[ERROR] AttributeError: 'dict' object has no attribute 'put'
The same happens with data.put('hello') which is the method recommended by the top answers at How to write a file or data to an S3 object using boto3 and with data.put_object: 'dict' object has no attribute 'put_object'.
What am I doing wrong?
On the opposite, reading works great (with data.get('Body').read().decode('utf-8')).

put_object is a method of the s3 object, not the data object.
Here is a full working example with Python 3.7:
import json
import boto3
s3 = boto3.client('s3')
import logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)
def lambda_handler(event, context):
bucket = 'mybucket'
key = 'id.txt'
id = None
# Write id to S3
s3.put_object(Body='Hello!', Bucket=bucket, Key=key)
# Read id from S3
data = s3.get_object(Bucket=bucket, Key=key)
id = data.get('Body').read().decode('utf-8')
logger.info("Id:" + id)
return {
'statusCode': 200,
'body': json.dumps('Id:' + id)
}

Unable to read the buffer from BytesIO in google app engine flex environment

Here is the related code
import logging
logging.getLogger('googleapicliet.discovery_cache').setLevel(logging.ERROR)
import datetime
import json
from flask import Flask, render_template, request
from flask import make_response
from googleapiclient.discovery import build
from googleapiclient.http import MediaIoBaseDownload
from oauth2client.client import AccessTokenCredentials
...
#app.route('/callback_download')
def userselectioncallback_with_drive_api():
"""
Need to make it a background process
"""
logging.info("In download callback...")
code = request.args.get('code')
fileId = request.args.get('fileId')
logging.info("code %s", code)
logging.info("fileId %s", fileId)
credentials = AccessTokenCredentials(
code,
'flex-env/1.0')
http = httplib2.Http()
http_auth = credentials.authorize(http)
# Exports a Google Doc to the requested MIME type and returns the exported content. Please note that the exported content is limited to 10MB.
# v3 does not work? over quota?
drive_service = build('drive', 'v3', http=http_auth)
drive_request = drive_service.files().export(
fileId=fileId,
mimeType='application/pdf')
b = bytes()
fh = io.BytesIO(b)
downloader = MediaIoBaseDownload(fh, drive_request)
done = False
try:
while done is False:
status, done = downloader.next_chunk()
logging.log("Download %d%%.", int(status.progress() * 100))
except Exception as err:
logging.error(err)
logging.error(err.__class__)
response = make_response(fh.getbuffer())
response.headers['Content-Type'] = 'application/pdf'
response.headers['Content-Disposition'] = \
'inline; filename=%s.pdf' % 'yourfilename'
return response
It is based on some code example of drive api. I am trying to export some files from google drive to pdf format.
The exception comes from the line
response = make_response(fh.getbuffer())
It throws the exception:
TypeError: 'memoryview' object is not callable
How can I retrieve the pdf content properly from the fh? Do I need to further apply some base 64 encoding?
My local runtime is python 3.4.3

I have used an incorrect API. I should do this instead:
response = make_response(fh.getvalue())

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to read the boto3 file object in opencv python3 - python-3.x

import urllib2 response = urllib2.urlopen(file_path) image = response.read() img = cv2.imread(image) can you please try this

Related

How to read parquet file from s3 using pandas

Kaggle login and unzip file to store in s3 bucket

How to upload video to s3 using API GW and python?

Writing string to S3 with boto3: "'dict' object has no attribute 'put'"

Unable to read the buffer from BytesIO in google app engine flex environment

Categories

Resources