I am getting a list of all of my usage plans in AWS through Boto3 and noticed that I am missing several usage plans compared to what should be there. Specifically Boto3 thinks there are 25 plans while awscli counts 39 (which is the number displayed in the AWS console). Below is the code that I'm using to get the usage plans for my specific setup:
Python file:
import boto3
session = boto3.session.Session(profile_name='myprofile')
plans = session.client('apigateway').get_usage_plans()
print(len(plans.get('items')))
Running the file returns the following:
$ python3 getplans.py
25
While going through awscli returns the following:
$ aws apigateway get-usage-plans --profile myprofile | jq '.items | length'
39
I looked through the output of both and there's just some complete plans that are missing without any real rhyme or reason behind them. Does anyone know why this might be happening?
I figured it out for anyone who finds this question later. Looks like Boto3 was paginating the response. I ended up fixing the problem by using the following code:
import boto3
session = boto3.session.Session(profile_name='myprofile')
client = session.client('apigateway')
paginator = client.get_paginator('get_usage_plans')
page_iterator = paginator.paginate()
plans = []
for page in page_iterator:
for plan in page['items']:
plans.append(plan)
Related
I have written the python boto3 code to take the ec2 inventory in AWS and in that same code, I am modifying to take multiple AWS accounts ec2 inventory list into csv but in that I am getting the ec2 output details only for last value. someone help me with the below script to generate multiple AWS ec2 inventory to csv.
import boto3
import csv
profiles = ['dev','stag']
for name in profiles:
aws_mag_con=boto3.session.Session(profile_name=name)
ec2_con_re=aws_mag_con.resource(service_name="ec2",region_name="ap-southeast-1")
cnt=1
csv_ob=open("inventory_info.csv","w",newline='')
csv_w=csv.writer(csv_ob)
csv_w.writerow(["S_NO","Instance_Id",'Instance_Type','Architecture','LaunchTime','Privat_Ip'])
for each in ec2_con_re.instances.all():
print(cnt,each,each.instance_id,each.instance_type,each.architecture,each.launch_time.strftime("%Y-%m-%d"),each.private_ip_address)
csv_w.writerow([cnt,each.instance_id,each.instance_type,each.architecture,each.launch_time.strftime("%Y-%m-%d"),each.private_ip_address])
cnt+=1
csv_ob.close()
above script I am getting the output of stag aws account only.
This is because your indentation is incorrect. The loop only accounts for the first line and everything else will be executed for the last element of profiles (when the for loop finishes). It should be:
import boto3
import csv
profiles = ['dev','stag']
for name in profiles:
aws_mag_con=boto3.session.Session(profile_name=name)
ec2_con_re=aws_mag_con.resource(service_name="ec2",region_name="ap-southeast-1")
cnt=1
csv_ob=open("inventory_info.csv","w",newline='')
csv_w=csv.writer(csv_ob)
csv_w.writerow(["S_NO","Instance_Id",'Instance_Type','Architecture','LaunchTime','Privat_Ip'])
for each in ec2_con_re.instances.all():
print(cnt,each,each.instance_id,each.instance_type,each.architecture,each.launch_time.strftime("%Y-%m-%d"),each.private_ip_address)
csv_w.writerow([cnt,each.instance_id,each.instance_type,each.architecture,each.launch_time.strftime("%Y-%m-%d"),each.private_ip_address])
cnt+=1
csv_ob.close()
I am doing a simple Cloud Function based on a file upload into GCS, this would trigger a Dataflow job. For the sake of simplicity, my current pipeline simply reads the file from GCS and then writes it to another bucket. While this Dataflow job works well without Cloud Function, Cloud Function does something else. It logs the file details correctly, it triggers a Dataflow job, but then Dataflow fails with a "module not found" error. Hence, while the function executes and triggers the job properly, the Dataflow job does not come through. Here is the code that I have:
def hello_gcs(event, context):
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
input_file = f"gs://{event['bucket']}/{event['name']}"
output_path = 'gs://<gcs_output_path>'
dataflow_options = ['--project=<project_name>', '--runner=DataflowRunner','--region=<region>','--temp_location=gs://<temp_location>']
options = PipelineOptions(dataflow_options, save_main_session = True)
print('Event ID: {}'.format(context.event_id))
print('Event type: {}'.format(context.event_type))
print('Bucket: {}'.format(event['bucket']))
print('File: {}'.format(event['name']))
print('Metageneration: {}'.format(event['metageneration']))
print('Created: {}'.format(event['timeCreated']))
print('Updated: {}'.format(event['updated']))
p = beam.Pipeline(options=options)
print_files = (p | beam.io.ReadFromText(input_file) | beam.io.WriteToText(output_path, file_name_suffix='.txt'))
result = p.run()
I also have a "requirements.txt" file added in the same directory as my function for the following two dependencies:
apache-beam[gcp]==2.39.0
functions-framework==3.*
I have seen in multiple comments that making a Dataflow template bypasses this issue, but I am wondering if anyone may have an idea why this error is being thrown, if it can be circumvented through modification of the current setup, and if not, how to alternately create a template such that this input file can be fed as a parameter?
Thank you!
This is probably a limitation of the save_main_session approach to staging dependencies. The functions-framework is not needed for Beam or Dataflow, but is just something that is loaded into the interpreter during the execution of your Cloud Function.
I suggest disabling the save_main_session option and/or using the --requirements_file or --setup_file options to provide a specification of the dependencies your pipeline will need at runtime.
Detailed documentation for dependency management is at https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/
I'm using AWS SDK for python (boto3) and want to set the subtitle output format (i.e. SRT). When I use this code, I get the error below which mentioned parameter Subtitle is not a valid parameter but according to AWS Documentation, I should be able to pass some values in this parameter.
s3 = boto3.client('s3', aws_access_key_id=ACCESS_KEY,aws_secret_access_key=SECRET_KEY)
transcribe = boto3.client('transcribe',aws_access_key_id=ACCESS_KEY,
aws_secret_access_key=SECRET_KEY, region_name=region_name)
job_name = "kateri1234"
job_uri = "s3://transcribe-upload12/english.mp4"
transcribe.start_transcription_job(TranscriptionJobName=job_name,Media{'MediaFileUri':job_uri},
MediaFormat='mp4',
LanguageCode='en-US',
Subtitles={'Formats': ['vtt']},
OutputBucketName = "transcribe-output12"
)
while True:
status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
if status['TranscriptionJob']['TranscriptionJobStatus'] in ['COMPLETED', 'FAILED']:
break
print("Not ready yet...")
time.sleep(5)
print(status)
ERROR i get is Unknown parameter in input: "Subtitles", must be one of: TranscriptionJobName, LanguageCode, MediaSampleRateHertz, MediaFormat, Media, OutputBucketName, OutputEncryptionKMSKeyId, Settings
refered the aws documentation
I have faced a similar issue, and after some research, I have found out it is because of my boto3 and botocore versions.
I have upgraded these two packages, and it worked. My requirements.txt for these two packages:
boto3==1.20.0
botocore==1.23.54
P.S: Remember to check these two new versions are compatible with your other python packages. Especially if you are using other AWS Libraries like awsebcli. To make sure everything is working perfectly together, try running this command after upgrading these two libraries to check the errors:
pip check
I try to retrieve historical financial Data from iex or morningstar. For this I use the Following Code.
import pandas as pd
pd.core.common.is_list_like = pd.api.types.is_list_like
import pandas_datareader.data as web
import datetime
start = datetime.datetime(2019, 1, 1)
end = datetime.datetime(2019, 1, 10)
facebook = web.DataReader("FB", 'morningstar', start, end)
print(facebook.head())
Unfortunatly I get the error message:
NotImplementedError: data_source='morningstar' is not implemented
or
ValueError: The IEX Cloud API key must be provided either through the
api_key variable or through the environment variable IEX_API_KEY
depending on which of both sources I use.
I tried to
pip uninstall pandas-datareader
pip install pandas-datareader
several times and also restarted the kernel but nothing changes. Was there any change to this APIs or am I doing anything wrong?
From the documentation:
You need to obtain the IEX_API_KEY from IEX and pass it to os.environ["IEX_API_KEY"]. (https://pandas-datareader.readthedocs.io/en/latest/remote_data.html#remote-data-iex)
I don't know if the IEX API still works.
The morningstar is not implemented. The following data sources (at the time of writing) are:
Tiingo
IEX
Alpha Vantage
Enigma
Quandl
St.Louis FED (FRED)
Kenneth French’s data library
World Bank
OECD
Eurostat
Thrift Savings Plan
Nasdaq Trader symbol definitions
Stooq
MOEX
You must provide an API Key when using IEX. You can do this using
os.environ["IEX_API_KEY"] = "pk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
or by exporting the key before starting the IPython session.
You can visit iexcloud.io, after creating a student account you will get an API key for free.
This script is giving ma a 500 Error, any ideas?
I am taking the script from a page from python samples and also using the path given to me by my hosting company (and I know it works because I have another script that does work.)
The file has 755 permissions as well as it's directory:
#!/home3/master/bin/python
import sys
sys.path.insert(1,'/home3/master/lib/python2.6/site-packages')
from twython import Twython
twitter = Twython()
trends = twitter.getCurrentTrends()
print trends
There are two problems with this code. The first is you have not included any OAuth data, so the Twitter API will reject whatever you send. The second is there is no getCurrentTrends() attribute in Twython. Did you mean get_available_trends or get_closest_trends?