Using shared AWS config/credentials files locally in NodeJS ExpressJS app - node.js

Im asking this question because I spent a long time trawling google trying to find an answer to it and finally stumbled upon a single line in an AWS doc that pointed me to the correct answer, but it is not obvious, so hope this helps someone save some time.
Problem:
I am trying to call AWS services via the AWS SDK in my NodeJS Express app, but the SDK is not loading the profile and key details I have in my ~/.aws/credentials and ~/.aws/config files. When I try and log the credentials object it is empty. I am trying to use a profile which I have set using the AWS_PROFILE environment variable, set in a .env file. Looking at the AWS docs the SDK should look in these files under ~/.aws, but it doesnt seem to be doing this.

So the answer lies in a single line in one of the AWS docs: http://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/getting-started-nodejs.html
Similarly, if you have set your region correctly in your config file, you can display that value by setting the AWS_SDK_LOAD_CONFIG environment variable to a truthy value
Adding AWS_SDK_LOAD_CONFIG=true to my .env file caused the SDK to start using the credentials stored in the ~/.aws/config and ~/.aws/credentials files.
I dont know why the docs dont make it more explicit that you need to do this (that page suggests you only need to set this to get the region to pass), everything I read suggested that these files would be checked by the SDK regardless.
Hopefully this helps save someone else a lot of frustration and Googling!

Related

Read Words from a PDF or DOC file in React or Firebase Cloud Functions

I have a simple React application (with no back-end servers), hosted on Firebase.
The application takes in a file(either a word document or a pdf) and stores it in firebase storage and stores the metadata in firestore.
I have a requirement to read the number of words in the file and if it is more than 500, block the upload.
I have been searching for a way to do this using just React and i think it cant be done. The other option i have is to use Cloud Functions in Firebase which use NodeJs and even with that i am not finding any solution to do this.
At this point in time, i cant setup a proper back-end server to do this work.
I would be grateful if someone can point me in the right direction to solve this.
Thanks.
A few pdf readers exist for node which can be installed in Cloud Functions since most don't support in browser. It is advised to upload the pdf to storage to save processing issues, simply add the reference path to the Cloud Function payload and delete it after completion.
pdfreader at this current time the best available for PDF's parsing, but requires a node environment such as Cloud Functions.
The second issue is reading image-based pdf's which require OCR, link provided.
https://www.npmjs.com/package/pdfreader
https://www.npmjs.com/package/pdf-ocr

Archive not found in the storage location: Google Function

I have a Running Google Function which I use in my code and it works fine.
But when I go to Google function to see the source code it shows:
Archive not found in the storage location
Why Can't I see my source code? What should I do?
Runtime: Node.js 10
There are two possible reasons:
You may have deleted the source bucket in Google Cloud storage. Have you perhaps deleted GCS bucket named like gcf-source-xxxxxx? It is the source storage where your code is archived. If you are sure you have deleted the source bucket, There is no way to restore your source code.
Much more likely, though, is that you did not delete anything but instead renamed the bucket, for example by choosing a near city for the location settings. If the GCS bucket's region does not match your Cloud function region, the error is thrown. You should check both services' region.
You can check the Cloud Function's region at details -> general information
This error had appeared before when I browsed the Google Storage location that is used by the cloud function - without deleting anything there. It might have happened, though, that I changed the location / city of the bucket to Region MY_REGION (MY_CITY). In my case, the CF was likely already at the chosen region, therefore the other answer above, bullet point 2., probably does not cover the whole issue.
I guess it is about a third point that could be added to the list:
+ 3. if you firstly choose a region at all, the bucket name gets a new suffix that was not there before, that is, from gcf-sources-XXXXXXXXXXXX to gcf-sources-XXXXXXXXXXXX-MY_REGION. Then, the CF is not be able to find its source code anymore in the old bucket address. That would explain this first error.
First error put aside, now the error in question appears again, and this time I have not done anything to get Google app engine deployment fails- Error while finding module specification for 'pip' (AttributeError: module '__main__' has no attribute '__file__'). I left it for two days, doing anything, only to get the error in question afterwards. Thus, you seem to sometimes just lose your deployed script out of nowhere, better keep a backup before each deployment.
Solution:
Create a new Cloud Function or
edit the Cloud Function, choose Inline Editor as source code, create the default files for Runtime Node.js 10 manually and fill them with your backup code.

I want to record some user actions for my python flask app deployed on heroku. Suggest me ways to do it

I have a Python Flask app deployed on heroku. I want to record user interactions in a file(kind of log file). Since heroku storage is temporary, even though I append actions to a log file the data is lost. I don't want use a DataBase for this simple task. My idea is to have an API that can modify files in a remote file system. I am looking for such remote file system(cloud storage) along with API to accomplish my task.
For example, let us assume that I have 3 buttons on my app and a tracking.txt file. Then
if button1 is clicked, I want to write(append) 1 to tracking.txt .
Similarly for button2 and button3.
I have searched the internet but didn't find any that can fit my exact need or I didn't understand any of them well.
Any help is appreciated. Thanks in advance.
PS: I am open to change my thought if there's no way other than using DB.
one possible solution is to use Amazon S3 together with the Boto3, the Amazon Web Services (AWS) SDK for Python.
You can copy (push) your file from Heroku to an S3 bucket (at intervals or after every change, this depends on your logic)
import boto3
session = boto3.session.Session()
s3 = session.client(
service_name='s3',
aws_access_key_id='MY_AWS_ACCESS_KEY_ID',
aws_secret_access_key='MY_AWS_SECRET_ACCESS_KEY'
)
# upload file from local path to S3 Bucker
s3.upload_file(Bucket='data', Key='files/file1.log', Filename='/tmp/file1.log')
One option with this approach is that you can use localstack for your local development, hence only your (production-like) application on Heroku will send files to S3, while during development you can work offline

Node not able to read environment variables in AWS Beanstalk

I can't go into details unfortunately, but I'll try to be as thorough as possible. My company is using AWS Beanstalk to deploy one of our node services. We have an environment property through the AWS configuration dashborad, the key ENV_NAME pointing to the value in this case one of our domains.
According to the documentation, and another resource I found once you plug your variables in you should be able to access it through process.env.ENV_NAME. However, nothing is coming out. The names are correct, and even process.env is logging out an empty Object.
The documentation seems straight forward enough, and the other guide as well. Is anyone aware of any extra steps between setting the key value pair in the dashboard, and console logging out the value once the application is running in the browser?
Turns out I'm an idiot. We were referencing the environment variable in the JavaScript that was being sent to the client. So, we were never looking for the env variable until after it was off the server. We've added a new route to fetch this in response.

cherrypy: how to access file system from cherrypy?

I am working on a cherrpy web service to work with my web app. In this service it needs to be able to access the file system. For example, I want to be able to list all files under a certain directory. I am using os.walk('/public/') but don't seem to get it to work, even though the same code works outside of cherrpy.
Is there a way to make it work so I can use cherrypy to manage files?
What user is the webapp running as, and does it have access to read the folder?
According to the documentation os.walk() will ignore errors from the underlying calls to os.listdirs()
http://docs.python.org/release/2.4.4/lib/os-file-dir.html
You could try setting the onerror argument like
def print_error(error):
print error
os.walk('/public/', print_error)
which might give you a hint as to what's going on.
Also, you could try going directly to os.listdirs() and see if you get any errors from it.

Resources