Nodejs API architecture cron job

Nodejs API architecture cron job - node.js

I'm building a REST API with Nodejs with MongoDB as database.
This API only have GET routes to retrieve data. This data is aggregated from others sources via cron job.
I'm not sure how to correctly implement this and what are the best practices.
Do i need to create POST/PUT route to put data in database ?
Do i need to just put data directly in the database ?
Edit : (more informations)
Only the cron jobs would use POST route
The cron jobs are getting data from others REST API and some web scraping.
Is it a good idea to have my cron in the same application with the API, or if I have to make another application to manage my cron jobs and populate my database ?

I suggest creating an API which can be called using an accessKey for updating the data because you would not want to write your mongo db username and password in a shell file.
But if the cron job is a k8s cron job or just a file written in some programming language which can access the db in secure way and is hosted in the same network then you can go ahead with doing it from cron job.

If you'd like to control the data flow entirely through API at this point, creating a POST route would be the way to go. I am not sure how are your GET routes secured - if not at all, consider implementing or at least hard-coding some sort of security for routes that modify your data (oAuth2 or similar).
If you can access the database directly and it's desired, you can just directly insert/update the data.
The second option would be probably quicker, but the first one offers more space for expansion in the future and could be more useful overall.
So in the end, both options are valid, it's up to preference and your use case.

Related

CosmosDB return data from external API on read

I am attempting to write an Azure CosmosDB integration (Core SQL Api) that integrates with an external service to provide some of the query data. As an example, I need a query made on Cosmos DB to convert some of the data returned (e.g. ID's) by the query into real data by calling an external service via a REST API. This should only happen when querying certain columns.
I initially investigated using a JS stored procedure and/or a UDF to make this external call, but the JS environment seems to be extremely limited and doesn't provide any way to make external calls. I then tried using this https://github.com/Oblarg/cosmosdb-storedprocs-ts repository, which uses webpack to bundle all of node.js into the stored procedure, allowing node modules to be used in stored procedures. Whilst this does allow some node modules to be used, whenever I try and use "https", "fetch", or "axios" modules to make an HTTP GET request I get errors (the same code works fine in a normal node environment, but I'm not a JS expert and can't seem to work past these errors). After a day of attempts it seems like the stored procedure approach is not possible.
Is this the case or is there some way of making HTTP GET requests from a JS stored procedure? If not possible with stored procedures, are there any other techniques to achieve the requirement of reading data from a remote API when querying cosmos DB?
Thanks

There is no way to achieve this from CosmosDB directly, for queries you also cannot use the change feed as the document dont change, so really your only option is to use a function or some preprocessor app to handle it, as you say its not ideal but there is no other solution here. If it was an insert or an update then change feed would allow you to do this but for plain queries its not possible.

Node.js: Is there an advantage to populating page data using Socket.io vs res.render(), or vice-versa?

Let's say, hypothetically, I am working on a website which provides live score updates for sporting fixtures.
A script checks an external API for updates every few seconds. If there is a new update, the information is saved to a database, and then pushed out to the user.
When a new user accesses the website, a script queries the database and populates the page with all the information ingested so far.
I am using socket.io to push live updates. However, when someone is accessing the page for the first time, I have a couple of options:
I could use the existing socket.io infrastructure to populate the page
I could request the information when routing the user, pass it into res.render() as an argument and render the data using, for example, Pug.
In this circumstance, my instinct would be to utilise the existing socket.io infrastructure; purely because it would save me writing additional code. However, I am curious to know whether there are any other reasons for, or against, using either approach. For example, would it be more performant to render the data, initially, using one approach or the other?

How to create database for React app and schedule scrapers

I have a question regarding one of my React apps that I recently developed.
It's basically a landing page, which is using React frontend and Node+Express backend and its scraping data from various pages (scrapers are developed in Python).
Right now, the React app itself is hosted in Heroku and the execution of scrapers is working, but not ideally.
What I would like to do is to set up a proper flow
create a database
schedule the scrapers
collect the data in the database
request data from the database in the React app, when needed
I've read about different possibilities such as Firebase, also different AWS options like EC2, Lambda, S3 etc. I'm a bit lost in the midst of all this, so maybe you can help out and give me some suggestions!
Thanks in advance!

If I understood correctly your problem, then the scraping itself does not have to be associated with your landing page/React application. Let's walk through a potential solution.
SQL Database
You can use anything SQL database here, really. Create a table with relevant columns for each source that you will scrape. I personally like RDS Postgres within AWS. Scraping Yahoo Finance? Well, have a table called "yahoo" and columns such as "ticker", "open", "close", "date", etc.
Schedule the scrapers
I assume you already taken care of the actual scraping/extracting information from the source with Python. You can use cronjob or schedule package to schedule the scrapers to run hourly/daily/weekly/etc. Connect your scrapers to the SQL database in order to access it and store the data in whichever way you need. The scrapers can live in EC2 in AWS. You would need to do some setup for the instance. You can also connect scrapers to an application such as Sentry to easily monitor the progress and errors of scraping.
React App
Connect the database to the Node backend. Use a simple API call to your backend to access the data and use it. You can use sequelize ORM to access the Postgres database.
To conclude, I believe the idea is relatively straightforward, you just need to select the tools (I gave some suggestions) and start implementing them!

Posting Firebases's thirdpartyuserdata object to the server

I'm using Firebase and the SimpleLogin to allow users to login via Google, Twitter etc.
I'd like to use some of the thirdpartyuserdata object to create a user profile for my application which runs on Node.
Currently I'm posting this data to the server so that I can add to it and create the profile object, but I wondered if there's a better way of doing this - is there something I can call server side to get this thirdpartyuserdata without having to post it from the client?

Start by considering that your "server" is actually just another consumer of Firebase data. Since FirebaseSimpleLogin is simply a token generator with some fancy tools for doing OAuth, and because this happens completely client-side, there is nothing to consume about this.
If you want to consume the data at the server, you will either need to POST it, as you have done, or use Firebase to transfer the information. You'll find that a queue approach can save you a large amount of code, as this allows you to use Firebase as the API, and avoid creating RESTful services in Node, and all the baggage that comes with that.
The idea of a queue is simply that you push data into Firebase at one client and read it out (and probably delete it) at the intended recipient (in this case your node worker).

Restify: Passing User Data

I am writing service using NodeJS + Restify. I have split each actual service into separate file (what, I assume, everyone is doing). They all are going to be using mysql database so I thought I could open a single connection to database which could be used by each service rather than opening connections every time a request is done.
The problem is that I don't seem to find a way to pass user data. By user data I mean any custom data that would be accessible by every service callbacked by the server.

I primarily use NodeJS + Express, but having looked through some of the documentation of Restify, I believe you could use the authorization parser (under Bundled Plugins on their site: click here to go there)
I think that would be the most basic way to pass user data.
I haven't tested it but, I believe you'd just add this to use it:
server.use(restify.authorizationParser());
You could then access the user data with:
//This is based on the structure of req.authorization in the documentation.
req.authorization.basic.user
I believe you could set new user data (when the user logs in or something) like:
req.authorization.id = 'id'

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Nodejs API architecture cron job - node.js

Related

CosmosDB return data from external API on read

Node.js: Is there an advantage to populating page data using Socket.io vs res.render(), or vice-versa?

How to create database for React app and schedule scrapers

Posting Firebases's thirdpartyuserdata object to the server

Restify: Passing User Data

Categories

Resources