How to create database for React app and schedule scrapers - node.js

I have a question regarding one of my React apps that I recently developed.
It's basically a landing page, which is using React frontend and Node+Express backend and its scraping data from various pages (scrapers are developed in Python).
Right now, the React app itself is hosted in Heroku and the execution of scrapers is working, but not ideally.
What I would like to do is to set up a proper flow
create a database
schedule the scrapers
collect the data in the database
request data from the database in the React app, when needed
I've read about different possibilities such as Firebase, also different AWS options like EC2, Lambda, S3 etc. I'm a bit lost in the midst of all this, so maybe you can help out and give me some suggestions!
Thanks in advance!

If I understood correctly your problem, then the scraping itself does not have to be associated with your landing page/React application. Let's walk through a potential solution.
SQL Database
You can use anything SQL database here, really. Create a table with relevant columns for each source that you will scrape. I personally like RDS Postgres within AWS. Scraping Yahoo Finance? Well, have a table called "yahoo" and columns such as "ticker", "open", "close", "date", etc.
Schedule the scrapers
I assume you already taken care of the actual scraping/extracting information from the source with Python. You can use cronjob or schedule package to schedule the scrapers to run hourly/daily/weekly/etc. Connect your scrapers to the SQL database in order to access it and store the data in whichever way you need. The scrapers can live in EC2 in AWS. You would need to do some setup for the instance. You can also connect scrapers to an application such as Sentry to easily monitor the progress and errors of scraping.
React App
Connect the database to the Node backend. Use a simple API call to your backend to access the data and use it. You can use sequelize ORM to access the Postgres database.
To conclude, I believe the idea is relatively straightforward, you just need to select the tools (I gave some suggestions) and start implementing them!

Related

Why does backend development need a seperate server?

I am developing my own website. So far, I've used React for the frontend and Flask for the backend. I've been doing frontend development for a while now but I'm just starting to get into the backend.
From my limited understanding, frameworks like Flask and ExpressJS create their own servers and host data that the frontend can use. It seems to me that that they automatically create websites to host and receive data. In my website, I route the backend to do what I want and use fetch requests with POST and GET from the frontend to communicate.
Although it works, to me, it seems overly complex. Why does the backend need it's own server? It seems unnecessary to create a proxy for the frontend and fetch data. Why can a website not just run custom code in the background, why does it need a service like Flask or ExpressJS to run in the background for it? These backend frameworks run Python or NodeJS in the background, but wouldn't it be much simpler if the website itself could run Python or NodeJS in the background?
I also see that in frameworks like React, you can import things and use modules— like in NodeJS. While importing some modules works, the require keyword is not allowed and normal NodeJS code will not work. Therefore, the backend will not work. Why is this— why can't you just run backend code natively? Instead you have to go through fetch and specify headers to basically translate information from your frontend to your backend.
Forgive my amateur understanding of web development, but the frontend/backend system seems overly complex to me. Thanks in advance.
Why does the backend need it's own server?
Where will the client store data so that when you open the page again the data will still be there? You can use localStorage but this is locked to that particular browser. What if someone logs in on a different device or uses a different browser?
Where will the client get the application from in the first place? Your application needs to be packaged up in a form that can be easily downloaded, and it needs an address to be loaded from. This is all considered "back end" even if you're using a static hosting service like GitHub Pages.
There's a lot of reasons why a back-end exists and needs its own server. Any application with persistent state which is expected to work across different sessions needs at least one of these.

How do I fetch in Remix JS from a specific (back-end) API link / Database URL?

I am trying to follow the Jokes tutorial #https://remix.run/docs/en/v1/tutorials/jokes
and I was wondering how do I fetch from a specific (back-end) API link / Database URL?
Can you provide an example (where you use Prisma)?
On the webpage it says "You can use any persistence solution you like with Remix; Firebase, Supabase, Airtable, Hasura, Google Spreadsheets, Cloudflare Workers KV, Fauna, a custom PostgreSQL, or even your backend team's REST/GraphQL APIs"
I have set the DATABASE_URL in the .env file to the API link, however I don't know how to continue from here
Please provide some code, so we can help you based on a specific question.
In general, you would fetch data inside your loader function in Remix. That function runs on your server and can be used to fetch from a database or API.
If you have trouble with Prisma, I would suggest you have a look at the tutorial you are following or look at the Prisma documentation.
You can find more information about data loading in Remix in the Remix documentation: https://remix.run/docs/en/v1/guides/data-loading

Access database from different network

I am trying to build a react-native application and to do so, I am following this tutorial : https://www.youtube.com/playlist?list=PLB97yPrFwo5hMR8znwt0NqgmmqzoPemnT
My application is like a form. People must enter some data into some fields and they will be sent into a database. People can later see their form from the application. So it is like simple a CRUD todo list application.
However, I do not want to use on-demand cloud computing platforms (AWS, GCP, OVH, ...) to store my data. I have my own server at home (NUC Intel) and want to use it as the application's server (store the data in there).
On my server I want to have a mongodb database and make it available from any computers that request it with the correct credentials. I would add firewalls as well for security.
This way, the react native application will be able to access read and write from this database wherever they are in the world as long as they have internet available.
As my backend I will use nodejs and express.
As my front end React Native and Redux.
Any ideas or tutorials I could follow ? I have been looking a lot but all the tutorials end by storing their database in GCP or AWS ... So is it possible to access mongodb database from a different network? Or would MySQL be a better solution?

the process between a frontend, backend and cloud database

I'm having a hard time finding information on simple straight forward process. I keep getting forwarded things like the "google cloud engine" and such.
I am attempting to start a new project to expand my knowledge. Previously, I developed a localhost web app which included; working frontend with react, express backend (REST api) and mongo database. I understood the concepts effectively of rest calls, state management and authentications and such.
The new setup is flutter, nodeJS (express), and firebase.
Looking at quick tutorials I have a simple flutter app working with a http post for a user sign up. Makes sense.
Normally in nodeJS, I'd have a route it hits e.g. router.post('/users', function (req, res, next) ... and then I'd have a model scheme to and if everything is correct it would post.
Exploring the relationship with firebase and nodeJS I'm slightly overwhelmed on how this works. I thought it would be something simple as an authentication key (which, btw I have sorted out with firebase-admin) and then proceed on my merry way with my models and routes/services.
Are the models defined within firebase, and my node just confirms the requests and talks through the firebase API? I haven't been able to locate any simple resources for this.
Since you didn't say which product within Firebase you're using (Firebase is a suite of products, not just one thing), I'm going to assume you mean Realtime Database or Cloud Firestore. They are both schemaless NoSQL databases -- they don't impose any structure on the data you put into them. There's no model, there's no validation. That's all stuff you have to do on your own, if you want. Or not, if you want flexibility.

How to fetch from nodejs-api-starter into react-starter-kit

I am trying out React-Starter-Kit for the first time and loving all the cutting edge features baked in (apollo/graphql-client in particular). A crucial part of any app for me is the database, and for that my understanding is the same author provides nodejs-api-starter which sets up a REST interface for accessing Postgres at localhost:5000 and has a graphql webui at localhost:5000/graphl.
That is about as far as I have been able to understand of the setup so far. I have changed the frontend code a little bit so a new Component "Counter" is loaded on the home page. I need to be able to make a new counter, fetch the latest counter, and increment decrement the counter. Write now the component just outputs the 'value' retrieved from the server at 5000.
I do not think I am accessing the 5000 server correctly, do I put the port in this url line somehow?
You can pull the repo down from : https://github.com/Falieson/react-starter-kit-crud-counter-demo
This is my first time setting up a nodejs api server, I am used to using MeteorJS which has pub/sub to MongoDB baked in. I am looking forward to the separation the RSK strategy (which seems more industry standard?) provides.
I've just done setting up the full site with Database from React-Stater-Kit, I'm also a newbie so I understand your frustration.
About this question, you don't need the NodeJS-API-Starter, it has enhanced function ( such as Redis cache ) and it's not suited for newbies. You should look deeper into the RSK, it already has the DB. If you ran the boilerplate and played around, change is you'll see file database.sqlite in your folder, it's the database. Here are the things you should learn:
Use SequelizeJS to connect the NodeJS server with database. Your database can be MySQL/MariaDB, PostgreSQL or SQLite. The connection is easy and there's tool to auto-generate Models from your database
How to create GraphQL's Types and Queries. If your queries need to search through the database, import Sequelize's models and use its functions.
Test your API via GraphQLi
Note: if you want to use MongoDB or other NoSQL, try Mongoose instead of Sequelize.

Resources