Pre-render a static website from REST-api and templates? - node.js

I have a rest-api that I will use to render html using some basic templating language. I wonder if there is any good platform or service for pre-rendering HTML-files and serv them statically. For performance and scalability.
I need to pre render the pages contiously, like every 24 hours, and it should also be possible to tell the system to re-render a specific page somehow. I'm comfortable in most open-source languages, node is a favourite.

It seems to me that the most straightforward way to accomplish this is to use two tiers: a rendering server and a cache server. When cache server starts up it would crawl through every url on the rendering server and store the pre-rendered HTMLS files into its local directory. For simplicity you can mirror the "directory structure" and make the resource paths identical. In other words, for every URL on the rendering server that looks like this:
http://render.xyz/path/to/resource
You create a directory structure /path/to on the cache server and put a file resource in it.
Your end-users don't need to be aware of this architecture. They make requests to the cache server like this:
http://cache.xyz/path/to/resource
The cache server gives them the result they are looking for.
There are many ways to tell the cache server to refresh (re-generate) a page. You could add a "hidden" directory, let's call it .cache-command, and use it to handle refresh requests. For example, to tell the cache server to refresh a resource, you would use a URL like this:
http://cache.xyz/.cache-command/refresh/path/to/resource
When the cache server received that request, it would refresh the resource.
One of the advantages of this approach is that your cache server can be completely independent of the render server. They could be written in different languages, running on different hardware, or they could be part of the same nodejs application. Whatever works best for you.

Related

What is the difference between these two ways of serving React App?

I would like to serve a react project from the nodejs server. I encountered the two ways of doing it:
The first way is to use express to serve just the build folder for whatever the req made:
const express = require('express')
const app = express()
const path = require('path')
app.use(express.static(path.join(__dirname,'build')))
app.get('*',function(req,res){
res.sendFile(path.join(__dirname,'build','index.html'))
})
module.exports = app;
The second way is one using ReactDOM.hydrate and ReactDOMServer.renderToString to serve the app. It is described here.
What is best way to achieve the good SEO from the above mentioned ways and when to choose one over other?
Thank you!!!
CSR
The first approach, where you just serve the build folder and direct all the requests to index.html is a default way of how single-page applications (SPA) work. This approach is called Client Side Rendering (CSR), meaning that client (browser) will be responsible for preparing all the content of your website via executing javascript code of your application, then fetch all the data from API (news, posts, profile, etc.) and, finally, build the page layout and display everything on the screen.
SSR
In turn, with the second approach you mentioned, server prepares (renders) the whole document (HTML) with content and sends it to the client which only needs to display it. This is called Server Side Rendering (SSR) and in your case the ReactDOMServer is responsible for that. However, since you want your application to be interactive, you need to "revive" it with javascript (in our case with React) and that is what ReactDOM.hydrate actually does. It appends all the necessary event listeners to existing markup and makes page to behave in the way it would behave if it was fully rendered on the client (default CSR).
SEO
There is a general opinion that using CSR has a bad impact on SEO because bots crawling the site need to perform additional steps (execute javascript) and it slows down the process and make it less efficient, moreover, not all the bots can run javascript at all.
However, nowadays, modern crawlers (e.g. Google) can cope with SPA quite good, but the end results might be not as good as with SSR.
If you are at the beginning of project development and SEO is really a very high priority for you, then you should choose SSR.
However, instead of implementing everything yourself with ReactDOMServer and hydrate, I'd recommend you to take a look at the Next.js - it is powerful and easy to learn React framework.
P.S.
SSG
You also should be aware of the Static Site Generation (SSG) approach, where every single page of your application gets prerendered at the build stage, producing bunch of HTML files and other assets of your site. Then, all those static files are served from a simple hosting and/or CDN. The main benefits of such approach are: very high speed of page loading, great SEO and usually very low cost for maintenance.
However, this approach suits only sites where content changes very rarely and pages are not interactive. Of course, you may combine it with hydration, but it often leads to quite tricky and buggy solutions in the end.
You can read more details about all three approaches here.
React renders on the client side by default. Most search engine bots however, cannot read JavaScript. Therefore, using server-side rendering is better for SEO, because it generates a static HTML file on the server, which is then served to the client.
Another difference is that client-side rendering will take longer to load the first time, but all consecutive times it will render faster (if the client didn't disable cache). A server-side rendered website has to render the page everytime it loads on the server. Making it slightly slower on average, but will provide consistent loading speeds and a faster first-time loading speed which is important for business landing pages as an example.

Using NodeJS functions in html

So I have made a back-end in NodeJS but I ran into one problem, how is it possible to link my back-end to my front end html/css page and use my NodeJS functions as scripts?
In case this wasn't clear to you, your nodejs back-end runs on your server. The server's job (in a webapp) is to deliver data to the browser. It delivers HTML pages. It delivers resources referenced in those HTML pages such as scripts, images, fonts, style sheets, etc.. It can answer programmatic requests for data also.
The scripts in those web pages run inside the browser which is nearly always (except for some developer testing scenarios) running on a completely different computer on a completely different local network (only connected via some network - usually the internet).
As such, a script in the browser cannot directly reference variables that exist in the server or call functions that exist on the server. They are completely different computers.
The kinds of things you can do in order to work-around this architectural limitation are as follows:
The server can dynamically modify the web page it is sending the browser and it can insert data into that web page. That data can be in the form of already rendered HTML or it can be variables inside of script tags that your web page Javascript can then use.
The javascript in the web page can make network requests to your server asking it for data. These are often called AJAX calls. In this scenario, some Javascript in your page sends a request to the server to retrieve some data or cause some action on the server. The server receives that request, carries out the desired operation and then returns a result back to the client Javascript running in the browser. That client Javascript receives the result and can then act on it, inserting data into the page, making the browser go to a new web page, prompting the user, etc...
There are some other ways that the web page javascript can communicate with the server such as webSocket connections, but we'll put those aside for now as they are just more ways for remote communication to happen - the structure of the communication doesn't really change.
how is it possible to link my back-end to my front end html/css page and use my NodeJS functions as scripts?
You can't directly use your nodejs functions as scripts in the front-end. You can make Ajax calls to the server and ask the server to execute it's own server code on your behalf to carry out some operation or retrieve some data.
If appropriate, you can also insert scripts into the web page and run Javascript directly in the browser, but whether you can do that for your particular situation depends entirely upon what the scripts are doing. If the scripts are accessing some resource that is only available from the server (like a database or a server storage system), then you won't be able to run those types of scripts in the browser. You will have to use ajax calls to ask the server to run them for you and then retrieve the results.

How do you manage repositories for production/deployment of Node-React app?

Not long ago , we used to have server render pages and then React came for client side rendering and single page application.It introduced virtual DOM's and changed the way we write our code.
We require all these react libraries and install them as dependencies before writing our codes. Now we can break into many components , have many css and scss files including images. But at the end we will build the files, make compact bundle and serve from build folder.
Express get route
app.get('*', (req,res) =>{
res.sendFile(path.join(__dirname+'/client/build/index.html'));
});
Heres, What I have understood :
Build folder is the place where webpack combines all the files and create minified bundle ready for deployment. That file is basically simple HTML and JS files which every browser can understand. As all the browser doesn't understand ES6 and much more, we have to convert all these files into plain language that every browser can understand.
Also, webpack-dev server is only for development purposes and we won't be running it into production.
Is virtual DOM/Real DOM just for development purposes? or
are those react libraries also trans-piled while building the minified files? If later is the case , react is run on background mode on client's browser? I want to know how react takes care of client side routing after the building the app.
How do you manage github repositories for Node-React app? Do you keep two different repositories one for front end and other for back-end? Whats the industry standard?
If you keep two repository, how do you deploy the front-end code? As you can't run the webpack-dev-server into production. Nor you can specify the public static (build folder) in your back-end(express server) as they are separated in two repos. How does, either the integration of these two repositories take place( lets say we have two AWS EC2 instance, one for each) or front-end get served from the front-end repo??). Can you actually use something like npm serve in production ??
what am I trying to do ?
I want to deploy my node-react app on AWS. I have only one repository on github. I have one folder "client" inside my repo where all the react code sits with its package.json file. All the other files for server are inside root folder (server doesn't have its own folder and files are scattered inside root folder). So there are two package.json files, one inside root folder for server and one inside client folder.I am planning to run my-node app on a docker container.
Please help me understand the core concepts and standard practices for code hosting and deployment keeping large scale enterprise application in picture.
I would not go into explaining all the points in your question here because, #Arnav Yagnik and #PrivateOmega have both done a brilliant job at explaining most of them. I would definitely recommend you to read their answers properly and read the links provided for more information before reading this answer.
I would like to address your question of deploying a Node-React application. In production, generally, we have different deployments (or "repositories" as you mention in your question) for both the front-end (React) and back-end (Node). This allows your back-end to sit in an EC2 instance, for example, with auto-scaling to make sure that it can cope up with all the requests coming in.
As mentioned in the previous answers, and in your question as well, webpack compiles and minifies the React files into simple HTML and JS files, which most browsers can run (I'm not going to explain VirtualDOM here because it has already been perfectly explained in other answers). You would then take these minified files and serve them from an S3 bucket for example, because again, it is a single page application (also discussed in the other answers) and the business logic is already in the minified JS files and its just simply sending all requests to your back-end server.
Now for front-end, you can use TravisCI for example to deploy the build folder (the one you talk about in your question) to an EC2 instance and serve your files using NGINX or if you can configure a CDN deployment properly, you can serve the files from an S3 bucket for the most optimal performance.
You can think of serving the React application like sending a cryptic block of code to your user's browser. Now you can deploy this cryptic block of code to a publicly available S3 bucket, and serve it from there. Again, because of webpack and minification/uglification, no on would be able to make any proper sense of what your original code was, remember that you can still access all the code in Chrome's Sources tabs for example.
I would like to address this with different approach.
Server Rendered Pages : The concept has not changed, server when encountered with a DOC request it has to respond with a html. Now HTML may or may not contain scripts(can be either inline or a external server address). In case of question's context you can still ship HTML where it will download scripts that you have written(may include react or not). for most cases you can ship empty html with scripts tags which will download the scripts over network and execute them which would contain all the rendering logic.
To Answer your questions :
1st : There is no background mode in a single threaded JS(unless we want to talk about workers but we can leave them out for this discussion). By writing in code you are not interacting with any DOM. You are instructing your components(extended by React) when to change their state and when to re-render(setState). React internally calculates the virtual DOM and compare to Real DOM to calculate actual changes that are to be made on Real DOM(this is very abstract answer, to get more understanding please read react docs, Baseline here is you are not interacting with any DOM just instructing React core library when to update and what is the updated state)
2nd : If you want to support SSR(server rendered pages). I would suggest to make 2 folders , client(this would include all client components and logic) and server(would include all server side logic) with different package.json as packages differ for both applications.There is no such industry standard here, what floats your boat should work but generally making directories based on logical entities should satisfy separation and maintainability, if in future you think you want to fork out server and client in separate repos , it would definitely make the process easy.
3rd : You shun running webpack-dev-server in production. Files are generally not obfuscated hence payload is heavy(not to forget your written code is out there). Even if you want to make different repos, server can spit out html and html can request scripts with your client server.
How to deploy : Deploy your code and run :
node server/app.js
and in app.js you can write the location block what you have mentioned.
P.S. : If you just need a server with that location block. do you really need a express server? You can upload the client build to a CDN and route your domain to serve index.html from the CDN(s3 bucket can also be used here)
I would like to start off with clearing up the terminologies as much as I can.
Like you said server rendered pages was a more prominent standard in the past, but it hasn't changed at all with the introduction of React, because even React has the support for Server rendering or SSR, which means HTML pages are generated at server side and then served to clients using browser.
And client side rendering means, a HTML page is loaded to browser and then javascript code renders things on top of those HTML pages and make them interactive.
And single page application concept is that we have only a single HTML file or base HTML page on top of which based on user interactions and data from server it is rewritten continuously.
Virtual Dom is an amazing concept introduced by React. React library code recreates the structure of all elements(called DOM elements) of a HTML page in the memory in a tree form. This enables React algorithm called Fiber to reconcile appropriate changes as per route update or any other changes first on this tree like structure before translating them onto the real elements in the HTML page.
Babel is a transpiler to transpile latest features that browser engines haven't started supporting to code that they can understand, usually ES6+ code into pre-ES6 because all browser supports that. In React application, if you have written application using JSX syntax, babel supports transforming JSX into normal javascript also.
Yes, breaking up of pages into many components is possible due to compositional nature of components by React which means we can build complex things by combining small and more focussed things.
At the end before serving it to end users, we can't have web application lag due to the huge size of code, so during the build process, things like minifying(removing whitespace etc) and other optimization like combining multiple javascript files into one etc are done, and then compact bundle is served from build folder like you said.
Yes, build folder is where webpack does the minifcation and combination to create a bundle as small as possible. It is basic HTML and JS files that is understood by every browser, and if the code contains something that a particular browser doesn't support, appropriate support code or something called polyfill is also bundled with it. Technically you can't say browsers only understand pre-ES6 code because a lot of browser engines have implemented plenty of ES6 features already.
Webpack dev server is just used to serve a webpack application over a port like a node.js server and gives us features like live-reloading which is needed when you constantly make changes to your application codebase and it isn't needed at production because like we said previously, at production time it's just HTML and JS and nobody ever makes any changes on these files.
Virtual DOM is a memory representation or concept used by React Code just like we have stacks and queues and it not just used at development time. Yes and No. Because I think appropriate parts of react source code which is required to run the application would also be bundled before generating the production bundle.
I would say, don't have a preset way of things, because it is totally upto the developer and the team, because I have seen people using 2 seperate repos because frontend people work on frontend things whereas backend people work on backend things. But there's also a case when everyone's a fullstack developer and you can Technically have it in a single repo with a single package.json and use the backend to serve the frontend files and you have to manually install each react dependency and cannot directly use CRA or create-react-app like generator.
What has 2 repositories to do with front-end deployment in production? You don't need to run webpack-dev-server to server files in production. You can create a production bundle and then setup any http server to serve the generated bundle.
Regarding your current scenario I would say instead of having 2 package.json, you can go with a single package.json and install all dependencies together or go with a monorepo approach using something like lerna or yarn workspaces.
But for a total beginner I would suggest 2 separate repositories to encounter less problems.
And a bonus point if you are not aware, you can write React in pre-ES6 code and also without JSX as well.
1) virtual DOM is basically to say that you are calling a function of react not the actual function which does manipulation on the real DOM
like this one
document.getElementById("demo").innerHTML ="Helloworld"
modifies the actual dom
but this
ReactDOM.render(
<HelloMessage name="Taylor" />,
document.getElementById('demo')
);
if you see this properly you aren't doing anything directly on the dom you are just giving the react function control to do things , internally react take cares of modifying the that dom element demo whenever the react wants to re-render it based on its own logic which is what they claim as optimized which is why people use it in first place. Yes when you build your code with webpack it does include react in it which is part of that minified code, so if you see any of the error stacktrace in development you do see react is the starting point for it
2) I think its a choice to be made, as there are not restrictions on this
3) Coming to deployment , In general if you want use nodejs you might choose expressjs server type of deployment but otherwise generally its better to use a high performance server like Nginx or Apache or else if you just don't want to get into this whole drama of things people generally use heroku based deployment or else people are using special platforms like netlify,surge.sh these days (its super easy to deploy on these platforms).
I believe others have done a pretty good job explaining the React Virtual DOM. In a simple and practical way, I’ll attempt to explain how I (would) manage the deployment of a dynamic website (including medium-sized enterprise systems) using NodeJS and React. I’ll also attempt not to bore you.
Imagine for once that React never existed and that you were building a traditional Server-Side Rendered application. Each time the user hits a route, the controller works with the model to perform some business logic and returns a view. In NodeJS, this view is usually compiled using a template engine such as handlebars. If you reflect for a second, it becomes obvious that the view could be any html content which is then sent back to the browser as a response.
This is a typical response that could be sent back:
<html>
<head>
<title>Our Website</title>
<style></style>
<script src="/link/to/any/JS/script"></script>
</head>
<body>
<h1>Hello World </h1>
</body>
</html>
If this response hits the browser, obviously “Hello World” is displayed on the screen.
Now, with this simple approach, we can do powerful things!
OPTION 1:
We can designate one controller to handle all incoming routes app.get("*", controllerFunc) and render one view for our entire server.
OPTION 2:
We could ask multiple controllers to handle different routes and render route-specific views from our server.
OPTION 3:
We could ask multiple controllers to handle different routes and generate pages on-the-fly (i.e. dynamically) from our server.
If we were building a traditional web application, option 3 would be the only reasonable standard. Here, pages are generated dynamically for different routes. However, with option 1, we can produce a quality Single-Page Application where the response sent to the server is an empty html page but with the built JS script that has the ability to manipulate the DOM – Yes, React! Here’s what such a response might look like:
<html>
<head>
<title>Our Website</title>
<style></style>
<script src="/link/to/any/JS/script"></script>
</head>
<body>
<h1>Hello World </h1>
<div id="root"> </div>
<script async type=”text/javascript” src="/link/to/our/transpiled/ReactSPA.js"></script>
<!--async attribute is important to ensure that the script has access to the DOM whenever it loads. This also makes the script non-blocking -->
</body>
</html>
Clearly, we’re giving all the responsibility to the generated SPA and all routing logic is handled on the client-side (See, react-router-dom). On the server side, we can introduce the concept in option 2 and tweak NodeJS route handlers to listen to another specific route for any REST API communication. If you’re familiar with NodeJS, the order in which routes are registered either by app.get() or app.post() matters.
However, using option 1, we can quickly become limited and only able to serve one Single-Page application from that server. Why? Because we have asked one controller to handle all non-API incoming routes and render one view. We also risk serving an unnecessarily bloated JS file. Users are served the complete website when all they probably wanted was just the landing page.
If we look to the option 2 though, we can tweak things a lot more and serve multiple Single-Page Applications for different routes, all from our server. This approach helps to reduce the sizes of the JS build being sent to the browser. A typical example would be a website that has a welcome page (or an introduction directory), a login page and a dashboard.
By assigning controllers for different routes, we can build SPAs uniquely for those routes. SPA for the intro page, another for the login page, and then another for the dashboard. Yes, the browser would have to load while transitioning between the three, but at least we highly increase initial render time for our website. We can also use the more secure option of cookie for authorization rather than the less secure option of storing session tokens on localStorage.
In a more advanced setting, we could have dynamic websites with different React components rendered as widgets within the dynamically generated page. Actually, this is what Facebook does.
The way to build such SPAs or components is pretty simple. Start up a react project and configure webpack to render the production-ready JS file into your preferred public static directory within the server-side repo. The <script> specified in the view can then easily load these built react components since they exist within the scope of the server-side’s public directory.
In essence, this means one repo with several client directories and one server directory where the destination of the production build files to be generated by webpack for each client project is set to the server’s public static directory. So, each client side’s directory is a project (either full SPA or simple React Component as a widget) with it’s own webpack.config and package.json file. In fact you can have two separate config files – production and development. Then, to build, you use npm ~relevant command~ for either production or development build.
You could then go ahead to host it the way you would host any NodeJS application. Because, the main application is the NodeJS - that's where the server is. Replace NodeJS with PHP and Apache/NGINX, the concept still remains the same.

NodeJS 6.x Express 4.x PostgreSQL 9.x dynamic routes and dynamic views with dynamic SQL

I am new to Node.js 6 and Express 4. I am wondering if something like this is possible todo? It appears wildcards can be used in the routeing of node. Is it possible to have a database driven app that is dynamic with routes and views? What I mean is something like the following URL's
/ <- can be anything
/xyz
/15/abc/xyz
So node/express would hit the database for the URL of / and then dynamically take the values in the row for / lookup it found, and output the path to the view template page with the SQL query ready to be used in the view template file that is local on disk. I know their is no way to dynamically generate the html because the SQL will be different for each view template URL. So that would haft to be a hard file with template engine like handlebars, etc. It appears node/express can dynamically deal with routes on the fly so this should be possible todo I think.
So when node/express get the URL of /xyz it will go into the database look up the URL and then output the SQL in the lookup row and call the path to the view template for that row it found in the database. Database could be a json file not sql too. Do not know what would be faster since both would be in RAM.
I am wondering if anyone has ever tried this? If they have dose anything like this or dose anyone know of a boilerplate with this kind of a setup on github? I can see several problems.
Handling 404 errors
Database pools, Ways to reduce the open and closing sockets. So when 100 URL requests would not have a 1,000 open and close socket requests. It would have just 1 open socket request and do all the SQL via that socket. Or have 64 sockets for 64 cpu system. Not open and closing socket every time you hit the URL.
Run app under PM2 Clustering so it will use all the CPU's not just one CPU.
I would like any input. How you would over come the problems listed or boilerplate to something like this if it is out their already?
What you're describing is a RESTful API. Generally, the very first item in the URL path is static if you're serving webpages, since otherwise you wouldn't be able to serve anything else from the root path, such as -- like you noticed -- error pages. So you'll commonly see URLs like www.mystore.com/products/1234.
Your #2 and #3 have nothing to do with routing. Connection pools don't work how you think: it's about reusing and managing the lifespans of many connections, which are picked up and released by your app as needed. You don't want to be sending all your SQL over one socket (since a long-running query would halt everything else until it completed), and the number of open sockets isn't limited by how many CPUs you have.
Clustering is just as possible with a RESTful app as it is with a non-RESTful one.

Ember.js on the server

I'm developing a very dynamic web application via ember.js. The client-side communicates with a server-side JSON API. A user can make various choices and see diced & filtered data from all kinds of perspectives, where all of this data is brought from said API.
Thing is, I also need to generate static pages (that Google can understand) from the same data. These static pages represent pre-defined views and don't allow much interaction; they are meant to serve as landing pages for users arriving from search engines.
Naturally, I'd like to reuse as much as I can from my dynamic web application to generate these static pages, so the natural direction I thought of going for is implementing a server-side module to render these pages which would reuse as much as possible of my Ember.js views & code.
However - I can't find any material on that. Ember's docs say "Although it is possible to use Ember.js on the server side, that is beyond the scope of this guide."
Can anyone point out what would be possible to reuse on the server-end, and best practices for designing the app in a way to enable maximal such reuse?
Of course, if you think my thinking here doesn't make sense, I'd be glad to hear this (and why) too :-)
Thanks!
C.
Handlebars - Ember's templating engine - does run on the server (at least under Node.js). I've used it in my own projects.
When serving an HTTP request for a page, you could quite possibly use your existing templates: pull the relevant data from the DB, massage it into a JSON object, feed it to handlebars along with the right template, then send the result to the client.
Have a look at http://phantomjs.org/
You could use it to render the pages on the server and return a plain html version.
You have to make it follow googles ajax crawling guides: https://developers.google.com/webmasters/ajax-crawling/docs/getting-started

Resources