NodeJS CI/CD Build Missing Files

NodeJS CI/CD Build Missing Files - node.js

I am setting up CI/CD for a NodeJS project and occasionally the developer forgets to send up a file (module) to source control. I run npm ci and npm test without problem and the application gets deployed to my server. However, it will error out once executed due to the missing module.
Is there a best practice for ensuring that all files required by a node application are available before allowing it to be deployed?

I don't know your exact configuration but here is how my teams have handled similar issues in the past:
Unit Tests. Normally, a CI system can catch this type of error before you deploy. If your CI tests aren't flagging your missing module before the code gets deployed, then a commonly used solution would be to write a test that ensures the module is present. That way the problem would be automatically caught when your unit tests are run. Something along these lines might work:
// mymodule.test.js using Mocha syntax:
import {expect} from 'chai';
import mymodule from './mymodule';
describe('my module', () => {
it('should export something!', () => {
expect(!!mymodule).to.be.true;
});
});
Version Control workflow. It sounds like there is also an issue here with your team's version control workflow. Generally, all of the files required to build the production application should be kept under version control and developers should be committing frequently. In this situation, I would normally do some investigation to see what is happening -- it might require training or perhaps the app is structured in a way that is overly confusing or complex to engineers.
Use a lockfile for npm packages. If the missing module is a npm modules, then there are a few things that could cause it to be missing. Generally, all npm modules should be listed in your package-lock.json or yarn.lock file. This ensures that the production version of the application will be in sync with what developers are using locally. I personally discourage developers from installing npm modules globally unless it is absolutely necessary. In that situation, your CI server (and perhaps your production server) will need to be updated to include the exact same versions of the global packages.
Automated build systems. I think you are indicating that your problem is caused by modules that aren't under source control. But I have also seen some situations where legacy build systems might omit a file that is important when building the app for production. Modern build tools like Webpack and Babel normally include every module referenced by the application but older solutions like grunt and gulp might require some fine-tuning to ensure the files are always included in an automated way (to avoid a situation where developers are expected to manually add the modules to the build system and often forget to do so).

The best way to prevent this is to have the developer get the project's checksum and compare that with source control and/or your server. If the checksums match, then all of the files were transfered.

If you (or your deployment software/service) are using rsync in your deployment process and use the -C parameter then some directories might be getting filtered out.
I had a similar CI/CD issue where npm packages used the directory name "core" and it was ignored due to the -C parameter.

Replace all of your require() calls with webpack import calls, and have your build run webpack. At runtime, node will run the bundle instead of your regular entrypoint.
Webpack will catch all unreachable imports during build time.
All this is assuming the missing files are modules (code) rather than resources (e.g. JSON files).

Related

How to speed up the build of front-end monorepo projects?

Here are some of my current thoughts, but would also like to understand how the community is doing it now
Use caching to reduce build time
Motivation
Currently, every time you change something in the libs, someone else needs to initialize it again when pulling it through git. If you know which package it's in, you can just run the initialize command for the specified package. If you don't know, you need to run the root initialize command, which is actually very slow because it re-runs all npm packages that contain the intialize command, regardless of whether they have changed or not.
reduce initialize time and improve collaborative development experience
support for ci/cd caching of built libs to speed up build time
Current state
lerna run
First initialization of new project 198.54s
Re-initialize for non-new projects 90.17s
Requirements
Execute commands as concurrently as possible according to the dependency graph, and implement caching based on git changes
Execute commands in modules that specify module dependencies
Execute commands in all submodules
Implementation ideas
lerna api is currently severely undocumented: https://github.com/lerna/lerna/issues/2013
scan lerna.json for all modules
run the specified commands in parallel as much as possible according to the dependencies
check git when running the command to see if it needs to be skipped
Schematic: https://photos.app.goo.gl/aMyaewgJxkQMhe1q6
The ultra-runner already exists, so maybe it's better to support this feature based on it
There seems to be no support for running commands other than build at the moment, see: https://github.com/folke/ultra-runner/issues/224
Known solutions
Trying to use Microsoft rush -- hoping for a better solution without replacing the main framework lerna+yarn

Building monorepo babel-transpiled node JS application with dependencies

I am working on a project that is hosted as a monorepo. For simplification purposes let's say that inside there are three self-explanatory packages: server, a webapp client and library. The directory structure would be something like the following:
the-project
packages
server
src
webapp
src
library
src
All packages employ flow type notation, use a few >ES5 features and, for this reason, go through babel transpilation. The key difference is that transpilation of the webapp package is done via webpack, whereas server employs a gulp task that triggers script transpilation through the gulp-babel package. library is transpiled automatically when web is built.
Now, the problem I have is that for server to build, babel requires library to be built first and its package.json to specify its (built) main JS source file so its transpiled artifacts can be included. As you can imagine, this would quickly become problematic if the project were to contain multiple libraries that are actively being developed (which it does), as all would require building, including any dependent packages (like server in this simple case).
As an attempt to overcome this annoyance, I initially thought of using webpack to build the server, which would take care of including whatever dependencies it requires into a bundle, but I ran into issues as apparently webpack is not meant to be used on node JS applications.
What strategies are available for building a node JS application requiring Babel transpilation, such that the application's source files as well as any dependencies are built transparently and contained in a single output directory?
Annex A
Simplified gulp task for transpilation of scripts, as employed by server.
return gulp
.src([`src/**/*.js`], { allowEmpty: true })
.pipe(babel({ sourceMap: true }))
.pipe(gulp.dest('dist'));
As can be seen above, only server's own source files are included in the task. If src were to be changed to also include library, the task would emit the dependencies' artifacts in server's own output directory and any require('library') statements within would attempt to locate the built artifacts in packages/library and not packages/server/dist, thus resulting in import failures.

First of all, I am not sure what your server is doing. If it is doing a database connection or some calculations then I would not recommend it to be built by webpack. Whereas If your server is just doing Server-Side Rendering and making some API calls to other servers then I would recommend it to be bundled using webpack.
A lot of projects follow this philosophy. For example, you can take a look at something similar, I have done in one of my personal projects [Blubus]. Specifically, you might be interested in webpack-server-config. And also you can take a look at how big projects like spectrum does it.

The best way to actively develop an NPM package that's consumed by another app running locally

I'm currently working on a React application that's consuming a React library we also develop.
Currently, the process is to copy over the dist folder of the library over to the node_modules folder of the application.
To resolve the tedious nature of this, I thought the solution would be simple: to npm link the package in our application, and have the JSX/React components run through the application's babel-loader. That way, we'd also get webpack's dev server to watch for changes in the library and refresh automatically.
The problem with this is that the library's babel settings are different from those of the consuming application. For instance, root imports in the library (e.g. import ~/some-module) are supposed to resolve from the root folder of the library, but instead, they resolve to the root folder of the application, resulting in errors, because the only babel configuration it uses is the .babelrc from the application.
I tried adding separate webpack config rules to make exceptions for the library, but now it feels kind of hacky. In addition to that, the webpack dev server runs incredibly slow to boot up, presumably because it's running a babel transformation on the library too.
Is there an easier way to do this? Like telling webpack that "for this library in node_modules, use its own configuration file, and respect all of its own babel settings and relative imports?"

Including local dependencies in deployment to lambda

I have a repo which consists of several "micro-services" which I upload to AWS's Lambda. In addition I have a few shared libraries that I'd like to package up when sending to AWS.
Therefore my directory structure looks like:
/micro-service-1
/dist
package.json
index.js
/micro-service-2
/dist
package.json
index.js
/shared-component-1
/dist
package.json
component-name-1.js
/shared-component-2
/dist
package.json
component-name-2.js
The basic deployment leverages the handy node-lambda npm module but when I reference a local shared component with a statement like:
var sharedService = require('../../shared-component-1/dist/index');
This works just fine with the node-lambda run command but node-lambda deploy drops this local dependency. Probably makes sense because I'm going below the "root" directory in my dependency so I thought maybe I'd leverage gulp to make this work but I'm pretty darn new to it so I may be doing something dumb. My strategy was to:
Have gulp deploy depend on a local-deps task
the local-deps task would:
npm build --production to a directory
then pipe this directory over to the micro-service under the /local directory
clean up the install in the shared
I would then refer to all shared components like so:
var sharedService = require('local/component-name-1');
Hopefully this makes what I'm trying to achieve. Does this strategy make sense? Is there a simpler way I should be considering? Does anyone have any examples of anything like this in "gulp speak"?

I have an answer to this! :D
TL;DR - Use npm link to link create a symbolic link between your common component and the dependent component.
So, I have a a project with only two modules:
- main-module
- referenced-module
Each of these is a node module. If I cd into referenced-module and run npm link, then cd into main-module and npm link referenced-module, npm will 'install' my referenced-module into my main-module and store it in my node_modules folder. NOTE: When running the second npm link, the name of the project is the one you find in your package.json, not the name of the directory (see npm link documentation, previously linked).
Now, in my main-module all I need to do is var test = require('referenced-module') and I can use that to my hearts content. Be sure to module.exports your code from your referenced-module!
Now, when you zip up main-module to deploy it to AWS Lambda, the links are resolved and the real modules are put in their place! I've tested this and it works, though not with node-lambda yet, though I don't see why this should be a problem (unless it does something different with the package restores).
What's nice about this approach as well is that any changes I make to my referenced-module are automatically picked up by my main-module during development, so I don't have to run any gulp tasks or anything to sync them.
I find this is quite a nice, clean solution and I was able to get it working within a few minutes. If anything I've described above doesn't make any sense (as I've only just discovered this solution myself!), please leave a comment and I'll try and clarify for you.
UPDATE FEB 2016
Depending on your requirements and how large your application is, there may be an interesting alternative that solves this problem even more elegantly than using symlinking. Take a look at Serverless. It's quite a neat way of structuring serverless applications and includes useful features like being able to assign API Gateway endpoints that trigger the Lambda function you are writing. It even allows you to script CloudFormation configurations, so if you have other resources to deploy then you could do so here. Need a 'beta' or 'prod' stage? This can do it for you too. I've been using it for just over a week and while there is a bit of setup to do and things aren't always as clear as you'd like, it is quite flexible and the support community is good!

While using serverless we faced a similar issue, when having the need to share code between AWS Lambdas. Initially we used to duplication the code, across each microservice, but later as always it became difficult to manage.
Since the development done in Windows Environment, using symbolic links was not an option for us.
Then we came up with a solution to use a shared folder to keep the local dependencies and use a custom written gulp task to copy these dependencies across each of the microservice endpoints so that the dependency can be required similar to npm package.
One of the decisions we made is not to keep two places to define the dependencies for microservices, so we used the same package.json to define the local shared dependencies, where gulp task passes this file and copy the shared dependencies accordingly also installing the npm dependencies with a single command.
Later we made the code open source as npm modules serverless-dependency-install and gulp-dependency-install.

Should I .npmignore my tests?

What exactly should I put in .npmignore?
Tests? Stuff like .travis.yml, .jshintrc? Anything that isn't needed when running the module (except the readme)?
I can't find any guidance on this.

As you probably found, NPM doesn't really state specifically what should go in there, rather they have a list of ignored-by-default files. Many people don't even use it as everything in your .gitignore is ignored in npm by default if .npmignore doesn't exist. Additionally, many files are already ignored by default regardless of settings and some files are always excluded from being ignored, as outlined in the link above.
There is not much official on what always should be there because it is basically a subset of .gitignore, but from what I gather from using node for 5-ish years, here's what I've come up with.
Note: By production I mean any time where your module is used by someone and not to develop on the module itself.
Pre-release cross-compiled sources
Pros: If you are using a language that cross-compiles into JavaScript, you can precompile before release and not include .coffee files in your package but keep tracking them in your git repository.
Build file leftovers
Pros: People using things like node-gyp might have object files that get generated during a build that never should go into the package.
Cons: This should always go into the .gitignore anyway. You must place these things inside here if you are using a .npmignore file already as it overrides .gitignore from npm's point of view.
Tests
Pros: Less baggage in your production code.
Cons: You cannot run tests on live environments in the slim chance there is a system-specific failure, such as an out of date version of node running that causes a test to fail.
Continuous integration settings/Meta files
Pros: Again, less baggage. Things such as .travis.yml are not required for using, testing, or viewing the code.
Non-readme docs and code examples
Pros: Less baggage. Some people exist in the school-of-thought where if you cannot express at least minimum viable functionality in your Readme, your module is too big.
Cons: People cannot see exhaustive documentation and code examples on their own file system. They would have to visit the repository (which also requires an internet connection).
Github-pages objects
Pros: You certainly don't need to litter your releases with CNAME files or placeholder index.htmls if you use your module serves double-duty as a gh-pages repository as well.
bower.json and friends
Pros: If you decide to build in your dependencies prior to release, you don't need the end-user to install bower then install more things with that. I would, personally, keep that stuff in the package. When I do an npm install, I should only be relying on npm and no other external sources.
Basically, you should ever use it if there is something you wish to keep out of your npm package but checked-in to your module's repo. It's not a long list of items, but npm would rather build in the functionality than having people stuck with irrelevant objects in their package.

I agree with lante's short and syntetic answer and SamT's big answer:
You should not include your tests in your package.
Your package should only contains production runtime files.
That will make your package more straightforward and faster to be dowloaded.
My contribution to those answers:
.npmignore is the blacklist way to achieve package file selection. But in a more practical way, you can whitelist files you need to include in your package using the files field in your package.json:
{
"files": [
"lib/",
"index.js"
]
}
I think that's simpler, future proof and have better semantics ;)

Just to clarify, anytime someone do npm install your-library, npm will download all source files that the package includes. Those files that were included in the .npmignore file in the source code of the package your-library will be excluded when publishing the lib, so users of your-library won't download them.
Know that people installing your library will need just your library running, anything else will be not necessary.
For example, when someone installs a library, its probably that he/she doesn't care about your .travis.yml or your .jshintrc files, or even some images, Grunt files, documentation, etc.
.npmignore could let your npm package to have less files, and faster to be downloaded

Don't include your tests. Oftentimes tests are like 5x the size of the actual codebase. As long as your tests are on Github, etc, that's good enough.
But what you absolutely should do is test your NPM package in its published format. Create some smoke tests that reside in the actual codebase, but are not part of the test suite.
You can read about testing your package after tarballing it, here:
https://github.com/ORESoftware/r2g
How to test an `npm publish` result, without actually publishing to NPM?

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string