can't copy deep folder structure with gulp - node.js

I"m getting the strangest behavior.
I have a gulp file it has three tasks. clean, copy a bunch of stuff to a folder, copy entire folder to other destination.
It seems to copy MOST of the entire folder, but leaves certain folders empty. There is no rhyme or reason, they are all folders that container javascript files and folders. Makes no sense.
Here's what I got.
gulp.task('clean', function(cb) {
del([config.get("deploy.output.deploy"), config.get("deploy.buildDirectory")],{force:true}, cb);
});
gulp.task("copy-source",["clean"], function () {
gulp.src("src/**")
.pipe(gulp.dest(config.get("deploy.output.app")+"/src"));
gulp.src(["package.json", "server.js", "bootstrap.js"])
.pipe(gulp.dest(config.get("deploy.output.app")));
gulp.src("config/**")
.pipe(gulp.dest(config.get("deploy.output.app")+"/config"));
return gulp.src("deploy/*")
.pipe(gulp.dest(config.get("deploy.output.deploy")));
});
gulp.task("copy-to-buildDir",["copy-source"], function () {
return gulp.src(config.get("deploy.output.deploy")+"/**")
.pipe(gulp.dest(config.get("deploy.buildDirectory")));
});
gulp.task("deploy",[ "copy-to-buildDir"]);
src folder structure looks like this, more or less obviously psudo structure
output
└── app
├── config
│ └── file7.js
├── src
│ ├── modules
│ │ ├── ges
│ │ │ ├── file1.js
│ │ │ └── file2.js
│ │ └── file3.js
│ ├── file4.js
│ └── controllers
│ └── file5.js
└── file6.js
dest folder structure looks like this
output
└── app
├── src
│ ├── modules
│ │ └── EMPTY
│ └── controllers
│   └── file5.js
└── file6.js
so modules and controllers are sisters, and one has the files one does not. makes no sense.
if you have any ideas I'd greatly appreciate it.
also the building up of the src (the first task) works every time. Also
I've tried just about every concievable permutation of the dependencies e.g. ["clean"]

below task definition is not correct
gulp.task("copy-source",["clean"], function () {
gulp.src("src/**")
.pipe(gulp.dest(config.get("deploy.output.app")+"/src"));
gulp.src(["package.json", "server.js", "bootstrap.js"])
.pipe(gulp.dest(config.get("deploy.output.app")));
gulp.src("config/**")
.pipe(gulp.dest(config.get("deploy.output.app")+"/config"));
return gulp.src("deploy/*")
.pipe(gulp.dest(config.get("deploy.output.deploy")));
});
you are doing 4 copies but checking the end of only final copy stream.
Either separate them into 4 tasks or combine the src and create single task to solve the issue

Related

How to merge all files within many sub-directories using Spark, but maintaining the directory structure?

I currently have data stored as csv files in an s3 bucket. The structure of the data is as follows
s3_bucket/
├── dir_1/
│ ├── part_0000.csv
│ ├── part_0001.csv
│ ├── part_0002.csv
│ └── ...
├── dir_2/
│ ├── part_0000.csv
│ ├── part_0001.csv
│ ├── part_0002.csv
│ └── ...
├── dir_3/
│ └── ...
├── dir_4/
└── ...
I want to write some kind of Spark job to go into each subdirectory dir_n/ and merge all the data into a single file, resulting in the following structure
s3_bucket/
├── dir_1/
│ └── merged.csv
├── dir_2/
│ └── merged.csv
├── dir_3/
│ └── merged.csv
├── dir_4/
└── ...
I was thinking of somehow spawning multiple workers to crawl each subdirectory, read the data into memory, and merge them using repartition(1), however I am not to sure of how to do this.
Any help is greatly appreciated!
You can just loop the directories using hadoop, and then read each directory and do a coalesce(1), here's the code in scala:
import org.apache.spark.sql._
import org.apache.hadoop.fs.Path
val path = "/path/s3_bucket/"
val hdfs = new Path(path).getFileSystem(spark.sparkContext.hadoopConfiguration)
hdfs.listStatus(new Path(path)).foreach(file => if (file.isDirectory)
spark.read.csv(path + file.getPath.getName).coalesce(1)
.write.mode(SaveMode.Overwrite).csv(path + file.getPath.getName)
)

For all subfolders in a Node.js project, use global custom entry point for require

Question
Is it possible to configure a global, custom entry point to be used by require for all subfolders in a Node.js project?
Rationale
When working in Node.js, I like having my index.js file as the topmost file in each subfolder in my IDE.
However, depending on the IDE and the way it sorts files, this is not always possible (for example, VSCode has several sorting options available, and none of them can achieve this).
To achieve that, I prefix it with _index.js, but then lose the built-in capability of require to recognize it as the default entry point.
Although this can be mitigated by adding a package.json into each subfolder, with a main property directing to the entry point file - I'd like to know if there's a way to define a "global" custom entry point, be it in the topmost package.json or using some npm package which I'm not aware of.
Example
Let's say I have the following folders structure, and assume that our IDE sorts files alphabetically:
MyApp
├── app.js
├── package.json
├─┬ featureA
│ ├── func1.featureA.js
│ ├── func2.featureA.js
│ └── index.js
└─┬ featureB
├── func1.featureB.js
├── func2.featureB.js
└── index.js
To keep index.js as the topmost file, we prefix it with an underscore, and use a package.json for each subfolder to define it as an entry point:
MyApp
├── app.js
├── package.json
├─┬ featureA
│ ├── _index.js
│ ├── func1.featureA.js
│ ├── func2.featureA.js
│ └── package.json
└─┬ featureB
├── _index.js
├── func1.featureB.js
├── func2.featureB.js
└── package.json
The package.json for both featureA and featureB is identical:
{
"main": "_index.js"
}
That package.json is necessary so that we can use require in the following way in app.js:
// app.js
const featureA = require('./featureA');
const featureB = require('./featureB');
But can these two package.json files be replaced with some "global" alternative?

Typescript folder structure in a project with client, server and shared code

I have a project with the following structure
project
├── client
│ └── src
│ ├── index.js
│ ├── and.js
│ ├── some.js
│ ├── other.js
│ └── files.js
├── public
├── server
│ ├── out
│ │ ├── index.js
│ │ └── any.other.dependency.js
│ ├── src
│ │ ├── index.ts
│ │ └── foo.js
│ └── templates
├── shared
│ └── constants.js
└── mutliple.config.files.json
My goal is to have a server with all the server logic inside server/src, which serves different html files from server/templates. I want the server code to use Typescript, and the compiled output should go to server/out.
There's also the client side of the application, which lives in client/src. The logic there is complex enough that I decided to use webpack for bundling. I might even add some react in the future. All this code is compiled by webpcak and the resulting files live in /public.
I also share some constants between the client and server logic, and I decided to put them in ./shared. I might want to add some utilities there in the future, so let's assume it's not just constants.
At some point in the future I'd like to migrate the whole project to TS, but I'm not close to that yet.
How can I achieve this with Typescript?
I have the webpack side sorted out. My problem comes with the TS compiler. I can't manage to get it working because shared is out the compilerOptions.outDir, but if I set it as the whole project folder I end up with a crazy server/out folder structure. Something like server/out/server/src/index.js
My tsconfig.json looks like this:
{
"extends": "#tsconfig/node12/tsconfig.json",
"include": ["server/src/*", "shared/*"],
"exclude": ["node_modules"],
"compilerOptions": {
"allowJs": true,
"noImplicitAny": false,
"outDir": "./server/out/",
"rootDir": "./server/src"
},
}

Rebase Relative Assets in Gulp

I'm using gulp-pretty-url to keep page URLs clean in a project, such that a file src/about.html will be output as app/about/index.html. Some files end up in deeper structures though, like src/series-460.html to app/series/460/index.html. Of course, each of the HTML files references Javascript and CSS files, and they need to use relative paths.
How can I rebase relative asset paths in files as they're being run through a gulp task? Here's the file structure I'm working with:
project
├── app
│ ├── about
│ │ └──index.html
│ ├── assets
│ │ ├── css
│ │ │ └── styles.css
│ │ └── js
│ │ └── scripts.js
│ └── index.html
└── src
├── assets
│ ├── css
│ │ └── styles.css
│ └── js
│ └── scripts.js
├── about.html
└── index.html

Ignore folders in .couchappignore

My CouchApp has the following folder structur, where files inside the app folder are compiled into the _attachments folder:
my_couchapp
├── _attachments/
│ ├── app.js
│ ├── app-tests.js
│ └── index.html
├── app/
│ └── app.js
├── Assetfile
└── views/
I want to exclude the file Assetfile, _attachments/app-tests.js and the folder app.
My current .couchappignore looks like this:
[
"app",
"Assetfile",
"_attachments/app-tests.js"
]
But this doesn't seem to work. All files beginning with app inside the _attachments folder are not pushed.
How do I define folders and specific files to be excluded when the CouchApp is pushed via couchapp push?
After a little more experimentation I found a way: the app folder can be excluded by specifying app$, so the final .couchappignore now looks like this:
[
"app$",
"Assetfile",
"app-tests.js"
]
In case you arrived here looking for a way to ignore subfolders, you are just like me. Here's my problem:
my-couchapp/
├── node_modules/
│ ├── react.js
│ ├── url/
│ ├── browserify/
│ └── coffee-script/
├── app/
│ └── app.js
└── views/
I wanted to include node_modules/react.js and node_modules/url/ (and all subfolders), but didn't want to include node_modules/browserify/ and node_modules/coffeescript.
I was trying
[
"node_modules/browserify$",
"node_modules/coffee-script$"
]
but it wasn't working.
[
"node_modules\/browserify",
"node_modules\/coffee-script"
]
also didn't work.
The only thing that worked was
[
"browserify",
"coffee-script"
]
I don't know why.

Resources