Duplicate bin/ directories from Apache Spark install with Homebrew - apache-spark

When installing Apache Spark 2.2.1 through Homebrew, the resulting install location seems to have two slightly-different bin/ directories, one a level below the other. (Directory structure at the bottom of this question.)
My main concern is that the load-spark-env.sh (Spark environment variable load script) looks pretty drastically different between the two, and it's tough to confirm which is being used.
In short, I'm wondering:
Why might there be two similar bin/ directories here? Sorry if I'm missing something obvious about Spark setup.
If I have $SPARK_HOME set to libexec/ (see below), will the bin/ here always be referenced over the other directory, or are there other environment variables that I need to set?
Info
I have the following set in ~/.bash_profile:
export JAVA_HOME="/Library/Internet Plug-Ins/JavaAppletPlugin.plugin/Contents/Home"
export SPARK_HOME="/usr/local/Cellar/apache-spark/2.2.1/libexec"
export PYSPARK_PYTHON="/Users/brad/anaconda3/bin/python3"
Homebrew's install file is here, for reference.
The directory looks like this (I've excluded some irrelevant folders such as sbin):
Cellar$ tree -l apache-spark/
apache-spark/
└── 2.2.1
├── INSTALL_RECEIPT.json
├── LICENSE
├── NOTICE
├── README.md
├── bin
│   ├── find-spark-home
│   ├── load-spark-env.sh
│   ├── pyspark
│   ├── run-example
│   ├── spark-beeline
│   ├── spark-class
│   ├── spark-shell
│   ├── spark-sql
│   ├── spark-submit
│   └── sparkR
└── libexec
├── bin
│   ├── find-spark-home
│   ├── load-spark-env.sh
│   ├── pyspark
│   ├── run-example
│   ├── spark-beeline
│   ├── spark-class
│   ├── spark-shell
│   ├── spark-sql
│   ├── spark-submit
│   └── sparkR
├── conf
├── ...
└── yarn
Edit: I've noticed this same structure with brew-installed Hadoop, so it looks like the question pertains more to Homebrew than the tools it installs.

Why might there be two similar bin/ directories here? Sorry if I'm missing something obvious about Spark setup.
The higher level bin directory contains just some wrappers that override the JAVA_HOME variable and start the executables in the libexec directory. These wrappers are then placed in your /usr/local/bin directory so that you can use things like spark-shell without specifying the full path.
If I have $SPARK_HOME set to libexec/ (see below), will the bin/ here always be referenced over the other directory, or are there other environment variables that I need to set?
If you do need to set the SPARK_HOME directory for whatever reason, then you're right to set it to the libexec one and not the higher level bin directory.

Related

How to create a nodejs microservices project in intellij?

The problem is that let's say I want to start multiple services (several npm start) concurrently, it would be inconvenient to run the services as separate projects. I want to have a folder structure similar to the following under one project workspace:
project
├── service1
│   ├── node_modules
│   │   ├── #module1
│   │   └── #module2
│   ├── package-lock.json
│   ├── package.json
│   ├── public
│   └── src
├── service2
│   ├── node_modules
│   │   ├── #module1
│   │   └── #module2
│   ├── package-lock.json
│   ├── package.json
│   ├── public
│   └── src
└── service3
├── node_modules
│   ├── #module1
│   └── #module2
├── package-lock.json
├── package.json
├── public
└── src
What would be a clean way to do so? I need to start multiple services together and obviously debug any in a convenient fashion if a possibility exists.
You may want to look into Lerna or Nx.
Both are tools that manage mono-repo microservices.
There are some subtle differences between them, but essentially both do the same thing.
They offer ways to share dependencies between your microservices.
They offer ways to created shared libraries.
They offer ways to launch
multiple services together.
Lerna
One of the subtle differences, is that Nx will force you to use a single package.json in your root folder, essentially forcing you to use the same dependencies for all microservices. By contrast, Lerna still allows a specific package.json in each individual folder, which seems to resemble your current directory structure better.
In general, I think Lerna is a safe choice. And you can find a good tutorial here.
Nx
On the other hand, even though Lerna has been around for a longer time it has some quirks at times. I believe Nx is probably technically a more robust solution.
However, I must admit that I've mostly seen it being used for mono-repo front-end projects, and less often for back-ends. Technically, it should be able to handle both.
To get you started with Nx, you could follow this tutorial.
Spoiler: Nx has commands like nx run-many that can help you to execute multiple services together. After migrating to nx, you could then put that command in your "start": script of the package.json, so that npm run start and npm start will execute it.

importing folder that shadows same name as third party dependency. What to do?

say I have a folder structure that looks like this:
.
├── CODEOWNERS
├── Makefile
├── README.rst
├── __pycache__
│   └── conftest.cpython-37-pytest-6.2.2.pyc
├── airflow
│   ├── __pycache__
│   │   └── __init__.cpython-37.pyc
│   ├── dev
│   │   └── dags
│   └── <some_file_I_need>
how do I import the file I need from the airflow local package (not the third party dependency named airflow). I have a dependency called airflow unfortunately and that gets imported when I do: import airflow.dev... and that errors out.

How does a proper node installation with N look like on linux

I may have messed something up with my installation, I don't know.
I reinstalled node with N package manager (script? package?) some time ago to solve sudo problems for my global packages, and today I encountered a problem -- I couldn't require globally installed packages in the node REPL (require('lodash') for example).
I am pretty sure I have something messed up with my node setup, but the more I skim through the web the more confused I am -- what are $NODE_MODULES, $PREFIX, .../lib/node, .../lib/node_modules, $N_PREFIX, which and why are obsolete, what are the differences etc.
My current setup looks like that:
# fragment of ~/.bashrc
# setup NODE
export N_PREFIX="$HOME/.n"
export PATH=$N_PREFIX/bin:$PATH
directory structure:
# output of `tree $N_PREFIX -L 2`. Arrows are symlinks, bin/node is executable
/home/tooster/.n
├── bin
│   ├── check -> ../lib/node_modules/checker/cli.js
│   ├── eslint -> ../lib/node_modules/eslint-cli/bin/eslint.js
│   ├── eslint-cli -> ../lib/node_modules/eslint-cli/bin/eslint.js
│   ├── ffmpeg-bar -> ../lib/node_modules/ffmpeg-progressbar-cli/lib/main.js
│   ├── js-yaml -> ../lib/node_modules/js-yaml/bin/js-yaml.js
│   ├── node
│   ├── npm -> ../lib/node_modules/npm/bin/npm-cli.js
│   ├── npx -> ../lib/node_modules/npm/bin/npx-cli.js
│   ├── tsc -> ../lib/node_modules/typescript/bin/tsc
│   └── tsserver -> ../lib/node_modules/typescript/bin/tsserver
├── include
│   └── node
├── lib
│   └── node_modules
├── n
│   └── versions
└── share
├── doc
├── man
└── systemtap
And the return of 'module' from Node REPL:
Module {
id: '<repl>',
path: '.',
exports: {},
filename: null,
loaded: false,
children: [],
paths: [ <-- None of these actually exist
'/home/tooster/repl/node_modules',
'/home/tooster/node_modules',
'/home/node_modules',
'/node_modules',
'/home/tooster/.node_modules',
'/home/tooster/.node_libraries,
'/home/tooster/.n/lib/node'
]
}
I cannot find any info about difference between the .../lib/node and .../lib/node_modules. Symlinking latter to the former seems to work, but I don't know what are consequences of doing that, so I'd rather not do that blindly.
Also npm config get prefix returns /home/tooster/.n.
How should a proper installation of npm with N look like? I want to have my globally installed packages available in REPL, because I often write quick scripts in Node.

linux how to get the deepest child folders?

My current directory contents are:
$ tree
├── README.md
├── deploy.sh
├── grizzly
│   ├── configs
│   │   ├── nginx-conf.yml
│   │   └── proxy-conf.yml
│   ├── deployments
│   │   ├── api.yml
│   │   ├── celery.yml
│   │   └── proxy.yml
│   ├── secrets
│   └── services
│   ├── api.yml
│   └── proxy.yml
├── ingress.yml
└── shared
├── configs
│   └── rabbitmq.yml
└── env
└── variables.yml
I plan to create a script that will run $ kubectl apply for all files in this tree.
My thought is to get all the child directories then just have all those child directories(expected to have the yml files) execute $ kubectl apply for my resources to be created.
This is an instance of the XY Problem. You want to apply all yamls which are somewhere within the directory structure of the current directory.
Just run:
kubectl apply -f . --recursive
If you want to filter the files based on certain conditions you can use a construct like
find . -type f | grep 'api.yml' | xargs -n 1 kubectl apply -f

Installing Viki on Vim 7.4.52

I'm following Swaroop's Byte of Vim, and have reached the chapter on personal information management, where it says to install the Viki plugin. The instructions are as follows (and I have no real idea of what is going on, but):
Download multvals.vim [2] and store as $vimfiles/plugin/multvals.vim
Download genutils.zip [3] and unzip this file to $vimfiles
Download Viki.zip [4] and unzip this file to $vimfiles (make sure all the folders and files under the 'Viki' folder name are stored directly in the $vimfiles folder)
After this I open a new text file in vim and run the commmand
:set filetype=viki
but I get a whole slew of errors. I've tried clearing out my ~/.vim folder and reinstalling everything, along with tlib this time as specified on the Viki vimscript page, and extracted the version 4.0 viki.vba instead of using the version 4.08 zip file, but I'm still getting errors about non-existent functions:
Error detected while processing home/user/.vim/ftplugin/viki.vim:
line 100
E117: Unknown function: tlib#balloon#Register
I don't really know what's going on, and am quite a new Vim user, so please be patient. Right now my ~/.vim directory tree looks like this:
.vim
├── autoload
│   ├── genutils.vim
│   ├── tlib
│   │   ├── eval.vim
│   │   ├── list.vim
│   │   ├── notify.vim
│   │   ├── persistent.vim
│   │   ├── progressbar.vim
│   │   ├── TestChild.vim
│   │   └── vim.vim
│   ├── tlib.vim
│   ├── viki
│   │   ├── enc_latin1.vim
│   │   └── enc_utf-8.vim
│   ├── viki_anyword.vim
│   ├── viki_latex.vim
│   ├── viki_viki.vim
│   └── viki.vim
├── colors
│   └── molokai.vim
├── compiler
│   └── deplate.vim
├── doc
│   ├── tlib.txt
│   └── viki.txt
├── ftplugin
│   ├── bib
│   │   └── viki.vim
│   └── viki.vim
├── plugin
│   ├── 02tlib.vim
│   ├── genutils.vim
│   ├── multvals.vim
│   └── viki.vim
└── test
└── tlib.vim
Any help is much appreciated.
The info is outdated. You need to install a current versions of tlib and viki from:
https://github.com/tomtom/viki_vim
https://github.com/tomtom/tlib_vim

Resources