Protect user credentials when connecting R with databases using JDBC/ODBC drivers

Protect user credentials when connecting R with databases using JDBC/ODBC drivers - linux

Usually I connect to a database with R using JDBC/ODBC driver. A typical code would look like
library(RJDBC)
vDriver = JDBC(driverClass="com.vertica.jdbc.Driver", classPath="/home/Drivers/vertica-jdbc-7.0.1-0.jar")
vertica = dbConnect(vDriver, "jdbc:vertica://servername:5433/db", "username", "password")
I would like others to access the db using my credentials but I want to protect my username and password. So I plan save the above script as a "Connections.r" file and ask users to source this file.
source("/opt/mount1/Connections.r")
If I give execute only permission to Connections.r others cannot source the file
chmod 710 Connections.r
Only if I give read and execute permission R lets users to source it. If I give the read permission my credentials will be exposed. Is there anyways we could solve this by protecting user credentials?

Unless you were to deeply obfuscate your credentials by making an Rcpp function or package that does the initial JDBC connection (which won't be trivial) one of your only lighter obfuscation mechanisms is to store your credentials in a file and have your sourced R script read them from the file, use them in the call and them rm them from the environment right after that call. That will still expose them, but not directly.
One other way, since the users have their own logins to RStudio Server, is to use Hadley's new secure package (a few of us sec folks are running it through it's paces), add the user keys and have your credentials stored encrypted but have your sourced R script auto-decrypt them. You'll still need to do the rm of any variables you use since they'll be part of environment if you don't.
A final way, since you're giving them access to the data anyway, is to use a separate set of credentials (the way you phrased the question it seems you're using your credentials for this) that only work in read-only mode to the databases & tables required for these analyses. That way, it doesn't matter if the creds leak since there's nothing "bad" that can be done with them.
Ultimately, I'm as confused as to why you can't just setup the users with read only permissions on the database side? That's what role-based access controls are for. It's administrative work, but it's absolutely the right thing to do.

Do you want to give someone access, but not have them be able to see your credentials? That's not possible in this case. If my code can read a file, I can see everything in the file.
Make more accounts on the SQL server. Or make one guest account. But you're trying to solve the problem that account management solves.

Have the credentials sent as command arguments? Here's an example of how one would do that:
suppressPackageStartupMessages(library("argparse"))
# create parser object
parser <- ArgumentParser()
# specify our desired options
# by default ArgumentParser will add an help option
parser$add_argument("-v", "--verbose", action="store_true", default=TRUE,
help="Print extra output [default]")
parser$add_argument("-q", "--quietly", action="store_false",
dest="verbose", help="Print little output")
parser$add_argument("-c", "--count", type="integer", default=5,
help="Number of random normals to generate [default %(default)s]",
metavar="number")
parser$add_argument("--generator", default="rnorm",
help = "Function to generate random deviates [default \"%(default)s\"]")
parser$add_argument("--mean", default=0, type="double", help="Mean if generator == \"rnorm\" [default %(default)s]")
parser$add_argument("--sd", default=1, type="double",
metavar="standard deviation",
help="Standard deviation if generator == \"rnorm\" [default %(default)s]")
# get command line options, if help option encountered print help and exit,
# otherwise if options not found on command line then set defaults,
args <- parser$parse_args()
# print some progress messages to stderr if "quietly" wasn't requested
if ( args$verbose ) {
write("writing some verbose output to standard error...\n", stderr())
}
# do some operations based on user input
if( args$generator == "rnorm") {
cat(paste(rnorm(args$count, mean=args$mean, sd=args$sd), collapse="\n"))
} else {
cat(paste(do.call(args$generator, list(args$count)), collapse="\n"))
}
cat("\n")
Sample run (no parameters):
usage: example.R [-h] [-v] [-q] [-c number] [--generator GENERATOR] [--mean MEAN] [--sd standard deviation]
optional arguments:
-h, --help show this help message and exit
-v, --verbose Print extra output [default]
-q, --quietly Print little output
-c number, --count number
Number of random normals to generate [default 5]
--generator GENERATOR
Function to generate random deviates [default "rnorm"]
--mean MEAN Mean if generator == "rnorm" [default 0]
--sd standard deviation
Standard deviation if generator == "rnorm" [default 1]
The package was apparently inspired by the python package of the same name, so looking there may also be useful.
Looking at your code, I'd probably rewrite it as follows:
library(RJDBC)
library(argparse)
args <- ArgumentParser()
args$add_argument('--driver', dest='driver', default="com.vertica.jdbc.Driver")
args$add_argument('--classPath', dest='classPath', default="/home/Drivers/vertica-jdbc-7.0.1-0.jar")
args$add_argument('--url', dest='url', default="jdbc:vertica://servername:5433/db")
args$add_argument('--user', dest='user', default='username')
args$add_argument('--password', dest='password', default='password')
parser <- args$parse_args
vDriver <- JDBC(driverClass=parser$driver, parser$classPath)
vertica <- dbConnect(vDriver, parser$url, parser$user , parser$password)
# continue here

Jana, it seems odd that you are willing to let the users connect via R but not in any other way. How is that obscuring anything from them?
I don't understand why you would not be satisfied with a guest account that has specific SELECT-only access to certain tables (or even views)?

Related

How to reload a batch file with .qwv containing section access in QlikView Desktop?

I execute "C:\Program Files\QlikView\qv.exe" /R "D:\QlikViewAdm\qlik_file.qvw" to launch the loader. It works when there is no section access. However, when I run a file with section access QV asks me for login and password. I can link the access to Windows user credentials, but it only works on my terminal.
Is there a way to point QV to a local file with credentials?

You can do it by using NTNAME in the section Access like so:
Section Access;
LOAD * INLINE [
ACCESS, USERID, PASSWORD,NTNAME
ADMIN, ADMIN, ADMIN , *
ADMIN, , , yourWindowsUserID
];
you can get yourWindowsUserID by using the OsUser() function (in QlikView).
hope this will help

First thing to notice is that the user churning is following the exponential function like the nuclear decay. Hence, t-test is useless (it assumes normal distribution).
Second, it is time to look at counts of users with 0,1,2.. events.
Third, just use Wilcoxon test if you compare the same cohort of users before and after. Otherwise, use Mann–Whitney U test. It is less stable than the tests on normally distributed values. However, it does the job whenever you cannot promise normality.
Spoiler: yes, it worked out. Cohort 1 and 2 were nearly indistinguishable, but cohort 3 and 1 yielded a difference at p < 0.001.

Security question: Are NodeJS spawns logged anywhere?

If you run a command in Terminal, say
rsync -avuP [SourcPath] [DestPath]
That command will get logged in, say .bash_history, .zsh_history, .bash_sessions, etc.
So if you make use of something as notoriously insecure as sshpass
say sshpass -P password -p MySecetPassword [Some command requiring std input], that too will be logged.
But what happens when you do the equivalent when spawning a process using Node JS?
passw = functionThatRetrievesPasswordSecurely();
terminalCmd = "sshpass";
terminalArgs = ["-P, "password", "-p", passw, "command", "requiring", "a", "password", "entry"];
spawn = require("child_process").spawn(terminalCmd, terminalArgs);
spawn.stdout.on("data", data => {console.log("stdout, no details"});
spawn.stderr.on("data", data => {console.log("stderr, no details"});
spawn.on("close", data => {console.log("Process complete, no details"});
Are the terminalCmd or terminalArgs logged anywhere?
If so, where?
If not, is this a secure way to make use opf sshpass?

There isn't a node specific history file for execs unless you created one by logging the arguments. There can be lower level OS logging that captures this type of data, like an audit log.
Passing a password on the command line is still considered the least secure way.
Try -f to pass a file or -d for a file descriptor instead (or ssh keys should always be the first port of call)
The man page explains...
The -p option should be considered the least secure of all of sshpass's options. All system users can see the password in the command line with a simple "ps" command. Sshpass makes a minimal attempt to hide the password, but such attempts are doomed to create race conditions without actually solving the problem. Users of sshpass are encouraged to use one of the other password passing techniques, which are all more secure.
In particular, people writing programs that are meant to communicate the password programatically are encouraged to use an anonymous pipe and pass the pipe's reading end to sshpass using the -d option.

What is the correct method to determine if a system user exists locally on windows?

I am working on an authentication system for a local server jupyterhub that relies on OAuth protocol. Additionally, it creates a local system user on windows, in case it does not exist.
What is the correct way to check whether a user exists on windows platforms using python?
This would include cases in which the system uses LDAP authentication and the user logged in the machine at least once.
I am looking for the correct windows alternative to the unix-like:
import pwd
try:
pwd.getpwnam(user.name)
except Exception as e:
print(repr(e))
My current solution is to check for the existence of the f"os.environ["SystemDrive"]\Users\{username}" folder. Side question, is there any drawback with the current method?

Here's a solution to checking if a local Windows user exists using python:
import subprocess
def local_user_exists_windows(username):
r = subprocess.run("net user",stdout=subprocess.PIPE)
#look for username in the output. Return carriage followed by line break followed by name, then space
return f"\\r\\n{username.lower()} " in str(r.stdout).lower()
Alternative is to use a regular expression to find username match (^ is regex for beginning of line if used in conjunction with multiline, \b for word boundary):
import re
re.findall(rf"^{username}\b", out,flags=re.MULTILINE | re.IGNORECASE)
Note that the \b could be replaced with \s+ meaning a space character one or more times and yield similar results. The function above will return True if given user name is an exact match with local username on Windows.
Again, my reason for this solution is there might be drawback to checking whether the path f"os.environ["SystemDrive"]\Users\{username}" exists. For example, I have a case where a Local User (e.g,local_username) exists via the net user command or via looking at "Local Users and Groups" control panel, but there is no C:\Users\local_user_name folder. One reason for this I can think of off the top of my head is perhaps the user switched from logging in as a Local User to using a Domain Account, and their User folder was deleted to save space, so the User exists, but the folder does not, etc.)
The call to net user gets local users - and the output looks something like this:
User accounts for \\SOME-WINDOWS-COMPUTER
-------------------------------------------------------------------------------
SomeUser Administrator DefaultAccount
Guest local_admin WDAGUtilityAccount
Notice how the SomeUser in this example is preceded by a \r\n followed by multiple spaces, hence looking for a username string inside this string could yield a false positive if the string you are searching is contained inside another string.
The solution above works for me, but has been tested all of ten minutes, and there might be some other simpler or more pythonic way of doing this.

How can you hide passwords in command line arguments for a process in linux

There is quite a common issue in unix world, that is when you start a process with parameters, one of them being sensitive, other users can read it just by executing ps -ef. (For example mysql -u root -p secret_pw
Most frequent recommendation I found was simply not to do that, never run processes with sensitive parameters, instead pass these information other way.
However, I found that some processes have the ability to change the parameter line after they processed the parameters, looking for example like this in processes:
xfreerdp -decorations /w:1903 /h:1119 /kbd:0x00000409 /d:HCG /u:petr.bena /parent-window:54526138 /bpp:24 /audio-mode: /drive:media /media /network:lan /rfx /cert-ignore /clipboard /port:3389 /v:cz-bw47.hcg.homecredit.net /p:********
Note /p:*********** parameter where password was removed somehow.
How can I do that? Is it possible for a process in linux to alter the argument list they received? I assume that simply overwriting the char **args I get in main() function wouldn't do the trick. I suppose that maybe changing some files in /proc pseudofs might work?

"hiding" like this does not work. At the end of the day there is a time window where your password is perfectly visible so this is a total non-starter, even if it is not completely useless.
The way to go is to pass the password in an environment variable.

Passing key material to openssl commands

Is it safe to pass a key to the openssl command via the command line parameters in Linux? I know it nulls out the actual parameter, so it can't be viewed via /proc, but, even with that, is there some way to exploit that?
I have a python app that I want to use OpenSSL to do the encryption/description through stdin/stdout streaming in a subprocess, but I want to know my keys are safe.

Passing the credentials on the command line is not safe. It will result in your password being visible in the system's process listing - even if openssl erases it from the process listing as soon as it can, it'll be there for an instant.
openssl gives you a few ways to pass credentials in - the man page has a section called "PASS PHRASE ARGUMENTS", which documents all the ways you can pass credentials into openssl. I'll explain the relevant ones:
env:var
Lets you pass the credentials in an environment variable. This is better than using the process listing, because on Linux your process's environment isn't readable by other users by default - but this isn't necessarily true on other platforms.
The downside is that other processes running as the same user, or as root, will be able to easily view the password via /proc.
It's pretty easy to use with python's subprocess:
new_env=copy.deepcopy(os.environ)
new_env["MY_PASSWORD_VAR"] = "my key data"
p = subprocess.Popen(["openssl",..., "-passin", "env:MY_PASSWORD_VAR"], env=new_env)
fd:number
This lets you tell openssl to read the credentials from a file descriptor, which it will assume is already open for reading. By using this you can write the key data directly from your process to openssl, with something like this:
r, w = os.pipe()
p = subprocess.Popen(["openssl", ..., "-passin", "fd:%i" % r], preexec_fn=lambda:os.close(w))
os.write(w, "my key data\n")
os.close(w)
This will keep your password secure from other users on the same system, assuming that they are logged in with a different account.
With the code above, you may run into issues with the os.write call blocking. This can happen if openssl waits for something else to happen before reading the key in. This can be addressed with asynchronous i/o (e.g. a select loop) or an extra thread to do the write()&close().
One drawback of this is that it doesn't work if you pass closeFds=true to subprocess.Popen. Subprocess has no way to say "don't close one specific fd", so if you need to use closeFds=true, then I'd suggest using the file: syntax (below) with a named pipe.
file:pathname
Don't use this with an actual file to store passwords! That should be avoided for many reasons, e.g. your program may be killed before it can erase the file, and with most journalling file systems it's almost impossible to truly erase the data from a disk.
However, if used with a named pipe with restrictive permissions, this can be as good as using the fd option above. The code to do this will be similar to the previous snippet, except that you'll need to create a fifo instead of using os.pipe():
pathToFifo = my_function_that_securely_makes_a_fifo()
p = subprocess.Popen(["openssl", ..., "-passin", "file:%s" % pathToFifo])
fifo = open(pathToFifo, 'w')
print >> fifo, "my key data"
fifo.close()
The print here can have the same blocking i/o problems as the os.write call above, the resolutions are also the same.

No, it is not safe. No matter what openssl does with its command line after it has started running, there is still a window of time during which the information is visible in the process' command line: after the process has been launched and before it has had a chance to null it out.
Plus, there are many ways for an accident to happen: for example, the command line gets logged by sudo before it is executed, or it ends up in a shell history file.
Openssl supports plenty of methods of passing sensitive information so that you don't have to put it in the clear on the command line. From the manpage:
pass:password
the actual password is password. Since the password is visible to utilities (like 'ps' under Unix) this form should only be used where security is not important.
env:var
obtain the password from the environment variable var. Since the environment of other processes is visible on certain platforms (e.g. ps under certain Unix OSes) this option should be used with caution.
file:pathname
the first line of pathname is the password. If the same pathname argument is supplied to -passin and -passout arguments then the first line will be used for the input password and the next line for the output password. pathname need not refer to a regular file: it could for example refer to a device or named pipe.
fd:number
read the password from the file descriptor number. This can be used to send the data via a pipe for example.
stdin
read the password from standard input.
All but the first two options are good.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string