Are third party packages/pip safe?

Are third party packages/pip safe? - python-3.x

my question may seem a bit strange but that's just because I'm new to programming. I am currently reading automate the boring stuff with Python and they require you to download openxlpy to work with spreadsheets in excel. I want to use what I'm learning to generate reports where I work but it requires me to work with sensitive customer information. I'm fairly certain that they are but I just wanted more experienced advice.
So the quest is: Are third party modules like openxlpy safe to use in a workplace environment?

This is an excellent and very relevant question. Security of 3rd party modules is indeed an increasingly important question for enterprise software development as well.
One thing is security of the package manager itself. It should download packages over a secure channel (https mostly), validate downloaded packages to make sure there was no tampering either on the host or on the client after downloading but before installing. You must also be careful to enter the right package name if you install a package manually, because install scripts for the package are run with the user you are installing with (often root on Linux), see this research why that is a threat (original website is down at the time of writing this response, articles are here or here).
The other thing is the code you are adding from the installed package. When you add a 3rd party module to your application, you inherently trust the person or organization that made the package. You either want to do that or not, the risk is that you might be adding vulnerabilities to your software through the packages you install. Of course well-known packages probably pose less of a risk, but being well-known and used by many people is by far not a guarantee for the security of a package.
What you can (and should) do as due diligence when adding a new package is checking online whether there are known vulnerabilities. In general, you can use online databases like the NVD for these types of queries, I don't know of such a database specific to Python.
In case of languages like Python or Ruby, you can of course also look at the source code of the package and check it for vulnerabilities. Note though that security code review is tricky business, sometimes it's not easy to spot security flaws.
So the short answer is most packages for Python are probably ok, but using packages from unknown authors can indeed introduce serious vulnerabilities. Also over time, new vulnerabilities may be discovered in old packages, so besides checking a package when adding it to your project, you should also regularly update your 3rd party packages, especially if there are known vulnerabilities (but also if there is none).

Related

Terraform providers vulnerability detection

Using a lot of (official and non official) terraform providers, I'm looking for a tool to perform security analysis on terraform providers before executing terraform plan/apply commands (and so executing providers code). I want to prevent malicious code from providers to be executed blindly.
I'm basically executing terraform providers mirror command to save local copies of required providers and I'm wondering if I can security scan that result.
I tested kics, checkov and tfsec but they are all looking for security issues in my terraform static code but not in providers.
Do you have any good advices regarding this topic ?

This is actually quite a good question. There are many other problems that can be reduced to same generic question - how to make sure that the thing you downloaded from the internet does not do anything malicious to you like e.g.:
How to make sure that a minecraft plugin does not hack you?
How to make sure that a spring boot dependency does not hack you?
How to make sure that a library xxx you attach to your project does not do harm to you?
Should you use docker image yyy in your project?
Truth is: everything you use has the potential to explode right in your face (or more correctly: right into the face of the system owner). That's why the system owner (usually a company) defines a set of rules to follow what is allowed and what is not allowed. No set of rules you are aware of? Below a set of rules we came up with ourselves when thinking about on-boarding a new library for some projects to use:
Do not take random stuff from github. Take only products with longer history, small bug backlog, little to none past issues in the CVE list, actively maintained.
Do static code analysis yourself. Sometimes it is possible to have tools that work on binaries level do that for you. Sometimes you can do it on source level only. In case of Java libraries, check what tools like Dependency Track think about the library and version you are about to use.
Run the code and see how it works: what does it write, what does it read, what URLs does it communicate with (do a TCP dump if necessary).
Document everything you have done somewhere.
This gives you no 100% confidence that things will not go terribly wrong. But this is a systematic approach that will reduce the risk of doing something stupid.

When should I create my own module package instead of using other packages?

I'm still a new node js developer, currently building a personal project, and I recently found out that there are open source packages available on npm similar to the thing I'm developing.
These packages carry new advanced concepts that I haven't come up with yet and provide more options than I want, but after thinking, it occurred to me why not develop a package that serves me in my project the way I want instead of using packages where I won't use more than 5% of the functions in my project?

Benefits of using an existing, well-supported module:
You save your development time for things that haven't already been written by someone else allowing you to make faster progress on your project
Well tested by the community (pre-tested code saves you lots of time)
Other people finding and fixing bugs (don't underestimate the importance of this)
The code will likely be kept up-to-date as tech changes over time
Possible community of people to ask questions of that knows about that package
Non-issues with using an existing, well-supported module:
Code size is rarely an issue for server-side nodejs development so the fact that a package may contain extra code that you don't need is generally not a practical issue of any consequence. If code size is paramount (like say you were running on a small, embedded system), then nodejs itself might not be the right environment as it's not exactly compact.
Reasons not to use an existing, well-supported module:
You aren't allowed to use open-source code in your project (but then you wouldn't be using nodejs if that was the case).
No existing module does what you want.
Existing modules that do what you want don't appear to be well supported or have many relevant bugs that have been open for a long time. In this case, it still might be worth if for you to clone the repository and use it as a starting point or learning point for your own module.
I'm still a new node js developer, currently building a personal project, and I recently found out that there are open source packages available on npm similar to the thing I'm developing.
IMO, this is part of the magic sauce of doing nodejs development. The huge repository of open source packages (through NPM) that are so easy to use make your development far more productive than developing everything from scratch yourself.
why not develop a package that serves me in my project the way I want instead of using packages where I won't use more than 5% of the functions in my project?
Unused code doesn't really cost you anything of consequence in a server-side environment. If you really wanted you can use bundlers that support tree-shaking which removes the code you're not using.
The question that really matters is whether an existing module meets your needs or is closest enough that you only have to write a little bit of code in order to use it. If that's the case, then the question becomes this: "Why should I use my precious development time to write a package from scratch when I could use far less development time by using something that is already available for free, is already tested and is already proven and then spend that development time (I would have spent developing that package) on other things that advance my product/service further?
In many ways, this is really no different than using the fs module built into nodejs. You use it because it's already developed and already tested and saves you time over developing your own file access module. Yes, the fs module contains lots of code you may never need, but that's not the question. The question is whether it already contains the code you DO need.

Could there be malware in Python/Javascript files that I purchased?

I recently purchased a django-react code off fiverr. I don't know too much about web development.
This may be just paranoia but is it possible that there could be some malicious malware in these kind of Files. And that if I run the server then something could happen to the pc I run it on?

Short answer, is it possible for HTML, PY, JS etc. files to contain malicious content, yes. If you run this server on a PC can any malicious content do bad things to the PC that it is being run on, yes.
Ok, so that is the scary side of this done. Let's consider the question a little more objectively. Let's think about how these files can contain malicious content, and more importantly what can you do about it.
The author deliberately wrote malicious code into the files
Of course this is possible, but in my opinion unlikely. People producing malware are looking for a return on the investment of their time. Writing a solution to your request on fiverr and including malicious content is a huge investment for minimal return.
Also, please bare in mind that any contractor / freelancer is building their career on trust. If they get caught writing malicious code for customers then their reputation will be impacted. There is a great book on Who Can You Trust? which goes into the details about trust on platforms sharing goods and services.
If you do want to check for issues in the code, then I would use a static code analyser (e.g. Fortify) and a penetration test.
The author has included an open source module that has malicious content
This in my opinion is more likely. There have been examples in the past of modules published via sharing mechanisms e.g. NPM have contained malicious content. Here are a couple of examples:
Malicious NPM package
Bitcoin Stealer
The good news here is that it is quite easy for you to check for known issues. For example, as you have JS files I assume there is a dependancy on npm, you can use npm-audit to check the dependancies for security advisories.
In summary, in your position I would start by ensuring that the dependencies used by the code don't have any significant security advisories. Then, if the system is critical enough I would use a static code analysis tool (e.g. Fortify) to check the custom code. Finally, always good for a public facing system is getting a good penetration test done.
The key point here, is to think about the risk profile for the system and then decide what investment you need / should do to ensure the security of the system. Is this a customer facing system taking sensitive information (e.g. bank / credit card details), or, an internal intranet system? The first will require more stringent security checks to ensure that you keep your customers safe.

Jenkins security as an open-source tool

I work in a corporate development environment that is fairly risk-averse where management is often afraid of change. I've prototyped out how a Jenkins solution for our development team might work, and highlighted some success stories where the pilot implementation has helped, but the time has now come to get it approved to a wider audience and in a more permanent way, and some security concerns have been raised.
Primarily, the concerns so far have focused on the fact that the tool is open-sourced and the plugins are open-sourced and made by community contributors, so management is concerned that somebody could insert malicious code that would go unnoticed by us when we update. My opinion is that if so many other places can make Jenkins work, we probably can too, but that is not necessarily a very compelling argument to our security testing team.
My question is, can anybody tell me how they have secured their own Jenkins implementations, or how what specific Jenkins capabilities (sandboxing, etc) are in place to prevent malicious code from being executed on our systems?

Using 3rd party components either in your software or your infrastructure will always have risks. One very important thing to note is that open source is not less secure than closed source. While probably anybody might contribute code to an open source project, in most cases there is review before it actually makes its way into the project. Of course, a vulnerability may slip through, but how is that different from a software company with lots of developers? A vulnerability may slip through there too, and based on the experience of many of us, it quite often does. :) And in case of closed source, you don't even have the power of a diverse community to spot such security flaws, the best you can rely on are 3rd party penetration tests or code scans, both of which miss many issues.
In case of such a well established project like Jenkins, you can be pretty sure that there is lots of scrutiny on its security, probably more so than any closed source commercial tool you may currently have.
As with any 3rd party component, you should exercise due diligence though. Have a look at online vulnerability databases like NVD regularly to find security issues. Install updates as they come out to mitigate the risk. You should do these for closed source components too.
As for how to secure a Jenkins installation, an answer here is not the right format I think, but there is a whole set of pages on their website dedicated to the topic.
Having said all this and looking at past vulnerabilities in Jenkins, there are quite a few. It's up to you (and your security department) to assess how exactly you would want to deploy Jenkins, and whether those past vulnerabilities are serious enough for you to think the whole tool is not adequate for your environment considering the way you want to deploy it. Again, it's the same process you would follow with a closed source tool too.

How to prevent malicious *.js scripts from executing in Node.js

I'm using Node.js to create the web service. In the implementation, I consumed many third party modules which are installed via npm. There is security issue if there is malicious *.js scripts in the consumed modules. For example, the malicious code may delete all my disk files, or collect the secret data in silence.
I have a couple of questions regarding this.
How to detect if there is security issue in the module?
What should I do to prevent malicious *.js scripts from executing in Node.js?
I'm very appreciate if you can share any experience to build the node.js service.
Thanks,
Jeffrey

One concern you did not raise is that a module might try to make a direct connection to your database itself, or to other services on your internal network. This might be prevented by setting passwords which the module cannot find so easily.
1. Restricting disk access
This project was presented at NodeConf last year. It attempts to restrict filesystem access in precisely the situation you describe.
https://github.com/yahoo/fs-lock
"The goal for this module is to help when you are loading 3rd party modules and you need to restrict their access."
It sounds rather like the proposal Jeffrey made in the comments in Plato's answer.
(If you want to look further into hooking OS calls, this hookit project may present a few ideas. Although in its current form it only wraps the callback function, it might provide inspiration of what to hook, and how. Here is an example of it being used.)
2. Analyse flow of sensitive data
If you are only worried about data-stealing (not filesystem or database access), then you can focus your concerns:
You should be most concerned about those packages which are being passed sensitive data. Presumably some of the data on your web-service is presented to the public anyway!
Most packages will not have access to the full stack of your application, only the bits of data you pass them. If a package is only being passed a small amount of sensitive data, and never passed the rest of the data, it may not be able to do anything malicious with the data it receives. (For example, if you pass all your usernames to one package for processing and all your addresses to a different package, that is a much smaller concern than if you pass all your usernames, addresses and credit-card numbers to the same package!)
Identify the sensitive data in your app, and note which functions in which modules they are passed to.
3. Perform efficient code review
You may not need to go to Github to read the code. The great majority of packages provide all their source-code in their install folder inside node_modules. (There are a few packages which provide binaries however; these are naturally harder to verify.)
If you do want to check the code yourself, there may ways to reduce the amount of work involved:
To secure your own app, you do not need to read the entire source code of all packages in your project. You only need to review those functions which are actually called.
You may trace the code by reading it, or with the aid of a text-based debugger, or a GUI debugger. (Of course you should look out for branching, where different inputs may cause different parts of the module to be called.)
Set breakpoints when you call into a module which you don't trust, so you can step through the code that is called and see what it does. You may be able to conclude that only a small part of the module is used, so only that code needs to be verified.
Whilst tracing flow should cover concerns about sensitive data at runtime, to check for file access or database access, we should also look at the initialisation code of each module which is required, and all calls (including requires) which are made from there.
4. Other measures
It might be wise to lock the version number of each package in package.json so that you don't accidentally install a new version of a package until you decide that you need to.
You may use social factors to build confidence in a package. Check the respectability of the author. Who is he, and who does he work for? Do the author and his employers have a reputation to uphold? Similarly, who uses his project? If the package is very popular, and used by industry giants, it is likely that others have already reviewed the code.
You may wish to visit github and enable notifications for all the top-level modules you are using, by "watching" the repository. This will inform you if any vulnerabilities are reported in the package in future.

Most (all?) modules have source code available on Github, you can read through the source and look for security problems, or hire a security professional to do the job.
I just take the risk - although I tend to use popular packages with hundreds of commits, active maintenence, and issue lists.

If your project dependency tree is large enough, reviewing all of your dependencies is not a feasible long-term strategy.
The original answer from Joey has some good countermeasures you can use for specific scenarios. I've also seen https://github.com/berstend/node-safe - could make you slightly safer on mac.
A general solution to the problem is taking shape though.
How to protect a project from malicious packages
make sure you don't run lifecycle (postinstall) scripts unless they're known and necessary (see my talk on this topic)
put 3rdparty code in a compartment, lock down the environment, decide on which powerful APIs to pass to each package.
The second step requires the use of Compartment, which is a work-in-progress in TC39 https://github.com/tc39/proposal-compartments/
But a shim exists. And Some tooling was built on top of that shim.
You could use the SES-shim directly and implement your own controls, or use the convenience of LavaMoat
LavaMoat lets you generate and tweak a per-package policy where you can decide which globals and builtins it should have access to.
LavaMoat also offers a tool to manage install scripts.
Here's my talk on SES and LavaMoat with a demo at the end.
How to set up LavaMoat
See LavaMoat docs for more details
disable/allow dependency lifecycle scripts (eg. "postinstall") via #lavamoat/allow-scripts
npm i --ignore-scripts -D #lavamoat/allow-scripts
npx --no-install allow-scripts setup
npx --no-install allow-scripts auto
then, edit the allow-list in package.json
after every insstall/reinstall run allow-scripts
run your server or build process in lavamoat-node
npm i -D lavamoat
in your package.json add something like:
"scripts": {
"lavamoat-policy": "lavamoat app.js --autopolicy",
"start": "lavamoat app.js"
run lavamoat-policy every time you make changes to your dependency tree and review the policy (see also: policy override)
run npm start to start your app
Disclaimer: I contribute to LavaMoat and Endo. They are Open Source projects on permissive licenses.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string