In my experience, we can use replace() filtering && and | to prevent command injection.
Our code need to send base64String to another process, but there is Stored Command Injection scaned by checkmarx, can Convert.ToBase64String(Encoding.UTF8.GetBytes(jsonFromDb)) deal with it? or we still need to do replacement
Good question. I think you should still make sure that you use prepared statements. Base64 only contains the letters of the ABC (upper and lowercase), digits, +, / and = as final padding character. That doesn't provide characters as quotes and such.
However, what would happen if somebody decides to use a different alphabet? What if somebody decides that base 64 is a waste of space? Because that's certainly what it is if you convert existing text to base 64. Then you'd suddenly be in problems.
All in all, using prepared statements is more secure, more performant and the best way of handling command injection. As an added bonus, you'd get rid of that pesky checkmarx.
I'd only use base 64 if there is absolutely no other way to change that other process, and if that process cannot be reached otherwise. In that case you might want to use base64url though, as it is likely you can do even less with _ and - instead of +, / and =.
But to be honest, that code needs to get cleaned up pronto.
Related
I am taking user input in an application I am writing and would like to expand escape sequences that the user enters.
For example if the user enters \\n it will be interpreted into a str \\\\n. I want to, in a general way, interpret that string (into a newline) and similar ones.
I could of course use String::replace() on the most essential ones and live without the rest, but I would prefer a general solution that also handles hex escapes (\x61 is a).
Escapes are usually handled by the lexer / parser (basically they're part of the language grammar), I don't think there's an stdlib function which would manage them as it would be done as a much lower level.
Furthermore, escapes tend to be highly language-specific, in possibly unexpected ways. Rust has a singularly small list of escape sequences, which is probably desirable compared to the garbage available from C's, but I still do not know that you'd want to allow e.g. arbitrary hex or unicode escape sequences.
I would therefore recommend setting up your own explicitly supported list of escapes, though if you really do not want to there are probably third-party packages which can help you.
I have read a bit about the security of printf() in C++.
Examples can be found e.g. here.
It left me wondering if fmt.Printf() from golang is safe.
To be more specific if it is safe if the formatted string itself could be forged.
inputString := "String from user"
x := "test"
fmt.Printf(inputString, x, 15)
When trying to replicate the exploits from C++, golang does not seem to be vulnerable.
E.g. fmt.Printf("%s%s%s%s%s%s%s%s%s%s%s%s\n") does not crash the program in golang.
Such an analysis of course is no proof that this would be secure in golang. So i wanted to ask here: Have the developers of go foolproofed its printf function?
Edit: By foolproof I mean that it does not have any unexpected side effects.
I would expect the resulting string to be totally compromised of course.
I would not expect the user to be able to gain privileged information (like the content of variables not passed to printf), or the user to be able to write any memory (e.g. assign a new value to x).
Many of the memory issues in C/C++ are related to null termination and buffer overflows. Golang lacks both of those. Strings are managed resources. Baring a compiler bug, it's not possible to terminate a string in such a way as to escape into the stack.
Take your example, as the example. Due to the variadic nature of the function, having a lot of input handlers with no handlers does not impact the code. As far as printf knows, the format string needs nothing replaced. Since you can't pass in anything destructive, even if your example took a dynamic value for ' a ...interface{}', you're protected by the compiler's string protecting code.
tl;dr: Don't print untrusted data, escape it first. If in doubt or a hurry, use %q instead of %s.
Go's string formatting methods are, indeed, memory-safe, and the specific class of vulnerabilities you're worried about and which were so prevalent in C are not applicable.
However, it is not generally safe in any language to output untrusted, unsanitized input via any means without careful consideration. One well-known example of why is cross-site scripting. Less well-known than XSS attacks, however, are terminal escape sequence attacks.
Common terminals respond to a wide variety of escape sequences that can potentially do nasty things directly, or to help exploit another vulnerability. Attackers can include these sequences in messages to a targeted system - say, in the URL of a request to a webserver - and await an admin cating the logs or similar. This can also be used to hide information - including backspace sequences effectively hides whatever came before from view in a terminal.
The quickest option is to simply use %q instead of %s. This outputs the string as a Go string literal, suitable for use in Go source code. This is convenient and safe for printing, but if you're trying to match a specific format, may not be ideal. (strconv.Quote(s string) string will also perform the same operation directly).
This answer to another question has a more involved but easily customized option using strings.Map, however in the given form it removes all non-printable characters rather than escaping them.
One way or another, sanitize your output. You will save someone a lot of pain later, possibly yourself.
package main
import (
"fmt"
"strconv"
)
func main() {
s := "a string with some control\x08\x08\x08\x08\x08\x08\x08hidden characters"
// Prints with the control characters intact. Terminal output:
// a string with some hidden characters
fmt.Printf("%s\n", s)
// Prints as Go string literal. Terminal output:
// "a string with some control\b\b\b\b\b\b\bhidden characters"
fmt.Printf("%q\n", s)
// Prints as Go string literal, but without surrounding double-quotes.
// Terminal output:
// a string with some control\b\b\b\b\b\b\bhidden characters
x := strconv.Quote(s)
fmt.Printf("%s\n", x[1:len(x)-1])
}
I am building a string compressor and for simplicity reasons, I wanted to use some non-printable characters.
1) Is it in some way "bad" to use the 0-31 ASCII characters?
2) Can these characters occur in a normal text string?
If the answer is "partially":
3) What of them is better to use in this case? I think I will need maximum 9 of them.
Well the answer is that it depends on how you're using it. If you're treating the "string" as binary, then binary by definition can have any value. However if it is meant to be read/printed, it could cause serious problems to use characters 0-31.
It isn't too big a deal for the most part, except that 0 is "end of string" by many platforms. Though again, it depends entirely on how you're using it. My advice would be at the very least, avoid character 0. If you want the user to be able to copy and paste the string, then none of these would be suitable. They must be printable characters, in other words.
I'm working on bringing Unicode support into narrow-string application, and while looking at how carefree does it handle it's all-char * strings, not weighed down by thinking of possibility of an invalid string, it made me think of following:
When decoding Unicode, programmer is generally presented with three choices on how to handle ill-formed strings — ignore all decoding errors, stripping invalid characters out of resulting string, stumble over first decoding error, or replace anything that can't be decoded with replacement characters.
I don't like ignoring approach because of security reasons - it's easy to make string that might look good on the first glance, but which becomes evil after stripping carefully designed errors. Replacing errors with replacement characters is much better in this case — it might look worse, but there is clear visual indication something went not as planned, as well replacement characters don't allow words to merge with different meaning.
But what are real-life use-cases of throwing an exception or stopping decoding after first error? What is the point of such "validation"? Let's assume some function got an apparently invalid UTF8 string - what is programmer supposed to do with this knowledge?
Base64 encoding is often used to obfuscate plaintext, I am wondering if there are any quick/easy ways of obfuscating a base 64 string, so that it is not easily recognizeable as such. To do so the method should obfuscate the padding characters (='s) such that they become some other symbol and are more dispersed.
Does anyone know of an easy (and easily reversible) way to do this?
You could use a shift cipher, but I am looking for something that's a little more comprehensive, for example if my shift cipher mapped = to a, someone might notice a string that frequently ends in a's.
The purpose is not to add security, it is actually simply to make base64 unrecognizeable as base 64. It also does not need to pass a security proffesional, just an individual that knows what base64 is and what it looks like. Ex (='s at the end etc.)
The method I describe would probably add non base 64 characters, like ^%$##!, to help obfuscate the reader.
Most of the replies seem to be on the topic of WHY I would want to do this, and the basic answer is that the operation would be completed numerous times (So I want something inexpensive), and done in a way where no password can be remembered (Why I don't XOR). Also the data isn't highly sensitive, and is just to be used as a method against the casual user, who might have knowledge of what a base 64 string is.
A couple of suggestions:
Strip any ending = (according to Wikipedia they are no needed) and then bitwise negate each byte. This will transform the text into mostly non-readable characters.
Loop over the data and xor each character with it's position, modulo 256. This will eliminate any simple statistical analysis since the mapping of each character depends on the position in the string.
In contrast to one of the points in Anders Abel's best answer, the = signs in the base64 strings seem to matter:
$ echo -n foobar | base64
Zm9vYmFy
$ echo -n foobar1 | base64
Zm9vYmFyMQ==
$ echo -n Zm9vYmFyMQ | base64 -D
foobar$ echo -n Zm9vYmFyMQ= | base64 -D
foobar$ echo -n Zm9vYmFyMQ== | base64 -D
foobar1$
What you are asking for is called "security by obscurity" and generally is a bad idea.
Base64 encoding was never designed or intended to be used to obfuscate text or data. Its used to encode binary data which needs to travel trough some communication channel which allows only ASCII characters - like email messages, or be part of XML, etc.
Better use real encryption if you want to hide the data. In any case, even after encrypting the data, you need to pass it as XML, etc., you may end up again encode it in Base64 for transport purposes.
I suppose you could generate a small amount of random data, and then use that to encode the Base64 characters. Prepend the random data to the re-encoded Base64 data.
A very simple example: given an input string "Hello", generate a random number in the range 1-9 and use that as the offset to apply to each input character. Suppose you generate "5", then the re-encoded string would be "5Mjqqt". Or encode the offset as a letter rather than as a number (a=1, b=2, ...) Then the "=" padding will be translated to a different character each time.
Or you could just drop the padding; according to the Wikipedia article, it's not really necessary.
(But consider whether this is really a necessary and sufficient thing to be doing in the first place. It's not clear from your question why you want to obfuscate base 64 data.)
agreed with the responses suggesting use of encryption if your requirements are to actually keep someone who is determined to decode the data from reversing the process.
otherwise, the answer somewhat depends on other constraints of your system, but a few ideas came to mind. if you're just concerned about the delimiter characters, and you have control over the process that generates the Base64 to begin with, you could choose some method of padding the data prior to conversion, thus eliminating the '=' characters from the output.
along this same vein, you could use one of the variants like 'base64url' encoding (see http://en.wikipedia.org/wiki/Base64 for lots of good info on the variants) that does not use the pad character.
after eliminating the '=' by one of these methods, you could perhaps do some sort of char-swapping on the generated Base64, just swapping every other character, just leaving any final character in place. you could also perhaps do some sort of substitution of the upper- or lowercase letters into some other characters to make it look less like Base64 to a quick glance.
however, whatever idea you choose, just remember that it will not be a substitute for a real encryption scheme if you require real protection of that data.
Base64 usually used when you want your data goes through some channel that can distort non-alpha-numeric symbols - for example in XML. If it is your task too - your code will be similar to Base64 no matter how you try :)
If your channel handles binary data well - then just get source text (decode Base64 back), get binary representation for it and use some sort of xor. For example make xor 37 with every byte in source bytes. The same operation will restore your text back.
But it still easily recognizable by anyone who has basic knowledge of cryptanalysis. If it is a problem - use real encryption.