I've seen that SQL injection strings are often constructed like this:
' ; DROP DATABASE db --
Therefore, if I disallow the use of semicolons in my application's inputs, does this 100% prevent any SQL injection attack?
Use parameterized queries (or stored procedures) and avoid dynamic SQL like the plague.
I suggest using built in library functions instead of trying to write your own anti-injection code.
A naive implementation will strip out ; even if it should be used (say as part of a passed in VARCHAR or CHAR parameter, where it is legal). You will end up having to write your own SQL parser in order to accept/reject queries.
You can read here more about dynamic SQL and the problems it presents (and solves).
No it does not prevent sql injection attacks. Any time you're dynamically constructing SQL either in the client side, or with the EXEC inside a stored proc, you are at risk.
Parameterized queries are the preferred way to get your input into query.
No, it doesn’t. You just showed one example of an SQL injection. But there are far more, all depending on the context you insert the data into.
Besided that, it’s not the semicolon that causes this issue but the ' that ends the string declaration prematurely. Encode your input data properly to prevent SQL injection.
It MAY stop from injecting a secondary DDL statement like your drop since they'll likely create a syntax error within a select, but it won't stop statements like this:
Return more data than intended
' or 1=1
Delete with output
' or 1 in (select * from (delete someOtherTable OUTPUT DELETED.* ) a)
Injection attacks revolve around ANY manipulation of the intended SQL statement, not just termination of that statement and injecting a second statement.
Injection Attacks Come from Un-sanitized User Input
It's important, however, to not conflate a SQL injection attack entirely with dynamic/inline sql. SQL injection attacks come from UN-SANITIZED USER INPUT used in dynamic sql. You can "build" a query with no problem as long as the components of that query all come from sources that you trust. For example we use schemas to hold custom structures...
$"select * from {customSchemaName}.EmployeeExtension where id=#id and clientid=#clientId"
The above is still a parameterized query from an input point of view, but the schema name is an internal system lookup that no interface has access to.
I make the case for inline/dynamic sql here:
https://dba.stackexchange.com/a/239571/9798
The best way to avoid SQL injection is to avoid string concatenation of user supplied data. This is best accomplished by either using stored procedures or using parameterized queries.
No, don't focus on semicolons. Focus on the way you put user input in the sql query - usually in quotes - and then focus on the quotes. Also don't forget when you work with regexp in sql that they need slightly different escaping procedure.
it will depend on a variety of things (which queries, etc). you should use prepared statements for this
Related
I am on jooq queries now...I feel the SQL queries looks more readable and maintainable and why we need to use JOOQ instead of using native SQL queries.
Can someone explains few reason for using the same?
Thanks.
Here are the top value propositions that you will never get with native (string based) SQL:
Dynamic SQL is what jOOQ is really really good at. You can compose the most complex queries dynamically based on user input, configuration, etc. and still be sure that the query will run correctly.
An often underestimated effect of dynamic SQL is the fact that you will be able to think of SQL as an algebra, because instead of writing difficult to compose native SQL syntax (with all the keywords, and weird parenthesis rules, etc.), you can think in terms of expression trees, because you're effectively building an expression tree for your queries. Not only will this allow you to implement more sophisticated features, such as SQL transformation for multi tenancy or row level security, but every day things like transforming a set of values into a SQL set operation
Vendor agnosticity. As soon as you have to support more than one SQL dialect, writing SQL manually is close to impossible because of the many subtle differences in dialects. The jOOQ documentation illustrates this e.g. with the LIMIT clause. Once this is a problem you have, you have to use either JPA (much restricted query language: JPQL) or jOOQ (almost no limitations with respect to SQL usage).
Type safety. Now, you will get type safety when you write views and stored procedures as well, but very often, you want to run ad-hoc queries from Java, and there is no guarantee about table names, column names, column data types, or syntax correctness when you do SQL in a string based fashion, e.g. using JDBC or JdbcTemplate, etc. By the way: jOOQ encourages you to use as many views and stored procedures as you want. They fit perfectly in the jOOQ paradigm.
Code generation. Which leads to more type safety. Your database schema becomes part of your client code. Your client code no longer compiles when your queries are incorrect. Imagine someone renaming a column and forgetting to refactor the 20 queries that use it. IDEs only provide some degree of safety when writing the query for the first time, they don't help you when you refactor your schema. With jOOQ, your build fails and you can fix the problem long before you go into production.
Documentation. The generated code also acts as documentation for your schema. Comments on your tables, columns turn into Javadoc, which you can introspect in your client language, without the need for looking them up in the server.
Data type bindings are very easy with jOOQ. Imagine using a library of 100s of stored procedures. Not only will you be able to access them type safely (through code generation), as if they were actual Java code, but you don't have to worry about the tedious and useless activity of binding each single in and out parameter to a type and value.
There are a ton of more advanced features derived from the above, such as:
The availability of a parser and by consequence the possibility of translating SQL.
Schema management tools, such as diffing two schema versions
Basic ActiveRecord support, including some nice things like optimistic locking.
Synthetic SQL features like type safe implicit JOIN
Query By Example.
A nice integration in Java streams or reactive streams.
Some more advanced SQL transformations (this is work in progress).
Export and import functionality
Simple JDBC mocking functionality, including a file based database mock.
Diagnostics
And, if you occasionally think something is much simpler to do with plain native SQL, then just:
Use plain native SQL, also in jOOQ
Disclaimer: As I work for the vendor, I'm obviously biased.
We've logged get requests where someone tried to add /**/or/**/ to the query string.
What does it mean, where can it do damage and what should I look out for in my code?
Edit
Specifically /**/or/**/1=##version)-- - what would that be targeting?
It's almost certainly an attempt at a SQL Injection Attack or something very similar. It's easiest to think (and read) about this kind of vulnerability in terms of SQL, even if it's a slightly different attack vector in your case.
The idea is that if you've got SQL of this kind:
"SELECT User from USERS where UserName =" + user + " AND PASSWORD = " + password
then inserting values like the one you've seen directly into the SQL can cause the query to mean something entirely different (and typically laxer) than you expect. Normally SQL injection attacks contain single quotes though, to terminate text values within SQL. Are you sure the attacks you've seen don't have single quotes?
If you use parameterized SQL throughout your application (and no dynamic SQL within your database), and never try to use incoming user-specified values as "code" of any kind, you should be okay.
I suspect in your case it may not be SQL that the attacker expects to use to penetrate your security; it may well be Javascript. The idea is the same though: if you use any user input as "code" in some form, e.g. by building a string mixing that with predefined code, then execute that code, you're leaving yourself vulnerable to attack. Always treat code as code (under your control) and data as data.
People say to prevent SQL Injection, you can do one of the following (amongst other things):
Prepare statements (parameterized)
Stored procedures
Escaping user input
I have done item 1, preparing my statements, but I'm now wondering if I should escape all user input as well. Is this a waste of time seeming as I have prepared statements or will this double my chances of prevention?
It's usually a waste of time to escape your input on top of using parametrized statements. If you are using a database "driver" from the database vendor and you are only using parametrized
statements without doing things like SQL String concatenation or trying to parametrize the actual SQL syntax, instead of just providing variable values, then you are already as safe as you can be.
To sum it up, your best option is to trust the database vendor to know how to escape values inside their own SQL implementation instead of trying to roll your own encoder, which for a lot of DBs out there can be a lot more work then you think.
If on top of this you want additional protection you can try using a SQL Monitoring Solution. There are a few available that can spot out-of-the-ordinary SQL queries and block/flag them, or just try to learn your default app behavior and block everything else. Obviously your mileage may vary with these based on your setup and use-cases.
Certainly the first step to prevent SQL Injection attacks is to always use parameterised queries, never concatenate client supplied text into a SQL string. The use of stored procedures is irrelevant once you have taken the step to parameterise.
However there is a secondary source of SQL injection where SQL code itself (usually in an SP) will have to compose some SQL which is then EXEC'd. Hence its still possible to be vunerable to injection even though your ASP code is always using parameterised queries. If you can be certain that none of your SQL does that and will never do that then you are fairly safe from SQL Injection. Depending on what you are doing and what version to SQL Server you are using there are occasions where SQL compositing SQL is unavoidable.
With the above in mind a robust approach may require that your code examines incoming string data for patterns of SQL. This can be fairly intense work because attackers can get quite sophisticated in avoiding SQL pattern detection. Even if you feel the SQL you are using is not vunerable it useful to be able to detect such attempts even if they fail. Being able to pick that up and record additional information about the HTTP requests carrying the attempt is good.
Escaping is the most robust approach, in that case though all the code that uses the data in you database must understand the escaping mechanim and be able to unescape the data to make use of it. Imagine a Server-side report generating tool for example, would need to unescape database fields before including them in reports.
Server.HTMLEncode prevents a different form of Injection. Without it an attacker could inject HTML (include javascript) into the output of your site. For example, imagine a shop front application that allowed customers to review products for other customers to read. A malicious "customer" could inject some HTML that might allow them to gather information about other real customers who read their "review" of a popular product.
Hence always use Server.HTMLEncode on all string data retrieved from a database.
Back in the day when I had to do classic ASP, I used both methods 2 and 3. I liked the performance of stored procs better and it helps to prevent SQL injection. I also used a standard set of includes to filter(escape) user input. To be truly safe, don't use classic ASP, but if you have to, I would do all three.
First, on the injections in general:
Both latter 2 has nothing to do with injection actually.
And the former doesn't cover all the possible issues.
Prepared statements are okay until you have to deal with identifiers.
Stored provedures are vulnerable to injections as well. It is not an option at all.
"escaping" "user input" is most funny of them all.
First, I suppose, "escaping" is intended for the strings only, not whatever "user input". Escaping all other types is quite useless and will protect nothing.
Next, speaking of strings, you have to escape them all, not only coming from the user input.
Finally - no, you don't have to use whatever escaping if you are using prepared statements
Now to your question.
As you may notice, HTMLEncode doesn't contain a word "SQL" in it. One may persume then, that Server.HTMLEncode has absolutely nothing to do with SQL injections.
It is more like another attack prevention, called XSS. Here it seems a more appropriate action and indeed should be used on the untrusted user input.
So, you may use Server.HTMLEncode along with prepared statements. But just remember that it's completely different attacks prevention.
You may also consider to use HTMLEncode before actual HTML output, not at the time of storing data.
I agree that correct input validation is the only 'fool-proof' way to prevent SQL Injection, however it requires modifying a lot of code in existing applications, possibly might require a badly designed application to be re-structured.
There has been a lot of academic interest in automated mechanisms to prevent SQL Injection (won't go on listing them here, I've done a literature survey and seen at least 20), but I haven't seen anything that's actually been implemented.
Does anyone know of any framework that's actually in use outside an academic environment, either Signature-Based, Anomaly-Based, or otherwise?
Edit: I'm looking for something that does not modify the code-base.
The company i work for uses Barracuda Web Application Firewall for what you are talking about. From what I have seen it works fairly well. Basically if it detects suspect input it will redirect the user to a page of our choosing. This allows you to place a layer between the internet and your applications and does not require you to change any of your code.
That said, it's a bad idea to not secure your applications.
If you aren't going to modify your code then you can only intercept requests. Since there is no such thing as a good or bad SQL command you're pretty limited in options but you could try rejecting multiple queries which initiate from a single string. In other words:
LEGAL
SELECT * FROM foo WHERE bar='baz';
ILLEGAL
SELECT * FROM foo WHERE bar=''; DELETE * FROM foo; SELECT 'baz';
Since pretty much every injection attack requires multiple queries in a single request and provided your application doesn't require this feature you may just get away with this. It probably won't catch every type of attack (there's probably a lot of damage you could do using subquerys and functions) but it's probably better than nothing at all.
The default behavior with PreparedStatements in Java where you pass in each parameter makes it mostly foolproof because the framework escapes the input for you. This doesn't prevent you from doing something like
exec spDoStuff <var>
where spDoStuff does:
exec( <var> )
But if you use normal queries it's very effective. I don't know whether you'd consider it non-programmatic but the developer doesn't have to write the code to manage input validation themselves.
Like this:
int id;
String desc;
Connection conn = dataSource.getConnection();
PreparedStatement ps = conn.prepareStatement("SELECT * FROM table1 t1 WHERE t1.id = ? AND t2.description = ?");
// or conn.prepareStatement("EXEC spMyProcedure ?, ?");
ps.setInt(1, id);
ps.setString(2, desc);
ResultSet rs = ps.executeQuery();
...
rs.close();
ps.close();
conn.close();
The only way to leave the code untouched while patching vulnerabilities like SQL Injection is to use a web application firewall like the open source project mod_security. Recently Oracle has released a database firewall which filters nasty queries. This approach is better at solving the problem of SQL Injection, but that all it can address.
WAF's are very useful and free, if you don't believe it put it to the test.
A WAF is just one layer. You should also test the application* under it. This is a defense in depth approach.
*This is a service I sell with a limited free offer.
In my travels in Oracle, the 'stragg' function, or 'String Aggregator' was life-saving when I had to create dynamic SQL queries on the fly.
You can read up about it here: http://www.oratechinfo.co.uk/delimited_lists_to_collections.html
The basic use of it was:
select stragg(fruit) from food;
fruit
-----------
apple,pear,banana,strawberry
1 row(s) returned
So simple to use, concatenating chr(13) turned it into a long list, and selecting information from system tables gave a 5 minute solution to dynamically generated SQL, e.g. auditing triggers.
Now I've been charged with transferring oracle functionality related to auditing into Sybase, and a function similar to Stragg would be ideal for this purpose.
E.g.
select #my_table = 'table_of_fruit'
select 'insert into '+#mytable+'_copy (' +char(10)
+ stragg(c.name) +char(10)
+ 'select '
+ stragg('inserted.'+c.name) + char(10)
+ 'from '+#mytable
from syscolumns c
where objectid(#mytable) = c.id
------------------------------------------
insert into table_of_fruit_copy
(fruit, sweetness, price)
select fruit, sweetness,price
from inserted
Done. Simple.
Except I don't know how to get a string-aggregation function working in Sybase.
Does anyone know of an attempt to do this kind of thing, or code that could work the same as stragg that could be used in this way?
The alternative at the moment is printing code based on complex cursors and such (sample LOC: 500), or select statements combining static strings and columns from user tables (sample LOC: 200). Stragg would severely reduce the complexity of this code, and would be a great deal of help in the future (sample LOC: who knows, maybe 50?)
p.s. I'm calling these selects through a shell script then piping them to file, then running the file through iSQL. Not the nicest solution, but it's better than the alternatives.
There are three separate answers
Question
You have made comments about simplicity, which need to be addressed before we get to the solution.
It is a common requirement to be able to take a delimited list of values, say A,B,C,D, and treat this data like it was a set of rows in a table, or vice versa
This one of the Top Ten Worst Programming Practices I read about recently.
In general, Sybase types tend to be somewhat more academically and Relationally qualified than Oracle types, so we simply do not do that sort of thing in SybaseLand or DB2Land.
In 20 years of working with Sybase, I have had to code that as part of my project just once, and that was for non-technical Auditor who loaded the result set into MS Access.
On the other hand, I have had to code that at least 12 times, when producing text files for importation into Oracle databases (fulfilling external requirements is outside my project, but I satisfy any such requirement free). Obviously the target databases were sub-standard and non-relational (loading a column with more than one datum breaks 1NF, and creates Update Anomalies), which is typical of what Oracle types have to do to get some speed.
Therefore, no, it is not simplicity, at least in the sense of that principle. It is by definition, complexity.
Your reference to "arrays" is incorrect. All commercial dbms handle arrays, according to the ISO/IEC/ANSI SQL (STRAGGR and LIST operators are non-standard SQL, therefore not SQL). Sybase is very strong in processing arrays. If it was an array, you would not need special hand coding to handle it (and you do, as per your question). This is not an array, there is no definition to the cells. This is a single concatenated scalar string.
Pivoting is an entirely different process, which uses set-processing; it does not require row-processing. (I understand on good authority, that Oracle is hopeless at scalar subqueries, and thus Oracle people are used to writing them as [very inefficient] joins or inline views, and then filtering: all that can be elevated to set-processing via scalar subqueries, and it will perform much faster. Particularly your Pivots.)
Even the author in your link posts as follows. Please familiarise yourself with the caveats:
It's as simple as this: If you want to have a system with no logical limitation in the number of data elements passed to a given process, then forget the following mechanisms! They are simply the wrong way to approach the problem.
Therefore, know whatever you are doing is sub-standard, non-relational, and limited; and go ahead with your eyes open. No use pretending that: it will not break; it is not limited; it is an "array"; or that Sybase doesn't have a neat little function that Oracle has. Any professional will see through all that. And if the string length is exceeded, for God's sake send some indicator back to the caller ("!Exceeded" in the string) identifying that condition.
Essentially you are turning the set-processing engine on its head, and forcing it into row-processing mode, so it will be very slow. A WHILE loop is distinctly faster than a cursor, but both are in the same class, row-processors.
The alternative at the moment is printing code based on complex cursors and such
What 200 or 500 LoC ? It is possible I am missing something, but my code is the same few lines of code identified under "Using a Table Function" in your link. Maximum 20, if you count nice formatting; the loop; initialisation; error handling. There is nothing "complex" about it. Do the exact reverse to cancatenate a single string from multiple rows. We use stored procedures for this (which oracle does not have, really, PL/SQL is a different animal). If you have ASE 15.0.2 or greater, you can use a User Defined Function, which you can then use in place of a column. Stored procs are better for true arrays.
the concatenation operator in Sybase is the plus sign. For reversal (decomposing the CSV string) you need CHARINDEX and SUBSTRING functions
You may need the Function Reference Manual, if for nothing else, to avoid writing code where we have functions.
Likewise, we do not have a RANK() function. We are quite happy with the 4 lines of code requires for the subquery. It is only required for Oracle because subqueries are crippled.
Ok, I have answered your question, Now to address the approaches.
You will be aware that code using Oracle Extensions to the SQL standard will need to be changed.
Sybase is way more automated than Oracle; if you familiarise yourself with its feature set, in many instances, you can get the same result (as you did in Oracle) without writing any code. Writing code-for-code blocks is the chain gang, rock-breaking method of building roads, in the context of bulldozers. Even if your company had good reason to use that method, you need to the aware that features work quite differently, eg. triggers, which is why I am posting so much detail.
Another issue that will annoy you is that Oracle isn't really ANSI SQL compliant (stretches the definitions in many places, in order to appear to be compliant), and Sybase, given its customer base, is rigidly SQL compliant. So in addition to the same function working differently, or in a different deployment, you need to be aware that code changes may be required to elevate Oracle code to ANSI compliance levels, just to execute on an ANSI SQL compliant platform.
I am not sure if you are trying to write code for the content of a trigger, or if you are trying to capture the changes to a database. I will provide both answers.
Auditing
Capture Changes to Database
We have an very robust, fast and configurable Auditing subsystem, fit for high volumes and banking level auditing requirements. Get your DBA to setup the sybaudit (separate) database, and to configure exactly what changes need to be captured. This facility will perform much faster than any code you or I can write in a trigger (as much as 100 times faster than your row-by-processing required for the above, as it is executed within the engine, within your executing thread). And of course the setup time is a fraction of your coding time.
Triggers
Again, I am not sure exactly what you are trying to achieve, but assuming you want to copy every insert to some table to a COPY of that table (inside the Trigger), that example code you have provided will not work (and I am not counting syntax issues).
Speaking to your example, you need to do way more work, to deal with the different datatypes; column sizes; precisions; scale; etc. And perhaps the UPDATE() function to identify which columns have changed (for an UPDATE trigger of course). If all you are trying to do is convert the various datatypes to strings, check the CONVERT() function.
Triggers are transactional.
Never place row-processing code in a Trigger (it will strangle the table)
You can't place Dynamic SQL in a Trigger.
But in Sybase even that is not necessary. Refer to the User Guide, chapter 19 is devoted to Triggers, with several variations, and examples. Inside the trigger, you should be able to simply:
INSERT table_copy
SELECT column_list -- never use * unless you want the db fixed in cement
FROM inserted
If you are trying to copy the inserts to all tables into one Audit table, then beware. Then I understand your example a little bit more. You will be forcing a highly Symmetric Muli-Threading server (oracle is not a server in the architecture sense) into single-threading through your table. Auditing is multi-threaded.
Last, the use of manual methods of any kind is not required, so if you could expand a bit more on your PS, what the requirement you are trying to fulfil is, I can identify the programmatic method for you. It appears you are trying to use the PL/SQL approach (which is very limited).
Just use the LIST() function. It's a direct replacement for stragg() function. Example:
SELECT LIST(state, ', ') FROM cities
Result:
name
CA, CA, MA, NY