Safe read-only sqlite3 database

Safe read-only sqlite3 database - security

I want to give my website users arbitrary read-only access to an
SQLite3 database, without letting them write to the database or do any
other damage. How?
Making the db file read-only helps a little, but commands like
"ATTACH", ".load" and ".output" allow people to read/write other
files, which may not be protected.
Of course, if I knew all such commands, I'd just filter against them,
but I'm mostly worried about commands I haven't thought of.
I tried briefly to alter sqlite3's source code to disallow writes, but
this is harder than it looks: even the SELECT statement appears to do
some internal INSERTS/etc.
Note : I've considered DOS attacks, and will ulimit cputime to 5s or
something. My main concern is damage to files/"hacking", not DOS.
chroot() may work, but seems extreme.
Thoughts?
EDIT : Wow, did I really ask this 3+ years ago?
Since then, I've actually written a program to do this.
which I think is reasonably secure (but I could be wrong).
Here is a sample query.

Of course, if I knew all such
commands, I'd just filter against
them, but I'm mostly worried about
commands I haven't thought of.
Have you considered using a whitelist instead of a blacklist? Only allow statements that start with SELECT or EXPLAIN.

You haven't mentioned how you are providing access to the SQLite database.
If you are doing so through the C API (e.g. writing a CGI in C that takes a raw SQL query, passes it to sqlite, and then returns whatever was returned), then the dot commands like ".load" are of no concern. These are implemented by the sqlite3 shell program, and will not work when calling the C API functions directly.
In this case you can call sqlite3_open_v2 passing SQLITE_OPEN_READONLY as one of the flags to prevent the database from being written.
The ATTACH command can be disabled by calling sqlite3_limit() to set SQLITE_LIMIT_ATTACHED to 1 to prevent attaching a second database from succeeding. Since the DETACH statement "detaches an additional database connection previously attached using the ATTACH statement" it sounds like this would prevent one from detaching the original database in order to bypass this restriction.
As far as I can tell from looking at the SQL understood by SQLite, this should close up all the holes. You may wish to run through the pragmas with a fine-tooth comb just to make sure, if there is anything I missed let me know and I'll update this answer.

Assure that your user has write access and that other users (especially the user that the webserver runs as) has only read access to the file itself. How you do this of course depends on your platform (Linux, Windows, etc.)

Make your database file read only in the operating system. Once you've done that SQLite can't override it. If you still have issues it's not a SQLite issue. They might still be able to find a php/cgi/etc issue but that's the nature of the security beast.

Related

safely executing arbitrary code

I have a program that can get code from a user as input (This question is language-agnostic, though I am primarily interested in answers for Java and Python). Usually, this code is going to be useful, but I don't have a guarantee that the user isn't making a mistake, or even deliberately giving malicious code.
I want to be able to execute this code safely, i.e. without harmful side effects if it turns out to be faulty or malicious.
More specifically:
the user specifies that the input code should operate on some objects that exist in the primary program (the program that gets the code from the user and executes it). Optimally, it should be able to access these objects directly, but sending them over to the child program through some communication protocol or a file is also fine.
in the same way, the code should generate some output that is transmitted back to the parent program.
the user can specify whether the code should be allowed to access any other data, whether it should be allowed to read or write to files, and whether it should have access to any other interfaces or OS methods.
it is possible to specify a maximum runtime after which the code will be interrupted if it hasn't finished executing yet.
the parent program and the code to execute may be different languages. You can assume that the programs necessary to compile and execute the given code are installed and available to the parent program. If the languages are different assume that some standard format like JSON can be used for transmitting the data (or is there a way to do this more efficiently?)
I think that this should be doable with a Virtual Machine. However, speed is a concern and I want to be able to execute many code blocks quickly, so that creating and tearing down a VM for each of them may be prohibitively expensive.
Another option is creating a sandbox, which e.g. Java can do, but as far as I am aware only for executing other Java code. I am unable to find a solution to do this with arbitrary languages.
For which languages does this work well, for which is it difficult?
Is this easier on some OS than on others?

How to Synchronize object between multiple instance of Node Js application

Is there any to lock any object in Node JS application.
Is there are multiple instance for application is available some function shouldnt run concurrent. If instance A function is completed, it should unlock that object/key or some identifier and B instance of application should check if its unlock it should run some function.
Any Object or Key can be used for identifying the locking and unlocking the function.
How to do that in NodeJS application which have multiple instances.

As mentioned above Redis may be your answer, however, it really depends on the resources available to you. There are some other possibilities less complicated and certainly less powerful which may also do the trick.
node-cache may also do the trick, if you set it up correctly. It is not any where near as powerful as Redis, but on the bright side it does not require as much setup and interaction with your environment.
So there is Redis and node-cache for memory locks. I should mention there are quite a few NPM packages which do the cache. Depends on what you need, and how intricate your cache needs to be.
However, there are less elegant ways to do what you want, though less elegant is not necessarily worse.
You could use a JSON file based system and hold locks on the files for a TTL. lockfile or proper-lockfile will accomplish the task. You can read the information from the files when needed, delete when required, give them a TTL. Basically a cache system to disk.
The memory system is obviously faster. The file system requires just as much planning in your code as the memory system.
There is yet another way. This is possibly the most dangerous one, and you would have to think long and hard on the consequences in terms of security and need.
Node.js has its own process.env. As most know this holds the system global variables available to all by simply writing process.env.foo where foo would have been declared as a global system variable. A package such as .dotenv allows you to add to your system variables by way of a .env text file. Thus if you put in that file sam=mongoDB, then in your code where you write process.env.sam it will be interpreted as mongoDB. Tons of system wide variables can be set up here.
So what good does that do, you may ask? Well these are system wide variables, and they can be changed in mid-flight. So if you need to lock the variables and then change them it is a simple manner to do it with. Beware though of the gotcha here. Once the system goes down, or all processes stop, and is started again, your environment variables will return to the default in the .env file.
Additionally, unless you are running a system which is somewhat safe on AWS or Azure etc. I would not feel secure in having my .env file open to the world. There is a way around this one too. You can use a hash to encrypt all variables and put the hash in the file. When you call it, decrypt before actually requesting use of the full variable.
There are probably many wore ways to lock and unlock, not the least of which is to use the native Node.js structure. Combine File System events together with Crypto. But this demands a much deeper level of understanding of the actual Node.js library and structures.
Hope some of this helped.

I strongly recommend Redis in your case.
There are several ways to create a application/process shared object, using locks is one of them, as you mentioned.
But they're just complicated. Unless you really need to do that yourself, Redis will be good enough. Atomic ops cross multiple process, transaction and so on.

Old thread but I didn't want to use redis so I made my own open source solution which utilizes websocket connections:
https://github.com/OneAndonlyFinbar/sync-cache

Nodejs - How to maintain a global datastructure

So I have a backend implementation in node.js which mainly contains a global array of JSON objects. The JSON objects are populated by user requests (POSTS). So the size of the global array increases proportionally with the number of users. The JSON objects inside the array are not identical. This is a really bad architecture to begin with. But I just went with what I knew and decided to learn on the fly.
I'm running this on a AWS micro instance with 6GB RAM.
How to purge this global array before it explodes?
Options that I have thought of:
At a periodic interval write the global array to a file and purge. Disadvantage here is that if there are any clients in the middle of a transaction, that transaction state is lost.
Restart the server every day and write the global array into a file at that time. Same disadvantage as above.
Follow 1 or 2, and for every incoming request - if the global array is empty look for the corresponding JSON object in the file. This seems absolutely absurd and stupid.
Somehow I can't think of any other solution without having to completely rewrite the nodejs application. Can you guys think of any .. ? Will greatly appreciate any discussion on this.

I see that you are using memory as a storage. If that is the case and your code is synchronous (you don't seem to use database, so it might), then actually solution 1. is correct. This is because JavaScript is single-threaded, which means that when one code is running the other cannot run. There is no concurrency in JavaScript. This is only a illusion, because Node.js is sooooo fast.
So your cleaning code won't fire until the transaction is over. This is of course assuming that your code is synchronous (and from what I see it might be).
But still there are like 150 reasons for not doing that. The most important is that you are reinventing the wheel! Let the database do the hard work for you. Using proper database will save you all the trouble in the future. There are many possibilites: MySQL, PostgreSQL, MongoDB (my favourite), CouchDB and many many other. It shouldn't matter at this point which one. Just pick one.

I would suggest that you start saving your JSON to a non-relational DB like http://www.couchbase.com/.
Couchbase is extremely easy to setup and use even in a cluster. It uses a simple key-value design so saving data is as simple as:
couchbaseClient.set("someKey", "yourJSON")
then to retrieve your data:
data = couchbaseClient.set("someKey")
The system is also extremely fast and is used by OMGPOP for Draw Something. http://blog.couchbase.com/preparing-massive-growth-revisited

Sqlite thread modes and sqlite misuse paradox

I have a project where i should use multiple tables to avoid keeping dublicated data in my sqlite file(Even though i knew usage of several tables was nightmare).
In my application i am reading data from one table in some method and inserting data into another table in some other method. When i do this i am getting from sqlite step function, error code 21 which is sqlite misuse.
Accoding to my researches that was because i was not able to reach tables from multi threads.
Up to now, i read the sqlite website and learned that there are 3 modes to configurate sqlite database:
1) singlethread: you have no chances to call several threads.
2) multithread: yeah multi thread; but there are some obstacles.
3) serialized: this is the best match with multithread database applications.
if sqlite3_threadsafe() == 2 returns true then yes your sqlite database is serialized and this returned true, so i proved it for myself.
then i have a code to configurate my sqlite database for serialized to take it under guarantee.
sqlite3_config(SQLITE_CONFIG_SERIALIZED);
when i use above codes in class where i read and insert data from 1 table works perfectly :). But if i try to use it in class where i read and insert data from 2 tables (actually where i really need it) problem sqlite misuse comes up.
I checked my code where i open and close database, there is no problem with them. they work unless i delete the other.
I am using ios5 and this is really a big problem for my project. i heard that instagram uses postgresql may be this was the reason ha? Would you suggest postgresql or sqlite at first?

It seems to me like you've got two things mixed up.
Single vs. multi-threaded
Single threaded builds are only ever safe to use from one thread of your code because they lack the mechanisms (mutexes, critical sections, etc.) internally that permit safe use from several. If you are using multiple threads, use a multi-threaded build (or expect “interesting” trouble; you have been warned).
SQLite's thread support is pretty simple. With a multi-threaded build, particular connections should only be used from a single thread (except that they can be initially opened in another).
All recent (last few years?) SQLite builds are happy with access to a single database from multiple processes, but the degree of parallelism depends on the…
Transaction type
SQL in general supports multiple types of transaction. SQLite supports only a subset of them, and its default is SERIALIZABLE. This is the safest mode of access; it simulates what you would see if only one thing could happen at a time. (Internally, it's implemented using a scheme that lets many readers in at once, but only one writer; there's some cleverness to prevent anyone from starving anyone else.)
SQLite also supports read-uncommitted transactions. This increases the amount of parallelism available to code, but at the risk of readers seeing information that's not yet been guaranteed to persist. Whether this matters to you depends on your application.

Control Linux Application Launch/Licensing

I need to employ some sort of licensing on some Linux applications that I don't have access to their code base.
What I'm thinking is having a separate process read the license key and check for the availability of that application. I would then need to ensure that process is run during every invocation of the respected application. Is there some feature of Linux that can assist in this? For example something like the sudoers file in which I detect what user and what application is trying to be launched, and if a combination is met, run the license process check first.
Or can I do something like not allow the user to launch the (command-line) application by itself, and force them to pipe it to my license process as so:
/usr/bin/tm | license_process // whereas '/usr/bin/tm' would fail on its own

I need to employ some sort of licensing on some Linux applications
Please note that license checks will generally cost you way more (in support and administration) than they are worth: anybody who wants to bypass the check and has a modicum of skill will do so, and will not pay for the license if he can't anyway (that is, by not implementing a licensing scheme you are generally not leaving any money on the table).
that I don't have access to their code base.
That makes your task pretty much impossible: the only effective copy-protection schemes require that you rebuild your entire application, and make it check the license in so many distinct places that the would be attacker gets bored and goes away. You can read about such schemes here.
I'm thinking is having a separate process read the license key and check for the availability of that application.
Any such scheme will be bypassed in under 5 minutes by someone skilled with strace and gdb. Don't waste your time.

You could write a wrapper binary that does the checks, and then link in the real application as part of that binary, using some dlsym tricks you may be able to call the real main function from the wrapper main function.
IDEA
read up on ELF-hacking: http://www.linuxforums.org/articles/understanding-elf-using-readelf-and-objdump_125.html
use ld to rename the main function of the program you want to protect access to. http://fixunix.com/aix/399546-renaming-symbol.html
write a wrapper that does the checks and uses dlopen and dlsym to call the real main.
link together real application with your wrapper, as one binary.
Now you have an application that has your custom checks that are somewhat hard to break, but not impossible.
I have not tested this, don't have the time, but sort of fun experiment.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string