I have been creating a text editor online, just for learning experience. I was curious what the best way to store multiple versions of a text file that is consistently changing is.
I've looked at a variety of options and I am yet to see a cheap, and scale-able option.
I've looked into Google Cloud Storage and Amazon S3. The only issue is that too many requests to save the file start to add up a lot in cost. I'd like files to be saved practically instantly, and also versioned every so often. I've also looked into data deduplication which looks like a great option, but I have not yet found a way to do it without writing my own software.
Any and all advice would be greatly appreciated. Thanks!
This is a very broad question, but the basic answer is usually some flavor of Operational Transform. Basically you don't want to be constantly sending the entire document back and forth between the user(s) and the server, nor do you want to overwrite the whole of the document repeatedly. Instead you want to store diffs. Then you need to deal with the idea that multiple users might be changing the file simultaneously, but possibly in different areas, and dealing with that effectively.
Wikipedia has some good, formal discussion of the idea: https://en.wikipedia.org/wiki/Operational_transformation
You wouldn't need all of that for a document that will only be edited by one person at a time, but even then, the answer is to think in terms of diffs from previous versions and only occasionally persist whole snapshots.
My embedded linux gets its data files from an external source (sd card). As this media is easily detachable I'd like to protect it in a certain way.
First idea that comes in mind is to do encryption. I'm afraid though this would take too much processing power. My files are not deeply sensitive, but I don't want that people can put the card into their desktop and see/copy my files. I assume these people know how to mount a standard ext4 drive.
Content is initially loaded on to the disk via a desktop linux box, so the process should be
I wouldn't care too much if the solution is not hack-proof. Basically I want to avoid to have my content copied by the general copycat.
I'm not looking for a turn-key solution, but like to get some pointers into the right direction.
A simple XOR Cipher requires very little processing. The security is limited in the sense that if someone has a both the encrypted and plain-text data, by XOR'ing the two the encryption key is revealed. However so long as you can avoid someone being knowingly in possession of both, and the key itself remains confidential, it may meet your requirements of simplicity and security.
Obviously you need a longer key that the simple 8 bit one in the example in the link. The key itself can be arbitrarily long with no impact on performance.
I saw allot of companies offering exe wrappers , but is there any in pdf side security programmatically ?
Well, you can encrypt the PDF. You can also use custom encryption handler and thus make your file unreadable with stock Acrobat or Reader (one will need to install your decryption plugin to Acrobat or Reader to make them understand your encryption). The problem is acrobat's DRM SDK (the one that allow you create encryption plugins) once had enormous cost (smth. like $25K to start). I don't know if this is still the case, though.
Another not-so-bad option is render everything to graphics - this makes text copying harder (though one can print everything and OCR it back).
The short answer is no. When you give someone the ciphertext, key, and cipher they will always be able to reproduce the plaintext. DRM fails universally for just this reason.
The long answer is that you can sometimes try little gimmicky tricks to prevent copying in some circumstances which may "work" if your audience doesn't try breaking it, but not in the general case. You can't really call something secure which is "safe as long as nobody tries to break it".
The PDF format itself has an "owner password" which allows the author to disallow readers from printing the document, modifying it, etc... Of course there's not actually any mechanism for preventing anyone from doing so. If you are trying to prevent the guys in the marketing department from printing it off or something, then maybe. But if you're releasing it out into the Internet, just assume that it can and will be copied however users see fit.
This is one of my assignments and I need some help in getting started. The basic idea behind the assignment is that I have to design a self destructible email program that is capable of destructing the message after (n) time duration.
Speaking about self destructible emails, there are quite a few ones on the internet offering the same service. But what they do is, they just convert the email message into an image and store them on their servers. Now, they send the message attaching the image inline with it. After they receive a hit on that image (which means that the message was being opened), they simply delete the image and the inline image link breaks! BOOM!
IMO, that's not what a self destructing email should be like. Nevertheless, in my case, I have to take care of following points:
I have to do it for TEXT. No image, nothing else.
I have to assume that the systems used throughout the process will be UNIX based (I don't know how that is going to make a difference).
There are also some hints regarding the usage of various network layers in solving the problem.
This isn't supposed to be done "in general". What I mean by that is, I have to do that ONLY for one/two UNIX systems. Let me put it this way, all I have is two UNIX systems and nothing else. Now I want to create a program (in UNIX itself) that would do that self-destructing thing. I have total control of protocols and the network layers and I have to code anything and everything required at any level.
This is more geared towards the StackOverflow side of things but I have no problem getting you started.
The first thing I'd like to point out is that you seem to be heavily over-analyzing this. The services that have self-destructing e-mails which are image based are simply deleting a file after it is viewed. All you need to do differently is put that text in a file and get it's contents before deleting it. This fits well with the UNIX philosophy since so many programs already make use of flat files.
The part you seem to have left out is how you are building this. You describe it as an e-mail program and then talk about web services. Is this a web-based project or a program you are designing for Linux? Do you have to code everything from scratch or can you parse output from Linux utilities to grab the mail? These kinds of things really would simplify the process.
How does one combine several resources for an application (images, sounds, scripts, xmls, etc.) into a single/multiple binary file so that they're protected from user's hands? What are the typical steps (organizing, loading, encryption, etc...)?
This is particularly common in game development, yet a lot of the game frameworks and engines out there don't provide an easy way to do this, nor describe a general approach. I've been meaning to learn how to do it, but I don't know where to begin. Could anyone point me in the right direction?
There are lots of ways to do this. m_pGladiator has some good ideas, especially with seralization. I would like to make a few other comments.
First, if you are going to pack a bunch of resources into a single file (I call these packfiles), then I think that you should work to avoid loading the whole file and then deseralizing out of that file into memory. The simple reason is that it's more memory. That's really not a problem on PC's I guess, but it's good practice, and it's essential when working on the console. While we don't (currently) serialize objects as m_pGladiator has suggested, we are moving towards that.
There are two types of packfiles that you might have. One would be a file where you want arbitrary access to the contents of the files. A second type might be a collection of files where you need all of those files when loading a level. A basic example might be:
An audio packfile might contain all the audio for your game. You might only need to load certain kinds of audio for the menus or interface screens and different sets of audio for the levels. This might fall intot he first category above.
A type that falls into the second category might be all models/textures/etc for a level. You basically want to load the entire contents of this file into the game at load time because you will (likely) need all of it's contents while a player is playing that level or section.
many of the packfiles that we build fall into the second category. We basically package up the level contents, and then compresses them with something like zlib. When we load one of these at game time, we read a small amount of the file, uncompress what we've read into a memory buffer, and then repeat until the full file has been read into memory. The buffer we read into is relatively small while final destination buffer is large enough to hold the largest set of uncompressed data that we need. This method is tricky, but again, it saves on RAM, it's an interesting exercise to get working, and you feel all nice and warm inside because you are being a good steward of system resources. once the packfile has been completely uncompressed into it's destinatino buffer, we run a final pass on the buffer to fix up pointer locations, etc. This method only works when you write out your packfile as structures that the game knows. In other words, our packfile writing tools share struct (or classses) with the game code. We are basically writing out and compressing exact representations of data structures.
If you simply want to cut down on the number of files that you are shipping and installing on a users machine, you can do with something like the first kind of packfile that I describe. Maybe you have 1000s of textures and would just simply like to cut down on the sheer number of files that you have to zip up and package. You can write a small utility that will basically read the files that you want to package together and then write a header containing the files and their offsets in the packfile, and then you can write the contents of the file, one at a time, one after the other, in your large binary file. At game time, you can simply load the header of this packfile and store the filenames and offsets in a hash. When you need to read a file, you can hash the filename and see if it exists in your packfile, and if so, you can read the contents directly from the packfile by seeking to the offset and then reading from that location in the packfile. Again, this method is basically a way to pack data together without regards for encryption, etc. It's simply an organizational method.
But again, I do want to stress that if you are going a route like I or m_pGladiator suggests, I would work hard to not have to pull the whole file into RAM and then deserialize to another location in RAM. That's a waste of resources (that you perhaps have plenty of). I would say that you can do this to get it working, and then once it's working, you can work on a method that only reads part of the file at a time and then decompresses to your destination buffer. You must use a comprsesion scheme that will work like this though. zlib and lzw both do (I believe). I'm not sure about an MD5 algorithm.
Hope that this helps.
do as Java: pack it all in a zip, and use an filesystem-like API to read directly from there.
Personally, I never used the already available tools to do that. If you want to prevent your game to be hacked easily, then you have to develop your own resource manipulation engine.
First of all read about serializing objects. When you load a resource from file (graphic, sound or whatever), it is stored in some object instance in the memory. A game usually uses dozens of graphical and sound objects. You have to make a tool, which loads them all and stores them in collections in the memory. Then serialize those collections into a binary file and you have every resource there.
Then you can use for example MD5 or any other encryption algorithm to encrypt this file.
Also, you can use zlib or other compression library to make this big binary file a bit smaller.
In the game, you should load the encrypted binary file and unpack it. Then decrypt it. Then deserialize the object collections and you have all resources back in memory.
Of course you can make this more comprehensive by storing in different binary files the resources for different levels and so on - there are plenty of variants, depending on what you want. Also you can first zip, then encrypt, or make other combinations of the steps.
Short answer: yes.
In Mac OS 6,7,8 there was a substantial API devoted to this exact task. Lookup the "Resource Manager" if you are interested. Edit: So does the ROOT physics analysis package.
Not that I know of a good tool right now. What platform(s) do you want it to work on?
Edited to add: All of the two-or-three tools of this sort that I am away of share a similar struture:
The file starts with a header and index
There are a series of blocks some of which may have there own headers and indicies, some of which are leaves
Each leaf is a simple serialization of the data to be stored.
The whole file (or sometimes individual blocks) may be compressed.
Not terribly hard to implement your own, but I'd look for a good existing one that meets your needs first.
For future people, like me, who are wondering about this same topic, check out the two following links:
http://www.sfml-dev.org/wiki/en/tutorials/formatdat
http://archive.gamedev.net/reference/programming/features/pak/