How much Data RMS can store Approx.? - java-me

I want to use RMS for storing Large Amount of Data. I have checked till certain Limit. I just want to ensure about RMS that will it capable to store it?
I have stored around 1,35,000 characters in RMS and I can also fetch them from RMS. How much data can I store using RMS?
I want to implement it in Live Application.

There is no such fixed limitation for RMS storage capacity. It all depends on how much free memory available on the device.
Try the following code snippet in your application to find memory status on device.
Runtime rt = Runtime.getRuntime();
long totalMemory = rt.totalMemory();
long freeMemory = rt.freeMemory();
// if rs is an instance of your App's RMS
// then, try
RecordStore rs = RecordStore.openRecordStore( "App_Db_Name_Here", true );
int sizeAvailable = rs.getSizeAvailable();
// it returns the amount of additional room (in bytes) available
// for this record store to grow.
Compare the above three values and proceed accordingly in your application.
But RMS IO operations would be slower as the size grows and hence such local database is not used for storing large data. Again this varies from device to device. You should be taking decision of implementation while porting your app on cross devices.
Refer to: RecordStore, and getSizeAvailable() documentation for complete notes.

Related

Latency in IMap.get(key) when the object being retrieved is heavy

we have a map of custom object key to custom value Object(complex Object). We set the in-memory-format as OBJECT. But IMap.get is taking more time to get the value when the retrieved object size is big. We cannot afford latency here and this is required for further processing. IMap.get is called in jvm where cluster is started. Do we have a way to get the objects quickly irrespective of its size?
This is partly the price you pay for in-memory-format==OBJECT
To confirm, try in-memory-format==BINARY and compare the difference.
Store and retrieve are slower with OBJECT, some queries will be faster. If you run enough of those queries the penalty is justified.
If you do get(X) and the value is stored deserialized (OBJECT), the following sequence occurs
1 - the object it serialized from object to byte[]
2 - the byte array is sent to the caller, possibly across the network
3 - the object is deserialized by the caller, byte[] to object.
If you change to store serialized (BINARY), step 1 isn't need.
If the caller is the same process, step 2 isn't needed.
If you can, it's worth upgrading (latest is 5.1.3) as there are some newer options that may perform better. See this blog post explaining.
You also don't necessarily have to return the entire object to the caller. A read-only EntryProcessor can extract part of the data you need to return across the network. A smaller network packet will help, but if the cost is in the serialization then the difference may not be remarkable.
If you're retrieving a non-local map entry (either because you're using client-server deployment model, or an embedded deployment with multiple nodes so that some retrievals are remote), then a retrieval is going to require moving data across the network. There is no way to move data across the network that isn't affected by object size; so the solution is to find a way to make the objects more compact.
You don't mention what serialization method you're using, but the default Java serialization is horribly inefficient ... any other option would be an improvement. If your code is all Java, IdentifiedDataSerializable is the most performant. See the following blog for some numbers:
https://hazelcast.com/blog/comparing-serialization-options/
Also, if your data is stored in BINARY format, then it's stored in serialized form (whatever serialization option you've chosen), so at retrieval time the data is ready to be put on the wire. By storing in OBJECT form, you'll have to perform the serialization at retrieval time. This will make your GET operation slower. The trade-off is that if you're doing server-side compute (using the distributed executor service, EntryProcessors, or Jet pipelines), the server-side compute is faster if the data is in OBJECT format because it doesn't have to deserialize the data to access the data fields. So if you aren't using those server-side compute capabilities, you're better off with BINARY storage format.
Finally, if your objects are large, do you really need to be retrieving the entire object? Using the SQL API, you can do a SELECT of just certain fields in the object, rather than retrieving the entire object. (You can also do this with Projections and the older Predicate API but the SQL method is the preferred way to do this). If the client code doesn't need the entire object, selecting certain fields can save network bandwidth on the object transfer.

Usage of Redis for very large memory cache

I am planning to consider Redis for storing large amount of data in cache. Currently I store them in my own cache written in java. My use case is below.
I get 15 minutes data from a source and i need to aggregate the data hourly. So for a given object A every hour I will get 4 values and I need to aggregate them to one value the formula I will use will max / min / sum.
Foe making key I plan to use like below
a) object id - long
b) time - long
c) property id - int (each object may have many property which I need to aggregate for each property separately)
So final key would look like;
objectid_time_propertyid
Every 15 minutes I may get around 50 to 60 Million keys , I need to fetch these keys every time convert the property value to double and apply the formula (max/min/sum etc.) then convert back to String and store back.
So I see for every key I have one read and one write and conversion in each case.
My questions are following.
Is is advisable to use redis for such use case , going forward I may aggregate hourly data to daily , daily to weekly and so on.
What would be performance of read and writes in cache (I did a sample test on Windows and 100K keys read and write took 30-40 seconds thats not great , but I did on windows and I finally need to run on linux.
I want to use persistence function of redis, what are pros and cons of it ?
If any one has real experience in usage of redis as memcache which requires frequent updation please give a suggestion.
Is is advisable to use redis for such use case , going forward I may aggregate hourly data to daily , daily to weekly and so on.
Advisable depends on who you ask, but I certainly feel Redis will be up to the job. If a single server isn't enough, your description suggests that the dataset can be easily sharded so a cluster will let you scale.
I would advise, however, that you store your data a little differently. First, every key in Redis has an overhead so the more of these, the more RAM you'll need. Therefore, instead of keeping a key per object-time-property, I recommend Hashes as a means for aggregating some values together. For example, you could use an object_id:timestamp key and store the property_id:value pairs under it.
Furthermore, instead of keeping the 4 discrete measurements for each object-property by timestamp and recomputing your aggregates, I suggest you keep just the aggregates and update these with new measurements. So, you'd basically have an object_id Hash, with the following structure:
object_id:hourtimestamp -> property_id1:max = x
property_id1:min = y
property id1:sum = z
When getting new data - d - for an object's property, just recompute the aggregates:
property_id1:max = max(x, d)
property_id1:min = min(y, d)
property_id1:sum = z + d
Repeat the same for every resolution needed, e.g. use object_id:daytimestamp to keep day-level aggregates.
Finally, don't forget expiring your keys after they are no longer required (i.e. set a 24 hours TTL for the hourly counters and so forth).
There are other possible approaches, mainly using Sorted Sets, that can be applicable to solve your querying needs (remember that storing the data is the easy part - getting it back is usually harder ;)).
What would be performance of read and writes in cache (I did a sample test on Windows and 100K keys read and write took 30-40 seconds thats not great , but I did on windows and I finally need to run on linux.
Redis, when running on my laptop on Linux in a VM, does an excess of 500K reads and writes per second. Performance is very dependent on how you use Redis' data types and API. Given your throughput of 60 million values over 15 minutes, or ~70K/sec writes of smallish data, Redis is more than equipped to handle that.
I want to use persistence function of redis, what are pros and cons of it ?
This is an extremely-well documented subject - please refer to http://redis.io/topics/persistence and http://oldblog.antirez.com/post/redis-persistence-demystified.html for starters.

Redis key design for real-time stock application

I am trying to build a real-time stock application.
Every seconds I can get some data from web service like below:
[{"amount":"20","date":1386832664,"price":"183.8","tid":5354831,"type":"sell"},{"amount":"22","date":1386832664,"price":"183.61","tid":5354833,"type":"buy"}]
tid is the ticket ID for stock buying and selling;
date is the second from 1970.1.1;
price/amount is at what price and how many stock traded.
Reuirement
My requirement is show user highest/lowest price at every minute/5 minutes/hour/day in real-time; show user the sum of amount in every minute/5 minutes/hour/day in real-time.
Question
My question is how to store the data to redis, so that I can easily and quickly get highest/lowest trade from DB for different periods.
My design is something like below:
[date]:[tid]:amount
[date]:[tid]:price
[date]:[tid]:type
I am new in redis. If the design is this is that means I need to use sorted set, will there any performance issue? Or is there any other way to get highest/lowest price for different periods.
Looking forward for your suggestion and design.
My suggestion is to store min/max/total for all intervals you are interested in and update it for current ones with every arriving data point. To avoid network latency when reading previous data for comparison, you can do it entirely inside Redis server using Lua scripting.
One key per data point (or, even worse, per data point field) is going to consume too much memory. For the best results, you should group it into small lists/hashes (see http://redis.io/topics/memory-optimization). Redis only allows one level of nesting in its data structures: if you data has multiple fields and you want to store more than one item per key, you need to somehow encode it yourself. Fortunately, standard Redis Lua environment includes msgpack support which is very a efficient binary JSON-like format. JSON entries in your example encoded with msgpack "as is" will be 52-53 bytes long. I suggest grouping by time so that you have 100-1000 entries per key. Suppose one-minute interval fits this requirement. Then the keying scheme would be like this:
YYmmddHHMMSS — a hash from tid to msgpack-encoded data points for the given minute.
5m:YYmmddHHMM, 1h:YYmmddHH, 1d:YYmmdd — window data hashes which contain min, max, sum fields.
Let's look at a sample Lua script that will accept one data point and update all keys as necessary. Due to the way Redis scripting works we need to explicitly pass the names of all keys that will be accessed by the script, i.e. the live data and all three window keys. Redis Lua has also JSON parsing library available, so for the sake of simplicity let's assume we just pass it JSON dictionary. That means that we have to parse data twice: on the application side and on the Redis side, but the performance effects of it are not clear.
local function update_window(winkey, price, amount)
local windata = redis.call('HGETALL', winkey)
if price > tonumber(windata.max or 0) then
redis.call('HSET', winkey, 'max', price)
end
if price < tonumber(windata.min or 1e12) then
redis.call('HSET', winkey, 'min', price)
end
redis.call('HSET', winkey, 'sum', (windata.sum or 0) + amount)
end
local currkey, fiveminkey, hourkey, daykey = unpack(KEYS)
local data = cjson.decode(ARGV[1])
local packed = cmsgpack.pack(data)
local tid = data.tid
redis.call('HSET', currkey, tid, packed)
local price = tonumber(data.price)
local amount = tonumber(data.amount)
update_window(fiveminkey, price, amount)
update_window(hourkey, price, amount)
update_window(daykey, price, amount)
This setup can do thousands of updates per second, not very hungry on memory, and window data can be retrieved instantly.
UPDATE: On the memory part, 50-60 bytes per point is still a lot if you want to store more a few millions. With this kind of data I think you can get as low as 2-3 bytes per point using custom binary format, delta encoding, and subsequent compression of chunks using something like snappy. It depends on your requirements, whether it's worth doing this.

In J2ME, RMS can still hold all records after closing an app or restarting the phone?

I am developing a location based app in J2ME. I am using a configuration of CLDC 1.1 & MIDP 2.0.
In that I have to stored place name, address, latitude, longitude, reminder text, and tone name in a database. My questions about RMS are:
1) When I close an app or restart an app; whatever records are stored by an app in RMS get deleted?
2) What is the maximum capacity of RMS of holding record. Is that infinite?
3) How many records can RMS hold without having the mobile system slow down?
4) Which J2ME Database system provide me efficiency, simplicity & speed in add, delete & update records. Does RMS provide that?
For your questions, the answers are given below:
1) When I close an app or restart an app; whatever records are stored by an app in RMS get deleted?: When you close or restart the phone the RMS not deleted. When you call delete method in RMS then it is deleted. If you delete your application (Midlet suite) from device then the RMS associated with the application is deleted,
2) What is the maximum capacity of RMS of holding record. Is that infinite?: I think it is based on how many memory available. If you store your application in the SD card then the RMS is also occupies the SD card memory.
If you store the application in the device memory (Not in SD card) then the RMS occupies the device memory (here device memory is much less).
4) Which J2ME Database system provide me efficiency, simplicity & speed in add, delete & update records. Does RMS provide that?: The RMS meets your requirement.
But the records in RMS are stored as a flat file system. There is delete, insert method, etc in RMS API. But you need to build a logic to find which record is needed for delete, insert, etc. For instance, we want to delete the record in which age is >20 means we cannot use the query like delete from table1 where age>20. Here we want to read the all the records one-by-one and find which records contain age>20, then we find position and delete with respect to position. Because we cannot use the sql queries in RMS. This is big disadvantage in RMS.

Store RMS to memory card

I had question, is it possible to store RMS to Memory card? Since we use large amount data to store in RMS. Is there any alternative way?
I want to know how RMS memory is calculated?
I want to know how RMS memory is calculated
API for above is RecordStore#getSizeAvailable(), see method javadocs:
...Returns the amount of additional room (in bytes) available for this record store to grow. Note that this is not necessarily the amount of extra MIDlet-level data which can be stored, as implementations may store additional data structures with each record to support integration with native applications, synchronization, etc.
To get Total memory size
Runtime.getRuntime().totalMemory()
To get free memory size
Runtime.getRuntime().freeMemory()

Resources