Using redis to send friend status - node.js

I was looking around the internet to find out how I can send user status such as offline and online, etc to only friends using socket io. Some people were saying to use Redis. so I had a look and played around with it. I am also using mongodb to store friends and users.
This is my setup right now:
//Status List:
// 0 - offline
// 1 - online
// 2 - away
// 3 busy
//Set the status
redisClient.hmset ("online_status:userID", "status", "1");
//Check if someone is online
redisClient.hgetall ("online_status:userID", (err, reply) => {
console.log(reply)
})
Is it fine if I use it like this to get user status? or is there a better way to do this?
Another question is that, is that is it fine to keep looping hgetall or is there a better way to get multiple statuses at once?

You are using a hash type for storing a single information and you are using hgetall to retrieve it, so I assume you are not that familiar with redis data types yet. So first let me explain in short the three data types I'll talk about (find all types in the docs here https://redis.io/topics/data-types-intro ):
String: Is a simple key/value type, access it with set(key, value) and get(key, value)
Hash: Is a bunch of key/values stored under one redis key. Useful for storing attributes of an entity, like you could have a "userdata:userID" key and store name, avatar, status... with it. Access it with hset(key, field, value), hget(key, field), hgetall(key)
Set: Is a collection of unique strings, access it with sadd(key, member), sismember(key, member), smembers(key)
If you are only going to save the online status it would be cleaner to use a string type with set, get and del (since usually most users are offline most of the time, delete them and save space). For this simple key/value usecase redis is actually not even better than good old memcache.
If you intend to store more user related attributes (mood, motto, avatar...) you should rename it to "userdata:userID" and check with hget("userdata:userID", "status") and use hgetall only to retrieve all attributes.
Another approach could be to store all users in a SET: sadd('users:online', userID) and check with sismember('users:online', userID) or get all online users with smembers('users:online'). Suppose you store all friends in another SET friends:userID, you could grab all online friends of a user with a single intersect command sinter('friends:userID', 'users:online') - pretty nice and elegant IMHO, but this get's complicated with more different states and doesn't work with redis-cluster.
I would prefer the SET approach. Multiple hgets should also be fine until you encounter issues due to the one guy (there is allways one) that has thousands of contacts and refreshes all the time. At that point you could still introduce some friendship limits or caching.

Related

CouchDB - Get custom fields within _users for replication filtering

I am developing a simple client for Android which fetches data from a CouchDB database. There will be only one database for all users. The data pull-replicated is filtered by a JS function. Such function (simplified) would be like this:
function(doc,req) {
if (!doc.type || doc.type !='item') { return false; }
if (doc.foo && ... && req.userCtx.bar.indexOf(doc.foo) != -1) { return true; }
...
}
As I have read in the official documentation, _users is a perfect place to set custom fields related to the user. So did I as you can see in the above code (see req.userCtx.bar array).
The problem I am facing is that the object/JSON req.userCtx only contains these fields: db, name and roles.
1. What would be a good alternative to my idea? I am a little bit stuck right now at this point. 2. How can I retrieve the user's data (all fields official and custom)?. 3. Is it correct to add as filter parameter a large array?
NOTE
I am thinking of a messy alternative of adding an array-field in every item which will contain the list with all users allowed to pull such item although I have the feeling that there must be another way.
Saving user data in _users is interesting because only the user or an admin can read a user's document.
However, as you've found out, that doesn't mean that all user data is available to the userCtx object. All you get is the user's name and roles array. Can you make do with roles?
To retrieve all of the user's data, you should fetch the user's document from the _users database. You can do that with a GET request on http://localhost:5984/_users/org.couchdb.user:[USER].
To know what would be an appropriate solution to your problem, we'd need quite a bit more info. For instance, looking at your code, it seems you designed that filter with the intention of restricting replication to documents listed as being visible to the user. However, you can't really lock down CouchDB in a way that replication works, and the user doesn't have read access to the entire database. You really need one db per user for this to work.

CouchDB replication strategy with dynamic groups of users

This is the situation:
We have a series of users who share some documents. The documents they can share might change throughout the day, so can the documents themselves (changes and deletions). The users can change some information on the documents.
E.g.
Users | Documents
A | X
A | Y
A | Z
B | X
B | Z
C | Y
Possible groups: A+C, A+B
The server on CouchDB is a replica of a SQL Server DB with this data, an ETL takes care of managing changes on CouchDB. However, the CouchDB database is replicated on each user phone via PouchDB.
The goal:
To replicate changes and deletions accordingly.
What we've tried:
1) we figured we'd structure our documents with a list of users that can access to it. Each document would have a "Users" array and then a filter in the design document would take care of the replication to the clients. Unfortunately document deletions and document changes that won't pass the filter (e.g. a user is removed from the array) are not present in the _changes feed so cannot be replicated accordingly on the clients
2) database per user. This is not possible, because users need to see each others work on the documents (they share them)
3) database per group of users. Pretty much the same problem as the first solution, but worse. In fact:
- groups of user can change and no longer be present: how do reflect that client-side?
- a document can shift to a new group: it will have to be redownloaded from scratch. This greatly increases the download size
- the same document can be in more than one group! (see example above)
- each client would have to know in which group she is everytime she logs in and replicate multiple databases. Then on the return trip you'd have to know on which databases the document was present
Is there a recipe for this situation? Am I missing an obvious solution?
EDIT
Partial solution for case 1:
localDB.sync(remoteDB, {
live: true,
retry: true,
filter: 'app/by_user',
query_params: { "agente": agent }
})
.on('paused', function(info){
console.log("paused");
localDB.allDocs().then(function(docs){
console.log("allDocs");
docs.rows.forEach(function(row){
console.log(row);
remoteDB.get(row.id)
.then(function(doc){
if(doc.Agents.indexOf(agent) < 0){
localDB.remove(doc);
}
});
});
});
})
.on('change', function(result){
console.log("change!");
result.change.docs.forEach(function(change) {
if(!change.deleted){
$rootScope.$apply(function(){
$rootScope.$broadcast('upsert', change);
});
}
});
});
Each remove() is giving me a 409 (conflict), and rightfully so. Is there a way to tell Pouch "no longer consider this as replicable and just remove it from my DB?"
(3) Seems like the simplest solution to me, i.e. the "database per role" solution.
I think your difficulty stems from trying to manage permissions inside the documents themselves (and then using filtering replication). When you do that, you are basically trying to mirror CouchDB's permission system inside your documents, which is going to cause headaches.
Why not create a database per role, and assign roles to users using the normal _users database? If roles change, then users will lose or gain access to a set of documents. You would need to have server endpoints to handle the role-shuffling, or you would need to set up separate "admin" databases with special privileges, where users can change the roles.
Then on the client side, you can either replicate from multiple CouchDB databases into a single PouchDB (and then collate the results together yourself), or into a single PouchDB (probably a bad idea if you need to sync bidirectionally). Obviously you would need an initial step where you determine which databases the user has access to, but that's a small downside in my opinion.
Then if the user loses access to a document, they will simply get normal 401 errors during replication (which will show up in the 'denied' event during live replication). No need for ddocs or filtered replication - much simpler!
We arrived at the conclusion that:
1) our use-case might not be what CouchDB is good for
2) we value our mental health. After almost a month struggling with this problem we'd rather try and fail
3) documents are relatively inexpensive, so even if they stay on the user's phone that won't cause any major distress. If the data builds up too much they can simply clear the data and start fresh
Solution:
1) Keep the architecture as to point 1
2) After each 'pause' event triggers compare local docs with remote docs, if the remote doc doesn't pass the filter remove it from the UI. Should there be a way to remove the local document only we'll be very interested in upgrading to that logic.
1) still sounds as the simplest approach to me..
I don't know PouchDB very well, but in plain CouchDB, changes on deleted document can be workaround by extending attributes on deleted document, using your own custom DELETE function.
I mean.. a delete is like an update which sets the _deleted attribute to true.
So, instead of directly deleting documents, using the normal CouchDB crud DELETE on document, you can create an update function like this:
function(doc,req){
// optional acls for deleting doc.. doc is owned by req.userCtx.name
// doc.users are users already granted to work with this doc
return [{
"_id" : doc._id,
"_rev": doc._rev,
"_deleted":true,
"users": doc.users
},"Ok doc deleted"];
}
Furthermore, using document rewriting rules, this update function can eventually be called even when submitting an HTTP DELETE request(not only on PUT or POST).. In this way your delete behaviour becomes totally transparent to the client... and you delete in a way which can be more useful for your use case.
The Smileupps Chatty couchapp tutorial app uses this approach: extended deletes for different document types are performed within user/drop.js, profile/drop.js, chat/drop.js files

global identifiers? - iCloud + Core Data + Ensembles - duplicates when deleting objects

I am trying to implement iCloud sync in my Core Data app. I am not that pro in programming and this is really an advanced topic I learned... I found that Core Data sync Framework "Ensembles" by Drew McCormack. It seems to make iCloud Sync much easier.
I integrated it in my App and syncing does work quite well as long as I add new objects to my Core Data model. But when I delete an object, it creates duplicates. And then duplicates from duplicates. I ended up having the same Entry (object) like 3-4 times...
Why is that? What am I doing wrong? I did some research and my guess is that global identifiers could solve this?
What are global identifiers? My guess is that they help to avoid duplicates!? But how do I set this? I really have no idea, did a lot of research but couldn´t find an answer to that.
Thanks for help!
Update:
Thanks for help! I read the readme and the book, but since i am beginner not everything is clear to me.
I think I understand the use of global identifiers in Ensembles now, but I don´t know if I´m doing it correctly.
If I understand it right, I have to assign an identifier to each object. I can do this by storing it in an attribute. This identifier can be anything as long as it is unique and a NSString?
In my app the user can store different things, let´s say name, text, title, date and so on. The app is based on the Master-Detail-View template in Xcode and uses Core Data. My Core Data model has only a single entity with some attributes, most are strings and a NSDate. No relationships or anything. If the user hits "+" a new object is created and I store the things the user enters in the attributes.
What I did to add global identifiers is to add a new attribute that stores it.
So when a new object is created i do
/// I did find that to use as identifier !?
NSString *taskUniqueStringKey = newManagedObject.objectID.URIRepresentation.absoluteString;
/// and store it in the attribute.
[newManagedObject setValue:taskUniqueStringKey forKey:#"coreDataObjectID"];
Then i use this:
- (NSArray *)persistentStoreEnsemble:(CDEPersistentStoreEnsemble *)ensemble globalIdentifiersForManagedObjects:(NSArray *)objects
{
return [objects valueForKeyPath:#"coreDataObjectID"];;
}
This seems to work for me. But am I doing it right? Is this the right place to assign a global identifier? I have no awakeFromInsert !?
If this is working, I got the next problem. My app is already live and older entries that the user saved before the update will be missing the global identifier. What can I do about that? I thought what I already got and what is unique and the only thing I can think of is an attribute that saves [NSDate date] when the object is created.
I was trying to use this but I failed because Ensembles will only accept NSString and not NSDate!? Can I use this date attribute, is this unique enough and working as gloabl identifier? And if yes, could you please give me code example in how to convert this from date to string?
Syncing with Ensembles works quite good. No duplicates anymore, you can just switch off iCloud and the entries stay and switch it on again and it syncs like it should without loosing locally stored objects or so. Ensembles is really cool! I am seeing some minor strange behaviors like sometimes sync takes long, sometimes it´s really quick and if I edit things in a short time period on two different devices it gets a bit messed up like an object that I just deleted reappears. But I guess that´s normal? If I take some time between using the app on the different devices everything works fine.
Do I understand it right, there is only that one method to call for sync:
- (void)syncWithCompletion:(void(^)(void))completion
{
if (self.ensemble.isMerging) return;
if (!self.ensemble.isLeeched) {
[self.ensemble leechPersistentStoreWithCompletion:^(NSError *error) {
if (error) NSLog(#"Error in leech: %#", error);
if (completion) completion();
}];
}
else {
[self.ensemble mergeWithCompletion:^(NSError *error) {
if (completion) completion();
}];
}
and you just call it if needed? There is nothing else like doing merge without leeching before, or a method like "this is the actual status - save it like it is now" ?
There are different points in the app where you want to sync. On app start and when terminating will be a good point. In my app there are two points where I should sync I guess: when adding an object and save it to Core Data and when I save changes to the object. I could also provide a button like "sync now". Is this a good approach and do I always just call
[self syncWithCompletion:NULL];
Another question that came up. Can I exclude objects from sync with Ensembles? My app loads tutorial entries as objects once on first app start. I don´t want to sync them if that´s possible somehow?
Thanks a lot for your help! If I could help you with anything like localizing in german or so let me know ! ;)
Yes, this is almost certainly due to not setting up global identifiers for your objects, or at least not doing it properly.
When you leech your ensemble, the local persistent store is imported into the sync data. Without global identifiers, Ensembles will assign random ids to your objects, so it can track them across devices.
Duplicates arise when you leech a second device that has the same data. Ensembles has no way to know that the data represents the same logical objects as on the other device, so it again assigns random ids. Effectively, it treats the objects on each device as being completely independent, so that all end up in your data set after syncing.
The solution is global identifiers. By implementing a CDEPersistentStoreEnsemble delegate method, you can provide Ensembles with global ids, which it can use to identify which objects on different devices belong together.
What should you use for global ids? Often, just a UUID, though for singleton like objects you will just want to pick an id.
You can initialize them in awakeFromInsert. You can store the global ids in attributes on your entities. (Note that if you are migrating an existing app, you will want to check with a fetch if the global ids have been generated BEFORE you try to leech the store for syncing.)
More details are in the README on GitHub and in the book at leanpub.
Update
To answer your update questions:
Yes, an identifier just has to be a string, and immutable. It should not change once assigned.
The NSManagedObjectID is not a very good global identifier, in that it will be different on different devices. We really want something that is global across devices.
If you are starting from scratch, using NSUUID is a good approach. Just create a unique id, and store it in the object.
If you have an existing app, and it has been syncing via another mechanism, you need to come up with a way to provide the same global identifiers on each device. One way to do that is mash up the object properties in some way. Usually that will give you a pretty-close-to-unique value, and it will be good enough for the transition.
As an example, you do a quick fetch, and discover that your objects don't yet have global ids. You go through the objects, and set the global ids to a string comprised of creationDate + text. (You could even shorten this by taking a hash, but it probably isn't that important.) After this initial 'migration' to global identifiers, you would just use UUIDs for any newly created objects.
Note that you don't have to use awakeFromInsert. That is simply a convenient place to put it. As long as you assign the global identifier before saving the object you should be fine.
The easiest way to get a string from an NSDate is to call the description method, but another way would be to get a double using timeIntervalSince1970, and turning that into a string. (Be careful with dates as unique identifiers on their own: often objects created together will have the same creation date.)
You are correct about how you should do a sync: you can simply call syncWithCompletion:.
To answer the question about excluding objects: You can't exclude individual objects, mainly because it could become tricky when those objects have relationships to synced objects. You can handle these objects in one of two ways:
Put them in a separate persistent store, and add that store to the same persistent store coordinator.
Sync the objects, but give them global ids manually, so that the objects are treated the same on each device. Eg. You could just give global ids as 'Sample1', 'Sample2', etc.
To integrate Drew's answer, I guess the two steps are the following.
1 Implement CDEPersistentStoreEnsemble delegate method (see README)
- (NSArray *)persistentStoreEnsemble:(CDEPersistentStoreEnsemble *)ensemble
globalIdentifiersForManagedObjects:(NSArray *)objects {
return [objects valueForKeyPath:#"yourUniqueIdentifier"];
}
2 Generate the unique identifier for a NSManagedObject subclass
- (void)awakeFromInsert {
[super awakeFromInsert];
if (!self.yourUniqueIdentifier) {
self.yourUniqueIdentifier = [[NSUUID UUID] UUIDString];
}
}
In awakeFromInsert you can initialize special default property values, like for example an identifier.
The check is necessary, for example, when you have parent-child contexts. Otherwise you are overwriting the identifier previously set. See Why is awakeFromInsert called twice?.

CouchDB and Couchbase Document Keys

In reference material for CouchDB and Couchbase it's common guidance to store the type of a document as a parameter within the actual document.
I've got a database, where I have different documents that record certain behaviour by URL. So naturally, I use the URL as the id of the document.
The problem I find is that by using just the key as the document id, I now get clashes between documents of different types. So I have started using the type as the first part of the key like this:
{ doc._id: "rss_entry|http://www.spiegel.de/1234", [...] }
{ doc._id: "page_text|http://www.spiegel.de/1234", [...] }
Now I start to wonder why I've never seen this approach to model type in any of the documentation.
Prefixes are commonly used. In addition to support for scenarios such as yours, prefixing allows one to perform logical range queries against views. There is use of this technique in the modeling examples, but perhaps the concept is not described in as much detail as you are expecting. In the section http://docs.couchbase.com/couchbase-devguide-2.5/#modeling-documents, the documents are keyed as beer_NNNN and brewery_NNNN. Also, the section http://docs.couchbase.com/couchbase-devguide-2.5/#using-reference-documents-for-lookups goes a bit deeper into this technique. There is a counter document named user::count and then each user is keyed as user::NNNN. Additionally, there are documents in the example that are keyed as fb::NNNN for a Facebook ID, email::XXX#YYYY.com for a user's email address, etc.

couchdb design views, updating fields on doc creation

Is it possible to have couch update or change fields on the fly when you create/update a doc? For example in the design view.... validate_doc_update:
function(newDoc, oldDoc, userCtx) {
}
Within that function I can throw errors like:
if(!newDoc.user_email && !newDoc.user_name && !newDoc.user_password){
throw({forbidden : 'all fields required'});
}
My Question is how would I reassign a field? I tried this:
newDoc.user_password ="changed";
with changed being some new value or hashed value. My overall goal is to build a user registration/login system with node and couchdb and have not found very good examples.
The validate_doc_update function cannot have any side effects and cannot change the document before storage. It only has the power to block an update or to let it through. This is important, because the function is not only called when a user requests an update, but also when changes are replicated from one CouchDB instance to another. So the function can be called multiple times for one document.
However, CouchDB now supports Document Update Handlers that can modify a document or even build it from scratch. These can be used to convert non-JSON input data into usable documents. You can find some documentation in the CouchDB Wiki.
Before you build your own user registration/login system, I'd suggest you look into the built-in CouchDB security features (if you haven't - some information here). They might not be enough for you (e.g. if you need email validation or something similar), but maybe you can build on them.

Resources