storing and deleting in cassandra wide row - cassandra

i am using cassandra for a blogging app. one of my column families is for storing all the followers of of a user - UserFollowers. where each row is a user and the columns are sorted keys for the followers composed of firstname+lastname+uuid. the composite key is so i can search ranges on the followers and serve them paginated.
example - followers of user A would look like:
A | john:2f432t3 | sam:f242fg | joe:f24gf24
all well and good so far. when i add a follower he falls into his sorted place and i can search and retrieve however i like. but now sam decided to stop being a follower and i need to delete him. moreover - just before that sam changed his name to samuel so the delete message i send now is samuel:f242fg. that value will not be found and the column sam:f242fg will stay.
my only solution for it now is that when i want to delete i have to pull out the entire row. locate sam by his id only. get the key that was stored initially and remove it. very inefficient for people with many followers and depends on these kinds of removals not happening a lot.
any better strategies out there?
thanks
or

I suggest the following:
Change your key on UserFollowers to an ID that represents the user.
Add a "name" column that contains the name of that user.
Instead of storing followers' names, store their IDs.
So your data now looks like this:
f1341df | name: george | 2f432t3 | f242fg | f24gf24
2f432t3 | name: john | f242fg | f1341df
... etc
Now you can get a list of followers' names by first querying the user and getting a list of IDs, then doing a multi-get with all those keys in a single query. If a user changes their name, this doesn't break your model.

ok i think ive found a way to do it more efficiently. it requires a bit more work application side but it works and allows deletions regardless of changes made to source.
just to define the problem again:
we have 2 entities that reference each other. example - User and Other Users. Users follow Other Users and Other Users are followed by Users.
we want to store the related entities horizontally. so we have a CF UserFollowers that stores in each row all the followers of the user.
we also have in inverse CF UserFollowing to store all the users this user is following.
what we actually store is a column for each followed or following user where the name is a key composed of firstname:lastname:uuid and the value is a compact json of the user.
now getting followers or following users is easy enough with range queries on the name.
removing a user from either one of the lists is however more tricky because we need to send a delete message with the original key that was stored.
example: if sam:jones:safg8sdfg followed abe:maxwell:fh2497h9 we would have -
in UserFollowers: fh2497h9 | sam:jones:safg8sdfg<json for sam>
and in UserFollowing: safg8sdfg | abe:maxwell:fh2497h9<json for abe>
if sam changes his name to sammy and tries to unfollow abe it wont work because the delete message will now attempt to delete a column in UserFollowers with name sammy:jones:safg8sdfg when the actual column stored is sam:jones:safg8sdfg.
so my solution to this was to store a reverseKey with the stored json on each side so that each side knows what key was actually stored on the other side and can use that to remove itself from there.
it would look like:
in UserFollowers: fh2497h9 | sam:jones:safg8sdfg<json for sam.. reversKey:abe:maxwell:fh2497h9>
and in UserFollowing: safg8sdfg | abe:maxwell:fh2497h9<json for abe..reverseKey:sam:jones:safg8sdfg>
now when sam wants removes abe from his Following he can use the reverseKey:sam:jones:safg8sdfg to remove himself from abes follower list.
and everyone is happy.

Related

How can we know ID of the document on client side

I have recently learned tutorial about restful APIs.In that, my instructor suggested me that if we want to delete any document we should pass id in the parameter of the request. But now I am confused How do we handle this implementation on the client side.I mean how can even the programmer on the front side could be aware of that particular document ID. Does he need to go to the database each time?
Common practice for accessing a record in db is to use its unique identifier, to get or update or delete the record.
On the client side (if you mean user interface) when user wants to delete a document, he/she must see the document somewhere in the interface. Suppose a page with a table containing a list of all (for instance) books in the db. On each row, you have book title and author's name and the id of the book document in the db.
So you can use that id to call the delete rest API.
In a nutshell, when you want to delete something you must have got it from db to simply see it, so the id is at your hand.
When you want to delete a some doc from the database you need to get all documents to the front end to see what do we need to do to this data right ?
Imagine any database GUI that u have worked with..
let's say phpmyadmin when using mysqli
in that case you have php mydamin's GUI so that u can clearly see what are the tables and how things persist in the database. you need to see that in order for you to make decision
. Like that you will need to bring at least a portion of that data to the front end for user to see it and choose what portion of data the user want's to make changes or delete.
so when we have a set of data in the front end like a list, if a user select one item from that list the id or the name of that item can be send to the server side and make the task if the user wishes to do
that's why you need an Id or a identification field of that particular data..

GetStream: Prevent adding the same activity in flat-feed

I'm using a flat-feed, to be able to follow and unfollow other profiles.
But as i see it, i want to group the verbs as it is possible in the aggregated-feed.
Because i want to aggregate likes to the same activity.
user2 and user3, likes image1.
user1 follow user2 and user3.
Meaning that user1 now have two activities, of users liking the same image and i want it to be displayed as one activity.
Should i handle this myself by sorting it with the foreign key and then collect the activities into one? Or can i somehow blend the flat-feed with aggregated-feed?
This How to build a news feed with aggregate and flat types? is similar, but i need to be able to follow and unfollow.
After finding out that i could combine flat-feed and aggregated-feed i came up with this setup:
In this example beneath, User2 is publishing an artwork, meaning that User2-feed now includes an activity that shows his artworks.
Shortly after User3 likes his newly posted artwork, meaning that User3-feed now include an activity that shows his like of the artwork.
User1 is following User2-feed and User3-feed which create User1-aggregatedfeed that include all activities from the two feeds, that User1 is currently following. It means that User1’s own activity will not being shown in his own User1-aggregatedfeed unless he is following his own feed, which isn’t possible.
The same artwork that is added to User2-feed is also added to Artist1-feed because the artist of the artwork is Artist1. Meaning that the same artwork is now added to two feeds that User1 is following. But activity will only be added to User1-aggregatedfeed once, because we store the “artwork publish” activity with the same foreign key.
Flat-feed combined with aggregated-feed example

cucumber feature: simulate multiple selection fields in a form

I have started writing the following feature within an app designed to manage a cleaning business:
Feature: Creating a new cleaner
In order to allow Franchisees to allocate cleaners to jobs they need to be uploaded to the system
Background:
Given I am currently logged in to my account
And I have navigated to the "Cleaners" page
And I want to add a new cleaner to the database
Scenario: Add a new cleaner to the system
Given I have brought up the "Add Cleaner" form
Then I will need to complete the fields within the following form:
| first_name |
| last_name |
| email |
| date_of_birth |
| postcode |
| mobile |
| other_phone |
| address_1 |
| address_2 |
| work_radius |
| **days_available** |
| notes |
When I have entered valid data
Then I can save to the database
And I will have added a new cleaner to the system
In addition to welcoming comments on the way I have written the scenarios etc, my main problem is that I can't work out how to simulate selecting from a pre-populated field:
Populating the days_available should allow the franchisee to choose which days of the week, and which hours within those days, that a cleaner will be available for work. This obviously makes it possible to return queries which only show available cleaners for any given day/time of day.
Really hope someone can explain how this is done?
Just a quick comment on the structure of your feature file ... the 'Then' step in your feature should be asserting that something has or has not been done successfully.
Given I have logged into the site
When I add a new Cleaner to the site
Then I should see that the Cleaner has been added successfully
I would recommend using language that can be easily understood. Your scenario doesn't need to be instructions on how to use the site. Excessive navigational steps can make you lose track of the purpose of the scenario.
To answer your question regarding days_available accurately, would require some knowledge of how the site is structured and how the days_available are entered. Are you choosing from select lists, filling in form fields, etc? Also, since you are testing, you could consider setting the data from within your step (ie. hash, array) instead of passing all of the info in via a table.
Just some food for thought. Cheers.
Based on your updated post, I would suggest the following:
The step And I want to add a new cleaner to the database doesn't seem like an actionable step and could be removed. Same for the step When I have entered valid data. If you handle filling out the form in the previous step, you have already entered valid data.
If you need to multiple available days, I would consider making it its own step
And(/^the cleaner is available from (.*?) to (.*?) on (.*?)$/) do |start_time, end_time, day|
#fill in start time
#fill in end time
#select day
end
Background:
Given I am currently logged in to my account
And I have navigated to the "Cleaners" page
Scenario:
And I bring up the "Add Cleaners" form
And I complete the form with
| first name | Bob |
| last name | Smith |
...
And the cleaner is available from 0600 to 1800 on W
When I submit the Add Cleaners form
Then I should see the new cleaner has been successfully added

Creating a view of users who have created nodes

Basically i want to creeate a block diaplay view,which displays a list of all the users thata have posted some nodes on the drupal website.
Oddly enough thinking about this right now it could be a little tricky. You have two possible solutions off the top of my head.
1 - Create a new view of item type Node. Your row style will obviously be set to Fields. Under which Fields to pull select the User group and then tick off the User: Name checkbox. Set your Items to display setting to 0 for unlimited results.
Under the preview you should get a ton of results looking something like:
Name: John Doe
Name: Mary Jane
Name: John Doe
Name: Anonymous
What you're seeing is the authors of all the nodes posted in your system. There will be duplication because a user in your system could be the author of multiple nodes. Unfortunately you can't just tick off the Distinct: Yes option because this only applies to nodes and not the users.
How to deal with the duplicate user name results tho? Custom theme your view by creating a custom template under Theme: information. Inside the template write some PHP code which intercepts the row results from the View query before it renders and only render distinct user names from the results. You'd have to write the logic though to determine whether a user name has already been added.
As simple as creating a new custom array, adding each row result (user name) to array but first checking to see whether it already exists in your custom array - if it does then toss it and move on to the next user name. At the end you'll have an array filled with distinct user names who have posted on your site.
voila! It works. It's not elegant but it definitely will work if built this way.
2 - Alternatively maybe you can get this module working to accomplish the same thing in a less complicated manner: http://drupal.org/project/views_customfield but I have never used it so I cannot comment on it.
Good luck. Hope that helps.
My solution was to:
Create a view of people
Add a UID field (and any other fields you want)
Create a theme.tpl.php file for the Row Style
Do a DB call on each loop through the row to search for nodes created by the supplied UID.
Here is what I have in my semanticviews-view-fields-VIEWNAME.tpl.php
<?php
//Query the Drupal DB for nodes associated to the supplied UID
$existing_nid = db_query("SELECT nid FROM {node} WHERE (type = :ctype) AND uid = :uid", array("ctype" => "CONTENT_TYPE", "uid" => $fields['uid']->content))->fetchField();
//If the supplied UID created content of the supplied type, then show their name
if ($existing_nid != FALSE) {
echo "Name:" . $fields['name']->content;
}
?>
This way only UID's that have content associated to it in the DB will be printed out, and those that don't, won't.
Hope that helps!

How can I create human readable key for notes documents

For the documents stored in the database, I would like to create a human readable key to uniquely identify the document. e.g. PO20090110-001. How do I go about doing that?
When saving a document you can put together the first part of the number by using the date or any technique you like (ej. "PO" & format(date, "YYYYMMDD") & confDoc.getitemvalue("doccounter")).
As for the counter I like to store it in a configuration document and update it when each doc is saved. If there are lots of documents created during the day you can run into rep conflicts on you configuration document, if this is the case you can have an agent on the server do the actual assigning of the number, the drawback to this is that you don't get the number right away when saving.
Hope this helps.
One solution used in our help desk is to take the initials of the current user and add it to the a number in the last document in a view. Add one to the number and store that it the new document along with the ititals and the new number as the key.
It's not simply.
Create field for uniquely key and this key saving onSave (or other event), but you must protect this number to be unique.
You can create agent, which checking number on domino server and if agent find conflict then notify application administrator or other responsibility person to resolve this.
Or each replica generate own number and after replicate on domino, agent assign number in right format.
You can create a "nearly" unique key in Domino simply by using the #Unique function, with no arguments. This will generate a string key, based on the current user's first and last name and the current clock time. You will end up with a string something like: "ESCR-12345678".
I say "nearly" unique, because it is not really like an identity column in SQL - Domino does not guarantee it will only give out a particular string once. If you use #unique in a server-side agent which generates many id's at once - for example, and agent that loops and uses #unique within the loop, you can get into a situation where #unique will return a duplicate - because you create 2 docs within the same second and because your "username" is always the server's canonical name. But, outside of that scenario, #unique is generally safe to use.
If you then need to open or reference docs by this ID, just create a view sorted by that ID and you can a url in the form ../myView/id?readDocument.

Resources