i have tried both the native api and fluent api for datastax graph in java.
i found fluent api more readable since it resembles java's OOP.
Native api has less readability in java since basically strings are being appended to create the entire gremlin script. but on the plus side a single call is made to execute the entire gremlin script
i wanted to know which is the best api to go with in case i need to add a large number of edges and vertices in one transaction and what are the performance issues which can occur in either case
Going forward I would recommend using the Fluent API over the String-based API. While we still support the string-based API in the DataStax drivers, most of our work and improvements will be using the fluent API.
The primary benefits of the Fluent API is that you can use the Apache TinkerPop library directly to form Traversals, it doesn't need to go through the groovy scripting engine (like the String-based API does).
In terms of loading multiple vertices/edges in one transaction, you can do that with Apache TinkerPop, and it will be much more effective than the String-based API because that all doesn't need to be evaluated through the gremlin-groovy engine. Also any future work around batching will likely be done in the Fluent API (via Apache TinkerPop), see JAVA-1311 for more details.
Related
https://cloud.google.com/datastore/docs/concepts/queries#datastore-distinct-on-query-nodejs
When reading the documentation about querying entities, I noticed that keys-only queries and projection queries without a distinct on clause are considered small operations, which according to their quota and pricing are considered free.
However, when you look at the examples from different languages on that page, it looks like several (C#, Java, PHP, etc...) support a way of telling the query to specifically perform a distinct on operation, but there doesn't seem to be support in NodeJS for specifying this directly. This seems to significantly impact cost, but NodeJS is missing support.
What am I missing?
I don't think the NodeJS API use distinct by default, though you should be able to do a simple test to confirm. Looking through the examples tells me that NodeJS API uses a slightly different terminology, and calls it groupBy for fetching distinct results. Here is the link to the API Documentation.
We're planning on implementing a server-side notification mechanism that pushes out to iOS and Android via ANH. We will have no code footprint on our mobile clients, short of a call to our server API for "registration". In this way our approach is looking similar to this MSDN discussion.
I also see the alternate, more bare-bones, approach noted on MSDN.
Is it fair to conclude that the two approaches will have similar performance on the 'send' side?
It appears the main difference is this:
The former approach has already done the work of integrating with the Task and Async mechanism, presenting a callable C# mechanism that has taken on more of the RESTful API layer,
The DirectBatch/Send API is just that -- the raw RESTful API for you to use as you see fit.
For operations that are available as both REST API and SDK, you shouldn't see any significant difference in performance on the client side because the SDK is just a wrapper around the REST APIs. There are SDKs for both iOS and Android and it's recommended to use those so that you don't have to re-write the wrapper.
Direct Send is only available in .NET SDK at the moment and for other platforms as REST API, so you'd have to implement your own wrapper in case you're using something other than .NET for the operation. You can use the sample to help you in the process.
In terms of performance it depends on what you mean by that.
Direct send will most likely be delivered to customers a bit faster because ANH service doesn't have to do any registrations in the process, it just delivers notifications with your parameters. But it has it's limitations in terms of number of handles you can provide and also you need to manage handles yourself.
If you only mean performance on the client side, then there should be no difference as all calls are asynchronous. And if you take advantage of tags, then you can do really tricky sends in one server call and let ANH figure out the details behind it.
But without knowing your scenario and requirements there's no way to give a proper recommendation.
I'm learning more about ArangoDB and it's Foxx framework. But it's not clear to me what I gain by using that framework over building my own stand alone nodejs app for API/access control, logic, etc.
What does Foxx offer that a regular nodejs app wouldn't?
Full disclosure: I'm an ArangoDB core maintainer and part of the Foxx team.
I would recommend taking a look at the webinar I gave last year for a detailed overview of the differences between Foxx and Node and the advantages of using Foxx when you are using ArangoDB. I'll try to give a quick summary here.
If you apply ideas like the Single Responsibility Principle to your architecture, your server-side code has two responsibilities:
Backend: persist and query data using the backend data storage (i.e. ArangoDB or other databases).
Frontend: transform the query results into a format acceptable for the client (e.g. HTML, JSON, XML, CSV, etc).
In most conventional applications, these two responsibilities are fulfilled by the same monolithic application code base running in the same process.
However the task of interacting with the data storage usually requires writing a lot of code that is specific to the database technology. You need to write queries (e.g. using SQL, AQL, ReQL or any other technology-specific language) or use database-specific drivers.
Additionally in many non-trivial applications you need to interact with things like stored procedures which are also part of the "backend code" but live in the database. So in addition to having the application server do two different tasks (storage and rendering), half the code for one of the tasks ends up living somewhere else, often using an entirely different language.
Foxx lets you solve this problem by allowing you to move the logic we identified as the "backend" of your server-side code into ArangoDB. Not only can you hide all the nitty gritty of query languages, edges and collections behind a more application-specific API, you also eliminate the network overhead often necessary to handle requests that would cause more than a single roundtrip to the database.
For trivial applications this may mean that you can eliminate the Node server completely and access your Foxx API directly from the client. For more complicated scenarios you may want to use Node to build external micro services your Foxx service can tap into (e.g. to interface with external non-HTTP APIs). Or you just put your conventional Node app in front of ArangoDB and use Foxx to create an HTTP API that better represents your application's problem domain than the database's raw HTTP API.
It's also worth keeping in mind that structurally Foxx services aren't entirely dissimilar from Node applications. You can use NPM dependencies and split your code up into modules and it can all live in version control and be deployed from zip bundles. If you're not convinced I'd suggest giving it a try by implementing a few of your most frequent queries as Foxx endpoints and then deciding whether you want to move more of your logic over or not.
I would like to work with Cassandra from javascript web app using REST API.
REST should support basic commands working with DB - create table, select/add/update/remove items. Will be perfect to have something similar to odata protocol.
P.S. I'm looking for some library or component. Java is a most preferred.
Staash solution looks perfect for the task - https://github.com/Netflix/staash
You can use DataStax drivers. I used it via Scala but you can use Java, a Session object is a long-lived object and it should not be used in a request/response short-lived fashion but it's up to you.
ref. rules when using datastax drivers
There is no "best" language for REST APIs, it depends on what you're comfortable using. Virtually all languages will be able to do this reasonable well, depending on your skill level.
The obvious choice is probably java, because cassandra's written in java, the java driver from Datastax is well supported, and because it's probably pretty easy to find some spring REST frameworks to do what you want. Second beyond that would be python - again, good driver support and REST frameworks with things like django or flask+potion. Ruby driver isn't bad, lots of ruby REST APIs out there, too.
I am looking at Couch Db and I saw Ektorp that presents a JPA like interface for database. However I see that there are examples that how to make query at JavaScript. I didn't understand how the system work.
Do I query a database from web tier without a middle tier? How can security be done with that?
CouchDB uses javascript to define map and reduce functions for it's views. Ektorp is simply providing you a convenient way to create those functions that will be used by couchdb. You might want to read the couchdb wiki page on views:
http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views
Just because the views are javascript, does not imply that you have to create the views from a 'web tier'.
In terms of architecture, you have a couple of options. You can use a traditional three tier approach with a java front end, and in your middle tier call couchdb with ektorp. Then you are in full control of security.
You can also go with what is coming to be known as the 2.1 tier model, where users interact directly with couchdb, mainly with a couchapp. You can then provide support services that listen to the changes feed. I have done this with ektorp and it works very well. Other have used node.js. It is a different way of thinking, but it can work. You can read a fun post about this model here:
http://markmail.org/thread/cfw7f3ef75aoqzin
Anyway, I just wanted to provide you with possible options in how you 'tier' your architecture.