ActivePivot QueriesService.retrieveObject on a distributed cube

ActivePivot QueriesService.retrieveObject on a distributed cube - activepivot

I've been trying to create a new action in ActivePivot Live, that calls retrieveObject on the QueriesService. Something like this:
IQueriesService queriesService = getSessionService(IQueriesService.class);
ObjectDTO dto = queriesService.retrieveObject("MyDistributedCube", action.getObjectKey());
This works fine on a local cube, but in a distributed setup if fails to retrieve an object from a remote server. Maybe this is not surprising, but the question is how do I make it work?
Would a new query type, similar to the LargeDealsQuery in this example help me?
http://support.quartetfs.com/confluence/display/AP4/How+to+Implement+a+Custom+Web+Service
UPDATE:
Here is the context. What I have is too may fields to resonably show in the drill-through blotter, so I'm hiding some in the cube drill-through config, both for display, but also to reduce the amount of data transfered. To see all the fields when that is needed, I added a "drill-through details" item to the right-click menu, that will query the cube for all fields on a single drill-through row and show that in a pop-up. Maybe there is a better way to get this functionality?

IQueriesService.retrieveObject() is an obsolete service that was introduced in ActivePivot 3.x. At that time ActivePivot stored the input objects directly in the memory and it was natural to provide means to retrieve those objects. But later versions of ActivePivot introduced a column store: the data is extracted from the input objects and packed and compressed into columnar structures. The input objects are then released, vastly reducing memory usage.
For ActivePivot 4.x the retrieveObject() service has been somewhat maintained, although indirectly, as in fact generic objects are reconstructed on the fly from the compressed data. As you noticed the implementation only supports local cubes. Only MDX queries and Drillthrough queries have a distributed implementation.
For ActivePivot 5.x the retrieveObject() service has been removed completely, in favor of direct access to the underlying datastore.
There is a good chance you can address your use case with a (distributed) drillthrough query that retrieve raw facts. Another quick fix would be to issue your request manually on each of the local cubes in the cluster.
More generally, drillthrough queries (and also MDX queries, and GetAggregates queries) are contextual in ActivePivot. You can attach *IContextValue*s to the query that will alter the way the query is executed. For drillthrough queries in particular, you can attach the IDrillThroughProperties context value to the query:
public interface IDrillthroughProperties extends IContextValue {
/**
* #return The list of hidden columns defined by their header names.
*/
List<String> getHiddenColumns();
/**
* #return The comparator used to sort drillthrough headers (impacts the column order).
*/
IComparator<String> getHeadersComparator();
/**
* Returns the post-processed columns defined as plugin definitions of {#link IPostProcessedProperty}.
* #return the post-processed columns defines as plugin definitions of {#link IPostProcessedProperty}.
*/
List<IPluginDefinition> getPostProcessedColumns();
#Override
IDrillthroughProperties clone();
}
This will among other things allow you to retrieve only the columns you want for a specific drillthrough query.

According to your update one may do the following:
set the drillthroughProperties not in the shared context but in a given role or per user and allow each user to change that before firing a DT query.
So you have to code a service that will display all the attributes that a user can access, then the user can choose what fields should appear in the drillthourgh, populate the drillthroughProperties then and fire a DT query. You'll see then only what you're interested in.
see this like the currency context of the sandbox but here it impacts the DT.

Related

Record User Activity In Express Nodejs

I have a backend api with express. I've implemented logging with winston and morgan.
My next requirement is to record a user's activity: timestamp, the user, and the content he've fetched or changed, into the database MySQL. I've searched web and found this. But since there is no answer yet, I've come to this.
My Thought:
I can add another query which INSERT all the information mentioned above, right before I response to the client, in my route handlers. But I'm curious if there could be another way to beautifully achieve it.

Select the best approach that suits your system from following cases.
Decide whether your activity log should be persistent or in memory, based on use case. Lets assume persistent and the Db is mySQL.
If your data is already is DB, there is no point of storing all the data again, you can just store keys/ids that are primary for identification, for the rows which you have performed CRUD. you can store as foreign keys in case if the operations performed are always fixed or serialised JSON in activity table.
For instance, the structure can be shown as below, where activity_data is serialised JSON value.
ID | activity_name | activity_data | start_date | end_date |
If there is a huge struggle while gathering the data again, at the end of storing activity before sending response, you can consider applying activity functions to the database abstraction layer or wrapper module created for mySQL (assuming).
For instance :
try {
await query(`SELECT * FROM products`);
//performActivity(insertion)
}catch{
//performErrorActivity(insertion)
}
Here, we need to consider a minor trade off regarding performance, as we are performing insertion operation at each step.
If we want to do it all at once, we need to maintain a collection that add up references of all activity in something like request.activityPayload or may be a cache and perform the insertion at last.
If you are thinking of specifically adding a new data-source for activity, A non-relational DB can be highly recommended to store/dump such data (MongoDB opinionated). This is because it doesn't focuses on schema structure as compare to relational DB as well you can achieve performance benefits as compare to mySQL specifically in case of activity storing.

If Cassandra facilitates the use of non-normalized data, how do users edit it without creating inconsistencies?

If Apache Cassandra's architecture encourages the use of non-normalized column families designed specifically for anticipated queries, how do users edit data that is replicated across many columns without creating inconsistencies?
e.g., example 3 here: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
If Jay was no longer interested in iphones, deleting this piece of information would require that columns in 2 separated column families be deleted. Do users just need to code add/edit/delete functions that appropriately update all the relevant tables, or does Cassandra somehow know how records are related and handle this for users?

In the Cassandra 2.x world, the way to keep your denormalized query tables consistent is to use atomic batches.
In an example taken from the CQL documentation, assume that I have two tables for user data. One is the "users" table and the other is "users_by_ssn." To keep these two tables in sync (should a user change their "state" of residence) I would need to apply an upsert like this:
BEGIN BATCH;
UPDATE users
SET state = 'TX'
WHERE user_uuid = 8a172618-b121-4136-bb10-f665cfc469eb;
UPDATE users_by_ssn
SET state = 'TX'
WHERE ssn = '888-99-3987';
APPLY BATCH;

User need to code add/edit/delete function himself.
Take in to attention that Cassandra 3.0 have materialised view that automate denormalization on the server side. Materialised views would add/edit/update automatically based on the parent table.

Selecting and updating against tables in separate data sources within the same transaction

The attributes for the <jdbc:inbound-channel-adapter> component in Spring Integration include data-source, sql and update. These allow for separate SELECT and UPDATE statements to be run against tables in the specified database. Both sql statements will be part of the same transaction.
The limitation here is that both the SELECT and UPDATE will be performed against the same data source. Is there a workaround for the case when the the UPDATE will be on a table in a different data source (not just separate databases on the same server)?
Our specific requirement is to select rows in a table which have a timestamp prior to a specific time. That time is stored in a table in a separate data source. (It could also be stored in a file). If both sql statements used the same database, the <jdbc:inbound-channel-adapter> would work well for us out of the box. In that case, the SELECT could use the time stored, say, in table A as part of the WHERE clause in the query run against table B. The time in table A would then be updated to the current time, and all this would be part of one transaction.
One idea I had was, within the sql and update attributes of the adapter, to use SpEL to call methods in a bean. The method defined for sql would look up a time stored in a file, and then return the full SELECT statement. The method defined for update would update the time in the same file and return an empty string. However, I don't think such an approach is failsafe, because the reading and writing of the file would not be part of the same transaction that the data source is using.
If, however, the update was guaranteed to only fire upon commit of the data source transaction, that would work for us. If the event of a failure, the database transaction would commit, but the file would not be updated. We would then get duplicate rows, but should be able to handle that. The issue would be if the file was updated and the database transaction failed. That would mean lost messages, which we could not handle.
If anyone has any insights as to how to approach this scenario it is greatly appreciated.

Use two different channel adapters with a pub-sub channel, or an outbound gateway followed by an outbound channel adapter.
If necessary, start the transaction(s) upstream of both; if you want true atomicity you would need to use an XA transaction manager and XA datasources. Or, you can get close by synchronizing the two transactions so they get committed very close together.
See Dave Syer's article "Distributed transactions in Spring, with and without XA" and specifically the section on Best Efforts 1PC.

Using dataCache property in view control in XPages

In the the data source of a view control there is a property of dataCache with options of Full, ID and NoData. From some sources I gather that:
Full - The entire view is persisted
ID - Minimal scalar data ID and position. Access to column values during a POST are not available
None - Enough said – the entire view needs to be reconstructed
But exactly how does this property effect performance of the XPage? Which are the methods/functionality I can use in each of these options? What is the suitability of each option?

I haven't tested, but I would presume the following are true. The more you persist in memory, the quicker it is to restore for expand/collapse etc. However, the more users and the bigger the view (number of columns rather than number of documents, because not all documents will get cached), the more risk of out of memory issues. Access to column values means you may have problems using getColumnValues() in SSJS from the view.
XPages is pretty quick, so unless you have specific performance issues, the default should be sufficient.

Preferred way to store a child object in Azure Table Storage

I did a little expirement with storing child objects in azure table storage today.
Something like Person.Project where Person is the table entity and Person is just a POCO. The only way I was able to achieve this was by serializing the Project into byte[]. It might be what is needed, but is there another way around?
Thanks
Rasmus

Personally I would prefer to store the Project in a different table with the same partition key that its parent have, which is its Person's partition key. It ensures that the person and underlying projects will be stored in the same storage cluster. On the code side, I would like to have some attributes on top of the reference properties, for example [Reference(typeof(Person))] and [Collection(typeof(Project))], and in the data context class I can use some extension method it retrieve the child elements on demand.

In terms of the original question though, you certainly can store both parent and child in the same table - were you seeing an error when trying to do so?
One other thing you sacrifice by separating out parent and child into separate tables is the ability to group updates into a transaction. Say you created a new 'person' and added a number of projects for that person, if they are in the same table with same partition key you can send the multiple inserts as one atomic operation. With a multi-table approach, you're going to have to manage atomicity yourself (if that's a requirement of your data consistency model).

I'm presuming that when you say person is just a POCO you mean Project is just a POCO?
My preferred method is to store the child object in its own Azure table with the same partition key and row key as the parent. The main reason is that this allows you to run queries against this child object if you have to. You can't run just one query that uses properties from both parent and child, but at least you can run queries against the child entity. Another advantage is that it means that the child class can take up more space, the limit to how much data you can store in a single property is less than the amount you can store in a row.
If neither of these things are a concern for you, then what you've done is perfectly acceptable.

I have come across a similar problem and have implemented a generic object flattener/recomposer API that will flatten your complex entities into flat EntityProperty dictionaries and make them writeable to Table Storage, in the form of DynamicTableEntity.
Same API will then recompose the entire complex object back from the EntityProperty dictionary of the DynamicTableEntity.
Have a look at: https://www.nuget.org/packages/ObjectFlattenerRecomposer/
Usage:
//Flatten complex object (of type ie. Order) and convert it to EntityProperty Dictionary
Dictionary<string, EntityProperty> flattenedProperties = EntityPropertyConverter.Flatten(order);
// Create a DynamicTableEntity and set its PK and RK
DynamicTableEntity dynamicTableEntity = new DynamicTableEntity(partitionKey, rowKey);
dynamicTableEntity.Properties = flattenedProperties;
// Write the DynamicTableEntity to Azure Table Storage using client SDK
//Read the entity back from AzureTableStorage as DynamicTableEntity using the same PK and RK
DynamicTableEntity entity = [Read from Azure using the PK and RK];
//Convert the DynamicTableEntity back to original complex object.
Order order = EntityPropertyConverter.ConvertBack<Order>(entity.Properties);

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

ActivePivot QueriesService.retrieveObject on a distributed cube - activepivot

Related

Record User Activity In Express Nodejs

If Cassandra facilitates the use of non-normalized data, how do users edit it without creating inconsistencies?

Selecting and updating against tables in separate data sources within the same transaction

Using dataCache property in view control in XPages

Preferred way to store a child object in Azure Table Storage

Categories

Resources