What is the best approach to design a data model based on a generalization relationship. For example imagine there is a based class A and two derived class B and C that inherit class A. Now I want design data model. I have three choices
1) Create Table A and having a type column for specifying B and C data.
2) Create Table A, B and C Just like my class diagram and relate B and C to A.
3) Create Table A, B and C but don't relate B and C to A.
Any clue would be appreciated
Check out this article. Although it is written for JPA, it tells you the pros and cons of each of the strategies you mentioned.
Related
I have two spark data frames (A and B) with respective sizes a x m and b x m, containing floating point values.
Additionally, each data frame has a column 'ID', that is a string identifier. A and B have exactly the same set of 'ID's (i.e. contain information about the same group of customers.)
I'd like to combine a column of A with a column of B by some function.
More specifically, I'd like to build a scalar product a column of A with a column of B, with ordering of the columns according to the ID.
Even more specifically I'd like to calculate the correlation between columns of A and B.
Performing this operation on all pairs of columns would be the same as a matrix multiplication: A_transposed x B.
However, for now I'm only interested in correlations of a small subset of pairs.
I have two approaches in mind, but I struggle to implement them. (And don't know whether either is possible or advisable, at all.)
(1) Take the column of interest of each data frame and combines each entry to a key value pair, where the key is the ID. Then something like reduceByKey() on the two columns of key value pairs and subsequent summation.
(2) Take the column of interest of each data frame, sort it by its ID, cast it to an RDD (haven't figure out how to do this) and simply apply
Statistics.corr(rdd1,rdd2) from pyspark.mllib.stat.
Also I wonder: Is it generally computationally preferable to operate on columns rather than rows (since spark data frames are columnar oriented) or does that make no difference?
Starting from spark 1.4 and if all you need is pearson correlation then you could go like this:
cor = dfA.join(dfB, dfA.id == dfB.id, how='inner').select(dfA.value.alias('aval'), dfB.value.alias('bval')).corr('aval', 'bval')
Suppose I have have a huge list of movies categorized by genres. Users can vote for movies, and each movie can be in multiple genres.
What is a good way to store this in Cassandra if I want to present the top X movies per category? Please ignore other use cases as I can have other column families as required (like presenting detailed movie information).
Action
Movie A
Movie B
Movie C
Comedy
Movie D
Movie E
Movie A
Based on the information you have presented -I say that you only gave requirements to create a single column family.
The columns would be - movie name, category, and any other attributes about the movie.
It is quite common to create several column families with different structure that are like 'materialized views' of original column family.
In Cassandra you design column families based on how the application is going to use it. So, you design your queries first then you design the column family to support it.
Is it possible by default to merge more than two contacts in Dynamic CRM 2011?
If it is not, what is the workaround needed to make it happen?
I don't think so.
Workaround: You have three contacts, A, B & C.
Merge A and B (A + B = AB).
Merge A and C (AB + C = ABC).
I would also suggest to create an entity that stores the old 1:N relationship as you merge records. For example if you have two contacts A and B and both of them have orders each say X and Y. When you merge A and B the child records (both X and Y) will now be attached to the primary record let's say A. This means you will not be able to ever relate deactivated record B with order Y. Having this stored in custom entity will save you time if you ever need to investigate data.
I'm new to core data.
I have a data model, in which there are two tables and a 1-N relationship between them.
The application downloads all data from a service and saves the data in each table.
In addition, the tables are related and therefore want to do this:
a) Obtain all elements of the table2, which satisfies certain conditions.
b) For each element in table2, the identifier look table1 / save the table1´s id.
c) I get the item in Table 1 which meets the requirements ID.
d) I relate to Table 2 with 1.
I'm not capable for doing this. :(
I do not know if this method for make a relation between tables in this way is good or no.
This is sort of difficult to answer. If you think about Core Data as an SQL table you'll just get yourself into difficulty.
Core data isn't about joining and searching tables, it's about an object graph. An object has relationships to another object which has an inverse relationship to the other object. Essentially, what you should be looking to do is:
This is a fetch request of the entity which you are storing in table 2 subject to a predicate which defines your conditions.
You don't actually deal with ids directly in Core Data. You hardly ever deal with the keys directly.
Step 1 returned a collection of objects, and you can run a further predicate on this to filter it.
That is what the inverse relationship is for.
I know this doesn't answer your actual question. I'm trying to get you to think of your Core Data store as a collection of objects related to each other rather than as a bunch of linked tables.
Lets say I have 3 entities: A, B and C.
A and B are both aggregate roots and reference C
C does not reference anything
Does this mean by DDD that C is also an aggregate root because it is referenced by more than one entity? (Or else it could be part of that only referencing entities aggregate).
Thanks
Yes. In this case C is an entity, and not an aggregate root.