I am learning how to use IBM Cognos and my first task is to create relationships between the tables I have uploaded into Cognos.
Basically, I am trying to tell Cognos to link the id column in the Person Table with the person_id and related_person_id columns in the Relationship Table, as shown here:
However, this does not seem possible since the Match Selected Columns button becomes disabled when I try to also link the related_person_id column.
The reason I need to do this is because person_id and related_person_id are foreign keys - they point to people in the Person Table and explain how they are related.
How can this be accomplished in Cognos?
Thank you.
You can have any number of matches. You need to match a single query item from each side for each match. IIRC, a query item can be used in multiple matches, although that would only be really helpful once relational operators are implemented.
It isn't clear if in your case you want to use person_id and related_person_id as a composite key or if you want a 1.n relationship between ID and person_id and some other relationship (n.1?) between ID and related person ID or if a 1.n relationship between ID and person_id would be sufficient to whatever you are trying to accomplish.
Editorial comment:
It would be really really nice if Cognos introduced relational operators Real Soon Now.
I have been looking into this issue for some time and need some help. I have looked into sub-queries and CTE with no Luck.
I want to create a select of all of the data in Table A. I also want to add some columns from Table B. So an Inner join will work initially.
However in Table B I want to create a string that is concatenated from all the different categories for each person and list this as a column in my select.
For example, one person might be in IT and also Construction categories so I would like a string that says "Construction, IT" listed for that person and the same basis for all others.
I believe I have looked into COALESCE
which meets me part way. Any advice most appreciated.
Table A
Name
Surname
Person Ref
Table B
Person Ref
Person Category
Person Gender
These are example tables I can't post the ones I'm looking at for security reasons but hopefully, you get my point.
Martin
I have a PowerPivot with two tables one contains a list of facilities, their type (active/inactive) and whether they belong to org A or org B (FaciltyID|Active/Inactive|ORG)
Another table has a list of users and facitilites assigned to them + their org, so it looks like (userID|FacilityID|ORG) where each userID is repeated the number of times that=the number of facilties it has.
Initially I needed to report the number of facilities active and easily built a PivotTable for it.
Now I need to get a list of the facilities that each user is missing , so I need to basically do an outer join between the the tables for each user and I just can't figure out the way to do it! I joined both table on the FacilityID and am able to see whether they have inactive facilties, but can't figure out a way to show all the facilities they are missing!
Thanks
Nonexistence is hard. This is not the sort of thing that is best solved through measures, but through modeling. In your source, you should cross join Facility and User to get FacilityUser. In FacilityUser, every user has 1 row with every facility, and you add a flag to indicate whether the user is or isn't assigned to that facility. Then your problem becomes one of filtering on that flag value. This is solvable in DAX, but you don't want to do that.
I just switched from Oracle to using Cassandra 2.0 with Datastax driver and I'm having difficulty structuring my model for this big data approach. I have a Persons table with UUID and serialized Persons. These Persons have lists of addresses, names, identifications, and DOBs. For each of these lists I have an additional table with a compound key on each value in the respective list and the additional person_UUID column. This model feels too relational to me, but I don't know how else to structure it so that I can have index(am able to search by) on address, name, identification, and DOB. If Cassandra supported indexes on lists I would have just the one Persons table containing indexed lists for each of these.
In my application we receive transactions, which can contain within them 0 or more of each of those address, name, identification, and DOB. The persons are scored based on which person matched which criteria. A single person with the highest score is matched to a transaction. Any additional address, name, identification, and DOB data from the transaction that was matched is then added to that person.
The problem I'm having is that this matching is taking too long and the processing is falling far behind. This is caused by having to loop through result sets performing additional queries since I can't make complex queries in Cassandra, and I don't have sufficient memory to just do a huge select all and filter in java. For instance, I would like to select all Persons having at least two names in common with the transaction (names can have their order scrambled, so there is no first, middle, last; that would just be three names) but this would require a 'group by' which Cassandra does not support, and if I just selected all having any of the names in common in order to filter in java the result set is too large and i run out of memory.
I'm currently searching by only Identifications and Addresses, which yield a smaller result set (although it could still be hundreds) and for each one in this result set I query to see if it also matches on names and/or DOB. Besides still being slow this does not meet the project's requirements as a match on Name and DOB alone would be sufficient to link a transaction to a person if no higher score is found.
I know in Cassandra you should model your tables by the queries you do, not by the relationships of the entities, but I don't know how to apply this while maintaining the ability to query individually by address, name, identification, and DOB.
Any help or advice would be greatly appreciated. I'm very impressed by Cassandra but I haven't quite figured out how to make it work for me.
Tables:
Persons
[UUID | serialized_Person]
addresses
[address | person_UUID]
names
[name | person_UUID]
identifications
[identification | person_UUID]
DOBs
[DOB | person_UUID]
I did a lot more reading, and I'm now thinking I should change these tables around to the following:
Persons
[UUID | serialized_Person]
addresses
[address | Set of person_UUID]
names
[name | Set of person_UUID]
identifications
[identification | Set of person_UUID]
DOBs
[DOB | Set of person_UUID]
But I'm afraid of going beyond the max storage for a set(65,536 UUIDs) for some names and DOBs. Instead I think I'll have to do a dynamic column family with the column names as the Person_UUIDs, or is a row with over 65k columns very problematic as well? Thoughts?
It looks like you can't have these dynamic column families in the new version of Cassandra, you have to alter the table to insert the new column with a specific name. I don't know how to store more than 64k values for a row then. With a perfect distribution I will run out of space for DOBs with 23 million persons, I'm expecting to have over 200 million persons. Maybe I have to just have multiple set columns?
DOBs
[DOB | Set of person_UUID_A | Set of person_UUID_B | Set of person_UUID_C]
and I just check size and alter table if size = 64k? Anything better I can do?
I guess it's just CQL3 that enforces this and that if I really wanted I can still do dynamic columns with the Cassandra 2.0?
Ugh, this page from Datastax doc seems to say I had it right the first way...:
When to use a collection
This answer is not very specific, but I'll come back and add to it when I get a chance.
First thing - don't serialize your Persons into a single column. This complicates searching and updating any person info. OTOH, there are people that know what they're saying that disagree with this view. ;)
Next, don't normalize your data. Disk space is cheap. So, don't be afraid to write the same data to two places. You code will need to make sure that the right thing is done.
Those items feed into this: If you want queries to be fast, consider what you need to make that query fast. That is, create a table just for that query. That may mean writing data to multiple tables for multiple queries. Pick a query, and build a table that holds exactly what you need for that query, indexed on whatever you have available for the lookup, such as an id.
So, if you need to query by address, build a table (really, a column family) indexed on address. If you need to support another query based on identification, index on that. Each table may contain duplicate data. This means when you add a new user, you may be writing the same data to more than one table. While this seems unnatural if relational databases are the only kind you've ever used, but you get benefits in return - namely, horizontal scalability thanks to the CAP Theorem.
Edit:
The two column families in that last example could just hold identifiers into another table. So, voilĂ you have made an index. OTOH, that means each query takes two reads. But, still will be a performance improvement in many cases.
Edit:
Attempting to explain the previous edit:
Say you have a users table/column family:
CREATE TABLE users (
id uuid PRIMARY KEY,
display_name text,
avatar text
);
And you want to find a user's avatar given a display name (a contrived example). Searching users will be slow. So, you could create a table/CF that serves as an index, let's call it users_by_name:
CREATE TABLE users_by_name (
display_name text PRIMARY KEY,
user_id uuid
}
The search on display_name is now done against users_by_name, and that gives you the user_id, which you use to issue a second query against users. In this case, user_id in users_by_name has the value of the primary key id in users. Both queries are fast.
Or, you could put avatar in users_by_name, and accomplish the same thing with one query by using more disk space.
CREATE TABLE users_by_name (
display_name text PRIMARY KEY,
avatar text
}
I have two forms that are related and I would to combine them in a view control. Not that difficult. This is for a "1 to Many" type scenario.
Say I have a customer view with the columns customerID and Customer Name. Then I have a view showing the "many" documents that has the columns masterCustomerID, orderNumber, orderDate.
On the XPage I create a view control of the many documents and add the columns masterCustomerID, orderNumber, orderDate. Then I add a column in the front to do a DbLookup to pull in the actual name of the customer. Nothing too fancy really.
My question is, in this situation, where the lookup column is the FIRST column. What are the strategies to sort the view column by that column. By default it would sort by the Key Value in the order view which is likely different then the Name values.
I'm not averse to using repeat controls if that would be easier.
My first thought would be to employ TreeMaps somehow, but I don't know if that's practical in the event that there might be a LOT of documents. Maybe there's something I'm missing...
Any advice would be appreciated. Thanks
Use view with (Customer name, Customer ID) structure as main view. Then based on Customer ID populate other columns by lookup from view with structure (Customer ID, Order ID, Order date). Hence it is 1:N relationship, you can't use single view component, but two nested - repeat inside view column would do.
I hope you are aware of performance impact (orders looked up for every customer row), so don't try to show too many customers at once.