Tableau - Use different palette according to name - colors

What I am looking for is to assign a color palette to a subcategory. I haven't find any information online for my problem. I'll explain using dummy values:
Imagine I have school data, with teachers, students and janitors names. On the database, their names are preceded by their job at school (eg: prof-John, st-Trinity, func-Manuel).
The purpose is to build a graph with the years on the job of each person. It is possible to create a calculated field and assign blue to teachers, red to students and green to janitors using the contains function. However, I want to distinguish (in the graph) each person within its job, assigning instead of the color blue for teachers, a blue palette for teachers and follow the same ideia for students and janitors.
Does anyone know how to do it?
Thanks in advance

EDIT: This solution gives you a color palette for continuous data. If you're looking to assign colors to discrete fields, this is clearly overkill. Alex Blakemore's suggestion to discretize your continuous data simplifies this process a LOT. But if you're feeling frisky and want a continuous color palette for each member of your dimension, this oughta do the trick.
Tableau doesn't let you assign entire palettes to members of a dimension, but I came up with a solution for you. A few caveats:
Tableauing always seems to be one silly hack after another, but this is truly the most hacktastic thing I have ever done in Tableau.
You're about to do a lot of manual work. Nothing about this process is even remotely dynamic.
This solution is extremely fragile. If your data currently contains nothing but students, professors, and janitors, but one day, you add a person that's an adventurer, this will break immediately, and you'll have to go redo a lot of your work.
So... consider yourself warned.
For simplicity, I'm just going to have two Roles in the data, but the formulas I use will be generalized for any number of Roles. I also added Age, just so we have a measure to work with.
+-------------+-----+
| Person | Age |
+-------------+-----+
| prof-John | 53 |
| st-Trinity | 22 |
| prof-Andrew | 47 |
| st-Alice | 21 |
| st-George | 20 |
| st-Frank | 21 |
| prof-Ed | 74 |
| st-Ralph | 26 |
| st-Skrillex | 18 |
+-------------+-----+
Let's start with the easy part. Tableau has a neat option called split that splits fields on delimiters. If you do a custom split, you can choose the delimiter, but Tableau is pretty clever, so if you just choose Split, there's a solid chance it will figure things out for you.
So right click on [Person] and click on Transform/Split. It will give you two calculated fields, the first of which looks like this:
TRIM( SPLIT( [Person], "-", 1 ) )
It should be pretty clear what that's doing, and equally clear what the second field will look like. Let's go ahead and rename those fields to Role and Name, so our table looks like this:
+-------------+------+----------+-----+
| Person | Role | Name | Age |
+-------------+------+----------+-----+
| prof-John | prof | John | 53 |
| st-Trinity | st | Trinity | 22 |
| prof-Andrew | prof | Andrew | 47 |
| st-Alice | st | Alice | 21 |
| st-George | st | George | 20 |
| st-Frank | st | Frank | 21 |
| prof-Ed | prof | Ed | 74 |
| st-Ralph | st | Ralph | 26 |
| st-Skrillex | st | Skrillex | 18 |
+-------------+------+----------+-----+
It's eventually going to be important that you add a serial ID for each of the Roles, starting at 0. We'll be using that number for some math later. Since we only have two Roles, we can just do it manually without too much effort:
Role #
IF [Role] = 'st'
THEN 0
ELSEIF [Role] = 'prof'
THEN 1
END
If you have more values than that, then you'll need to come up with something clever, but frankly, if you have enough values that doing this manually would be a challenge, then you probably shouldn't be giving each of those values its own color palette anyway.
Now the hard, hideous, hacktastic monstrosity of a solution I've concocted for you. We're going to make a custom color palette. (You're going to need to understand how that works for the rest of this post to make sense, so click on that link if you don't know how to make custom color palettes. No worries, it's easy.) More specifically, we're going to build a single sequential palette with a region for each of your Roles.
Our goal will be to normalize and manipulate our data so that the students are in the green region and the professors are in the blue region. Let's start with the normalization.
We're going to need the minimum and maximum ages in each Role, so we'll use LOD expressions:
Maximum Age:
{ FIXED [Role] : MAX([Age]) }
Minimum Age
{ FIXED [Role] : MIN([Age]) }
Now let's normalize the ages:
Normalized Age Value (NAV)
( ([Age] - [Minimum Age]) / ([Maximum Age] - [Minimum Age]) )
We now have a [Normalized Age Value] (henceforth NAV) between 0 and 1 for each person, normalized within each Role. Our data now looks like this:
+-------------+------+--------+----------+-----+-----+-----+------+
| Person | Role | Role # | Name | Age | Min | Max | NAV |
+-------------+------+--------+----------+-----+-----+-----+------+
| prof-John | prof | 1 | John | 53 | 47 | 74 | .22 |
| st-Trinity | st | 0 | Trinity | 22 | 18 | 26 | .5 |
| prof-Andrew | prof | 1 | Andrew | 47 | 47 | 74 | 0 |
| st-Alice | st | 0 | Alice | 21 | 18 | 26 | .375 |
| st-George | st | 0 | George | 20 | 18 | 26 | .25 |
| st-Frank | st | 0 | Frank | 21 | 18 | 26 | .375 |
| prof-Ed | prof | 1 | Ed | 74 | 47 | 74 | 1 |
| st-Ralph | st | 0 | Ralph | 26 | 18 | 26 | 1 |
| st-Skrillex | st | 0 | Skrillex | 18 | 18 | 26 | 0 |
+-------------+------+--------+----------+-----+-----+-----+------+
Now we need to move the professors to the blue region of our palette, and this is where things get a little tricky.
In a perfect world that exists only in our dreams, we could just add 1 to our professors' NAVs, giving us NAVs between 0 and 1 for our students and between 1 and 2 for our professors, but what our dreams didn't notice is that because we're using a sequential color palette, there are "dead zones" between each region of the palette.
Let's say we've built a sequential color palette with just two colors for each color region. Let's call them Green0, Green1, Blue0, and Blue1. There will be a space between Green1 and Blue0 where the color is continuously shifting from green to blue, thus making this area unusable in our palette. We can drive this point home by discretizing the palette:
That greenish/bluish section in the middle is actually lighter than Green1. So we need to make sure that our students only get the area of the palette between Green0 and Green1, and that our professors only get the area of the palette between Blue0 and Blue1.
We should not consider Green0 and Green1 regions on our palette — they are points. And these points have split up our palette into three distinct regions, the Green Zone, the Dead Zone, and the Blue Zone.
Since the Dead Zone is just the space between our two color regions (between our final green color and our first blue color), we can lower the size of the Dead Zone by adding more colors, which I don't think is necessarily valuable, but it is worth noting. Here's what the zones look like if we build a palette with ten colors per region.
Now if we add more color regions to our palette (for example, a Red Zone), that will add more Dead Zones.
Now we just need the starting point of each color zone and the size of a color zone, and we wind up with the formula:
[NAV] * [Color Zone Size] + [Color Zone Starting Point]
It's not hard to math out the size of the zones when you know the size of the Dead Zones:
So the formula for the size of a color zone is:
Color Zone Size
( ([numColorCodes] / [numColorZones]) - 1 ) / ([numColorCodes] - 1)
The start point is easy to derive from there — it's just the size of a color zone plus the size of the subsequent dead zone. We'll need to multiply by that Role # we calculated earlier:
Color Zone Start Point
( ([numColorCodes] / [numColorZones]) * [Role #] ) / ([numColorCodes] - 1)
So, to reiterate, our color field will be:
Color Coordinate
[NAV] * [Color Zone Size] + [Color Zone Starting Point]
I went ahead and put this together — here's a bar chart as a proof of concept.
And, as a bonus, here's the 20 code color palette I made.
<color-palette name="Hacktastic" type="ordered-sequential">
<color>#DBE9B1</color>
<color>#BFE38D</color>
<color>#A7DA72</color>
<color>#92D064</color>
<color>#80C45D</color>
<color>#70B557</color>
<color>#62A74D</color>
<color>#569A33</color>
<color>#498E0F</color>
<color>#398300</color>
<color>#B3D4DB</color>
<color>#8CCCE0</color>
<color>#71BFDF</color>
<color>#63ADD6</color>
<color>#4592C2</color>
<color>#2B7FB7</color>
<color>#1471B3</color>
<color>#1660A2</color>
<color>#1C508C</color>
<color>#24446F</color>
</color-palette>
Now obviously, you'll need to add another color to that palette to include the janitors. Unfortunately, that bit is always going to be manual, but you can save a few color palettes with different numbers of zones that you can recycle in the future. You can also have Tableau count the number of color zones you'll need with:
{ FIXED : COUNTD([Role]) }
Godspeed.

The easiest approach is to make sure you have two discrete (i.e. blue) fields, usually dimensions. Say one is profession and one is years on the job bin. You can use the create bins command to create a discrete bin dimension based on the years on the job measure. You can adjust the bin size by editing the bin field.
Then you can place two discrete fields on the color shelf if you hold down the SHIFT key when adding the second field. In that case, Tableau will effectively create a combined field and assign colors intelligently. You can edit the color assignments by double clicking on the color legend.
Here is an example.

Related

pyexcel - is there a way to get the column width?

I have an excel sheet that I can display via pyexcel.get_book like this:
+------------------------+-------+---------------+--------------+-----------------------------------------------------------------+------------------------+----------+
| Rec No. | Title | Forenames | Surname | Address | Plaintiff Name | Amount |
+------------------------+-------+---------------+--------------+-----------------------------------------------------------------+------------------------+----------+
| 01 | MR | GUY PERSON | FITZGERALD | 69 BLAZEIT TERRACE, BAGEL ROAD, ST SAVIOUR, GERBILTONG, JE2 7TR | BANANA COMPANY LIMITED | 69420.69 |
+------------------------+-------+---------------+--------------+-----------------------------------------------------------------+------------------------+----------+
Is there a way to determine the width of the column that pyexcel calculates to display the table above (address is longer than the rest for example)? Like a col.width?
I want to convert this table to an image that takes up the full width of a page. The columns are of different widths, which I would like to use.

Cognos : create custom groups in report studio

I am new to Cognos and I know SQL but it seems I can’t figure out cognos logic for some basic stuff. It's been two days I am trying and I have been looking all over the internet without finding anything.
Here’s the problem.
I have a Dimension Product that has two dimensions under it: type of product and article (in this order, article is below type of product in terms of hierarchy).
Let’s simplify and say I have this table:
Product line | Article | Sales
-------------------------------
Shoes | Article1 | 1000
| Article2 | 2000
| Article3 | 10
| Article4 | 20
| Article5 | 30
Bags | Article6 | 100
| Article7 | 100
| Article8 | 30
Balls | Article9 | 50
| Article10 | 50
I want to display the sales per product line and per article for article1 and article2 and the sales per product line only for the rest.
I want my final result to look like this:
Product line | Article | Sales
-------------------------------
Shoes | Article1 | 1000
| Article2 | 2000
| Other | 60
Bags | Other | 330
Balls | Other | 100
I created an elementary data with the following expression “if [article-name] in (‘article1’,’article2’) then ([article-name]) else (‘other’) but it gives me this:
Product line | ArticleNEW| Sales
-------------------------------
Shoes | Article1 | 1000
| Article2 | 2000
| Other | 10
| Other | 20
| Other | 30
Bags | Other | 100
| Other | 100
| Other | 30
Balls | Other | 50
| Other | 50
I thought Cognos would group by automatically but it seems it does not when you create a new expression….
Please note that I have thousands of articles and I cannot create a data that would say “article3+article4+article5 etc.”.
If anyone has an idea on this, it would be great!
Thank you in advance!
I believe the issue is with the model. If you have access to Framework Manager and the project/metadata, this would change my answer
Try this method: 3 queries
1) Query 1 just have product line and article
2) Query 2 product line, article, sales
3) Next go to queries, then tool box, find the join.
Drag that over. There will be spots to add query 1 and query 2
In the middle is how you define the join
Connect the product line and article (there should be a button to add links so you should have 2 lines). This will be 1 to many (1.1 to 1.n). The first part represents the type of join, 1 being inner, 0 being outer. The second part is the relationship (either 1 or n for many).
We can group by query 1 and aggregate query 2 the way we want
Double click on query 3 and drag the data items (from query 1 and query 2)
Grab sales from query 2, and everything else from query 1
Now you should be able to set the aggregate property for Sales (either total or sum)

Should all fields that are visible on a screen be validated in Gherkin?

We are creating Gherkin feature files for our application to create executable specifications. Currently we have files that look like this:
Given product <type> is found
When the product is clicked
Then detailed information on the product appears
And the field text has a value
And the field price has a value
And the field buy is available
We are wondering if this whole list of and keywords that validate if fields are visible on the screen is the way to go, or if we should shorten that to something like 'validate input'.
We have a similar case in that our service can return a lot of 10's of elements for each case that we could validate. We do not validate every element for each interaction, we only test the elements that are relevant to the test case.
To make it easier to maintain and switch which elements we are using, we use scenario outlines and tables of examples.
Scenario Outline: PO Boxes correctly located
When we search in the USA for "<Input>"
Then the address contains
| Label | Text |
| PO Box | <PoBox> |
| City name | <CityName> |
| State code | <StateCode> |
| ZIP Code | <ZipCode> |
| +4 code | <ZipPlus4> |
Examples:
| ID | Input | PoBox | CityName | StateCode | ZipCode |
| 01 | PO Box 123, 12345 | PO Box 123 | Boston | MA | 12345 |
| 02 | PO Box 321, Whitefish | PO Box 123 | Whitefish | MN | 54321 |
By doing it this way, we have a generic step "the address contains" that uses the 'Label' and 'Text' to test the individual elements. It is a neat and tidy way to test a lot of potential combinations - but it probably depends on your individual use case - how important all of the fields are.
You only need to validate the ones that provide business value, which is probably all of them. I would avoid using tech terms like "field" because it isn't related to a behavior. Al Mills is right on for using the tables.
I'd word it like this:
Scenario Outline: Review product details
Given I find the product <Type>
When I select the product
Then detailed information on the product appears including
| Description | <Description> |
| Price | <Price> |
And I can buy the product
Examples:
| Type | Description | Price |
| Hose | Rubber Hose | 31.99 |
| Sprinkler | Rotating Sprinker | 12.99 |
The words I chose are behaviors or whats, not technical implementations or hows.

How to return the header by value in a PivotTable

I have a PivotTable like this:
Sum of Gf_Amount | Column Labels
| 2015 | | | | Grand Total
Row Labels | 17-Mar | 18-Mar | 19-Mar | 20-Mar |
3601 | 20 | 20 | | | 40
10386 | 35 | | | | 35
76301 | 5 | | | | 5
80941 | | | | 10 | 10
205738 | | | 5 | | 5
219576 | | 15 | | | 15
Grand Total | 60 | 35 | 5 | 10 | 110
What I want do is find the last non-empty column and return the date according to the value. For example: for ID 3601 the result should be 2015 18-Mar.
Currently I know how to find the last non-empty column by using =LOOKUP(9.99E+307,B6:E6). For ID 3601 it gives me 20 which is correct. However when I use:
=INDEX($B$5:$E$5,MATCH(LOOKUP(9.99E+307,B6:E6),B6:E6,0))
to find the header, it gives me 17-Mar which is the corresponding header for the first 20. Besides, the formula I wrote can't even give me the year.
Can anyone help me out so I can find the date and year? (It doesn't have to be in PivotTable. You can copy and paste it in a normal table.)
I'm guessing that your column labels are date indices formatted as dd-mmm so there is no need to find the 2015 that is displayed hence:
=INDEX($5:$5,MATCH(1E+100,A6:E6))
formatted as say dd-mmm-yyyy and presumably copied down may suit.
It is a peculiarity (perhaps never really intended) of the MATCH function that, without the optional argument, where it can’t find a match in a list it returns the index of the last entry in the list – very useful, as here, at times! So all the “big number” (there are lots of versions of it – for example the one you used 9.99E+307) does is feed MATCH a number so large it is never likely to find it (to force selection of the last entry).
I like 1E+100, a googol, as short and easy to remember, and for its ‘derivation’. 9.99E+307 is theoretically better as closer to the largest number Excel can handle:
9.99999999999999E+307
but
10,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000
for me is big enough – I don’t expect ever to want to work with a number bigger than that and smaller than or equal to 9.99E+307.

Excel Calculations and VBA

In the following Excel spreadsheet, I need to make the following calculations:
Input Color Selected Output
------- -------------- --------
40 red 40x18
40 blue 40x12
40 green 40x16
40 yellow 40x13
39 red 39x18
28 blue 28x12
33 green 33x16
25 yellow 25x13
My question is, how can I assign values to the colors being selected using Java?
It sounds like you want to be doing something like this... "Countif cell color is red". That is, you mean to apply different multipliers based on the format of a cell. See also "Color Functions In Excel".
But to be honest, the best thing to do is to create a new column that contains the semantics of the information you are trying to represent using formatting and use THAT for your conditional expression instead. Make a column that contains the information contained in the "color" formatting and use that.
You could have a separate table with colors and numbers:
| F | G |
---|---------|-------|--
1 | red | 18 |
2 | blue | 12 |
3 | green | 16 |
. | ... | .. |
And then use the table in your calculation:
| A | B | C |
---|-------|-----------|-----------------------------------|--
1 | 40 | red | =A1*VLOOKUP(B1,$F$1:$G$100,2,0) |
2 | 40 | blue | =A2*VLOOKUP(B2,$F$1:$G$100,2,0) |
. | .. | ... | ... |

Resources