Cognos CASE statement - cognos

I am new to Cognos and I'm writing a CASE statement and have two general questions.
Is there a more efficient way of writing this statement?
Can the different WHEN statements be grouped where they share a common
description?
In the sample below several of the WHEN clause share a common THEN result i.e 'key tillage' or 'other crop production'.
Can you write the WHEN statement as:
WHEN ('Combo Primary Tillage','Disk Harrows','Field Cultivators') THEN ('Key Tillage')
Code CASE Sample
CASE([Class Long description])
WHEN('TR. 20<40') THEN ('Under 40')
WHEN('TR. 40<60') THEN ('40-59')
WHEN('TR. 60<100') THEN ('60-99')
WHEN('TR. 100<140') THEN ('100-139')
WHEN ('TR. 140<180') THEN ('140+')
WHEN ('TR. 180+') THEN ('140+')
WHEN('TR. 4WD') THEN ('4WD')
WHEN('CMB CAT 5') then ('Combines')
WHEN('CMB CAT 6') THEN('Combines')
WHEN('CMB CAT 7') THEN ('Combines')
WHEN('DISC MC') then ('Major Hay')
WHEN('SICKLE MC') THEN ('Major Hay')
WHEN('LARGE SQUARE BALER') THEN ('Major Hay')
WHEN('SMALL SQUARE BALER') THEN('Major Hay')
WHEN('ROUND BALER') THEN('Major Hay')
WHEN('SP WINDROWER') THEN('Major Hay')
WHEN('BALE THROWER') THEN('Other Hay')
WHEN('SP SPRAYERS') THEN('Sprayers')
WHEN('PLANTERS') THEN ('Planters')
WHEN('COMBO PRIMARY TILLAGE') THEN ('Key Tillage')
WHEN('DISK HARROWS') THEN('Key Tillage')
WHEN('FIELD CULTIVATORS') THEN ('Key Tillage')
WHEN('MIN PRIMARY TILLAGE') THEN('Key Tillage')
WHEN('VERTICAL SEEDBED TILLAGE') THEN ('Key Tillage')
WHEN('AIR DRILLS') THEN('Other Crop Production')
WHEN('FLOATER APPLICATORS') THEN('Other Crop Production')
WHEN('CHISEL PLOWS') THEN('Other Crop Production')
WHEN('CRUMBLERS') THEN('Other Crop Production')
WHEN('PULL TYPE SPRAYERS') THEN('Other Crop Production')
WHEN('AIR SYSTEMS') THEN('Other Crop Production')
WHEN('FLOATERS') THEN ('Other Crop Production')
ELSE ([Class Long description])
END

No, can't. You can write it like
CASE
WHEN [Class Long description] in ('Combo Primary Tillage','Disk Harrows','Field Cultivators') THEN ('Key Tillage')
WHEN [Class Long description] in ('AIR SYSTEMS','FLOATERS') THEN ('Other Crop Production')
.....
END
But your approach is ineffective. You'd better create a table for it and join it to your original table.

Related

Assign a timezone label to a list of geometries of a geodataframe

I have a geodataframe ("timezones") that contains the timezone labels of all the world (for example: "Europe/Zurich", or "Pacific/Galapagos") and its corresponding geometry:
On the other hand, I have a geodataframe ("regions") with ~80K rows, where each row represents a region in some country, defined by a certain geometry (a Polygon or Multipolygon):
I need to assign a timezone to each of the regions, so the final dataframe has "region", "province", "geometry" and "timezone" columns.
I don't know how to do it, maybe using a for loop over each row and check if geometry is inside the timezone using geopandas within or contains?
for i,row in regions.iterrows():
if regions.geometry.within(timezones.geometry)=="True":
region['timezone'] = timezones['timezone']
This example does not work, but it could be something similar to this? Or maybe there is a better way to do it?
Any suggestions would be highly appreciated.

Split string mixed input into range of single values

I have an input which allows multiple IDs.
They can be entered like this:
[ 1000, 1001, 1050-1060, 1100 ]
Out of this input string I want to get all the single IDs.
I already found this to split after each ,, so the part with 1000, 1001 already works.
data : itab TYPE TABLE OF string,
SPLIT l_bukrs_string AT ';' INTO TABLE itab.
My problem is the self-built range. Any idea how I could combine this with the case above to split 1050-1060 into single values?
I want to get 1050 | 1051 | 1052 | ... | 1060 out of it.
Appreciate every hint :) Thank you so much!
The easiest solution would be to use a real range/select-option for user (?) input instead. Then you would use that range to select every value from the database table.
If you cannot use a real range/select-option, then you could convert the string to one as shown below.
DATA: bukrs_string TYPE string,
split_bukrs TYPE TABLE OF string,
bukrs TYPE bukrs,
bukrs_between TYPE TABLE OF bukrs,
bukrs_range TYPE RANGE OF bukrs,
bukrs_rline LIKE LINE OF bukrs_range,
bukrs_table TYPE TABLE OF bukrs.
FIELD-SYMBOLS: <string> TYPE string,
<bukrs> TYPE bukrs,
<bukrs_from> TYPE bukrs,
<bukrs_to> TYPE bukrs.
bukrs_string = '1000, 1001, 1050-1060, 1100'.
CONDENSE bukrs_string NO-GAPS.
SPLIT bukrs_string AT ',' INTO TABLE split_bukrs.
LOOP AT split_bukrs ASSIGNING <string>.
bukrs_rline-sign = 'I'.
IF <string> CA '-'.
SPLIT <string> AT '-' INTO TABLE bukrs_between.
bukrs_rline-option = 'BT'.
READ TABLE bukrs_between INDEX 1 ASSIGNING <bukrs_from>.
bukrs_rline-low = <bukrs_from>.
READ TABLE bukrs_between INDEX 2 ASSIGNING <bukrs_to>.
bukrs_rline-high = <bukrs_to>.
ELSE.
bukrs_rline-option = 'EQ'.
bukrs = <string>.
bukrs_rline-low = bukrs.
ENDIF.
APPEND bukrs_rline TO bukrs_range.
CLEAR bukrs_rline.
ENDLOOP.
SELECT bukrs
FROM t001
INTO TABLE bukrs_table
WHERE bukrs IN bukrs_range.
Before you split the string, you would condense it, to remove all spaces. Then you would loop over the resulting parts and check if it contains any '-'. If that is the case, you split it again and create a BETWEEN entry in your range (consider if you may want an additional check to see if the latter number is actually higher). If there is no '-', you just create an EQUAL entry.
After you have your real range, you use it to select from the database. This is because not every bukrs in that range has to exist. You may only have 1000, 1050, 1055 and 1060, for example.
Edit: The reason there is no command, function module or class to convert a range to individual values is because what needs to be done changes heavily depending on WHAT data the range is for and if/how much values need to be verified.
If you have an integer range, then all you need to do is take the from-value and add 1 to it until you reach the to-value. What about a range of binary floating point numbers? What about a range of colours? What about your range of company codes, where not all of them necessarily exist? That's why the conversion has to be done manually.
Provided you were given a string with a list of mixed values, both single and interval BUKRS values divided by dash, and this list is separated by comma+space, then
DATA: input TYPE string VALUE '1000, 1001, 1050-1060, 1100, 1300-1340',
itab TYPE TABLE OF char10,
r_bukrs TYPE RANGE OF bukrs.
SPLIT input AT `, ` INTO TABLE itab.
r_bukrs = VALUE #( FOR GROUPS bukrs OF <bukrs> IN itab WHERE ( table_line+4(1) NE '-' ) GROUP BY <bukrs> WITHOUT MEMBERS ( sign = 'I' option = 'EQ' low = bukrs ) ).
DATA(ranges) = VALUE ddtest_ttyp_char( FOR GROUPS bukrs OF <bukrs> IN itab WHERE ( table_line+4(1) EQ '-' ) GROUP BY <bukrs> WITHOUT MEMBERS ( bukrs ) ).
LOOP AT ranges ASSIGNING FIELD-SYMBOL(<range>).
r_bukrs = VALUE #( BASE r_bukrs FOR j = CONV i( <range>(4) ) UNTIL j = CONV i( <range>+5(4) ) + 1 ( sign = 'I' option = 'EQ' low = j ) ).
ENDLOOP.
The first table expression (7th line) fills r_bukrs with unique values from initial table string.
The second table expression (8th line) fills ranges table with dash ranges found in initial table string, 1050-1060 and 1300-1340 in our case.
In the loop through ranges table the <range>(4) is the left extrema of interval, and <range>+5(4) is the right extrema, e.g. 1300 and 1340 correspondingly for last value interval.

How to convert matlab table [Inf], '' entry to char string

I have a Matlab table and want to create an SQL INSERT statement of this line(s).
K>> obj.ConditionTable
obj.ConditionTable =
Name Data Category Description
________________ ____________ _________________ ___________
'Layout' 'STR' '' ''
'Radius' [ Inf] 'Radius_2000_inf' ''
'aq' [ 0] '0' ''
'VehicleSpeed' [ 200] 'Speed_160_230' ''
Erros when conditionTable = obj.ConditionTable(1,:);
K>> char(conditionTable.Data)
Error using char
Cell elements must be character arrays.
K>> char(conditionTable.Description)
ans =
Empty matrix: 1-by-0
problem: the [Inf] entry
problem: possibly [123] number entries
problem: '' entries
Additionally, following commands are also useless in this matter:
K>> length(conditionTable.Data)
ans =
1
K>> isempty(conditionTable.Description)
ans =
0
Target Statement would be something like this:
INSERT INTO `ConditionTable` (`Name`, `Data`, `Category`, `Description`, `etfmiso_id`) VALUES ("Layout", "STR", "", "", 618);
Yes, num2str accept a single variable of any type and will return a string, so all these operations are valid:
>> num2str('123')
ans =
123
>> num2str('chop')
ans =
chop
>> num2str(Inf)
ans =
Inf
However, it can deal with purely numeric arrays (e.g. num2str([5 456]) is also valid), but it will bomb out if you try to throw a cell array at it (even if all your cells are numeric).
There are 2 possible way to work around that to convert all your values to character arrays:
1) use an intermediate cell array
I recreated a table [T] with the same data than in your example. Then running:
%% Intermediate Cell array
T3 = cell2table( cellfun( #num2str , table2cell(T) , 'uni',0) ) ;
T3.Properties.VariableNames = T.Properties.VariableNames
T3 =
Name Data Category Description
______________ _____ _________________ ___________
'Layout' 'STR' '' ''
'Radius' 'Inf' 'Radius_2000_inf' ''
'aq' '0' '0' ''
'VehicleSpeed' '200' 'Speed_160_230' ''
produces a new table containing only strings. Notice that we had to recreate the column names (copied from the initial table), as these are not transferred into the cell array during conversion.
These method is suitable for relatively small tables, as the round trip table/cellarray/table plus the call to cellfun will probably be quite slow for larger tables.
2) Use varfun function
varfun is for Tables the equivalent of cellfun for cell arrays. You'd think that a simple
T2 = varfun( #num2str , T )
would do the job then ... well no. This will error too. If you look at the varfun code at the line indicated by the error, you'll notice that internally, data in your table are converted to cell arrays and the function is applied to that. As we saw above, num2str errors when met with a cell array. The trick to overcome that, is to send a customised version of num2str which will accept cell arrays. For example:
cellnum2str = #(x) cellfun(#num2str,x,'uni',0)
Armed with that, you can now use it to convert your table:
%% Use "varfun"
cellnum2str = #(x) cellfun(#num2str,x,'uni',0) ;
T2 = varfun( cellnum2str , T ) ;
T2.Properties.VariableNames = T.Properties.VariableNames ;
This will produce the same table than in the example 1 above. Notice that again we had to reassign the column headers on the newly created table (the irony is varfun choked trying to apply the function on the column headers, but does not re-use or return them in the output ... go figure.)
discussion: Initially I tried to make the varfun solution work (hence the T2 name of the result), and wanted to recommend this one, because I didn't like the table/cell/table conversion of the other solution. Now I have seen what goes on into varfun, I am not so sure that this solution will be faster. It might be slightly more readable in a semantic way, but if speed is a concern you'll have to try both version and choose which one gives you the best results.
for the record: num2str(cell2mat(conditionTable.Data)), works, independant if 'abc', [Inf], [0], [123.123], apparently..

Check if a sentence is in a string in pl/pgsql

I have a pl/pgsql script that needs to check if a word/sentence is in a string, and it must take care of word boundaries, and case insenstive.
Example:
String: "my label xx zz yy", Pattern: "my label", MATCH
String: "xx my label zz", Pattern: "my label", MATCH
String: "my labelxx zz", Pattern: "my label", NO MATCH
So the obvious solution is to use a regex, like this:
select _label ~* (E'\\y' || _pattern || E'\\y') into _match;
It works but is slow, compared to a simple
select _label ilike '%' || _pattern || '%' into _match;
This is wrapped in a function that my script calls A LOT (in the tens of millions, I do a lot of recursion), and with this requirement the overall runtime doubled.
Now my question is, is there a faster way to implement this ?
Thanks.
EDIT: ended up using this:
if _label ilike '%' || _pattern || '%' then
select _label ~* (E'\\m' || _pattern || E'\\M') into _match;
end if;
and it is significantly faster.
I would consider the full text search capabilities, but from what you're describing, I'd likely implement this using PostgreSQL arrays.
First: define a function that takes a label, lowercases it (or uppercase if you prefer), splits it on word boundaries, and returns an array. Say:
CREATE OR REPLACE FUNCTION label_to_array(text) RETURNS text[] AS $$
SELECT regexp_split_to_array(lower($1), E'\\W');
$$ LANGUAGE sql IMMUTABLE;
$ select label_to_array('my label xx zz yy');
label_to_array
---------------------
{my,label,xx,zz,yy}
Now, create a GIN index over this function:
CREATE INDEX sometable_label_array_key ON sometable
USING GIN((label_to_array(label));
From here, PostgreSQL can use this index for many queries involving array operators, such as "contains":
SELECT *
FROM sometable
WHERE label_to_array(label) #> label_to_array('my label');
This query would split 'my label' into {my,label}, and would then use the index to find a list of rows containing my, intersect that with the list of rows containing label, and then return the result. This isn't exactly equivalent to your original query (since it doesn't check their order), but since it uses an index to eliminate most of the rows in the table, adding the original check on the end would work just fine:
SELECT *
FROM sometable
WHERE label_to_array(label) <# label_to_array('my label')
AND label ~* (E'\\y' || 'my label' || E'\\y');

Access list element using get()

I'm trying to use get() to access a list element in R, but am getting an error.
example.list <- list()
example.list$attribute <- c("test")
get("example.list") # Works just fine
get("example.list$attribute") # breaks
## Error in get("example.list$attribute") :
## object 'example.list$attribute' not found
Any tips? I am looping over a vector of strings which identify the list names, and this would be really useful.
Here's the incantation that you are probably looking for:
get("attribute", example.list)
# [1] "test"
Or perhaps, for your situation, this:
get("attribute", eval(as.symbol("example.list")))
# [1] "test"
# Applied to your situation, as I understand it...
example.list2 <- example.list
listNames <- c("example.list", "example.list2")
sapply(listNames, function(X) get("attribute", eval(as.symbol(X))))
# example.list example.list2
# "test" "test"
Why not simply:
example.list <- list(attribute="test")
listName <- "example.list"
get(listName)$attribute
# or, if both the list name and the element name are given as arguments:
elementName <- "attribute"
get(listName)[[elementName]]
If your strings contain more than just object names, e.g. operators like here, you can evaluate them as expressions as follows:
> string <- "example.list$attribute"
> eval(parse(text = string))
[1] "test"
If your strings are all of the type "object$attribute", you could also parse them into object/attribute, so you can still get the object, then extract the attribute with [[:
> parsed <- unlist(strsplit(string, "\\$"))
> get(parsed[1])[[parsed[2]]]
[1] "test"
flodel's answer worked for my application, so I'm gonna post what I built on it, even though this is pretty uninspired. You can access each list element with a for loop, like so:
#============== List with five elements of non-uniform length ================#
example.list=
list(letters[1:5], letters[6:10], letters[11:15], letters[16:20], letters[21:26])
#===============================================================================#
#====== for loop that names and concatenates each consecutive element ========#
derp=c(); for(i in 1:length(example.list))
{derp=append(derp,eval(parse(text=example.list[i])))}
derp #Not a particularly useful application here, but it proves the point.
I'm using code like this for a function that calls certain sets of columns from a data frame by the column names. The user enters a list with elements that each represent different sets of column names (each set is a group of items belonging to one measure), and the big data frame containing all those columns. The for loop applies each consecutive list element as the set of column names for an internal function* applied only to the currently named set of columns of the big data frame. It then populates one column per loop of a matrix with the output for the subset of the big data frame that corresponds to the names in the element of the list corresponding to that loop's number. After the for loop, the function ends by outputting that matrix it produced.
Not sure if you're looking to do something similar with your list elements, but I'm happy I picked up this trick. Thanks to everyone for the ideas!
"Second example" / tangential info regarding application in graded response model factor scoring:
Here's the function I described above, just in case anyone wants to calculate graded response model factor scores* in large batches...Each column of the output matrix corresponds to an element of the list (i.e., a latent trait with ordinal indicator items specified by column name in the list element), and the rows correspond to the rows of the data frame used as input. Each row should presumably contain mutually dependent observations, as from a given individual, to whom the factor scores in the same row of the ouput matrix belong. Also, I feel I should add that if all the items in a given list element use the exact same Likert scale rating options, the graded response model may be less appropriate for factor scoring than a rating scale model (cf. http://www.rasch.org/rmt/rmt143k.htm).
'grmscores'=function(ColumnNameList,DataFrame) {require(ltm) #(Rizopoulos,2006)
x = matrix ( NA , nrow = nrow ( DataFrame ), ncol = length ( ColumnNameList ))
for(i in 1:length(ColumnNameList)) #flodel's magic featured below!#
{x[,i]=factor.scores(grm(DataFrame[, eval(parse(text= ColumnNameList[i]))]),
resp.patterns=DataFrame[,eval(parse(text= ColumnNameList[i]))])$score.dat$z1}; x}
Reference
*Rizopoulos, D. (2006). ltm: An R package for latent variable modelling and item response theory analyses, Journal of Statistical Software, 17(5), 1-25. URL: http://www.jstatsoft.org/v17/i05/

Resources