I have a string field with mostly numeric values like 13.4, but some have 13.4%. I am trying to use the following expression to remove the % symbols and retain just the numeric values to convert the field to integer.
Here is what I have so far in the expression definition of Cognos 8 Report Studio:
IF(POSITION('%' IN [FIELD1]) = NULL) THEN
/*** this captures rows with valid data **/
([FIELD1])
ELSE
/** trying to remove the % sign from rows with data like this 13.4% **/
(SUBSTRING([FIELD1]), 1, POSITION('%' IN [FIELD1])))
Any hints/help is much appreciated.
An easy way to do this is to use the trim() function. The following will remove any trailing % characters:
TRIM(trailing '%',[FIELD1])
The approach you are using is feasable. However, the syntax you are using is not compatible with the version of the ReportStudio that I'm familiar with. Below you will find an updated expression which works for me.
IF ( POSITION( '%'; [FIELD1]) = 0) THEN
( [FIELD1] )
ELSE
( SUBSTRING( [FIELD1]; 1; POSITION( '%'; [FIELD1]) - 1 ) )
Since character positions in strings are 1-based in Cognos it's important to substract 1 from the position returned by POSITION(). Otherwise you would only cut off characters after the percent sign.
Another note: what you are doing here is data cleansing. It's usually more advantageous to push these chores down to a lower level of the data retrieval chain, e.g. the Data Warehouse or at least the Framework Manager model, so that at the reporting level you can use this field as numeric field directly.
Related
I've been given data to build an application that has sequential data in the form of part numbers of products: "000000", "000001", "000002", "000010", "000011" .... The previous application was an old MS Access database that didn't have any gap filling features in the part number generator, hence the gap between "000002" and "000010" (Yes, they are also strings, but I can work with that...).
We could continue to increment based on the last value and ignore the gaps, however, in an attempt to use all numbers available to us with our naming scheme, we'd like to be able to fill the gaps. Our naming scheme describes the "product family" with the first two digits such that: [00]0000 would be a different family from [02]0000.
I can find the starting and ending values using something like:
let query = `
LET first = (
MIN(
FOR part in part_search
SEARCH STARTS_WITH(part.PartNumber, #family)
RETURN part.PartNumber
)
)
LET last = (
MAX(
FOR part in part_search
SEARCH STARTS_WITH(part.PartNumber, #family)
RETURN part.PartNumber
)
)
RETURN { first, last }
`
The above example returns: {first: "000000", last: "000915"}
Using ArangoDB and AQL, how could I go about finding these gaps? I've found some SQL examples but I feel the features of AQL are a bit more limiting.
Thanks in advance!
To start with, I think your best bet for getting min/max values is using aggregates:
FOR part in part_search
SEARCH STARTS_WITH(part.PartNumber, #family)
COLLECT x = 1
AGGREGATE first = MIN(part.PartNumber), last = MAX(part.PartNumber)
RETURN {
first: first,
last: last
}
But that won't really help when trying to find gaps. And you're right - SQL has several logical constructs that could help (like using variables and cursor iteration), but even that would be a pattern I would discourage.
The better path might be to do a "brute force" approach - compare a table containing your existing numbers with a table of all numbers, using a native method like JOIN to find the difference. Here's how you might do that in AQL:
LET allNumbers = 0..9999
LET existingParts = (
FOR part in part_search
SEARCH STARTS_WITH(part.PartNumber, #family)
LET childId = RIGHT(part.PartNumber, 4)
RETURN TO_NUMBER(childId)
)
RETURN MINUS(allNumbers, existingParts)
The x..y construct creates a sequence (an array of numbers), which we use as the full set of possible numbers. Then, we want to return only the "non-family" part of the ID (I'm calling it "child"), which needs to be numeric to compare with the previous set. Then, we use MINUS to remove elements of existingParts from the allNumbers list.
One thing to note, that query would return only the "child" portion of the part number, so you would have to join it back to the family number later. Alternatively, you could also skip string-splitting, and get "fancy" with your list creation:
LET allNumbers = TO_NUMBER(CONCAT(#family, '0000'))..TO_NUMBER(CONCAT(#family, '9999'))
LET existingParts = (
FOR part in part_search
SEARCH STARTS_WITH(part.PartNumber, #family)
RETURN TO_NUMBER(part.PartNumber)
)
RETURN MINUS(allNumbers, existingParts)
I have a text field in my SQLite database that stores a Time value, but for unrelated reasons I can't change the data type to TIME.
The values are stored in HH:MM format, and I'm having trouble trying to sort results by time because the values below '10:00' are missing a leading zero. I would prefer not to store the data with leading zero for the same unrelated reasons.
I'd like to add something to the Query that would pad the missing character if necessary, causing the results to read '08:30' when collected. I've been searching through the command and function lexicon though and I'm not finding what I need.
Is there a simple way to do this inside a query?
Thanks
I think this would work:
select your_col, case when length(your_col) < 5
then '0' || your_col else your_col end from your_table
Demo using Python
>>> conn.execute('''select c, case when length(c) < 5
then '0' || c else c end from t''').fetchall()
[(u'10:00', u'10:00'), (u'8:00', u'08:00')]
SELECT REPLACE(PRINTF('%5s', your_col), ' ', '0') FROM your_table
The PRINTF call pads the value with spaces until it's 5 characters, and the
REPLACE call replaces those spaces with zeros.
I have a query calculation that should throw me either a value (if conditions are met) or a blank/null value.
The code is in the following form:
if([attribute] > 3)
then ('value')
else ('')
At the moment the only way I could find to obtain the result is the use of '' (i.e. an empty character string), but this a value as well, so when I subsequently count the number of distinct values in another query I struggle to get the correct number (the empty string should be removed from the count, if found).
I can get the result with the following code:
if (attribute='') in ([first_query].[attribute]))
then (count(distinct(attribute)-1)
else (count(distinct(attribute))
How to avoid the double calculation in all later queries involving the count of attribute?
I use this Cognos function:
nullif(1, 1)
I found out that this can be managed using the case when function:
case
when ([attribute] > 3)
then ('value')
end
The difference is that case when doesn't need to have all the possible options for Handling data, and if it founds a case that is not in the list it just returns a blank cell.
Perfect for what I needed (and not as well documented on the web as the opposite case, i.e. dealing with null cases that should be zero).
I currently have an assignemnt where i have to handle data from a lot of countries. My customer have given me a list of acceptable characters, lets call it:
'aber =*'
All other characters should just be changed to '_'.
I know the conversion for my country's specific chars (æøå), easily done with something like
select replace ('Ål', 'Å', 'AA') from dual;
But how would i go about removing all unwanted "noise" without splitting it up in char-by-char comparison?
For example "bear*2 = fear" should become "bear*_ = _ear" as 2 and f are not in the accepted list.
Oracle 10g and up. As one of the approaches, you can use regular expression function regexp_replace():
select regexp_replace('bear*2 = fear', '[^aber =*]', '_') as res
from dual
res
------------------------------
bear*_ = _ear
Find out more about regexp_replace() function.
I am rolling up a huge table by counts into a new table, where I want to change all the empty strings to NULL, and typecast some columns as well. I read through some of the posts and I could not find a query, which would let me do it across all the columns in a single query, without using multiple statements.
Let me know if it is possible for me to iterate across all columns and replace cells with empty strings with null.
Ref: How to convert empty spaces into null values, using SQL Server?
To my knowledge there is no built-in function to replace empty strings across all columns of a table. You can write a plpgsql function to take care of that.
The following function replaces empty strings in all basic character-type columns of a given table with NULL. You can then cast to integer if the remaining strings are valid number literals.
CREATE OR REPLACE FUNCTION f_empty_text_to_null(_tbl regclass, OUT updated_rows int)
LANGUAGE plpgsql AS
$func$
DECLARE
_typ CONSTANT regtype[] := '{text, bpchar, varchar}'; -- ARRAY of all basic character types
_sql text;
BEGIN
SELECT INTO _sql -- build SQL command
'UPDATE ' || _tbl
|| E'\nSET ' || string_agg(format('%1$s = NULLIF(%1$s, '''')', col), E'\n ,')
|| E'\nWHERE ' || string_agg(col || ' = ''''', ' OR ')
FROM (
SELECT quote_ident(attname) AS col
FROM pg_attribute
WHERE attrelid = _tbl -- valid, visible, legal table name
AND attnum >= 1 -- exclude tableoid & friends
AND NOT attisdropped -- exclude dropped columns
AND NOT attnotnull -- exclude columns defined NOT NULL!
AND atttypid = ANY(_typ) -- only character types
ORDER BY attnum
) sub;
-- RAISE NOTICE '%', _sql; -- test?
-- Execute
IF _sql IS NULL THEN
updated_rows := 0; -- nothing to update
ELSE
EXECUTE _sql;
GET DIAGNOSTICS updated_rows = ROW_COUNT; -- Report number of affected rows
END IF;
END
$func$;
Call:
SELECT f_empty2null('mytable');
SELECT f_empty2null('myschema.mytable');
To also get the column name updated_rows:
SELECT * FROM f_empty2null('mytable');
db<>fiddle here
Old sqlfiddle
Major points
Table name has to be valid and visible and the calling user must have all necessary privileges. If any of these conditions are not met, the function will do nothing - i.e. nothing can be destroyed, either. I cast to the object identifier type regclass to make sure of it.
The table name can be supplied as is ('mytable'), then the search_path decides. Or schema-qualified to pick a certain schema ('myschema.mytable').
Query the system catalog to get all (character-type) columns of the table. The provided function uses these basic character types: text, bpchar, varchar, "char". Only relevant columns are processed.
Use quote_ident() or format() to sanitize column names and safeguard against SQLi.
The updated version uses the basic SQL aggregate function string_agg() to build the command string without looping, which is simpler and faster. And more elegant. :)
Has to use dynamic SQL with EXECUTE.
The updated version excludes columns defined NOT NULL and only updates each row once in a single statement, which is much faster for tables with multiple character-type columns.
Should work with any modern version of PostgreSQL. Tested with Postgres 9.1, 9.3, 9.5 and 13.