Finding PUA characters anywhere in column?

Finding PUA characters anywhere in column? - search

I asked a similar question here: sqlite3 run sql - select all with PUA characters.
The answer:
SELECT *
FROM TableName
WHERE (ColumnName >= '' AND ColumnName < '豈')
OR (ColumnName >= '󰀀' AND ColumnName < '󿿾')
OR (ColumnName >= '􀀀' AND ColumnName < '􏿾');
Only works with entries that start with a PUA character.
I'm trying to find a way to find these characters anywhere inside the entries (i.e.: LIKE) but I can't seem to figure out how to do it aside from the above.
Ideas?

To search for character ranges, you need GLOB:
SELECT *
FROM MyTable
WHERE ColumnName GLOB '*[-]*'
OR ColumnName GLOB '*[󰀀-󿿽]*'
OR ColumnName GLOB '*[􀀀-􏿽]*';

Related

how to modify textfile using U-SQL

I have a large file of around 130MB containing 10 A characters in each line and \t at the end of 10th "A" character, I want to extract this text file and then change all A's to B's. Can any one help with its code snippet?
this is what I have wrote till now
USE DATABASE imodelanalytics;
#searchlog =
EXTRACT characters string
FROM "/iModelAnalytics/Samples/Data/dummy.txt"
USING Extractors.Text(delimiter: '\t', skipFirstNRows: 1);
#modify =
SELECT characters AS line
FROM #searchlog;
OUTPUT #modify
TO "/iModelAnalytics/Samples/Data/B.txt"
USING Outputters.Text();
I'm new to this, so any suggestions will be helpful ! Thanks

Assuming all of the field would be AAAAAAAAAA then you could write:
#modify = SELECT "BBBBBBBBBB" AS characters FROM #searchlog;
If only some are all As, then you would do it in the SELECT clause:
#modify =
SELECT (characters == "AAAAAAAAAA" ? "BBBBBBBBBB" : characters) AS characters
FROM #searchlog;
If there are other characters around the AAAAAAAAAA then you would use more of the C# string functions to find them and replace them in a similar pattern.

how to use like and substring in where clause in sql

Hope one can help me and explain this query for me,
why the first query return result but the second does not:
EDIT:
first query:
select name from Items where name like '%abc%'
second Query:
select name from Items where name like substring('''%abc%''',1,10)
why the first return result but the second return nothing while
substring('''%abc%''',1,10)='%abc%'
If there are a logic behind that, Is there another approach to do something like the second query,
my porpuse is to transform a string like '''abc''' to 'abc' in order to use like statement,

You can concatenate strings to form your LIKE string. To trim the first 3 and last 3 characters from a string use the SUBSTRING and LEN functions. The following example assumes your match string is called #input and starts and ends with 3 quote marks that need to be removed to find a match:
select name from Items where name like '%' + SUBSTRING(#input, 3, LEN(#input) - 4) + '%'

How to remove percent character from a string in Cognos?

I have a string field with mostly numeric values like 13.4, but some have 13.4%. I am trying to use the following expression to remove the % symbols and retain just the numeric values to convert the field to integer.
Here is what I have so far in the expression definition of Cognos 8 Report Studio:
IF(POSITION('%' IN [FIELD1]) = NULL) THEN
/*** this captures rows with valid data **/
([FIELD1])
ELSE
/** trying to remove the % sign from rows with data like this 13.4% **/
(SUBSTRING([FIELD1]), 1, POSITION('%' IN [FIELD1])))
Any hints/help is much appreciated.

An easy way to do this is to use the trim() function. The following will remove any trailing % characters:
TRIM(trailing '%',[FIELD1])

The approach you are using is feasable. However, the syntax you are using is not compatible with the version of the ReportStudio that I'm familiar with. Below you will find an updated expression which works for me.
IF ( POSITION( '%'; [FIELD1]) = 0) THEN
( [FIELD1] )
ELSE
( SUBSTRING( [FIELD1]; 1; POSITION( '%'; [FIELD1]) - 1 ) )
Since character positions in strings are 1-based in Cognos it's important to substract 1 from the position returned by POSITION(). Otherwise you would only cut off characters after the percent sign.
Another note: what you are doing here is data cleansing. It's usually more advantageous to push these chores down to a lower level of the data retrieval chain, e.g. the Data Warehouse or at least the Framework Manager model, so that at the reporting level you can use this field as numeric field directly.

How to search for unicode characters in records of DB2?

I have a table in DB2 say METAATTRIBUTE wherein a column say "content" might contain any special character including the unicode characters.
For any special character, Eg: "#" I can simply search by :
Select * from METAATTRIBUTE where content like '%#%';
but how to search for unicode characters like "u201B" or "u201E" ???
Thanks in advance.

Assuming you are talking about DB2 LUW, the Unicode string literals are designated by the symbols "u&", followed by a regular string literal in single quotes. Unicode code points are designated by an escape character, backslash by default. For example:
$ db2 "values u&'\201b'"
1
---
‛
1 record(s) selected.
So your query would look like:
Select * from METAATTRIBUTE where content like u&'%\201b%';

Recently, I have had the same problem. This worked for me
select *
from METAATTRIBUTE
where MEDEDELINGSZONE like '%' || UX'201B' || '%'

Replace empty strings with null values

I am rolling up a huge table by counts into a new table, where I want to change all the empty strings to NULL, and typecast some columns as well. I read through some of the posts and I could not find a query, which would let me do it across all the columns in a single query, without using multiple statements.
Let me know if it is possible for me to iterate across all columns and replace cells with empty strings with null.
Ref: How to convert empty spaces into null values, using SQL Server?

To my knowledge there is no built-in function to replace empty strings across all columns of a table. You can write a plpgsql function to take care of that.
The following function replaces empty strings in all basic character-type columns of a given table with NULL. You can then cast to integer if the remaining strings are valid number literals.
CREATE OR REPLACE FUNCTION f_empty_text_to_null(_tbl regclass, OUT updated_rows int)
LANGUAGE plpgsql AS
$func$
DECLARE
_typ CONSTANT regtype[] := '{text, bpchar, varchar}'; -- ARRAY of all basic character types
_sql text;
BEGIN
SELECT INTO _sql -- build SQL command
'UPDATE ' || _tbl
|| E'\nSET ' || string_agg(format('%1$s = NULLIF(%1$s, '''')', col), E'\n ,')
|| E'\nWHERE ' || string_agg(col || ' = ''''', ' OR ')
FROM (
SELECT quote_ident(attname) AS col
FROM pg_attribute
WHERE attrelid = _tbl -- valid, visible, legal table name
AND attnum >= 1 -- exclude tableoid & friends
AND NOT attisdropped -- exclude dropped columns
AND NOT attnotnull -- exclude columns defined NOT NULL!
AND atttypid = ANY(_typ) -- only character types
ORDER BY attnum
) sub;
-- RAISE NOTICE '%', _sql; -- test?
-- Execute
IF _sql IS NULL THEN
updated_rows := 0; -- nothing to update
ELSE
EXECUTE _sql;
GET DIAGNOSTICS updated_rows = ROW_COUNT; -- Report number of affected rows
END IF;
END
$func$;
Call:
SELECT f_empty2null('mytable');
SELECT f_empty2null('myschema.mytable');
To also get the column name updated_rows:
SELECT * FROM f_empty2null('mytable');
db<>fiddle here
Old sqlfiddle
Major points
Table name has to be valid and visible and the calling user must have all necessary privileges. If any of these conditions are not met, the function will do nothing - i.e. nothing can be destroyed, either. I cast to the object identifier type regclass to make sure of it.
The table name can be supplied as is ('mytable'), then the search_path decides. Or schema-qualified to pick a certain schema ('myschema.mytable').
Query the system catalog to get all (character-type) columns of the table. The provided function uses these basic character types: text, bpchar, varchar, "char". Only relevant columns are processed.
Use quote_ident() or format() to sanitize column names and safeguard against SQLi.
The updated version uses the basic SQL aggregate function string_agg() to build the command string without looping, which is simpler and faster. And more elegant. :)
Has to use dynamic SQL with EXECUTE.
The updated version excludes columns defined NOT NULL and only updates each row once in a single statement, which is much faster for tables with multiple character-type columns.
Should work with any modern version of PostgreSQL. Tested with Postgres 9.1, 9.3, 9.5 and 13.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Finding PUA characters anywhere in column? - search

To search for character ranges, you need GLOB: SELECT * FROM MyTable WHERE ColumnName GLOB '[-]' OR ColumnName GLOB '[󰀀-󿿽]' OR ColumnName GLOB '[􀀀-􏿽]';

Related

how to modify textfile using U-SQL

how to use like and substring in where clause in sql

How to remove percent character from a string in Cognos?

How to search for unicode characters in records of DB2?

Replace empty strings with null values

Categories

Resources

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Finding PUA characters anywhere in column? - search

To search for character ranges, you need GLOB: SELECT * FROM MyTable WHERE ColumnName GLOB '*[-]*' OR ColumnName GLOB '*[󰀀-󿿽]*' OR ColumnName GLOB '*[􀀀-􏿽]*';

Related

how to modify textfile using U-SQL

how to use like and substring in where clause in sql

How to remove percent character from a string in Cognos?

How to search for unicode characters in records of DB2?

Replace empty strings with null values

Categories

Resources

To search for character ranges, you need GLOB: SELECT * FROM MyTable WHERE ColumnName GLOB '[-]' OR ColumnName GLOB '[󰀀-󿿽]' OR ColumnName GLOB '[􀀀-􏿽]';