How to find pattern in file and comment that pattern in file using python - python-3.x

I want to search pattern in file from "COMPRESS" till ")" and comment it.
My input file as below :
CREATE MULTISET TABLE TESTDB.testTbl ,FALLBACK ,
(
Local_Pd BIGINT NOT NULL,
Year_Id INTEGER NOT NULL,
par_t CHAR(15) CHARACTER SET LATIN NOT CASESPECIFIC,
PB_Ind INTEGER COMPRESS(0,1,2,3,4,5,6,6))
UNIQUE PRIMARY INDEX ( Local_Pd ,Year_Id ,par_t,
PB_Ind);
Output file :
CREATE MULTISET TABLE TESTDB.testTbl ,FALLBACK ,
(
Local_Pd BIGINT NOT NULL,
Year_Id INTEGER NOT NULL,
par_t CHAR(15) CHARACTER SET LATIN NOT CASESPECIFIC,
PB_Ind INTEGER /* COMPRESS(0,1,2,3,4,5,6,6) */ )
UNIQUE PRIMARY INDEX ( Local_Pd ,Year_Id ,par_t,
PB_Ind);

Something like this should work
import re
test_str = "CREATE MULTISET TABLE TESTDB.testTbl ,FALLBACK , ( Local_Pd BIGINT NOT NULL, Year_Id INTEGER NOT NULL, par_t CHAR(15) CHARACTER SET LATIN NOT CASESPECIFIC, PB_Ind INTEGER COMPRESS(0,1,2,3,4,5,6,6)) UNIQUE PRIMARY INDEX ( Local_Pd ,Year_Id ,par_t, PB_Ind);"
regex = r"(COMPRESS\([^\)]*\))"
t=re.sub(regex, r"/* \1 */", test_str)
print(t)

Related

Replace One character in string with multiple characters in loop - ORACLE

I have a situation where say a string has one replaceable character.. For ex..
Thi[$] is a strin[$] I am [$]ew to Or[$]cle
Now I need to replace the [$] with s,g,n,a
Respectively...
How can I do that? Please help.
There is a special PL/SQL function UTL_LMS.FORMAT_MESSAGE:
You can use use it in your INLINE pl/sql function:
with function format(
str in varchar2
,s1 in varchar2 default null
,s2 in varchar2 default null
,s3 in varchar2 default null
,s4 in varchar2 default null
,s5 in varchar2 default null
,s6 in varchar2 default null
,s7 in varchar2 default null
,s8 in varchar2 default null
,s9 in varchar2 default null
,s10 in varchar2 default null
) return varchar2
as
begin
return utl_lms.format_message(replace(str,'[$]','%s'),s1,s2,s3,s4,s5,s6,s7,s8,s9,10);
end;
select format('Thi[$] is a strin[$] I am [$]ew to Or[$]cle', 's','g','n','a') as res
from dual;
Result:
RES
-------------------------------------
This is a string I am new to Oracle
Here is a hand-rolled solution using a recursive WITH clause, and INSTR and SUBSTR functions to chop the string and inject the relevant letter at each juncture.
with rcte(str, sigils, occ) as (
select 'Thi[$] is a strin[$] I am [$]ew to Or[$]cle' as str
, 'sgna' as sigils
, 0 as occ
from dual
union all
select substr(str, 1, instr(str,'[$]',1,1)-1)||substr(sigils, occ+1, 1)||substr(str, instr(str,'[$]',1,1)+3) as str
, sigils
, occ+1 as occ
from rcte
where occ <= length(sigils)
)
select *
from rcte
where occ = length(sigils)
Here is a working demo on db<>fiddle.
However, it looks like #sayanm has provided a neater solution.
Consider this method that lets the lookup values be table-based. See the comments within. The original string is split into rows using the placeholder as a delimiter. Then the rows are put back together using listagg, joining on it's order to the lookup table.
Table-driven using as many placeholders as you want. The order matters though of course just as with the other answers.
-- First CTE just sets up source data
WITH tbl(str) AS (
SELECT 'Thi[$] is a strin[$] I am [$]ew to Or[$]cle' FROM dual
),
-- Lookup table. Does not have to be a CTE here, but a normal table
-- in the database.
tbl_sub_values(ID, VALUE) AS (
SELECT 1, 's' FROM dual UNION ALL
SELECT 2, 'g' FROM dual UNION ALL
SELECT 3, 'n' FROM dual UNION ALL
SELECT 4, 'a' FROM dual
),
-- Split the source data using the placeholder as a delimiter
tbl_split(piece_id, str) AS (
SELECT LEVEL AS piece_id, REGEXP_SUBSTR(t.str, '(.*?)(\[\$\]|$)', 1, LEVEL, NULL, 1)
FROM tbl T
CONNECT BY LEVEL <= REGEXP_COUNT(t.str, '[$]') + 1
)
-- select * from tbl_split;
-- Put the string back together, joining with the lookup table
SELECT LISTAGG(str||tsv.value) WITHIN GROUP (ORDER BY piece_id) STRING
FROM tbl_split ts
LEFT JOIN tbl_sub_values tsv
ON ts.piece_id = tsv.id;
STRING
--------------------------------------------------------------------------------
This is a string I am new to Oracle

SQL Server 2017 - Dynamically generate a string based on the number of columns in another string

I have the following table & data:
CREATE TABLE dbo.TableMapping
(
[GenericMappingKey] [nvarchar](256) NULL,
[GenericMappingValue] [nvarchar](256) NULL,
[TargetMappingKey] [nvarchar](256) NULL,
[TargetMappingValue] [nvarchar](256) NULL
)
INSERT INTO dbo.TableMapping
(
[GenericMappingKey]
,[GenericMappingValue]
,[TargetMappingKey]
,[TargetMappingValue]
)
VALUES
(
'Generic'
,'Col1Source|Col1Target;Col2Source|Col2Target;Col3Source|Col3Target;Col4Source|Col4Target;Col5Source|Col5Target;Col6Source|Col6Target'
,'Target'
,'Fruit|Apple;Car|Red;House|Bungalo;Gender|Female;Material|Brick;Solution|IT'
)
I would need to be able to automatically generate my GenericMappingValue string dynamically based on the number of column pairs in the TargetMappingValue column.
Currently, there are 6 column mapping pairs. However, if I only had two mapping column pairs in my TargetMapping such as the following...
'Fruit|Apple;Car|Red'
then I would like for the GenericMappingValue to be automatically generated (updated) such as the following since, as a consequence, I would only have 2 column pairs in my string...
'Col1Source|Col1Target;Col2Source|Col2Target'
I've started building the following query logic:
DECLARE #Mapping nvarchar(256)
SELECT #Mapping = [TargetMappingValue] from TableMapping
print #Mapping
SELECT count(*) ColumnPairCount
FROM String_split(#Mapping, ';')
The above query gives me a correct count of 6 for my column pairs.
How would I be able to continue my logic to achieve my automatically generated mapping string?
I think I understand what you are after. This should get you moving in the right direction.
Since you've tagged 2017 you can use STRING_AGG()
You'll want to split your TargetMappingValue using STRING_SPLIT() with ROW_NUMER() in a sub-query. (NOTE: We aren't guaranteed order using string_split() with ROW_NUMBER here, but will work for this situation. Example below using OPENJSON if we need to insure accurate order.)
Then you can then use that ROW_NUMBER() as the column indicator/number in a CONCAT().
Then bring it all back together using STRING_AGG()
Have a look at this working example:
DECLARE #TableMapping TABLE
(
[GenericMappingKey] [NVARCHAR](256) NULL
, [GenericMappingValue] [NVARCHAR](256) NULL
, [TargetMappingKey] [NVARCHAR](256) NULL
, [TargetMappingValue] [NVARCHAR](256) NULL
);
INSERT INTO #TableMapping (
[GenericMappingKey]
, [GenericMappingValue]
, [TargetMappingKey]
, [TargetMappingValue]
)
VALUES ( 'Generic'
, 'Col1Source|Col1Target;Col2Source|Col2Target;Col3Source|Col3Target;Col4Source|Col4Target;Col5Source|Col5Target;Col6Source|Col6Target'
, 'Target'
, 'Fruit|Apple;Car|Red;House|Bungalo;Gender|Female;Material|Brick;Solution|IT' );
SELECT [col].[GenericMappingKey]
, STRING_AGG(CONCAT('Col', [col].[ColNumber], 'Source|Col', [col].[ColNumber], 'Target'), ';') AS [GeneratedGenericMappingValue]
, [col].[TargetMappingKey]
, [col].[TargetMappingValue]
FROM (
SELECT *
, ROW_NUMBER() OVER ( ORDER BY (
SELECT 1
)
) AS [ColNumber]
FROM #TableMapping
CROSS APPLY STRING_SPLIT([TargetMappingValue], ';')
) AS [col]
GROUP BY [col].[GenericMappingKey]
, [col].[TargetMappingKey]
, [col].[TargetMappingValue];
Here's an example of what an update would look like assuming your primary key is the GenericMappingKey column:
--This what an update would look like
--Assuming your primary key is the [GenericMappingKey] column
UPDATE [upd]
SET [upd].[GenericMappingValue] = [g].[GeneratedGenericMappingValue]
FROM (
SELECT [col].[GenericMappingKey]
, STRING_AGG(CONCAT('Col', [col].[ColNumber], 'Source|Col', [col].[ColNumber], 'Target'), ';') AS [GeneratedGenericMappingValue]
, [col].[TargetMappingKey]
, [col].[TargetMappingValue]
FROM (
SELECT *
, ROW_NUMBER() OVER ( ORDER BY (
SELECT 1
)
) AS [ColNumber]
FROM #TableMapping
CROSS APPLY [STRING_SPLIT]([TargetMappingValue], ';')
) AS [col]
GROUP BY [col].[GenericMappingKey]
, [col].[TargetMappingKey]
, [col].[TargetMappingValue]
) AS [g]
INNER JOIN #TableMapping [upd]
ON [upd].[GenericMappingKey] = [g].[GenericMappingKey];
Shnugo brings up a great point in the comments in that we are not guarantee sort order with string_split() and using row number. In this particular situation it wouldn't matter as the output mappings in generic. But what if you needed to used elements from your "TargetMappingValue" column in the final "GenericMappingValue", then you would need to make sure sort order was accurate.
Here's an example showing how to use OPENJSON() and it's "key" which would guarantee that order using Shnugo example:
SELECT [col].[GenericMappingKey]
, STRING_AGG(CONCAT('Col', [col].[colNumber], 'Source|Col', [col].[colNumber], 'Target'), ';') AS [GeneratedGenericMappingValue]
, [col].[TargetMappingKey]
, [col].[TargetMappingValue]
FROM (
SELECT [tm].*
, [oj].[Key] + 1 AS [colNumber] --Use the key as our order/column number, adding 1 as it is zero based.
, [oj].[Value] -- and if needed we can bring the split value out.
FROM #TableMapping [tm]
CROSS APPLY OPENJSON('["' + REPLACE([tm].[TargetMappingValue], ';', '","') + '"]') [oj] --Basically turn the column value into JSON string.
) AS [col]
GROUP BY [col].[GenericMappingKey]
, [col].[TargetMappingKey]
, [col].[TargetMappingValue];
if the data is already in the table and you want to break it out into columns, this should work
select
v.value
,left(v.value, charindex('|',v.value) -1) col1
,reverse(left(reverse(v.value), charindex('|',reverse(v.value)) -1)) col2
from String_split(#mapping,';') v

How to tokenize string by delimiters in teradata

I have a list of values stored as a string of the form [val1, val2, val3] is there a way to tokenize this string and stack the values in Teradata 15, in the style of NVP? E.g.
select <magic function>(values,'[ , ]')
returns
col
------
Val1
Val2
Val3
This mainly depends on the actual values and delimiters.
If any of the chars ',[] ' are treated as delimiters:
SELECT *
FROM
TABLE (STRTOK_SPLIT_TO_TABLE(1, '[val1, val2, val3]', ',[] ')
RETURNS (keycol INT, tokennum INTEGER, token VARCHAR(100) CHARACTER SET UNICODE)) AS dt
For multicolumn delimiters like yours '[', ']', ', ' probably better using REGEXP_SPLIT_TO_TABLE:
SELECT *
FROM
TABLE (REGEXP_SPLIT_TO_TABLE(1, '[val1, val2, val3]', '(\[|\]|, )', 'i')
RETURNS (keycol INT, tokennum INTEGER, token VARCHAR(100) CHARACTER SET UNICODE)) AS dt

in Tsql can i compare two string "MY String" to my string and show they are different

I need to do a query between two tables and find non matching fields
table 1 field locations has "my String"
table 2 field locations has "MY string"
they = by text but not by capitalization i need to return a false for this
Having the following data:
DECLARE #TableOne TABLE
(
[ID] TINYINT
,[Value] VARCHAR(12)
)
DECLARE #TableTwo TABLE
(
[ID] TINYINT
,[Value] VARCHAR(12)
)
INSERT INTO #TableOne ([ID], [Value])
VALUES (1,'my String')
INSERT INTO #TableTwo ([ID], [Value])
VALUES (1,'MY String')
You can use set Case Sentitive collation like this:
SELECT [TO].[Value]
,[TW].[Value]
FROM #TableOne [TO]
INNER JOIN #TableTwo [TW]
ON [TO].[ID] = [TW].[ID]
AND [TO].[Value] <> [TW].[Value]
COLLATE Latin1_General_CS_AS
or use HASH functions like this:
SELECT [TO].[Value]
,[TW].[Value]
FROM #TableOne [TO]
INNER JOIN #TableTwo [TW]
ON [TO].[ID] = [TW].[ID]
WHERE HASHBYTES('SHA1', [TO].[Value]) <> HASHBYTES('SHA1', [TW].[Value])
DECLARE #Table1 AS TABLE (FieldName VARCHAR(100))
DECLARE #Table2 AS TABLE (FieldName VARCHAR(100))
INSERT INTO #Table1 (FieldName) VALUES ('MY Location')
INSERT INTO #Table2 (FieldName) VALUES ('My Location')
With a default case insensitive collation order - Matches and returns results
SELECT * FROM #Table1 AS T1
INNER JOIN #Table2 AS T2
ON T1.FieldName = T2.FieldName
With a case sensitive collation order specified. Will not match
SELECT * FROM #Table1 AS T1
INNER JOIN #Table2 AS T2
ON T1.FieldName = T2.FieldName COLLATE Latin1_General_CS_AS_KS_WS
Microsoft article on collation

Using Case to match strings in sql server?

I am trying to use CASE in a SQL Select statement that will allow me to get results where I can utilize the length of one string to produce the resutls of another string. These are for non-matched records from two data sets that share a common ID, but variant Data Source.
Case statement is below:
Select Column1, Column2,
Case
When Column1 = 'Something" and Len(Column2) = '35' Then Column1 = "Something Else" and substring(Column2, 1, 35)
End as Column3
From dbo.xxx
When I run it I get the following error:
Msg 102, Level 15, State 1, Line 5 Incorrect syntax near '='.
You need to have a value for each WHEN, and ought to have an ELSE:
Select Data_Source, CustomerID,
CASE
WHEN Data_Source = 'Test1' and Len(CustomerName) = 35 THEN 'First Value'
WHEN Data_Source = 'Test2' THEN substring(CustomerName, 1, 35)
ELSE 'Sorry, no match.'
END AS CustomerName
From dbo.xx
FYI: Len() doesn't return a string.
EDIT:
A SQL Server answer that addresses some of the comments might be:
declare #DataSource as Table ( Id Int Identity, CustomerName VarChar(64) )
declare #VariantDataSource as Table ( Id Int Identity, CostumerName VarChar(64) )
insert into #DataSource ( CustomerName ) values ( 'Alice B.' ), ( 'Bob C.' ), ( 'Charles D.' )
insert into #VariantDataSource ( CostumerName ) values ( 'Blush' ), ( 'Dye' ), ( 'Pancake Base' )
select *,
-- Output the CostumerName padded or trimmed to the same length as CustomerName. NULLs are not handled gracefully.
Substring( CostumerName + Replicate( '.', Len( CustomerName ) ), 1, Len( CustomerName ) ) as Clustermere,
-- Output the CostumerName padded or trimmed to the same length as CustomerName. NULLs in CustomerName are explicitly handled.
case
when CustomerName is NULL then ''
when Len( CustomerName ) > Len( CostumerName ) then Substring( CostumerName, 1, Len( CustomerName ) )
else Substring( CostumerName + Replicate( '.', Len( CustomerName ) ), 1, Len( CustomerName ) )
end as 'Crustymore'
from #DataSource as DS inner join
#VariantDataSource as VDS on VDS.Id = DS.Id
Select
Column1,
Column2,
Case
When Column1 = 'Something' and Len(Column2) = 35
Then 'Something Else' + substring(Column2, 1, 35)
End as Column3
From dbo.xxx
Update your query on
use '+' for string concat
len() returns int, no need to use ''
remove "Column1 =" in the case when condition
replace "" with ''
Hope this help.

Resources