How to find a specific mask within a string - Oracle? - string

I have a field in a table that can be informed with differente values.
Examples:
Row 1 - (2012,2013)
Row 2 - 8871
Row 3 - 01/04/2012
Row 4 - 'NULL'
I have to identify the rows that have a string with a date mask 'dd/mm/yyyy' informed. Like Row 3, so I may add a TO_DATE function to it.
Any idea on how can I search a mask within the field?
Thanks a lot

Sounds like a data model problem (storing a date in a string).
But, since it happens and we sometimes can't control or change things, I usually keep a function around like this one:
CREATE OR REPLACE FUNCTION safe_to_date (p_string IN VARCHAR2,
p_format_mask IN VARCHAR2,
p_error_date IN DATE DEFAULT NULL)
RETURN DATE
DETERMINISTIC IS
x_date DATE;
BEGIN
BEGIN
x_date := TO_DATE (p_string, p_format_mask);
RETURN x_date; -- Only gets here if conversion was successful
EXCEPTION
WHEN OTHERS THEN
RETURN p_error_date;
END;
END safe_to_date;
Then use it like this:
WITH d AS
(SELECT 'X' string_field FROM DUAL
UNION ALL
SELECT '11/15/2012' FROM DUAL
UNION ALL
SELECT '155' FROM DUAL)
SELECT safe_to_date (d.string_field, 'MM/DD/YYYY')
FROM d;

SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE Test ( id, VALUE ) AS
SELECT 'Row 1', '(2012,2013)' FROM DUAL
UNION ALL SELECT 'Row 2', '8871' FROM DUAL
UNION ALL SELECT 'Row 3', '01/04/2012' FROM DUAL
UNION ALL SELECT 'Row 4', NULL FROM DUAL
UNION ALL SELECT 'Row 5', '99,99,2015' FROM DUAL
UNION ALL SELECT 'Row 6', '32/12/2015' FROM DUAL
UNION ALL SELECT 'Row 7', '29/02/2015' FROM DUAL
UNION ALL SELECT 'Row 8', '29/02/2016' FROM DUAL
/
Query 1 - You can check with a regular expression:
SELECT *
FROM TEST
WHERE REGEXP_LIKE( VALUE, '^\d{2}/\d{2}/\d{4}$' )
Results:
| ID | VALUE |
|-------|------------|
| Row 3 | 01/04/2012 |
| Row 6 | 32/12/2015 |
| Row 7 | 29/02/2015 |
| Row 8 | 29/02/2016 |
Query 2 - You can make the regular expression more complicated to catch more invalid dates:
SELECT *
FROM TEST
WHERE REGEXP_LIKE( VALUE, '^(0[1-9]|[12]\d|3[01])/(0[1-9]|1[0-2])/\d{4}$' )
Results:
| ID | VALUE |
|-------|------------|
| Row 3 | 01/04/2012 |
| Row 7 | 29/02/2015 |
| Row 8 | 29/02/2016 |
Query 3 - But the best way is to try and convert the value to a date and see if there is an exception:
CREATE OR REPLACE FUNCTION is_Valid_Date(
datestr VARCHAR2,
format VARCHAR2 DEFAULT 'DD/MM/YYYY'
) RETURN NUMBER DETERMINISTIC
AS
x DATE;
BEGIN
IF datestr IS NULL THEN
RETURN 0;
END IF;
x := TO_DATE( datestr, format );
RETURN 1;
EXCEPTION
WHEN OTHERS THEN
RETURN 0;
END;
/
SELECT *
FROM TEST
WHERE is_Valid_Date( VALUE ) = 1
Results:
| ID | VALUE |
|-------|------------|
| Row 3 | 01/04/2012 |
| Row 8 | 29/02/2016 |

You can use the like operator to match the pattern.
where possible_date_field like '__/__/____';

Related

How can I filter for a specific date on a CQL timestamp column?

I have a table defined as:
CREATE TABLE downtime(
asset_code text,
down_start timestamp,
down_end timestamp,
down_duration duration,
down_type text,
down_reason text,
PRIMARY KEY ((asset_code, down_start), down_end)
);
I'd like to get downtime on a particular day, such as:
SELECT * FROM downtime \
WHERE asset_code = 'CA-PU-03-LB' \
AND todate(down_start) = '2022-12-11';
I got a syntax error:
SyntaxException: line 1:66 no viable alternative at input '(' (...where asset_code = 'CA-PU-03-LB' and [todate](...)
If function is not allowed on a partition key in where clause, how can I get data with "down_start" of a particular day?
You don't need to use the TODATE() function to filter for a specific date. You can simply specify the date as '2022-12-11' when applying a filter on a CQL timestamp column.
But the difference is that you cannot use the equality operator (=) because the CQL timestamp data type is encoded as the number of milliseconds since Unix epoch (Jan 1, 1970 00:00 GMT) so you need to be precise when you're working with timestamps.
Let me illustrate using this example table:
CREATE TABLE tstamps (
id int,
tstamp timestamp,
colour text,
PRIMARY KEY (id, tstamp)
)
My table contains the following sample data:
cqlsh> SELECT * FROM tstamps ;
id | tstamp | colour
----+---------------------------------+--------
1 | 2022-12-05 11:25:01.000000+0000 | red
1 | 2022-12-06 02:45:04.564000+0000 | yellow
1 | 2022-12-06 11:06:48.119000+0000 | orange
1 | 2022-12-06 19:02:52.192000+0000 | green
1 | 2022-12-07 01:48:07.870000+0000 | blue
1 | 2022-12-07 03:13:27.313000+0000 | indigo
The cqlshi client formats the tstamp column into a human-readable date in UTC. But really, the tstamp values are stored as integers:
cqlsh> SELECT tstamp, TOUNIXTIMESTAMP(tstamp) FROM tstamps ;
tstamp | system.tounixtimestamp(tstamp)
---------------------------------+--------------------------------
2022-12-05 11:25:01.000000+0000 | 1670239501000
2022-12-06 02:45:04.564000+0000 | 1670294704564
2022-12-06 11:06:48.119000+0000 | 1670324808119
2022-12-06 19:02:52.192000+0000 | 1670353372192
2022-12-07 01:48:07.870000+0000 | 1670377687870
2022-12-07 03:13:27.313000+0000 | 1670382807313
To retrieve the rows for a specific date, you need to specify the range of timestamps which fall on a specific date. For example, the timestamps for 6 Dec 2022 UTC ranges from 1670284800000 (2022-12-06 00:00:00.000 UTC) to 1670371199999 (2022-12-06 23:59:59.999 UTC).
This means if we want to query for December 6, we need to filter using a range query:
SELECT * FROM tstamps \
WHERE id = 1 \
AND tstamp >= '2022-12-06' \
AND tstamp < '2022-12-07';
and we get:
id | tstamp | colour
----+---------------------------------+--------
1 | 2022-12-06 02:45:04.564000+0000 | yellow
1 | 2022-12-06 11:06:48.119000+0000 | orange
1 | 2022-12-06 19:02:52.192000+0000 | green
WARNING - In your case where the timestamp column is part of the partition key, performing a range query is dangerous because it results in a multi-partition query -- there are 86M possible values between 1670284800000 and 1670371199999. For this reason, timestamps are not a good choice for partition keys. Cheers!
👉 Please support the Apache Cassandra community by hovering over the cassandra tag above and click on Watch tag. 🙏 Thanks!

Difference between 2 consecutive values in Kusto

I have the following script:
let StartTime = datetime(2022-02-18 10:10:00 AM);
let EndTime = datetime(2022-02-18 10:15:00 AM);
MachineEvents
| where Timestamp between (StartTime .. EndTime)
| where Id == "00112233" and Name == "Higher"
| top 2 by Timestamp
| project Timestamp, Value
I got the following result:
What I am trying to achieve after that is to check if the last Value received (in this case for example it is 15451.433) is less than 30,000. If the condition is true, then I should check again the difference between the last two consecutive values (in this case : 15451.433 - 15457.083). If the difference is < 0 then I should return the Value as true, else it should return as false (by other words the Value should give a boolean value instead of double as shown in the figure)
datatable(Timestamp:datetime, Value:double)
[
datetime(2022-02-18 10:15:00 AM), 15457.083,
datetime(2022-02-18 10:14:00 AM), 15451.433,
datetime(2022-02-18 10:13:00 AM), 15433.333,
datetime(2022-02-18 10:12:00 AM), 15411.111
]
| top 2 by Timestamp
| project Timestamp, Value
| extend nextValue=next(Value)
| extend finalResult = iff(Value < 30000, nextValue - Value < 0, false)
| top 1 by Timestamp
| project finalResult
Output:
finalResult
1
You can use the prev() function (or next()) to process the values in the other rows.
...
| extend previous = prev(value)
| extend diff = value - previous
| extend isPositive = diff > 0
You might need to use serialize if you don't have something like top that already does that for you.

ROUNDUP Function giving error as Invalid Identifier in oracle

I want to use ROUNDUP formula of excel in my oracle procedure. But while using I am getting error as
ROUNDUP is Invalid Identifier.
Below is my code
SELECT ROUNDUP(15/30) FROM DUAL;
Please suggest how can I use this
You cannot, ROUNDUP is not an Oracle function (which is why you get the invalid identifier error).
You could instead use CEIL.
SELECT CEIL(15/30) FROM DUAL;
| CEIL(15/30) |
| ----------: |
| 1 |
If you want to round up to a given precision then you could create a user-defined function:
CREATE FUNCTION roundup(
value IN NUMBER,
precision IN PLS_INTEGER DEFAULT 0
) RETURN NUMBER DETERMINISTIC
IS
BEGIN
IF precision = 0 THEN
RETURN CEIL( value );
ELSE
RETURN CEIL( value * POWER( 10, precision ) ) / POWER( 10, precision );
END IF;
END;
/
Then:
SELECT ROUNDUP(0.56789),
ROUNDUP(0.56789, 1),
ROUNDUP(0.56789, 2),
ROUNDUP(0.56789, -1)
FROM DUAL;
Outputs:
ROUNDUP(0.56789) | ROUNDUP(0.56789,1) | ROUNDUP(0.56789,2) | ROUNDUP(0.56789,-1)
---------------: | -----------------: | -----------------: | ------------------:
1 | .6 | .57 | 10
db<>fiddle here

Excel : Get the most frequent value for each group

I Have a table ( excel ) with two columns ( Time 'hh:mm:ss' , Value ) and i want to get most frequent value for each group of row.
for example i have
Time | Value
4:35:49 | 122
4:35:49 | 122
4:35:50 | 121
4:35:50 | 121
4:35:50 | 111
4:35:51 | 122
4:35:51 | 111
4:35:51 | 111
4:35:51 | 132
4:35:51 | 132
And i want to get most frequent value of each Time
Time | Value
4:35:49 | 122
4:35:50 | 121
4:35:51 | 132
Thanks in advance
UPDATE
The first answer of #scott with helper column is the correct one
See the pic
You could use a helper column:
First it will need a helper column so in C I put
=COUNTIFS($A$2:$A$11,A2,$B$2:$B$11,B2)
Then in F2 I put the following Array Formula:
=INDEX($B$2:$B$11,MATCH(MAX(IF($A$2:$A$11=E2,IF($C$2:$C$11 = MAX(IF($A$2:$A$11=E2,$C$2:$C$11)),$B$2:$B$11))),$B$2:$B$11,0))
It is an array formula and must be confirmed with Ctrl-Shift-Enter. Then copied down.
I set it up like this:
Here is one way to do this in MS Access:
select tv.*
from (select time, value, count(*) as cnt
from t
group by time, value
) as tv
where exists (select 1
from (select top 1 time, value, count(*) as cnt
from t as t2
where t.time = t2.time
group by time, value
order by count(*) desc, value desc
) as x
where x.time = tv.time and x.value = tv.value
);
MS Access doesn't support features such as window functions or CTEs that make this type of query easier in other databases.
Would that work? I haven't tried and got inspired here
;WITH t3 AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY time ORDER BY c DESC, value DESC) AS rn
FROM (SELECT COUNT(*) AS c, time, value FROM t GROUP BY time, value) AS t2
)
SELECT *
FROM t3
WHERE rn = 1

Detect overlapping ranges and correct then in oracle

Googling it a bit I found this to be an interesting question. Would like you guys shots.
Having my table
USER | MAP | STARTDAY | ENDDAY
1 | A | 20110101 | 20110105
1 | B | 20110106 | 20110110
2 | A | 20110101 | 20110107
2 | B | 20110105 | 20110110
Whant I want is to fix user's 2 case, where maps A and B overlaps by a couple days (from 20110105 until 20110107).
I wish I was able to query that table in a way that it never return overlapping ranges. My input data is falky already, so I don't have to worry with the conflict treatment, I just want to be able to get a single value for any given BETWEEN these dates.
Possible outputs for the query I'm trying to build would be like
USER | MAP | STARTDAY | ENDDAY
2 | B | 20110108 | 20110110 -- pushed overlapping days ahead..
2 | A | 20110101 | 20110104 -- shrunk overlapping range
It doesn't even matter if the algorithm causes "invalid ranges", e.g. Start = 20110105, End = 20110103, I'll just put null when I get to these cases.
What would you guys say? Any straight forward way to get this done?
Thanks!
f.
Analytic functions could help:
select userid, map
, case when prevend >= startday then prevend+1 else startday end newstart
, endday
from
( select userid, map, startday, endday
, lag(endday) over (partition by userid order by startday) prevend
from mytable
)
order by userid, startday
Gives:
USERID MAP NEWSTART ENDDAY
1 A 01/01/2011 01/05/2011
1 B 01/06/2011 01/10/2011
2 A 01/01/2011 01/07/2011
2 B 01/08/2011 01/10/2011

Resources