How to compare two dates in a text string in excel? - excel

I have a set of data with this format:
Each cell has one or some names, each of them are followed by a date. I want to compare the dates which are presented in each cell and check whether they are the same or not.
Example of a cell content: university XXX (2016-10-21) company YYY (2016-10-22)
I used the formula: =MID(A1,SEARCH("(",A1,1)+1,10) to find the first date. how could I find 2nd, 3rd, ... dates?
Thank you in advance,

Easiest if done step by step:
So break down your MID(A1,SEARCH("(",A1,1)+1,10) (and make it more specific to dates - you don't want to match "(KACST)") as:
B1: =SEARCH("(2",A1,1) and H1: =mid(A1,B1+1,10)
Then add
C1: =SEARCH("(2",A1,B1+1) This tells the search to start from the character after the one it has already found
I1: =mid(A1,C1+1,10)
etc

If you have Office 365 (Windows), here's one way that should work, depending on the variability in your real data:
=LET(arr,--MID(FILTERXML("<t><s>" & SUBSTITUTE(A1," ","</s><s>") & "</s></t>","//s[contains(.,'(')]"),2,10),
numDates,COUNT(arr),
AGGREGATE(14,6,arr,SEQUENCE(numDates)))
create an XML
Use FILTERXML to return only nodes containing the (
Convert those nodes to a date serial number
Will => error if the 10 characters subsequent to the ( are not numeric
Return those values.
Then you can compare those values however you want
For example, if you wanted to see if all the dates were the same:
=COUNT(UNIQUE(
LET(arr,--MID(FILTERXML("<t><s>" & SUBSTITUTE(A1," ","</s><s>") & "</s></t>","//s[contains(.,'(')]"),2,10),
numDates,COUNT(arr),
AGGREGATE(14,6,arr,SEQUENCE(numDates)))))=1
If you don't have Office 365, or if your data is more varied than what you show, such that the method I used for determining if the parenthesized values are dates is not reliable, I suggest you develop a VBA solution, possibly using Regular Expressions.

Related

splitting underscores in Excel

I'm fairly new to Excel and need some assistance. I have a Column that has a list of files that look like:
12345_v1.0_TEST_Name [12345]_01.01.2022.html
45321_v55.9_Some Name Here [64398]_07.15.2018.html
56871_v14.2_Test[64398]_10.30.2019.html
Each file name can be different depending on what output is provided to me.
Note: There are other random files in the same format, however where it says Test_Name there could be an underscore and sometimes no underscore. Would like that to be ignored in the formula or vba. Files also can change but will be in the same format.
I need some help with a formula or vba that splits the underscores and outputs the data into their own cells:
Column C 12345
Column D v1.0
Column E TEST_Name [12345]
Column F 01.01.2022
Column G .html
Since there can be different file extensions however the format remains same, hence the above formula which i provided has been amended with some few tweaks so that it works for any file extensions,
FORMULA IN CELL C1
=IF(LEN($B1)-LEN(SUBSTITUTE($B1,"_",""))+1>4,
TRIM(MID(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE($B1,"."&TRIM(RIGHT(SUBSTITUTE(
SUBSTITUTE($B1,"."," ",LEN($B1)-LEN(SUBSTITUTE($B1,".","")))," ",REPT(" ",200)),100)),"_"&"."&
TRIM(RIGHT(SUBSTITUTE(SUBSTITUTE($B1,"."," ",LEN($B1)-LEN(SUBSTITUTE($B1,".","")))," ",
REPT(" ",200)),100))),"_"," ",3),"_",REPT(" ",100)),COLUMN(A1)*99-98,100)),
TRIM(MID(SUBSTITUTE(SUBSTITUTE($B1,"."&TRIM(RIGHT(SUBSTITUTE(SUBSTITUTE(
$B1,"."," ",LEN($B1)-LEN(SUBSTITUTE($B1,".","")))," ",REPT(" ",200)),100)),"_"&"."&
TRIM(RIGHT(SUBSTITUTE(SUBSTITUTE($B1,"."," ",LEN($B1)-LEN(SUBSTITUTE($B1,".","")))," ",
REPT(" ",200)),100))),"_",REPT(" ",100)),COLUMN(A1)*99-98,100)))
FILL DOWN & FILL ACROSS!!!
There are other random files in the same format.....Files also can change but will be in the same format.
So, assuming the files indeed will be in the same format, we can brake this query down into the following requirements:
Change the 1st and 2nd occurence and the very last of the underscore into anything to split on;
Change the dot before the file-extension into anything to split on under the assumption we don't know if this would be '.html' or any other extension.
Since you have Microsoft365 we can use dynamic arrays and some basic functions to retrieve what you want:
=LET(X,SEARCH("_??.??.????.",A1),Y,"</s><s>",TRANSPOSE(FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(REPLACE(A1,X,12,Y&MID(A1,X+1,10)&Y),"_",Y,2),"_",Y,1)&"</s></t>","//s")))
To break this down a little bit:
SEARCH("_??.??.????.",A1) - This part will make sure that we find the position of the very last underscore upto the dot before the file extension assuming you don't have any other date in your filenames in this specific format;
SUBSTITUTE() - We can use this formula to specifically change the 1st and 2nd instances of the underscore to anything we can split on;
FILTERXML() - You may notice we used valid xml start/end-tags to split our data using this function.
TRANSPOSE() - This last function will now spill the returned array over the columns instead of rows.
Without LET():
=TRANSPOSE(FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(REPLACE(A1,SEARCH("_??.??.????.",A1),12,"</s><s>"&MID(A1,SEARCH("_??.??.????.",A1)+1,10)&"</s><s>"),"_","</s><s>",2),"_","</s><s>",1)&"</s></t>","//s"))
Is this what you are trying to achieve, although there might be more eloquent way to use a formula, and solve this, however you may try using this as well,
FORMULA USED IN CELL C1
=IF(LEN($B2)-LEN(SUBSTITUTE($B2,"_",""))+1>4,TRIM(MID(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE($B2,".html","_.html"),"_"," ",3),"_",REPT(" ",100)),COLUMN(A2)*99-98,100)),TRIM(MID(SUBSTITUTE(SUBSTITUTE($B2,".html","_.html"),"_",REPT(" ",100)),COLUMN(A2)*99-98,100)))
Fill Down & Fill Across !

Make SumIf ignore words?

=SUMIF(E3:E,"YES",C3:C)
The above formula works in adding the numbers in C if the corresponding E cell is "YES", however my cells in C have "# MINS" in them, is there a way to make SumIf ignore words and only add the number?
SCREENSHOT OF SPREADSHEET: https://cdn.discordapp.com/attachments/358381825246101505/488443165364322327/Screenshot_1.png
If you’re using Google Spreadsheets, you have the possibility to format the numbers as you want.
In the cells, store the numbers only so that SUMIF will work, then create a custom number format: in the toolbar - Format - Number - More Formats - Custom number format - type in # “MINS”.
=SUMPRODUCT(LEFT(C3:C5,LEN(C3:C5)-LEN(" mins"))*(D3:D5="yes"))
This is an array like calculation. As such full column references may bog your computer down with excess calculations.
Get rid of the MINS. You can use Find & Replace or Text to Columns, etc.
Create a custom number format of 0 \M\I\N\S.
Use your original formula.
excel
=SUMIF(E:E,"YES",C:C)
google-spreadsheet
=SUMIF(E3:E,"YES",C3:C)
Shareable link

Is there any way to use a wildcard in order to count dates?

Is there a way to use a wildcard to count the partial string of a date?
In my spreadsheet, I want to use the COUNTIF function to count a certain date. However, the date value also contains the time.
Example: "12/06/2017 17:35:12"
I only want to include "12/06/2017"
This is the formula I have: =COUNTIF(Pivot_Data[Created Date],"*12/06/2017*")
If I recall correctly, you must use the "&" concatenation operator on the wildcard symbol instead of including it with the rest of your string. This is specific to the countif and sumif functions:
=COUNTIF(Pivot_Data[Created Date],"*"&"12/06/2017"&"*")
EDIT: Here is a solution for date formats instead of text formats:
=COUNTIFS(Pivot_Data[Created Date],">"&DATEVALUE("12/06/2017"),Pivot_Data[Created Date],"<"&(DATEVALUE("12/06/2017")+1) )
This works because the date is represented as an unique integer in Excel's date encoding, with the time as a decimal value.
Place * either side of the date.E.g. =COUNTIF(E9,"*12/06/2017*")
This is a wildcard match. Adjust range according to needs.
Instead of searching for a specific text, I would just check if the date is the same as Excel's built in serial number. Also I prefer SUMPRODUCT over COUNTIF.
Something like:
= SUMPRODUCT((FLOOR(Pivot_Data[Created Date],1)=DATE(2017,12,6))+0)
The FLOOR function (in this case) effectively just takes the date and removes any reference to what time it is on that date.
I think this way is better because it doesn't rely on the cell being a specific text format.
See below, I used this formula but replaced Pivot_Data[Created Date] with A1:A5 just to demonstrate that this formula works with sample data. As expected, the formula returns a value of 2 because the data contains two dates on 12/6/2017. Notice how it doesn't care what time it is.

Excel VLOOKUP with multiple possible options in table array

I have two lists, the first is a set of users. The second list contains different encounter dates for these users.
I need to identify the date that is within 10 days of the "Renew Date" [Column C], but not before. With Member 1 this would be row 3 1/8/2017. With Member 2 this would be row 6, 1/21/2017.
Now using a VLOOKUP which the user before me who managed this spreadsheet obviously isn't viable as it's simply going to pickup the first date that has a matching Member ID. Is there a way to do this in Excel at all?
I have included a link to a sample file and a screenshoit of the sample data.
https://drive.google.com/file/d/0B5kjsJZFrUgFcUFTNFBzQzN4cm8/view?usp=sharing
To avoid the slowness and complexities of array formulas, you can try with SUMIFS but the problem is that if you have more than one match, it will add them, not return the first match. The sum will look like an aberration. Will work however if you are guaranateed that you have only one match in the data.
An alternative is also to use AVERAGEIFS, which, in case of multiple matches, will give you their average and it will look like a valid date and a good result. Enter this formula in D2 and fill down the column:
D2:
=AVERAGEIFS(G:G,F:F,A2,G:G,">="&C2,G:G,"<="&C2+10)
and don't forget to format column D as Date.
Try this
=SUMPRODUCT($G$2:$G$7,--($F$2:$F$7=A2),--($G$2:$G$7<=C2+10),--($G$2:$G$7>C2))
Format the result as date. See screenshot (my system uses DMY order)
Don't use whole column references with this formula. It will slow down the workbook.

Excel: Count cell once within a column when it meets at least one of multiple conditions

I'm working in Excel 2013. I've had a quick google for this, but I think my terminology might be wrong as I'm not finding a response that solves my exact problem.
I have a column of concatenated healthcare-related outcomes, such as:
"999 Emergency Ambulance By Co-op Admit Accident & Emergency! Admitted To Hospital"
"Advice Only"
"Admit to hospital"
"Medication prescribed"
I want to enter one formula in one cell that counts the records that EITHER contain "999" "Admit" OR "A&E" OR "Admission" for the ENTIRE column.
I don't want to have another column performing this count, I know the following works:
=IF(ISNUMBER(SEARCH("999",K2)),1,IF(ISNUMBER(SEARCH("ADMIT",K2)),1,IF(ISNUMBER(SEARCH("ADMISSION",K2)),1,0)))
But the formula I would need would replace the sum of the column that contained the above formula, negating the need for an extra column.
What I'm struggling with is that the other solutions I've seen would check the column for each condition and you'd end up with cells counted twice. As you can see in the above, some cells will contain "999" AND "Admit".
Apologies if this is a simple question!
Thanks in advance.
You can use this formula:
=SUMPRODUCT(1*((NOT(ISERROR(SEARCH("999";A1:A4)))+NOT(ISERROR(SEARCH("Admit";A1:A4)))+NOT(ISERROR(SEARCH("Admission";A1:A4))))>0))
Or, as in your example, use Isnumber instead of NOT(ISERROR(
Depending on your regional settings you may need to replace field separator ";" by ","
You can use MMULT here to avoid some repetition, i.e.
=SUMPRODUCT((MMULT(ISNUMBER(SEARCH({999,"Admit","A&E","Admission"},K2:K100))+0,{1;1;1;1})>0)+0)
Assumes data in the range K2:K100 - change as required
Note that there are 4 1s in {1;1;1;1} to match the 4 search terms - if you increase the number of search terms you need to increase the numbers of 1s

Resources