I have a list of >100,000 diagnosis codes in a .XLS document and need to extract from this all codes that are relevant to a number of specific diseases.
What I would like to be able to do is include all 100,000 diagnostic codes in Column A, diagnostic labels in column B, and then have a "search term" cell (e.g. C1) in which I can write a word such as "fracture".
I would then like all the diagnostic codes including the string "fracture" to appear in column D.
Is there a simple way of doing this in Excel? I have looked online without much success but this might be because I'm not certain where to start. Conditional formatting hasn't helped as it's still unmanageable to scroll through 100,000 codes even if they are highlighted nicely.
Any initial thoughts or tips as to what I could try searching for would be very welcome.
Sample dataset:
238 Fracture of proximal humerus
202 Aortic stenosis
990 Chronic obstructive pulmonary disease
302 Hip fracture
182 Recurrent fractures
094 Marfan syndrome
298 Diabetic retinopathy
We can use a helper column to find the matching rows. In E1 enter:
=MATCH("*" & $C$1 & "*",B:B,0)
and in E2 enter:
=IFERROR(MATCH("*" & $C$1 & "*",INDEX(B:B,E1+1):INDEX(B:B,999999),0)+E1,"")
and copy down. Column E tells us where the matches are.. Then in D1 enter:
=IFERROR(INDEX(A:A,E1),"")
and copy down:
This is a fairly standard way to do a keyword search.
I realize you didn't ask for a filter but if you are open to a slightly different solution this seems to work well. Column A title and values should start at cell A2. Then type your search term in cell B1. It will also filter with wildcards and is case insensitive (i.e. show fracture, Fracture, fractures, Fractures, fractured, Fractured).
Option Explicit
Sub Filter()
Dim MyArray() As Variant
Dim MyNewArray() As Variant
Dim i As Long
Dim item As Variant
Dim FilterRange As Range
With ActiveSheet
Set FilterRange = .Range(.Cells(2, 1), .Cells(.Rows.Count, 1).End(xlUp))
MyArray = Application.Transpose(FilterRange)
i = 0
For Each item In MyArray
If UCase(item) Like "*" & UCase(Range("B1")) & "*" Then
ReDim Preserve MyNewArray(i)
MyNewArray(i) = item
i = i + 1
End If
Next item
.Range(FilterRange.Address).AutoFilter Field:=1, Criteria1:=MyNewArray(), Operator:=xlFilterValues
End With
End Sub
Additionally, you could add the following in the Worksheet object so you don't have to click a button to run the macro:
Option Explicit
Private Sub Worksheet_Change(ByVal Target As Range)
If Not Application.Intersect(Target, Range("B1")) Is Nothing Then
Call Filter
End If
End Sub
Honestly, you can achieve the same thing with Filter > Text Filters > Custom Filter. You can use wildcards, too. :)
Let's say you want to search the term contained in C1 in column B.
You can try using this in A
=IF(C1<>"",IFERROR(FIND(C1,B1,1),0),0)
This will return a number if the text in C1 is present, else a 0.
You can then set D to be
=IF(A1>0,B1,"")
You will obtain this
I am putting the string to be searched for in the first row.
In row 2, I do a string match to get the first instance of a match for C1 in column A.
C2 =INDEX($A:$A,MATCH("*"&C1&"*",$A2:$A$100,0)+1)
Then find the next match starting from the array with row that matches C2 as the first row. I am fixing the last row as A100 which you can change.
C3 =INDEX($A:$A,MATCH("*"&$C$1&"*",INDIRECT("A"&MATCH(C2,$A:$A,0)+1&":A100"),0)+MATCH(C2,$A:$A,0))
Copy the formula down as required. Then copy columns across as required.
You can wrap the formula using IFERROR to suppress N/As.
You can also used an advanced filter in Excel by going to Data|Advanced:-
Note that the column headers in C1 and D1 must match the column headers in A1 and B1.
You can also adopt a DIY approach with:-
=IFERROR(INDEX(B:B,SMALL(IF(ISNUMBER(SEARCH($H$2,$B$1:$B$100000)),ROW($B$1:$B$100000)),ROW(1:1))),"")
assuming the search term is in H2, starting in say J2 and pulled down as necessary. This may be a little slow with 100K terms but is usable. Must be entered as an array formula with CtrlShiftEnter
Related
Input
I need Summary like below
I am looking for distinct no of account here instead of duplicate.
You can FILTER the original data and then count the number of unique instances.
Try =COUNT(UNIQUE(FILTER($A$2:$A$10, $B$2:$B$10=$D2)))
Here I assume that the original data is in cells A2:B10, and that the criteria for the filtering is in column D.
I have updated my answer to work for Office 2007 and mixed #Rory's comment to original post with my OFFSET part of my previous Office 365 solution.
Formula
Plase the following formula in E2:
=SUM(1/COUNTIFS(OFFSET($A$3,MATCH(D3,$A$3:$A$12,0)-1,1,COUNTIF($A$3:$A$12,D3),1),OFFSET($A$3,MATCH(D3,$A$3:$A$12,0)-1,1,COUNTIF($A$3:$A$12,D3),1)))
Explanation
The OFFSETpart provides a list of consecutive rows that feature the Acc No and belong to the same Br code. It uses MATCHto determine the first occurance of the given Br code. As the anchor cell for the formula is the cell with the first content ($A$3), I subtract 1 from the MATCH result. To determine the height, I use a COUNTIF statement, that counts how many rows feature the current Br code.
The cell range provided by OFFSET is then used as input for #Rory's COUNTIFS solution.
The text and non-text (numeric) unique and distinct values in the column are quickly listed separately using VBA macro.
enter image description here
Sub GetCountDistinctValues()
Dim sayi As Long, rang As Range
With CreateObject("Scripting.Dictionary")
For Each rang In Range("A2:A" & Cells(Rows.Count, "A").End(xlUp).Row)
If rang <> Empty Then
If Not .Exists(rang.Value) Then
.Add rang.Value, Nothing
If IsNumeric(rang.Value) Then sayi = sayi + 1
End If
End If
Next
Range("C2").Value = .Count - sayi
Range("C3").Value = sayi
End With
End Sub
Source : Find count of unique-distinct values
I am looking for reverse vlookup with more than 255 characters in Excel VBA.
This is the formula based one which I took from this website.
=INDEX(F2:F10,MATCH(TRUE,INDEX(D2:D10=A2,0),0))
I have try to convert it in VBA. Here below sample code
Sub test()
'concat
Range("i1") = WorksheetFunction.TextJoin(" ", True, Range("g1:h1"))
'lookup
Sal1 = Application.WorksheetFunction.Index(Sheets("sheet1").Range("a1:a2"), Application.WorksheetFunction.Match(True, Application.WorksheetFunction.Index(Sheets("sheet1").Range("i1:i1") = Range("i1").Value, 0), 0))
'=INDEX($W$3:$W$162,MATCH(TRUE,INDEX($W$3:$W$162=U3,0),0))
End Sub
It works well but it didn't when i change the range("i1:i1") to range("i1:i2")
I'm not sure what that worksheet formula does that =INDEX(F2:F11,MATCH(A2,D2:D11,FALSE)) doesn't do.
This part Index(Sheets("sheet1").Range("i1:i2") = Range("i1").Value, 0) is comparing a 2-d array to a single value, which should result in a Type Mismatch error. Whenever you reference a multi-cell range's Value property (Value is the default property in this context), you get a 2-d array even if the range is a single column or row.
You could fix that problem with Application.WorksheetFunction.Transpose(Range("D1:D10")) to turn it into a 1-d array, but I still don't think you can compare a 1-d array to a single value and have it return something that's suitable for passing into INDEX.
You could use VBA to create the array's of Trues and Falses, but if you're going to go to that trouble, you should just use VBA to do the whole thing and ditch the WorksheetFunction approach.
I couldn't get it to work when comparing a single cell to a single cell like you said it did.
Here's one way to reproduce the formula
Public Sub test()
Dim rFound As Range
'find A2 in D
Set rFound = Sheet1.Range("D1:D10").Find(Sheet1.Range("A2").Value, , xlValues, xlWhole)
If Not rFound Is Nothing Then
MsgBox rFound.Offset(0, 2).Value 'read column f - same position as d
End If
End Sub
If that simpler formula works and you want to use WorksheetFunction, it would look like this
Public Sub test2()
Dim wf As WorksheetFunction
Set wf = Application.WorksheetFunction
MsgBox wf.Index(Sheet1.Range("F2:F11"), wf.Match(Sheet1.Range("A2").Value, Sheet1.Range("D2:D11"), False))
End Sub
Function betterSearch(searchCell, A As Range, B As Range)
For Each cell In A
If cell.Value = searchCell Then
betterSearch = B.Cells(cell.Row, 1)
Exit For
End If
betterSearch = "Not found"
Next
End Function
i found this code from above link and it is useful for my current search.Below examples i try to get value..
Kindly consider Row 1 to 5 as empty for A and B column because my table always start from Row 6
Row
A Column
B Column
6
54
a
7
55
b
8
56
c
VBA Code:
Sub look_up ()
Ref = "b"
look_up = betterSearch(Ref, Range("B6:B8"), Range("A6:A8"))
End Sub
it show Empty while use Range("B6:B8"), Range("A6:A8")
but when changing the range from B6 and A6 to B1 and A1 (Range("B1:B8"), Range("A1:A8") )it gives the value...
My question is "can get the values from desired range"
Expressing matches via VBA
I like to know if there (are) any possibilities to convert this formula.
=INDEX(F2:F10,MATCH(TRUE,INDEX(D2:D10=A2,0),0))
So "reverse VLookUp" in title simply meant to express the (single) formula result via VBA (btw I sticked to the cell references in OP, as you mention different range addresses in comments).
This can be done by simple evaluation to give you a starting idea:
'0) define formula string
Dim BaseFormula As String
BaseFormula = "=INDEX($F$2:$F$10,MATCH(TRUE,INDEX($D$2:$D$10=$A2,0),0))"
'1) display single result in VB Editor's immediate
Dim result
result = Evaluate(BaseFormula)
Debug.Print IIf(IsError(result), "Not found!", result)
On the other hand it seems that you have the intention to extend the search string range
from A2 to more inputs (e.g. till cell A4). The base formula wouldn't return a results array with this formula,
but you could procede as follows by copying the start formula over e.g. 3 rows (note the relative address ...=$A2 to allow a row incremention in the next rows):
'0) define formula string
Dim BaseFormula As String
BaseFormula = "=INDEX($F$2:$F$10,MATCH(TRUE,INDEX($D$2:$D$10=$A1,0),0))"
'2) write result(s) to any (starting) target cell
'a)Enter formulae extending search cells over e.g. 3 rows (i.e. from $A2 to $A4)
Sheet3.Range("H2").Resize(3).Formula2 = BaseFormula
'b) optional overwriting all formulae, if you prefer values instead
'Sheet3.Range("H2").Resize(3).Value = Tabelle3.Range("G14").Resize(3).Value
Of course you can modify the formula string by any dynamic replacements (e.g. via property .Address(True,True,External:=True) applied to some predefined ranges to obtain absolute fully qualified references in this example).
Some explanations to the used formulae
The formula in the cited link
=INDEX(F2:F10,MATCH(TRUE,INDEX(D2:D10=A2,0),0))
describes a way to avoid an inevitable #NA error when matching strings with more than 255 characters directly.
Basically it is "looking up A2 in D2:D10 and returning a result from F2:F10" similar to the (failing) direct approach in such cases:
=INDEX(F2:F11,MATCH(A2,D2:D11,FALSE))
The trick is to offer a set of True|False elements (INDEX(D2:D10=A2,0))
which can be matched eventually without problems for an occurence of True.
Full power by Excel/MS 365
If, however you dispose of Excel/MS 365 you might even use the following much simpler function instead
and profit from the dynamic display of results in a so called spill range.
That means that matches can be based not only on one search string, but on several ones (e.g. A1:A2),
what seems to solve your additional issue (c.f. last sentence in OP) to extend the the search range as well.
=XLOOKUP(A1:A2,D2:D10,F2:F10,"Not found")
I have the following data:
Example:
A B C
EmployeeID EmployeeName EmployeeSalary
-------------------------------------------
E101 JAK 20000
E102 SAM 25000
E103 John 20000
E104 Shawn 30000
I have the cell H1 in which i type salary of the employee, and in the below cell that is cell H2, I2, J2 should list the employee details according to the given salary in the cell H1.
I have used VLOOKUP function for this.
For cell H2:
=IFERROR(VLOOKUP(H1,C2:A5,1,FALSE),"EmployeeID not found")
For cell I2:
=IFERROR(VLOOKUP(H1,C2:B4,2,FALSE),"EmployeeName not found")
For cell J2:
=IFERROR(VLOOKUP(H1,C2:C4,3,FALSE),"EmployeeSalary not found")
Note: The above works fine for single result to display but when i enter 20000 it will only show single record NOT all which meet the given criteria.
There are three ways to deal with this:
First the Formula:
I set up the Field as such which will become apparent with another method:
So in J4 I put the following formula:
=IFERROR(AGGREGATE(14,6,$C$2:INDEX(C:C,MATCH(1E+99,C:C))/($C$2:INDEX(C:C,MATCH(1E+99,C:C))=$H$2),ROW(1:1)),"")
In H4 I put:
=IF($J4<>"",INDEX(A$2:INDEX(A:A,MATCH(1E+99,$C:$C)),AGGREGATE(15,6,(ROW($C$2:INDEX($C:$C,MATCH(1E+99,$C:$C)))-1)/($C$2:INDEX($C:$C,MATCH(1E+99,$C:$C))=$J4),COUNTIF($J$4:$J4,$J4))),"")
Which I then drag across to I4. Then drag all three formulas down till you are sure you have covered all the possible results.
This is a non CSE array formula. Array formulas calculations are exponential, so we need to limit the reference range to the minimum needed. All the INDEX($C:$C,MATCH(1E+99,$C:$C)) finds the last cell with data and sets this as the end reference.
The IFERROR() wrapper on the first allows the formula to be copied down further than the list will return and avoid the #N/A. In the picture the formulas occupy the first 8 rows.
Second we use the Advanced Filter:
First we set up the area around H1 like this:
Then we navigate to Advanced Filter which is on the Data tab. This window pops open:
Then we enter the Information:
Mark the Copy to another location.
List Range is $A$1:$C$5
Criteria Range is $A$1:$C$5
Copy to range is $H$3:$J$3
Then hit okay.
The third is vba which mimic the Advanced Filter:
Sub atfilt()
Dim ws As Worksheet
Dim rng As Range
Dim critrng As Range
Dim cpytorng As Range
Dim lstrow As Long
Set ws = Sheets("Sheet9")
lstrow = ws.Range("A" & ws.Rows.Count).End(xlUp).row
Set rng = ws.Range("A1:C" & lstrow)
Set critrng = ws.Range("H1:H2")
Set cpytorng = ws.Range("H3:J3")
rng.AdvancedFilter Action:=xlFilterCopy, CriteriaRange:=critrng, CopyToRange:=cpytorng, Unique:=False
End Sub
Each has their disadvantages:
Formula: If the data set is large then(1,000 rows or more) the calculations will be long.
Advanced Filter: Each step must be redone each time a new filter is wanted. It is not automatic.
VBA: It is VBA and requires a certain understanding on how to use it.
I agree with #Scott Craner's comment that autofilter would be great here to allow you to find multiple values based on a criteria. Unfortunately (maybe someone can fill this bit in :)) I don't know much about autofilter in vba for this purpose (only used it once or twice)
I can tell you about left lookups with INDEX(MATCH()) which should work in place of your VLOOKUPS.
Format:
INDEX("column of values to return",MATCH("lookup value","column to find lookup value", 0))
so in your example for cell H2 you could use:
=IFERROR(INDEX($A:$A,MATCH(H1,$C:$C,0)),"EmployeeID not found")
Note the "0" in the formula is to find an exact match!
First time posting for me and hoping to get some help with VBA for selective hardcoding.
I currently have a column into which a formula is set which returns either blank or a variety of text strings (the status of our company's orders).
I need to make a macro that looks into all the cells of that column and copy/pastes as value into that same cell only if the formula in that cell returns text string "Received". It should not affect the other cells where the formula is returning either blank or a different text string.
Would really appreciate your help. Please let me know if you need more info.
Thanks in advance,
Olivier
Put the following in the VBA project of your workbook:
Option Compare Text
Sub replaceThem()
Dim r As Range
Dim c
Set r = Range("B1:B3") ' use the actual range here
For Each c In r
If c.Value = "Received" Then c.Formula = "Received"
Next
End Sub
This will do what you asked. c.Value returns the value of the formula in the cell c, c.Formula replaces the formula. The Option Compare Text makes the comparison case-insensitive.
The two columns look like on this image.
When I want to show only the cells which contain a letter 'b', I can no longer see the text "Title1" and "Title2" which is normally visible in the column B.
I guess although the cells in column B are merged, the text is still bound to A3, respectively to A7.
So how can I at the same time filter the visible content and preserve the merged text? In simple words, I want to filter content by letter 'b' and I still want to see the text "title 1/2" in the column B.
You tagged excel so here is a solution in excel:
You need to click on that column with the merged cells and unmerge all cells.
Then you need to put this formula at the top of your list and enter it with ctrl+shift+enter(this will enter it as an array formula):
=OFFSET(C3,MAX(IF(NOT(ISBLANK(C$3:C3)),ROW(C$3:C3),0))-ROW(C3),0)
Then you need to autofill that down.(this function seems a little verbose but I just got it online - there is probably a simpler way to do this - but it finds the last nonblank cell in a range).
I think openoffice has similar functions so you should be able do the same or something similar in openoffice.
Alternatively if you are using excel you could click on the column you want to unmerge and run this macro:
Sub UnMergeSelectedColumn()
Dim C As Range, CC As Range
Dim MA As Range, RepeatVal As Variant
For Each C In Range(ActiveCell, Cells(Rows.Count, ActiveCell.Column).End(xlUp))
If C.MergeCells = True Then
Set MA = C.MergeArea
If RepeatVal = "" Then RepeatVal = C.Value
MA.MergeCells = False
For Each CC In MA
CC.Value = RepeatVal
Next
End If
RepeatVal = ""
Next
End Sub
Good Luck.
EDIT:
I found a Non-VBA solution that will work in both excel and openoffice and doesn't require you to enter it as an array formula(with ctrl+shift+enter):
=INDEX(B:B,ROUND(SUMPRODUCT(MAX((B$1:B1<>"")*(ROW(B$1:B1)))),0),1)
In open office I think you want to enter it like this:
=INDEX(B:B;ROUND(SUMPRODUCT(MAX((B$1:B2<>"")*(ROW(B$1:B2)))),0),1)
or maybe like this:
=INDEX(B:B;ROUND(SUMPRODUCT(MAX((B$1:B2<>"")*(ROW(B$1:B2)))),0))
You just need to autofill that formula down:
Your main problem seems to be the one "blank row" that you have left after the filter fields.
Remove it, and it will work fine.