How to use vlookup in excel - excel

I have a sheet something like this
A B C D
1 2 2
2 3 3
4 5 5
5 7 9
10
11
12
I would like column D to show values of col A if col B values exist in col C
Example:
A B C D
1 2 2 1
5 7 9 -
D would have a value of 1 since Col b val is in Col C and in row 4 Col D would have no value at all
Yes A,B,C,D are labels as per the comments

You don't need VLOOKUP here. I think MATCH is a better choice.
Try this:
D1:D4 =IF(ISERROR(MATCH(B1,$C$1:$C$7,0)),"",A1)
(This assumes that your numerical values start in row 1.)
The output looks like this:
+---+---+---+----+---+
| | A | B | C | D |
+---+---+---+----+---+
| 1 | 1 | 2 | 2 | 1 |
| 2 | 2 | 3 | 3 | 2 |
| 3 | 4 | 5 | 5 | 4 |
| 4 | 5 | 7 | 9 | |
| 5 | | | 10 | |
| 6 | | | 11 | |
| 7 | | | 12 | |
+---+---+---+----+---+

You can do this with a combination of vlookup, offset and iserror like so:
=IFERROR(IF(VLOOKUP(B2,C:C,1,0)=B2,OFFSET(B2,0,-1)),"-")
offset used with the -1 parameter will return the cell one column to the left, so you do not need to rearrange the columns in your actual worksheet. iserror will check if the lookup failed, and return the specified default value. Finally, you can also specify the exact range to be looked up, in this case as
VLOOKUP(B2,$C$2:$C$8,1,0)

Related

filter and get rows between the conditions in a dataframe

My DataFrame looks something like this:
+----------------------------------+---------+
| Col1 | Col2 |
+----------------------------------+---------+
| Start A | 1 |
| value 1 | 2 |
| value 2 | 3 |
| value 3 | 4 |
| value 5 | 5 |
| End A | 6 |
| value 6 | 3 |
| value 7 | 4 |
| value 8 | 5 |
| Start B | 1 |
| value 1 | 2 |
| value 2 | 3 |
| value 3 | 4 |
| value 5 | 5 |
| End B | 6 |
| value 6 | 3 |
| value 7 | 4 |
| value 8 | 5 |
| Start C | 1 |
| value 1 | 2 |
| value 2 | 3 |
| value 3 | 4 |
| value 5 | 5 |
| End C | 6 |
+----------------------------------+---------+
What I am trying to acheive is if substring start and end is present I want the rows between them.
Expected Result is:
+----------------------------------+---------+
| Col1 | Col2 |
+----------------------------------+---------+
| Start A | 1 |
| value 1 | 2 |
| value 2 | 3 |
| value 3 | 4 |
| value 5 | 5 |
| End A | 6 |
| Start B | 1 |
| value 1 | 2 |
| value 2 | 3 |
| value 3 | 4 |
| value 5 | 5 |
| End B | 6 |
| Start C | 1 |
| value 1 | 2 |
| value 2 | 3 |
| value 3 | 4 |
| value 5 | 5 |
| End C | 6 |
+----------------------------------+---------+
I tried the code from this How to filter dataframe columns between two rows that contain specific string in column?
m = df['To'].isin(['Start A', 'End A']).cumsum().eq(1)
df[m|m.shift()]
But this only returns the first set of start and end, also it expects the exact string.
output:
+----------------------------------+---------+
| Col1 | Col2 |
+----------------------------------+---------+
| Start A | 1 |
| value 1 | 2 |
| value 2 | 3 |
| value 3 | 4 |
| value 5 | 5 |
| End A | 6 |
+----------------------------------+---------+
The answer you linked to was designed to work with a single pair of Start/End.
A more generic variant of it would be to check for the parity of the group (assuming strictly alternating Start/End):
m1 = df['Col1'].str.match(r'Start|End').cumsum().mod(2).eq(1)
# boolean indexing
out = df[m1|m1.shift()]
Alternatively, use each Start as a flag to keep the following rows and each End as a flag to drop them. This wouldn't however consider the A/B/C letter after the Start/End like the nice answer of #Quang does:
# extract Start/End
s = df['Col1'].str.extract(r'^(Start|End)', expand=False)
# set flags and ffill
m1 = s.replace({'Start': True, 'End': False}).ffill()
# boolean slicing
out = df[m1|m1.shift()]
Output:
Col1 Col2
0 Start A 1
1 value 1 2
2 value 2 3
3 value 3 4
4 value 5 5
5 End A 6
9 Start B 1
10 value 1 2
11 value 2 3
12 value 3 4
13 value 5 5
14 End B 6
18 Start C 1
19 value 1 2
20 value 2 3
21 value 3 4
22 value 5 5
23 End C 6
Let's try:
# extract the label after `Start/End`
groups = df['Col1'].str.extract('[Start|End] (.*)', expand=False)
# keep rows with equal forward fill and backward fill
df[groups.bfill() == groups.ffill()]
Output:
Col1 Col2
0 Start A 1
1 value 1 2
2 value 2 3
3 value 3 4
4 value 5 5
5 End A 6
9 Start B 1
10 value 1 2
11 value 2 3
12 value 3 4
13 value 5 5
14 End B 6
18 Start C 1
19 value 1 2
20 value 2 3
21 value 3 4
22 value 5 5
23 End C 6
One option is with an interval index:
Get the positions of the starts and ends:
starts = df.Col1.str.startswith("Start").to_numpy().nonzero()[0]
ends = df.Col1.str.startswith("End").to_numpy().nonzero()[0]
Build an interval index, and get matches where the index lies between Start and End:
intervals = pd.IntervalIndex.from_arrays(starts, ends, closed='both')
intervals = intervals.get_indexer(df.index)
Filter the original dataframe with the intervals, where intervals are not less than 0:
df.loc[intervals >= 0]
Col1 Col2
0 Start A 1
1 value 1 2
2 value 2 3
3 value 3 4
4 value 5 5
5 End A 6
9 Start B 1
10 value 1 2
11 value 2 3
12 value 3 4
13 value 5 5
14 End B 6
18 Start C 1
19 value 1 2
20 value 2 3
21 value 3 4
22 value 5 5
23 End C 6

MS Excel's alternative for ={A:A} formula in Google Sheets

This must be a simple thing to do but somehow I am unable to find answer to this question. In google sheets, if you want to reference an entire column (e.g Column A) you will put ={A:A} and the entire column will be referenced. How do you achieve similar thing in MS excel?
EDIT: (Asked in comments to post specific example)
Lets assume google sheets contain the following data:
| A | B | C |
| 1 | 5 | 9 |
| 2 | 6 | 0 |
| 3 | 7 | 9 |
| 4 | 8 | 0 |
Now if in cell D1 I type ={A:A}, the entire column A will be shown in column D.
| A | B | C | D |
| 1 | 5 | 9 |={A:A}
| 2 | 6 | 0 |
| 3 | 7 | 9 |
| 4 | 8 | 0 |
becomes
| A | B | C | D |
| 1 | 5 | 9 | 1 |
| 2 | 6 | 0 | 2 |
| 3 | 7 | 9 | 3 |
| 4 | 8 | 0 | 4 |
I dont have to drag the formula to the bottom or anything. It just shows the entire column
How do I do the exact same thing in excel?
It depends. For example:
=COUNTIF(A:A,"gold")
Excel does not support stuff like:
=COUNTIF(A12:A,"gold")

Assigning ranks to items that vary in order

I am trying to build a dataset from an online questionnaire. In this questionnaire, participants were asked to name 6 items. These items are represented with numbers from 1 to 6 (order of mention does not matter). Afterwards, participants were asked to rank those items from most important to least important (order here matters). Right now I have three columns "Named items", "Item ranked" and "Rank." The last column represents the position at which each case was ranked at. Thus, the idea would be to look at the number in the first column "Named item" and search for its position on the second column "Items Ranked" and return its position to the third column corresponding row.
Since the numbers go from 1 to 6, every six rows the process has to start again on the 7th row. I have a total of 186 participants, which means there's a total of 1116 items. What would be the most efficient way of doing this and preventing human error?
Here is an example of how the sheet looks like done manually:
+----------------------+-----------------------------+------+
| Order of named items | Items ranked (# = Identity) | Rank |
+----------------------+-----------------------------+------+
| 1 | 2 | 4 |
| 2 | 5 | 1 |
| 3 | 6 | 6 |
| 4 | 1 | 5 |
| 5 | 4 | 2 |
| 6 | 3 | 3 |
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| 3 | 3 | 3 |
| 4 | 4 | 4 |
| 5 | 5 | 5 |
| 6 | 6 | 6 |
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| 3 | 3 | 3 |
| 4 | 4 | 4 |
| 5 | 5 | 5 |
| 6 | 6 | 6 |
| 1 | 5 | 3 |
| 2 | 6 | 4 |
| 3 | 1 | 5 |
| 4 | 2 | 6 |
| 5 | 3 | 1 |
| 6 | 4 | 2 |
| 1 | 2 | 2 |
| 2 | 1 | 1 |
| 3 | 6 | 4 |
| 4 | 3 | 5 |
| 5 | 4 | 6 |
| 6 | 5 | 3 |
+----------------------+-----------------------------+------+
You can use this non volatile function:
=MATCH(A2,INDEX(B:B,INT((ROW(1:1)-1)/6)*6+2):INDEX(B:B,INT((ROW(1:1)-1)/6)*6+7),0)
Assuming 1st column starts at A2 and second column at B2 use this formula in C2 copied down
=MATCH(A2,OFFSET(B$2,6*INT((ROWS(C$2:C2)-1)/6),0,6),0)
OFFSET returns the 6 cell range required and MATCH finds the position of the relevant item within that
See screenshot below

Ordering the third column in excel according to first column (and fill blanks)

I have three columns, two first columns have identical values, the third one has some values missing (below nr. 2 & 4 are missing).
So how can I "order" these:
+---+---+---+
| A | B | C |
+---+---+---+
| 1 | 1 | 1 |
| 2 | 2 | 3 |
| 3 | 3 | 5 |
| 4 | 4 | |
| 5 | 5 | |
+---+---+---+
To become:
+---+---+---+
| A | B | C |
+---+---+---+
| 1 | 1 | 1 |
| 2 | 2 | 0 |
| 3 | 3 | 3 |
| 4 | 4 | 0 |
| 5 | 5 | 5 |
+---+---+---+
As you can see, the values that are missing are being (i.e. should be) filled with zero.
The numbers above are unique (i.e. I can't have two 4's in the same column). So how can I get the same values from Column C to be right next to column B (and then I can fill the empty fields with zero).
Assuming you have row 1 as headers...
Create a new column D with the formula in D2=
=IFERROR(VLOOKUP(A2,C:C,1,FALSE),0)
drag down and then copy/paste special values columns D and delete column C

Excel Creating a List from Beginning and End number AND tags

I am trying to create a list from an index of grouped values.
This is very similar to this, however my groups also have "tags" on then that complicate the listings.
Here is an example of my INDEX tab:
| A | B | C | D |
-------------------------
1 | 1 | 1 | 1 | CV |
2 | 1 | 2 | 2 | IS |
3 | 1 | 3 | 3 | IS |
4 | 2 | 4 | 5 | GN |
5 | 2 | 6 | 7 | PS |
6 | 4 | 8 | 11 | SQ |
7 | 2 | 12 | 13 | SS |
8 | 1 | 14 | 14 | AT |
9 | 15 | 15 | 29 | AT |
10| 4 | 30 | 33 | TYP |
Where A is the number of pages, B is the first page, C is the last page and D is the tag. I would also like to add columns such that I can keep a running tally of the tags.
| A | B | C | D | E | F |
---------------------------------------
1 | 1 | 1 | 1 | CV | CV1 | CV1 |
2 | 1 | 2 | 2 | IS | IS1 | IS1 |
3 | 1 | 3 | 3 | IS | IS2 | IS2 |
4 | 2 | 4 | 5 | GN | GN1 | GN2 |
5 | 2 | 6 | 7 | PS | PS1 | PS2 |
6 | 4 | 8 | 11 | SQ | SQ1 | SQ4 |
7 | 2 | 12 | 13 | SS | SS1 | SS2 |
8 | 1 | 14 | 14 | AT | AT1 | AT1 |
9 | 15 | 15 | 29 | AT | AT2 | AT16 |
10| 4 | 30 | 33 | TYP | TYP1 | TYP4 |
Note that the tag could occur multiple times and it may not be in sequential rows.
Here is what I want this to look like for my LIST tab:
| A |
---------
1 | CV1 |
2 | IS1 |
3 | IS2 |
4 | GN1 |
5 | GN2 |
6 | PS1 |
7 | PS2 |
8 | SQ1 |
9 | SQ2 |
10| SQ3 |
11| SQ4 |
and so on...
How do I add the additional columns to the INDEX tab via formulas?
How do I create the LIST via formulas? (...is this even possible?)
The formulas should be pretty simple to write. Just consider what you're trying to accomplish.
Your first formula (in column E) is just taking a running count of the tags (in column D). So you want to count all cells from the first tag up to the corresponding tag where the tag names are the same. That count is to be appended to the tag name.
=$D1 & COUNTIF($D$1:$D1, $D1)
The second formula (in column F) is just taking a running sum of the page counts (in column A). So you want to take the sum of all corresponding page counts from the first tag up to the corresponding tag where the tag names are the same. The sum is to be appended to the tag name.
=$D1 & SUMIF($D$1:$D1, $D1, $A$1:$A1)
Note that the column doesn't change nor does the starting rows of the ranges (hence the need to use absolute ranges). The only thing that changes are the rows of the tag and the row of the end range.
I don't think it would be possible to generate that list through simple formulas. As far as I know, formulas need to have a 1-to-1 correspondence with another range. A single range can yield multiple values so a formula just won't cut it. You'll need to write a VBA script to generate that.
Sub GenerateList()
Dim usedRange As Range
Dim count As Dictionary
Set usedRange = Worksheets("Index").usedRange
Set count = CountValues(usedRange)
Dim output As Range
Dim row As Integer
Dim key As Variant
Set output = Worksheets("List").Columns("A").Rows
output.ClearContents
row = 1
For Each key In count.Keys()
Dim i As Integer
For i = 1 To count(key)
output(row) = key & i
row = row + 1
Next i
Next key
End Sub
Function CountValues( _
usedRange As Range, _
Optional tagsColumn As String = "D", _
Optional valuesColumn As String = "A") As Dictionary
Dim tags As Range
Dim values As Range
Set tags = usedRange.Columns(tagsColumn).Rows
Set values = usedRange.Columns(valuesColumn).Rows
Dim map As New Dictionary
Dim tag As Range
For Each tag In tags
map(tag.Value) = map(tag.Value) + values(tag.row)
Next tag
Set CountValues = map
End Function
This uses a Dictionary so you'll have to reference the scripting runtime.
It sounds like you're just trying to get a list of "Unique Values" on a separate sheet that you can use as your list. Try these pages, there are multiple VBA methods to paste unique items in a range.
Also, Advanced Filter has an option to paste unique values to another location. So none of your repeat tags would appear in this list, only unique ones for your "LIST" tab.
Anyway, not sure if that's what you're wanting, but the question was a smidge vague.
Links here:
Create Unique list
Create Unique list 2

Resources