I have a column with entries that follow a similar format as the example below:
ABC - Adam Smith (T) (ABCadasmi)
ABC - John Carter (V) (ABCjohcar)
I'm looking to extract the "ABCadasmi" and "ABCjohcar" strings from these entries. Is there an Excel formula that can do this?
Related
I have a pandas dataframe with a name column as below
name
Dr. Maso Guilani
Paul Dupey
Mrs. Sarah Kant
Cathay Pane
Canine Paul
I want to remove strings like "Dr. , Mrs." from that "name" column
I tried as below.
df['name']=df.name.replace({"Mrs.": ""},regex=True).replace({"Dr.": ""},regex=True)
But I want to generalize this as I am not sure how many prefixes like "Dr. , Mrs." are
available in the huge dataset. Basically I want to remove all the prefix with dots. Thanks.
Expected output:
name
Maso Guilani
Paul Dupey
Sarah Kant
Cathay Pane
Canine Paul
With your shown samples, please try following. Using str.replace function of Pandas here. Simple explanation of regex would be: replacing everything from starting of value(with a lazy match) till first dot followed by 1 or more spaces with NULL in name column.
df['name'].str.replace(r'^.*?\.\s+','')
Output will be as follows.
Maso Guilani
Paul Dupey
Sarah Kant
Cathay Pane
Canine Paul
One way of doing this:
Via split() and apply() method:
df['name']=df['name'].str.split('.',1).apply(lambda x:x[1] if len(x)>1 else x[0])
Output of df:
0 Maso Guilani
1 Paul Dupey
2 Sarah Kant
3 Cathay Pane
4 Canine Paul
Let's say I've got two tables with two columns. In both cases, the first column consists of a name and a second column consist string of characters with the similar pattern. It looks like this:
Table 1
Peter xxxxx01
John xxxxx01
Bill xxxxx01
William xxxxx01
Table 2
Richard xxxxx02
John xxxxx02
Bill xxxxx02
Arthur xxxxx02
Now, I'd like to compare these two tables, find values where the names are duplicated and display data stored in second columns, just like this:
(Peter excluded)
John xxxxx01 xxxxx02
Bill xxxxx01 xxxxx02
(William, Arthur excluded)
I am familiar with pivot tables, however, it won't allow doing this.
I've also tried messing with index match formulas but without much success.
Any advices?
You can use the VLOOKUP function for this.
If your "Table1" is in B3:C6, and your "Table 2" is in F3:G6, then you can use the following formula in D3:D6 to lookup the values in table 2;
Cell D3: =IFERROR(VLOOKUP(B3,$F$3:$G$6,2,FALSE),"")
This is first looking up the name in table 1 (Cell B3) against table 2 (F3:G6), and returning the second column of table 2 if it finds the name. If it doesn't find the name, it will return an error, so we wrap the VLOOKUP in an "IFERROR" function, and replace any errors with an empty string, so it looks a bit friendlier. This results in the following table;
A B C D E F G
1
2 Table 1 Result Table 2
3 Peter xxxxxx01 Richard xxxxxx02
4 John xxxxxx01 xxxxxx02 John xxxxxx02
5 Bill xxxxxx01 xxxxxx02 Bill xxxxxx02
6 William xxxxxx01 Arthur xxxxxx02
You can then filter on the (Non-Blanks) in column D to only get the results you're interested in.
I am creating helper columns to assist me in reviewing our data, but I am running across an issue with one. What I am trying to accomplish is to create a helper column that tells me, by month, what type of medications a person is prescribed, and then combines multiple selections for the same name into a new name.
A sample data set would be:
A B C
1/1/2016 Doe, John Oral
1/1/2016 Doe, John Compound
1/1/2016 Doe, John Oral
2/1/2016 Smith, Jane Oral
2/1/2016 Smith, Jane Oral
2/1/2016 Adams, Tom Compound
2/1/2016 Doe, John Oral
So, for example, if John Doe was prescribed 2 oral medications and 1 compounded medication on 1/1/2016, the helper column would sort out that the three medications belong to the same person and are of two different types, so changes them to Combined. It would end up something akin to "1-Doe, John-Combined", displayed here:
D
1-Doe, John-Combined
1-Doe, John-Combined
1-Doe, John-Combined
2-Smith, Jane-Oral
2-Smith, Jane-Oral
2-Adams, Tom-Compound
2-Doe, John-Oral
So far, all I have is the concatenation by month:
=MONTH(A2)&"-"&B2&"-"
But I am not certain how to tackle the portion of the formula that will present the type of medication and combine (if required). Also, if necessary, more than one column can be created.
Thank you in advance.
Use SUMPRODUCT to test:
=MONTH(A1) & "-" & B1& "-" & IF(SUMPRODUCT((MONTH($A$1:$A$7)=MONTH(A1))*($B$1:$B$7=B1)*($C$1:$C$7<>C1))>0,"Combined",C1)
I have 3 columns
a b c
jon ben 2
ben jon 2
roy jack 1
jack roy 1
I'm trying to retrieve all unique permutations e.g. ben and jon = jon and ben so they should only appear once. Expected output:
a b c
jon ben 2
roy jack 1
Any ideas of a function that could do this? The order in the output does not matter. I've tried concatenating and then removing duplicates, but obviously this only considers the string order.
I've created a fourth column by joining all three columns together =a1&","&b1&","&c1 and used excel's built in remove duplicates function. This doesnt work as the order of the strings are different.
In your forth column use the formula
=if(A1<B1,A1&","&B1&","&C1,B1&","&A1&","&C1)
Which should join A and B in alphabetical order, then you can remove duplicates as you have done.
I am struggling to solve the below problem:
I have a list of users who have attended various numbers of courses. Now I want to find which courses each person has attended and list them in a new sheet. Below is a picture of my sheet:
Names | Courses
--------------------------------------------------------------------------------------
Farnaz Hossein Zadeh, Elena Pak, Mehran Behzadi, Atefeh Ghorbani, John Smith | AP01
John Smith, Farnaz Hossein Zadeh, Tom green | AP03
John Smith | AP05
And I need to get:
F G H
Farnaz Hossein Zadeh AP01 AP03
As far as I know, this is not quite possible with Excel formulas alone.
First, you need to clean up your data. Your can use Data > Text to columns to separate the comma-separated data. Then, make that data vertical so that you effectively have a list of pairs course-student. Then you can list unique courses by doing a pivot table on your data.