Pulling name and title from strings in Python 3.5 - string

I'm working through a list of legal notices and need to pull the name and title from the strings.
"NO. 17-1354 (1) THOMAS A. GOLDBERG ADMINISTRATOR"
"NO. 17-1355 (1) ASHLEY MARIE BAKER EXECUTOR"
"CAUSE NO. _________ TIMOTHY WIMBERLY PETITIONER"
"ERIC SMITH, PETITIONER NO. 17-1048 MLF"
I've been trying various combinations using .split(), but can't seem to find one combo to fit them all.
For each one I want to identify the name and title, so it would look like:
['THOMAS A. GOLDBERG', 'ADMINISTRATOR']
['ASHLEY MARIE BAKER', 'EXECUTOR']
etc.

Related

Combining IF Statement along with CONCATENATE if it exists

Trying to combine and IF + Concatenate together. I'm running a report right now for my company where we grab samples from different water locations, but due to COVID-19 we aren't allowed in some specific locations and therefore have to get a water sample from a nearby hydrant.
I have all the locations and hydrants in one spreadsheet as data, and in my main tab I have an empty cell where someone may put (YES/NO) and if they put YES then another cell will fill with the hydrant name along with the location.
My issue is I have to have both this data combined in one static cell if "YES" is put, for example...
Location: LOC-3 John Street
Hydrant used?: YES
Hydrant (auto filled): LOC-3 HYDRANT 3333
Full location name (if YES): LOC-3 John Street LOC-3 Hydrant 3333
Full location name (if NO): LOC-3 John Street
This is the code below that I'm using in order to return the location name, can't figure out where or how to throw concatenate in there without getting an error back. Thank you in advance for your help.
=IF(OR((AND((A6<>""),(D6<>""))),(AND((B6<>""),(D6<>"")))),IF(A6="",B6,A6),"")
(Not a complete answer, but too large for a comment)
Your first part of your logical expression is quite large, let's have a look:
[(a6<>"") AND (d6<>"")] OR [(b6<>"") AND (d6<>"")]
=[(a6<>"") OR (b6<>"")] AND (d6<>"")
=[(a6&b6) <> ""] AND (d6<>"")
Where a6&b6 has the Excel meaning (concatenation of a6 and b6).
This is already a significant simplification of your formula. You might try to simplify even further and go on from there.

Transposing Rows in Openrefine

I am using Openrefine (openrefine-2.6-rc.2) running on Windows and opening with Chrome browser (65.033225.181
I have data in text format (.txt) that I have imported into Openrefine for cleaning and processing. The data entries reside in rows under one column. I would like to "transpose" (pivot) the items in the rows so they appear in columns
Following is an example of the current state:
Column 1
Mary Smith
Company Name IBM
Location New York
John Davis
Company Name Lockheed-Martin
Location Los Angeles
Jane Segal
Company Name Microsoft
Location Boston
Ideally, by transposing the entries the result would look like this:
Last Name First Name Company Name Location
Smith Mary IBM New York
Davis John Lockheed Los Angeles
Segal Jane Microsoft Boston
Just not sure how to do this in Openrefine
When creating your Open Refine project, make sure that empty rows are not imported.
You can delete them later, but it's a little more complicated (see screencast).
Then, just :
1° Apply the function Transpose -> Transpose cells in rows into columns, with a value of 3.
2° Delete the words "Company Name" and "Location" using a Transform with formulas like value.replace('Company Name', '').trim() and value.replace('Location', '').trim()
3° Rename the columns.
Here is a visual tutorial.

Cleaning full names into first name, last name, etc columns

I have a CSV file that has a single column of full names that are in different formats. Some include suffixes and initials. There are thousands of records.
I want to break each record apart into separate columns for each part of the full name that exists. The final columns would be:
Title
First Name
Middle Name
Last Name
Suffix
Here is an example of what some of the different names look like:
John Smith
Doe, Jane, MBA
Mrs. Sarah Johnson
Steven P Little
Fredericks, J S, D.D.S.
S Morrison, Dr Oscar
Fred Jones, M.B.A.
T. H. Gallatin
Morris Jr, Gary B.
What is a good way to break those out into separate columns given there is no standard format to the full names?

Match Two Inconsistent Lists in Excel

I have a sort of complicated listing issue in excel and hopefully someone is up for the challenge. I appreciate any and all responses.
I have two lists of about 50,000 names. My actual workbook has longer strings of data but to keep it simple, I'll use this:
LIST A LIST B
Joe Michael
John
Kim Matt
Carl
Mike Joey
Matthew Kimberly
The goal is to rearrange column B to match the appropriate nickname with column A, ie:
LIST A LIST B
Joe Joey
John
Kim Kimberly
Carl
Mike Michael
Matthew Matt
The relevance of the name is less important than it matching similar characters. I can manually correct any extraneous or odd nicknames.
The other caveat is that names without akas/nicknames are left blank in the other column on both sides.
I have seen other sorting operations that could work, but don't due to the fact that the values in the two columns are technically different.
Overall - a simpler way to say is that the aim is to make them more or less stack alphabetically and then have the similar names line up and ignore things that don't match.
Let me know if any further clarification is necessary.
Thank you!

Creating tables from email outputs - what is the easiest way?

I'm new here, I hope my question will be clear enough. I have around 200 email messages with some text inside, an output from a contact form on my website.
Name: Lorem Ipsum
Email: something#gmail.com
Mailing List: Yes
Israeli artist: Ido B & Zooki
artist of the year: Avicii
The phenomenon of the year: Martin Garrix - Animals
Israeli Discovery of the Year Dance: Matierro
Selected songs: 10
1. Dimitri Vegas & Like Mike & Moguai Mammoth
2. Martin Garrix Animals
3. Offer Nissim & Asi Tal Breath (feat. Maya Simantov)
4. Robin Thicke feat. T.I & Pharrell Blurred Lines (JRMX Remix)
5. Tiesto Take Me (feat. Kyler England)
6. Scream & Shout will.i.am (feat. Britney Spears)
7. Yinon Yahel Reach Out (feat. Alon Sharr)
8. Ylvis The Fox
9. Zedd Clarity (feat. Foxes)
10. Zedd Stay The Night (feat. Hayley Williams)
I would like to arrange all the information inside (First name, last name, email, etc.) in a table.
What is the easiest way to do it?
Given that the mail formats could be different, the number of songs selected different, unknown, and values each different, I don't think there is an "easy" way to do this. You could import each into an excel workbook and then use a pivot table to summarize the results. but this will require about 2-4 hours of manual activity. in formatting the imports, and writing formula's to parse the data so a pivot could be generated.
Here's an example based on 1 sample data provided.
B2:B9 uses the following formula: =LEFT(A2,FIND(":",A2)-1)
c2:c9 uses the following formula: =RIGHT(A2,LEN(A2)-FIND(":",A2))
B10:b19 uses the following formula: =LEFT(A10,FIND(".",A10)-1)
c10:c19 uses the following formula: =RIGHT(A10,LEN(A10)-FIND(".",A10))
the differences come due to the uses of : vs . as deliminators.
In the future I'd avoid emails based on form data and write it to a consistent file format. It's more work up front but it will save you time on the back end parsing data.

Resources