Excel column split by criteria - excel

I have a column with the following text:
PLEASANT AVENUE
PATTERSON DRIVE
I would like to separate the road type ("Avenue", "Drive", etc.) from the address (or road name, like "Pleasant" or "Patterson").
I need to end up with Col1 as the street name and Col2 as the type, as follows:
col1 | col2
|
PLEASANT | AVENUE
|
PATTERSON | DRIVE
How can I do this?

Select the column where you have the text, then go to Data => Text to column => Delimited => Space.

Adding on to #nicolaesse's post, you could first replace text to create a better delimiter.
For example: replace "AVENUE" with "; AVENUE", and then use the delimiter " ; " to split into columns.
P.S. do this for "DRIVE" too

Related

how to join related country and city google sheets

I have a problem with joining two columns based on related city and country names in google sheets or in excel maybe (if that helps)
I have two columns like this
Columns O is the name of the cities and column P is the country names
I want to join the cities to the related countries and add a character in between like | separated by comma's
so the result would be like Buenos Aires | Argentina, Lima | Peru, ..... in the corresponding rows
Example: Google Sheets Demo
how am I able to do that?
For each column O and P (excluding row 1)
Join everything in the column using a comma, as that's the delimiter that's already present. (Use a filter to exclude blanks.) JOIN(",", FILTER(O2:O, NOT(ISBLANK(O2:O))))
Split the join so that every value is in its own cell. SPLIT(JOIN(",", FILTER(O2:O, NOT(ISBLANK(O2:O)))), ",", FALSE)
Transpose the output for easier manipulation. (Operations are typically easier on columns than rows.) TRANSPOSE(SPLIT(JOIN(",", FILTER(O2:O, NOT(ISBLANK(O2:O)))), ",", FALSE))
Trim each cell to get rid of spaces on either side of the values. ARRAYFORMULA(TRIM(TRANSPOSE(SPLIT(JOIN(",", FILTER(O2:O, NOT(ISBLANK(O2:O)))), ",", FALSE))))
Get rid of duplicate values. UNIQUE(ARRAYFORMULA(TRIM(TRANSPOSE(SPLIT(JOIN(",", FILTER(O2:O, NOT(ISBLANK(O2:O)))), ",", FALSE)))))
Then you can join the two using the template JOIN(", ", ARRAYFORMULA("city"&" | "&"country")). Ultimately, this formula should give you what you want.
=JOIN(", ",
ARRAYFORMULA(
UNIQUE(ARRAYFORMULA(TRIM(TRANSPOSE(SPLIT(JOIN(",", FILTER(O2:O, NOT(ISBLANK(O2:O)))), ",", FALSE)))))
&" | "&
UNIQUE(ARRAYFORMULA(TRIM(TRANSPOSE(SPLIT(JOIN(",", FILTER(P2:P, NOT(ISBLANK(P2:P)))), ",", FALSE)))))
)
)

Need initial N characters of column in Postgres where N is unknown

I have one column in my table in Postgres let's say employeeId. We do some modification based on the employee type and store it in DB. Basically, we append strings from these 4 strings ('ACR','AC','DCR','DC'). Now we can have any combination of these 4 strings appended after employeeId. For example, EMPIDACRDC, EMPIDDCDCRAC etc. These are valid combinations. I need to retrieve EMPID from this. EMPID length is not fixed. The column is of varying length type. How can this be done in Postgres?
I am not entirely sure I understand the question, but regexp_replace() seems to do the trick:
with sample (employeeid) as (
values
('1ACR'),
('2ACRDCR'),
('100DCRAC')
)
select employeeid,
regexp_replace(employeeid, 'ACR|AC|DCR|DC.*$', '', 'gi') as clean_id
from sample
returns:
employeeid | clean_id
-----------+---------
1ACR | 1
2ACRDCR | 2
100DCRAC | 100
The regular expression says "any character after any of those string up to the end of the string" - and that is then replace with nothing. This however won't work if the actual empid contains any of those codes that are appended.
It would be much cleaner to store this information in two columns. One for the empid and one for those "codes"

How can I merge 2 spotfire tables by a regex match?

I am working on a spotfire tool, and I am using a calculated column in my main data table to group data rows into 'families' through a regex match. For example, one row might have a 'name' of ABC1234xyz, so it would be part of the ABC family because it contains the string 'ABC'. Another rows could be something like AQRST31x2af, and belong to the QRST family. The main point is that the 'family' is decided by matching a substring in the the name, but that substring could be any length, and isn't necessarily the beginning of the name string.
Right now I am doing this by a large nested If statement with a calculated column. However, this is tedious for adding new families, and maintaining the current list of families. What I would like to do is create a table with 2 columns, the string match and the family name. Then, I would like to match from this table to determine family instead of the nested if. So, it might look like the below tables:
Match Table:
id_string | family
----------------------
ABC | ABC
QRST | QRST
SUP | Super
Main Data Table:
name | data | family
---------------------------------------
ABC1234 | 1.02342 | ABC
ABC1215 | 1.23749 | ABC
AQRST31x2af | 1.04231 | QRST
BQRST32x2ac | 1.12312 | QRST
1903xSUP | 1.51231 | Super
1204xSUP | 1.68123 | Super
If you have any suggestions, I would appreciate it.
Thanks.
#wcase6- As far as I know, you cannot add columns from one table to another based on expression. When you add columns, the value in one matching column should exactly match with the other one.
Instead, you can try the below solution on your 'Main Data Table'.
Note: This solution is based on the scenarios posted. If there are more/different scenarios, you might have to tweak the custom expressions provided.
Step 1: Add a calculated column 'ID_string' which ignores lower case letters and digits.
Trim(RXReplace([Name],"[a-z0-9]","","g"))
Step 2: Add a calculated column 'family'.
If([ID_string]="SUP","Super",If(Len([ID_string])>3,right([ID_string],4),[ID_string]))
Final Output:
Hope this helps!
As #ksp585 mentioned, it doesn't seem like Spotfire can do exactly what I want, so I have come up with a solution using IronPython. Essentially, here is what I have done:
Created a table called FAMILIES, with the columns IDString and Family, which looks like this (using the same example strings above):
IDString | Family
------------------------
ABC | ABC
SUP | Super
QRST | QRST
Created a table called NAMES, as a pivot off of my main data table, with the only column being NAME. This just creates a list of unique names (since the data table has many rows for each name):
NAME
------------------------
ABC1234
ABC1215
AQRST31x2af
BQRST32x2ac
...
Created a Text Area with a button labeled Match Families, which calls an IronPython script. That script reads the NAMES table, and the FAMILIES table, compares each name to the IDString column with a regex, and associates each name with a family from the results. Any names that don't match a single IDString get the family name 'Other'. Then, it generates a new table called NAME_FAMILY_MAP, with the columns NAME and FAMILY.
With this new table, I can then add a column back to the original data table using a left outer join from NAME_FAMILY_MAP, matching on NAME. Because NAME_FAMILY_MAP is not directly linked to the NAMES table (generated by a button press), it does not create a circular dependency.
I can then add families to the FAMILIES table using another script, or by just replacing the FAMILIES table with an updated list. It's slightly more tedious than what I was hoping, but it works, so I'm happy.

How can I convert rows with three columns into SQL insert statements?

I have a spreadsheet looking like this:
あう to meet
青 あお blue
青い あおい blue
Is there a way that I could convert the data in these columns into three SQL statements that I could then use to enter the data into a database? I am not so much concerned in asking how to form the statement but I would like to know if there's a way I can take data in columns and make it into a script?
If I could convert this into:
col1: 青い col2: あおい col3: blue
I could add and modify this for the correct format
INSERT INTO JLPT col1,col2,col3 VALUES ('青い', 'あおい', 'blue')
etc
Use the formula
="('"&A1&"', '"&B1&"', '"&C1&"'), "
in column D and copy the formula down for all the rows. Then prepend
Insert into JPT (col1, col2, col3) values
and your are done. The end result will be something like this:
Just don't forget to delete the last comma (and optionally exchange it for a semicolon) when you copy over the data from Excel.

Excel: how to group and then sort groups in a custom order?

I have a table of data, I want to group this data and then sort the groups of rows in a custom way.
Example:
I have a table of data like this:
key | group
-------------
BC.AA | BC
AA.AA | AA
CC.DE | CC
AA.CD | AA
And a list of groups like this
group | no. of items
-------------------
BC | 1
CC | 1
AA | 2
How do I create a new table where the rows of the first table are grouped and ordered in the same way the second table is ordered. So like this:
key | group
-------------
BC.AA | BC
CC.DE | CC
AA.CD | AA
AA.AA | AA
I like to do this with excel formulas, so it updates automatically when the original table is changed. I hope to avoid using macros, but I could write a custom excel worksheet formula.
You could add a column to your first table of =MATCH(B1, GroupSheet!A:A), which will just return the corresponding row in GroupSheet that matches your group column, and sort by that.
You can do this in Excel 2010 by selecting the data you want to sort, going to the Data tab, clicking the Sort icon and then choosing Custom List... under Order. This will be fine for small sorts, but you might need something more powerful for longer lists...

Resources