How to separate Phone Numbers in the Name Column - excel

I have a few problems that I have been stuck with for a few days now.
I have a table as below:
| Full Name | Atlanta_Email_Only
| 16788889999 | random#gmail.com
| 14045556666 | notreal#gmail.com
| John Harris | johnharris#atlanta.com
| Sarah Smith | sarahsmith#atlanta.com
How can I use Power Query Editor to separate the Full Name into 2 columns; one is Join By Phone, and one is Full Name.
And for the email, how can I delete all the emails that does not contain the word Atlanta in it.
I have tried to use Split Column -> By Digit to Non-Digit / By Non_Digit to Digit for the Full Name, but it didn't work.
I also tried the Add Column -> Conditonal Column to drop the Email without containing the word Atlanta, but it also didn't work.
Thank you for you help.

In powerquery ... Right click Full name column and duplicate it
Click the new column, Transform data type .. whole number
Right click new column, replace errors, null
That is the numbers
Add column .. custom column to compare the new column with the original column using formula similar to:
= if [#"Full Name - Copy"] = null then [Full Name] else null
This is the text
Right click and remove original Full Name column
To filter the emails, right click the email column, transform .. lowercase
edit the code in the code window (or in Home ... advanced ... ) from
, Text.Lower, type text}})
to
, each if Text.Contains(_,"atlanta") then _ else null , type text}})
Full code sample below:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Duplicated Column" = Table.DuplicateColumn(Source, "Full Name", "Join By Phone"),
#"Changed Type1" = Table.TransformColumnTypes(#"Duplicated Column",{{"Join By Phone", Int64.Type}}),
#"Replaced Errors" = Table.ReplaceErrorValues(#"Changed Type1", {{"Join By Phone", null}}),
#"Added Custom" = Table.AddColumn(#"Replaced Errors", "FullName2", each if [#"Join By Phone"] = null then [Full Name] else null),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Full Name"}),
#"FilterEmail" = Table.TransformColumns(#"Removed Columns",{{"Atlanta_Email_Only", each if Text.Contains(_,"atlanta") then _ else null , type text}})
in #"FilterEmail"

Related

Replace text within a table for all cells that contain a given word for n columns

I have data within a table that occasionally has been inputted with text to say something like not available or No Data etc. I wish to replace each instance a cell contains no that this is then replaced with null across n number of columns. I don't know every type of word that has been entered but it looks as though each cell to be converted to null contains no as characters so I will go with this.
i.e.
Is there any way to combine `if text.contains([n columns],"no") then null else [n columns]
In powerquery, this removes the content of any cell containing (No,NO,no,nO) and converts to a null
Click select the first column, right click, Unpivot other columns
click select Value column and transform ... data type .. text
right click Value column and transform ... lower case
we really don't want that so change this in the formula bar
= Table.TransformColumns(#"Changed Type1",{{"Value", Text.Lower, type text}})
to resemble this instead (which also ignore the Case of the No)
= Table.TransformColumns(#"Changed Type1",{{"Value", each if Text.Contains(_,"no", Comparer.OrdinalIgnoreCase) then null else _, type text}})
click select attribute column
Transform ... pivot column
values column:Value, Advanced ... don’t aggregate
sample full code:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"Column"}, "Attribute", "Value"),
#"Changed Type1" = Table.TransformColumnTypes(#"Unpivoted Other Columns",{{"Value", type text}}),
#"CheckForNo" = Table.TransformColumns(#"Changed Type1",{{"Value", each if Text.Contains(_,"no", Comparer.OrdinalIgnoreCase) then null else _, type text}}),
#"Pivoted Column" = Table.Pivot(#"CheckForNo", List.Distinct(#"Lowercased Text"[Attribute]), "Attribute", "Value")
in #"Pivoted Column"

Can I extract a number with a specific format from a column, and copy the number to a new column?

I have a spreadsheet that is used for scheduling employees, and for our purposes we use a job name and number in the same cell like so:
Employee
1/1/22
1/2/22
John A
ABC Job 21-1111
XYZ Job 21-2222/ABC Job 21-1111
Mike D
XYZ Job 21-2222
JKL Job 21-3333
Sometimes we have employees going to multiple jobs in one day. With the way it's currently set up I can use power query, and then unpivot the data so that I can filter by job number, and see how many employees we had at a specific job on a certain date.
The issue is that when an employee goes to two jobs in one day, I get a count for "ABC Job 21-0101" and a separate count for "XYZ Job 21-0202/ABC Job 21-0101"
I'm looking for a way to pull the number "21-0101" and "21-0202" into a new row associated with each unpivoted record.
So I'd like it to look like this:
Date
Employee Name
Job #
1/1/22
John A
21-1111
1/1/22
Mike D
21-2222
1/2/22
John A
21-2222
1/2/22
John A
21-1111
1/2/22
Mike D
21-3333
I hope the question makes sense! any help is appreciated!
You can obtain your desired format from your input using Power Query, available in Windows Excel 2010+ and Excel/Office 365
To use Power Query
Select some cell in your Data Table
Data => Get&Transform => from Table/Range
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to understand the algorithm
Code for Regex
Paste into a Blank Query and Rename it fnRegexExtr
//see http://www.thebiccountant.com/2018/04/25/regex-in-power-bi-and-power-query-in-excel-with-java-script/
// and https://gist.github.com/Hugoberry/4948d96b45d6799c47b4b9fa1b08eadf
let fx=(text,regex)=>
Web.Page(
"<script>
var x='"&text&"';
var y=new RegExp('"&regex&"','g');
var b=x.match(y);
document.write(b);
</script>")[Data]{0}[Children]{0}[Children]{1}[Text]{0}
in
fx
Main M Code
let
//Read in data from Table
//Change table name in next line to actual table name in your workbook
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
//set the data types to text for all columns
#"Changed Type" = Table.TransformColumnTypes(Source,
List.Transform(Table.ColumnNames(Source), each {_, type text})),
//Unpivot the columns EXCEPT for the Employee column
// into a Date and a Value column
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Employee"}, "Date", "Value"),
//Use Regex to extract the job number as a List in the format nn-nnnn
//as written, the job number cannot be adjacent to any character in the set [A-Za-z0-9_]
#"Added Custom" = Table.AddColumn(#"Unpivoted Other Columns", "Job", each fnRegexExtr([Value], "\\b\\d\\d-\\d{4}\\b"), type text),
//expand the List column into rows
//then set the data type, remove unneeded columns, and put them in the desired order
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Added Custom", {{"Job", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Job"),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Job", type text}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type1",{"Value"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"Date", "Employee", "Job"})
in
#"Reordered Columns"

Move Data in Vertical Cells To Horizontal Cells in Excel 2007

I am using excel 2007
I have a excel sheet with around 1200 records with following structure...
WHAT CAN BE EASIEST WAY TO DO THIS ?
For easy understanding, Adding image :
As per your comment request, here is a Power Query solution.
To enter the code:
Select some cell in your Data Table
Data => Get&Transform => from Table/Range
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Algorithm
Fill in (fill down) the blank rows for the District and Branch columns
Group by District and Branch
For each Group, extract as a delimited string the entries for President, Secretary and Treasurer.
Create the appropriate column names and split the delimited strings into separate columns.
If you have more officers, or more items per officer/position, or more columns before you get to the officer columns, it should be relatively simple to modify the code to take that into account.
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table16"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"District", Text.Type}, {"Branch", type text},
{"President", type text}, {"Secretary", type text}, {"Treasurer", type text}}),
#"Filled Down" = Table.FillDown(#"Changed Type",{"District", "Branch"}),
#"Grouped Rows" = Table.Group(#"Filled Down", {"District", "Branch"},{
{"President", each Text.Combine([President],";")},
{"Secretary", each Text.Combine([Secretary],";")},
{"Treasurer", each Text.Combine([Treasurer],";")}
}),
colHeaderSuffix = {"","Addr","Mobile"},
PresidentCols = List.Accumulate(colHeaderSuffix, {}, (state, current) => List.Combine({state, {"President " & current}})),
#"Split Column by Delimiter" = Table.SplitColumn(#"Grouped Rows", "President",
Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), PresidentCols),
SecretaryCols = List.Accumulate(colHeaderSuffix, {}, (state, current) => List.Combine({state, {"Secretary " & current}})),
#"Split Column by Delimiter2" = Table.SplitColumn(#"Split Column by Delimiter", "Secretary",
Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), SecretaryCols),
TreasurerCols = List.Accumulate(colHeaderSuffix, {}, (state, current) => List.Combine({state, {"Treasurer " & current}})),
#"Split Column by Delimiter3" = Table.SplitColumn(#"Split Column by Delimiter2", "Treasurer",
Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), TreasurerCols)
in
#"Split Column by Delimiter3"
Original Data
Results
Formula Used as follows (Giving solution here so in future members can use it)
The formula is : In empty cell after Treasurer Column,
=If($a2="","",a2) // copy over next 4 columns to give the District, Branch, Pre name, address, mobile elements as =If($a2="","",a2), =If($b2="","",b2), =If($c2="","",c2), =If($c2="","",c3), =If($c2="","",c3), =If($c2="","",c4)
=if($d2="","",d2) // copy over next 2 columns for Secretary details like =if($d2="","",d2), =if($d2="","",d3), =if($d2="","",d4)
=if($g2="","",g2) // copy over next 2 columns for Treasurer details like =if($g2="","",g2), =if($g2="","",g3), =if($g2="","",g4)
Now Select All New Formula Cells in a row after Treasurer column >> Drag Down Till All Records....
Then Copy all these down to the bottom of your data
Either copy / paste special >> values to somewhere else and
Then sort by District / Branch / Pres to drop the blank rows
I don't know if your excel pc will be able to handle it but you can use the [Paste Transpose][1].
You copy everything, (my advice, go to a new spreadsheet, but you can use the same one),
and then you past it use past transform
*edit
after you edited your question with the example you might want to use the past transform and then use pivot table

How to number each occurrence of a substring in a cell in Power Query?

I'm fairly new to Power Query and have hit a hiccup that's been bothering me all day. I've read multiple threads here and on the Power BI community and none has really cleared my question, and my logic suggests a few different options to achieve what I want, but my lack of experience blocks any solution I attempt.
Context:
I'm building a database for product import/export into WooCommerce, eBay and other channels; which takes some inputs by the (non tech savyy) users in Excel and develops several of the required fields. One of those is the image file names for each product.
I have this columns (in a much larger query table):
| ImageBaseName | ImageQTY | ImageIDs |
| product-name.jpg | 3 | product-name.jpg product-name.jpg product-name.jpg |
| other-product.jpg| 5 |other-product.jpg other-product.jpg...other-product.jpg |
And my desired output would be:
| ImageBaseName | ImageQTY | ImageIDs |
| product-name.jpg | 3 | product-name-1.jpg product-name-2.jpg product-name-3.jpg |
| other-product.jpg| 5 |other-product-1.jpg other-product-2.jpg...other-product-5.jpg |
In fact I don't need the two first columns if I get the ImageIDs like that.
The ImageBaseName column is generated from the input product name.
The ImageQTY column is direct input by the user.
The ImageIDs column I got so far is from using:
= Table.AddColumn(#"previous step", "ImageIDs", each Text.Trim(Text.Repeat ([ImageBaseName]&" ", [ImageQty])))
And these are the options I've considered thus far:
Option 1: Text.Combine(Text.Split ImageIDs and (somehow) count and number each item in the list) and concatenate it all back... Which would probably start like this: Text.Combine(Text.Split,,,
Option 2 Using the UI, splitting the ImageIDs by each space and by a high number of columns (as I don't know how many images each product will have, but probably no more than 12) and then assign a number suffix to each of those columns and then putting it all back together, but it feels messy as hell.
Option 3 Probably theres a clean calculated way to generate the numbered image base names based on the number in the second column, and then attach the .jpg at the end of each, but honestly I don't know how.
I'd like it to be on the same table as I am already dealing with different queries...
Any help would be gladly accepted.
Starting with this as Table1:
This M code...
let
Source = Table1,
SplitAndIndexImageIDs = Table.AddColumn(Source, "Custom", each Table.AddIndexColumn(Table.FromColumns({Text.Split([ImageIDs]," ")}),"Index",1)),
RenameImageIDs = Table.AddColumn(SplitAndIndexImageIDs, "NewImageIDs", each Text.Combine(Table.AddColumn([Custom],"newcolumn",each Text.BeforeDelimiter([Column1], ".") & "-" &Text.From([Index]) & "." & Text.AfterDelimiter([Column1], "."))[newcolumn],", ")),
#"Removed Other Columns1" = Table.SelectColumns(RenameImageIDs,{"ImageBaseName", "ImageQTY", "NewImageIDs"})
in
#"Removed Other Columns1"
Should give you this result:
Here's a chunky "uber step" piece of code you could put in a custom column given the ImageBaseName and ImageQty columns
Text.Combine
(
List.Transform
(
List.Zip
(
{
List.Repeat({Text.BeforeDelimiter([ImageBaseName], ".", {0, RelativePosition.FromEnd})},[ImageQTY])
,
List.Transform({1..[ImageQTY]}, each "-" & Number.ToText(_) &".")
,
List.Repeat({Text.AfterDelimiter([ImageBaseName], ".", {0, RelativePosition.FromEnd})}, [ImageQTY])
}
)
, each Text.Combine(_)
)
, " "
)
Summary is you create the components of your string as 3 lists (text before file type, numbers 1 through qty, text after file type). Then you use List.Zip which combines the three text components into their own lists. Then we convert those lists back to a single piece of text with List.Transform and Text.Combine.
Lets assume range Table1 contains two columns ImageBaseName and Quantity
Add column ... Index column...
Right Click ImageBaseName Split Column...By Delimiter... --Custom--, use a period as the delimiter and split at Right-most delimiter. That will pull the image suffix off
Add Column ... Custom Column ... name it list and use formula ={1..[Quantity]} which will create a list of values from 1 to the Quantity
Click the double arrow at the top of the new list column and choose expand to new rows
Click-Select the list, Quantity, ImageBaseName.2, ImageBaseName.1 columns and Transform ... Data Type...Text
Add Column .. Custom Column .. name it Custom and use formula =[ImageBaseName.1]&"-"&[list]&"."&[ImageBaseName.2] to put together all the parts
Right-click Index Group By ... [x] Basic, Group By index, new column name ImageIDs, Operation count rows
That will generate code like this:
Table.Group(#"Added Custom1", {"Index"}, {{"ImageIDs", each Table.RowCount(_), type number}})
Use formula bar to change the formula as shown below. It will combine rows using , as a separator
Table.Group(#"Added Custom1", {"Index"}, {{"ImageIDs", each Text.Combine([Custom], ", "), type text}})
Full sample code is below that you can paste into Home .. Advanced Editor...
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1),
#"Split Column by Delimiter" = Table.SplitColumn(#"Added Index", "ImageBaseName", Splitter.SplitTextByEachDelimiter({"."}, QuoteStyle.Csv, true), {"ImageBaseName.1", "ImageBaseName.2"}),
#"Added Custom" = Table.AddColumn(#"Split Column by Delimiter", "list", each {1..[Quantity]}),
#"Expanded list" = Table.ExpandListColumn(#"Added Custom", "list"),
#"Changed Type1" = Table.TransformColumnTypes(#"Expanded list",{{"list", type text}, {"Quantity", type text}, {"ImageBaseName.2", type text}, {"ImageBaseName.1", type text}}),
#"Added Custom1" = Table.AddColumn(#"Changed Type1", "Custom", each [ImageBaseName.1]&"-"&[list]&"."&[ImageBaseName.2]),
#"Grouped Rows" = Table.Group(#"Added Custom1", {"Index"}, {{"ImageIDs", each Text.Combine([Custom], ", "), type text}})
in #"Grouped Rows"
There are probably many ways to combine all this into one uber step, but I thought I'd show the parts

excel formula with multiple criteria (match and index?)

I have a table with following structure and it shows calendar entries:
| Title | Description | StartTime | EndTime | User |
.
I want to create a new table with the following structure and this table would show all users and their plans for the date which has given in the first row.:
| User | Date1 | Date2 | Date3 | …
.
My problem is something like this:
I want to show in the second table the titles of the rows if the Date1(or Date2 ..) is between Start- and End date. So I need an excel formula which I can write in all cells.
.
I could write a SQL statement like that (I know its syntax is not correct but I want to show what I need):
SELECT Title
FROM Table1, Table2
WHERE Date1 > StartDate AND Date1 < EndDate and User.Table1 = User.Table2
.............
Can you please help me?
Can't think of a simple way to do this.
First of all, how do you plan to display it if there are two titles that fall under the same date segment for the same user?
To me this looks like an effort to reverse engineer a summary table to a more detailed table, in which you will need to type in the individual column by dates - fill in all the missing data, then a simple pivot would do the job.
First you will need to keep only ONE date field, then populate all the dates in between start and end date.
From this:
*listing two titles - a and b for user ak to illustrate the problem where one user has multiple titles appearing within the same date segment.
To this: - populating all the dates where the title will appear
Then just pivot the new range to get this:
Instead of the title being listed out, we can see which date did it occur. Easily copy and paste the pivot as values, then replace the title count "1" with title name "a" to get below:
Assuming you would want the title concatenated by user, just copy the blue part, and get the end result below:
Do you have Power Query? if you have Excel 2016 version you have it (Get & Transform) in previous versions you can download it. it is a free add-in.
Go to Data
Select From Table/Range
ok
It will appear the Query Editor, there you can:
Change data type to "Date"
Go to Add Column
And 7. In date options select "Subtract Days"
Fix the negatives results Duration.Days([End] - [Start])
Add a "custom column" List.Dates([Start],[Subtraction]+1,#duration(1,0,0,0))
Click in the corner (doble arrow) and chose "Expand to New Rows"
Select and delete Columns that you won't need
Go to Transform
Click "Pivot Column"
In "Advanced Options" select "Don't aggregate"
ok
Go Home select "close & load"
Finally you get a new sheet with the new information.
You can add some filters to see a specific period of time...
The amazing thing about this is you can append all the data that you want, and then it will be a simple right click and refresh in the green table, and you will have your data fixed it.
This is the query if you just want to copy and paste in the "Advanced Editor"
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Title", type text}, {"Start", type date}, {"End", type date}, {"User", type text}}),
#"Inserted Date Subtraction" = Table.AddColumn(#"Changed Type", "Subtraction", each Duration.Days([End] - [Start])),
#"Added Custom" = Table.AddColumn(#"Inserted Date Subtraction", "Days", each List.Dates([Start],[Subtraction]+1,#duration(1,0,0,0))),
#"Expanded Days" = Table.ExpandListColumn(#"Added Custom", "Days"),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Days",{"Start", "End", "Subtraction"}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Removed Columns", {{"Days", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Removed Columns", {{"Days", type text}}, "en-US")[Days]), "Days", "Title")
in
#"Pivoted Column"

Resources