Power Query: Adding characters to a set limit across several columns/rows - excel

Very new to PQ, and I'm pretty sure it can do what I need in this situation, but I need help figuring out how to get there.
I have a timesheet report with 20 columns covering 50 rows that will need to be formatted to a word doc for uploading into a separate system. The original data in the cells range from 0 to any negative 2 digit number (ex: "-20"), but they need to be formatted to a seven-character set ending in ".00".
Examples:
0 will need to become "0000.00"
-4 will need to become "-004.00"
-25 will need to become "-025.00"
I think I should be able to use the text.insert function, but I'm not familiar enough with M Language to get it to do what I want it to do.
Any solutions/suggestions?

Here's my previous answer revisited...set up to use a function. You can just invoke the function once for each column you want to reformat. You'll just pass the name of the column you want to reformat to the function as you invoke the function each time.
Create a new blank query:
Open the new query in Advanced Editor and highlight everything in it:
Paste this over the highlighted text in the Advanced Editor:
let
FormatIt = (SourceColumn) =>
let
Base = Number.Round(SourceColumn,2)*.01,
Source = try Text.Start(Text.Range(
if Base < 7 then Text.From(Base) & "001" else
Text.From(Base),0,7),2) & Text.Range(Text.Range(
if Base < 7 then Text.From(Base) & "001" else
Text.From(Base),0,7),3,2) & "." & Text.End(Text.Range(
if Base < 7 then Text.From(Base) & "001" else
Text.From(Base),0,7),2)
otherwise "0000.00"
in
Source
in
FormatIt
...and click Done.
You'll see a new function has been created and listed in the Queries list on the left side of the screen.
Then go to your query with the columns you want to reformat (click on the name of your query that has the numbers you want to change in it, on the left side of the screen) and...
Click Invoke Custom Function
And fill out the pop-up like this:
- You can make up a different New column name than Custom.1.
- Function Query is the name of your query you are calling (the one you just created when you pasted the code)...for me, it's called Query1.
- Source Column is the column with the numbers you want to format.
...and click OK.
You can invoke this function once for each column. It will create a new formatted column for each.

You can use this formula = Text.PadStart(Text.From([Column1]),4,"0")&".00") in PQ to add new column that looks similar to your needs.

Here's an admittedly "busy" formula to do it:
= Table.AddColumn(#"Changed Type", "Custom", each Text.Start(Text.Range(if Number.Round([Column1],2)*.01 < 7 then Text.From(Number.Round([Column1],2)*.01) & "001" else Text.From(Number.Round([Column1],2)*.01),0,7),2) & Text.Range(Text.Range(if Number.Round([Column1],2)*.01 < 7 then Text.From(Number.Round([Column1],2)*.01) & "001" else Text.From(Number.Round([Column1],2)*.01),0,7),3,2) & "." & Text.End(Text.Range(if Number.Round([Column1],2)*.01 < 7 then Text.From(Number.Round([Column1],2)*.01) & "001" else Text.From(Number.Round([Column1],2)*.01),0,7),2))
It assumes your numbers that you want formatted are in Column1 to start. It creates a new column...Custom...with the formatted result.
To try it out, start with Column1 already populated and loaded into Power Query; then click the Add Column tab and then the Custom Column button, and populate the pop-up window like this:
...and click OK.
With more time, the repetitive parts could be made with variables to shorten this up a bit. This could also be turned into a function, given some time. But I don't have the time right now, so I figured I'd give you at least "something."

Related

Nesting error in Excel Formula - Changes to If

I'm currently trying to record a macro for my excel spreadsheet, but keep recieving the message "The specified formula cannot be entered because it uses more levels of nesting than are allowed in the current file format can anyone help me out in fixing the formula to make it smaller?
=IF(ISNUMBER(SEARCH("Conductor + Surface",B3)),"Conductor + Surface",IF(OR(ISNUMBER(SEARCH("17
1/2",B3)),ISNUMBER(SEARCH("Drilling",B3)),ISNUMBER(SEARCH("12 1/4",B3)),ISNUMBER(SEARCH("8
1/2",B3)),ISNUMBER(SEARCH("Run Screens",B3)),ISNUMBER(SEARCH("Temporay",B3)),ISNUMBER(SEARCH("BOP
Hop",B3)),(ISNUMBER(SEARCH("Data Acquisition",B3)))),"Inter, Res, Lower Comp., &
TP&A",IF(ISNUMBER(SEARCH("Maintenance",B3)),"BOP Maintenance",IF(OR(ISNUMBER(SEARCH("Re-
entry",B3)),ISNUMBER(SEARCH("Wellbore Prep",B3)),ISNUMBER(SEARCH("Run
Completion",B3)),ISNUMBER(SEARCH("Install TH",B3)),ISNUMBER(SEARCH("BOP
Pull",B3)),ISNUMBER(SEARCH("Subsea Move Off",B3)),ISNUMBER(SEARCH("BOP Run -
Completion",B3))),"Upper Comp & TH",IF(ISNUMBER(SEARCH("Rig Move - N and C",B3)),"Rig Move - N and
C",IF(ISNUMBER(SEARCH("Install XMT",B3)),"Install XMT w/ Rig",IF(ISNUMBER(SEARCH("Open
Plugs",B3)),"Open Plugs",IF(ISNUMBER(SEARCH("Rig Move - S and B",B3)),"Rig Move - S and
B",IF(ISNUMBER(SEARCH("Install VXT",B3)),"Install VXT","ERROR IN EXCEL FORMULA")))))))))
Currently there is a column with tasks that are too in depth, aka "New Conductor + Surface" or "ADCO - DG2 8 1/2" I want to make a new column with shorter names for each of them depending on certain words that are in the detailed column. I would then like to return an error if there is a detailed task that is not described properly.
This can be done by modifying the setup in this link to a 2 column lookup table.
See below for sample:
Array formula is =IFERROR(INDEX(lookupList,MATCH(TRUE,ISNUMBER(SEARCH(list,D7)),0),2),"NOT FOUND")
remember to press Ctrl+Shift+Enter when exiting cell edit mode.

VBA or function to connect 2x CSV files into 1x XLSX

Greets,
I got this scenario with 3x different files;
1) one CSV file has Column A (-first row) with abbreviations that needs to be copied on XLSX file (also in Column A)
+
2) another CSV has many rows and column where is explanation for the first case (abbrevations), and I have to look for explanation inside that big file (so vlookup I used).
=
3) xlsx file is separate that has to combine both CSV into one, where on Column A I will have abbreviations and on Column B explanations of the certain terms.
I tried with functions and simply defining ranges:
Column A1 ='C:\Users\MirzaV\Desktop\1\[0528-matrix.csv]0528-matrix'!A3
Column B1 =VLOOKUP(A1;'C:\Users\MirzaV\Desktop\1\[variantendb.csv]variantendb'!$C:$D;2;0)
So seems nothing hard or else, but problem is I am having XXX of these CSV files and one main CSV file with explanations (it is stated as "varianten") , that are gonna be updated periodically - all of the files.
Instead to open three files at the same time just to refresh my functions, is it a bit quicker way with a code or other functions?? And I would like to have it in XLSX file.
I tried to record a macro but it didnt work good, I was thinking I can use it for rest of the files but always gives an error.
Application.Left = 2318.5
Application.Top = 89.5
Windows("0528-matrix1.xlsx").Activate
Range("A1").Select
ActiveCell.FormulaR1C1 = "='0528-matrix.csv'!R[1]C"
Range("A1").Select
Selection.AutoFill Destination:=Range("A1:A500"), Type:=xlFillDefault
Range("A1:A500").Select
ActiveWindow.Close
ActiveWindow.ScrollRow = 1
Application.Left = 2161
Application.Top = 1
Application.Width = 720
Application.Height = 780
Windows("0528-matrix1.xlsx").Activate
Range("B1").Select
ActiveCell.FormulaR1C1 = "=VLOOKUP(RC[-1],variantendb.csv!C3:C4,2,0)"
Range("B1").Select
Selection.AutoFill Destination:=Range("B1:B500")
Range("B1:B500").Select
Application.Left = 1896.25
Application.Top = 32.5
Application.Width = 864
Application.Height = 493.5
Windows("variantendb.xlsx").Activate
ActiveWindow.Close
Application.Left = 1669
Application.Top = 1
ChDir "C:\Users\MirzaV\Desktop\1"
Since you're using Office 365 we can use the Get & Transform feature to create links to your CSV files. As long as you maintain the same filenames on the CSVs, this will enable Excel to automatically update the data.
We'll complete this data merge in 3 stages:
Link the reference CSV (the second file you have listed) to a table
Link to the data CSV (the first file) to a table
Write an Index/Match function to pull the descriptions.
Stage 1: Linking the reference file to a table
In a new Excel workbook, click on the Data tab, then click on the New Query dropdown in the Get & Transform section. Mouse over "From File >" and select "From CSV"
Navigate to CSV 2 and click Import
On the next window that pops up, click "Load"
Your lookup data will now load into a table on a new sheet. Now let's clean up the references here:
Click on the Formulas tab, then Click on Name Manager
Select your new table (it will be named the same as your file)
Change the name to "Reference" and click Ok.
Go to your table and change the column names from "Column 1" and "Column 2" to "Abbr" and "Desc"
And that's it for stage 1! Now that we have the reference table set up and linked, we can move on to loading the data table we want to find the descriptions for.
Stage 2: Linking the data file to a table
We're going to link to the data file in the same way we did the reference file. Go to Data > Get & Transform > New Query > From File > From CSV. Select your file and click Import, then click Load.
On the new table, rename Column 1 to "Code" (I would use Abbr, but Code will help keep the next step looking clear).
Add another column to this table. The simplest way is to just click in B1, type "Desc" (or whatever name of your choosing) and hit Enter.
Stage 3: The Index function that makes the magic
On your new data table with the blank description column, click in the first data cell.
Type in the function =INDEX(Reference[Desc],MATCH([#Code],Reference[Abbr],0)) and press Enter.
Watch the magic happen as Excel copies our formula to every cell in that table column!
By setting up our CSV files as external connections in this manner, we're able to create a dynamic table that will always update with the CSVs.
By using Index/Match, we're able to get away from the constraints of VLookup (data in left-most field, sorted alphabetically), and move to a system that allows us to look for the value we need from any field in any order.
Breaking it down, Index returns the value of the cell provided in the target row and column of the specified array or table. Because we specified the target array as a single column of data, we can use Index([array], [row number]), or using the code above Index(Reference[Desc], [row number]). What really makes this work is the use of Match. Match returns the row number in an array of a target value, so we use MATCH([#Code],Reference[Abbr],0). This returns the row number to Index, which then pulls the data from the desired cell.
There are some additional steps we can do using the Power Query Editor to ensure the column headers always stay the same, but that's a tutorial for a different day. Hope this helps!

(Excel 2013/Non-VBA) Format Data column based on value of another cell?

We have a column that is query driven, and the query partially formats the values in the column using math based off the value of a "user entry cell" on another sheet.
For the really curious, our query looks like this:
DECLARE #rotationsNum INT
SET #rotationsNum = ?
SELECT t.Piece_ID, t.Linear_Location, ((ROW_NUMBER() OVER(ORDER BY
Linear_Location) -1 )%#rotationsNum )*(360/#rotationsNum) AS Rotation FROM
(SELECT Position.Feature_Key, Piece_ID, ((Place-1)%(Places/#rotationsNum))+1 AS Linear_Location, Place, Measured_Value, Places FROM Fake.dbo.Position LEFT JOIN Fake.dbo.Features
ON Position.Feature_Key = Features.Feature_Key WHERE Position.Inspection_Key_FK = (SELECT Inspection_Key FROM Fake.dbo.Inspection WHERE Op_Key = ?)) AS t
ORDER BY Piece_ID, Linear_Location
The first parameter "#rotationsNum" is a cell that will always have a value between 1-4. IF the value is 1, the entire column will show "0"s, which we want to show as "N/A". However, it isn't as simple as "How to hide zero data.." Because if the "#rotationsNum" == 2, 3, or 4, there will still be 0 values in the column that need to be shown.
A "#rotationsNum" value of 2 will have the query write the column as such: example
So I am trying to come up with a way to format the column =IF(cell>1, do nothing, overwrite entire column to say "NA"). But I don't think it is that straight forward since the column is query driven.
My resolution was to format the column so that if the cell that drives the "#rotationsNum" parameter is below 2, then the whole column just gets "grayed out". It kind of makes it look like a redaction, and isn't as desirable as "NA", but it works for our purposes. Hopefully this solution helps someone else who stumbles upon this problem.

How can I add a 1 to the most recent, repeated row in Excel?

I have a dataset with 60+ thousand rows in excel and about 20 columns. The "ID column" sometimes repeats itself and I want to add a column that will return 1 only in the row that is the most recent only IF it repeats itself.
Here is the example. I have…
ID DATE ColumnX
AS1 Jan-2013 DATA
AS2 Feb-2013 DATA
AS3 Jan-2013 DATA
AS4 Dec-2013 DATA
AS2 Dec-2013 DATA
I want…
ID DATE ColumnX New Column
AS1 Jan-2013 DATA 1
AS2 Feb-2013 DATA 0
AS3 Jan-2013 DATA 1
AS4 Dec-2013 DATA 1
AS2 Dec-2013 DATA 1
I've been trying with a combination of sort and nested if's, but it depends on my data being always in the same order (so that it looks up the ID in the previous row).
Bonus points: consider my dataset if fairly large for excel, so the most efficient code that won't eat up processor would be appreciated!
An approach you could use is to point MSQuery at your table and use SQL to apply the business rules. On the positive side, this runs very quickly (a couple seconds in my tests against 64k rows). A huge minus is the query engine does not seem to support Excel tables exceeding 64k rows, but there might be ways to work around this. Regardless, I offer the solution in case it gives you some ideas.
To set up first give your data set a named range. I called it MYTABLE. Save. Next select a cell to the right of your table in row 1, and click through Data | From other sources | from Microsoft Query. Choose Excel Files* | OK, browse for your file. The Query Wiz should open, showing MYTABLE available, add all the columns. Click Cancel (really), and click Yes, you want to continue editing.
The MSQuery interface should open, click the SQL button and replace the code with the following. You will need to edit some specifics, such as the file path. (Also, note I used different column names. This was sheer paranoia on my part. The Jet engine is very finicky and I wanted to rule out conflicts with reserved words as I built this.)
SELECT
MYTABLE.ID_X,
MYTABLE.DATE_X,
MYTABLE.COLUMN_X,
IIF(MAXDATES.ID_x IS NULL,0,1) * IIF(DUPTABLE.ID_X IS NULL,0,1) AS NEW_DATA
FROM ((`C:\Users\andy3h\Desktop\SOTEST1.xlsx`.MYTABLE MYTABLE
LEFT OUTER JOIN (
SELECT MYTABLE1.ID_X, MAX(MYTABLE1.DATE_X) AS MAXDATE
FROM `C:\Users\andy3h\Desktop\SOTEST1.xlsx`.MYTABLE MYTABLE1
GROUP BY MYTABLE1.ID_X
) AS MAXDATES
ON MYTABLE.ID_X = MAXDATES.ID_X
AND MYTABLE.DATE_X = MAXDATES.MAXDATE)
LEFT OUTER JOIN (
SELECT MYTABLE2.ID_X
FROM `C:\Users\andy3h\Desktop\SOTEST1.xlsx`.MYTABLE MYTABLE2
GROUP BY MYTABLE2.ID_X
HAVING COUNT(1) > 1
) AS DUPTABLE
ON MYTABLE.ID_X = DUPTABLE.ID_X)
With the code in place MSQuery will complain the query can't be represented graphically. It's OK. The query will execute -- it might take longer than expected to run at this stage. I'm not sure why, but it should run much faster on subsequent refreshes. Once results return, File | Return data to Excel. Accept the defaults on the Import Data dialog.
That's the technique. To refresh the query against new data simply Data | Refresh. If you need to tweak the query you can get back to it though Excel via Data | Connections | Properties | Definition tab.
The code I provided returns your original data plus the NEW_DATA column, which has value 1 if the ID is duplicated and the date is the maximum date for that ID, otherwise 0. This code will not sort out ties if an ID's maximum date is on several rows. All such rows will be tagged 1.
Edit: The code is easily modified to ignore the duplication logic and show most recent row for all IDs. Simply change the last bit of the SELECT clause to read
IIF(MAXDATES.ID_x IS NULL,0,1) AS NEW_DATA
In that case, you could also remove the final LEFT JOIN with alias DUPTABLE.
Sort by ID, then by DATE (ascending). Define entries in new column to be 1 if previous row has the same ID and next row has a different ID or is empty (for last row), 0 otherwise.
It could be done in VBA. I'd be interested to know if this is possible just using formulas, I had to do something similar once before.
Sub Macro1()
Dim rowCount As Long
Sheets("Sheet1").Activate
rowCount = Cells(Rows.Count, 1).End(xlUp).Row
Columns("A:D").Select
Selection.AutoFilter
Range("D2:D" & rowCount).Select
Selection.ClearContents
Columns("A:D").Select
ActiveWorkbook.Worksheets("Sheet1").AutoFilter.Sort.SortFields.Add Key:=Range _
("B1:B" & rowCount), SortOn:=xlSortOnValues
ActiveWorkbook.Worksheets("Sheet1").AutoFilter.Sort.SortFields.Add Key:=Range _
("A1:A" & rowCount), SortOn:=xlSortOnValues
ActiveWorkbook.Worksheets("Sheet1").AutoFilter.Sort.Apply
Dim counter As Integer
For counter = 2 To rowCount
Cells(counter, 4) = 1
If Cells(counter, 1) = Cells(counter + 1, 1) Then Cells(counter, 4) = 0
Next counter
End Sub
So you activate the sheet and get the count of rows.
Then select and autofilter the results, and clear out Column D which has the 0s or 1s. Then filter on the values mbroshi suggested that you say you're already using. Then execute a loop for each record, changing the value to 1, but then back to 0 if the value ahead of it has the same ID.
Depending on your processor I dont think this would take more than a minute or two to run. If you do find something using formulas I would be interested to see it!

Multiply numbers in Excel or LibreOffice cell contents by a constant when they are mixed with text?

I have a long series of cells written like this (example text):
Example Number (3502, 456)
How would I multiply the numbers by 4 without having to delete the text?
I also have cells in the format [sic below]:
Example Number (3502,456) (4560,250) (2345,223)
et cetera, there are on average ten parentheses per text string.
Occasionally, the text might also be only one word long, e.g.
Example (3205, 456)
or
Example (3205,456) (4560,250) (2345,223)
et cetera.
(all above is [sic]).
As a sort of newbie to Excel (well, really Libre Office Calc but it's essentially the same), how would I do this? I don't want to go through and manually multiply all the numbers myself. The number I want to multiply by is 4. I've tried just running a find-and-replace to replace all ,'s and )'s with *4's, but the program I need these numbers for can't evaluate expressions, it needs single numbers.
There are some 110+ items on each list I need to change, and just one math error on any of the three lists (!) and the program won't run correctly (I'm resizing an image, and the points I plotted on the image didn't scale up with it). I don't want to risk it.
It should be possible to do this with a macro but unless I'm mistaken LibreOffice macro code is quite different from Excel VBA.
However if you can afford to use several columns of your spreadsheet to figure the values out, you can do so using formulae. If cell A1 contains
Example Number (3502,456) (4560,250) (2345,223)
and B1 contains
=MID(A1,FIND("(",A1)+1,9999)
then this formula will return the 3502 as a number:
=NUMBERVALUE(LEFT(B1,FIND(",",B1)-1))
(9999 is chosen to be much larger than the likely length of any line, so the MID function will always return the whole of the rest of the text after the search character).
You should be able to combine MID and FIND functions in further cells to isolate the other numbers, assuming these are always found in the format (xxx,yyy) as per your example. Then you can use a final formula to rebuild the string from the multiplied numbers:
="Example Number (" & 4*C1 & "," & 4*E1 & ")"
and so on.
If your data has a variable number of numbers to find, some of your FIND functions may return a #VALUE error. You may need to use an IF function to exclude these, for example:
=IF(ISERROR(G1),"",G1)
would return the value of G1 if it contains data, but blank if it contains an error.
Here is a Python LibreOffice macro that does what you want. It assumes all of the values are in column A, and it writes the results to column B.
import re
def do_calculations():
document = XSCRIPTCONTEXT.getDocument()
sheet = document.getSheets().getByIndex(0)
cellrange = sheet.getCellRangeByName("A1:A10000")
row_tuples = cellrange.getDataArray()
row = 1
for row_tuple in row_tuples:
if row_tuple:
row = output_values(row, row_tuple[0], sheet)
def output_values(row, pairs_string, sheet):
"""Multiply pairs of values by 4 and output each pair to B column.
:param row: the row number in the B column
:param pairs_string: a string like "Example Number (123, 456) (789, 1011)"
:param sheet: the current spreadsheet
Returns the next row number in the B column.
"""
pairs = re.findall(r'\([^)]+\)', pairs_string)
for pair in pairs:
match_obj = re.match(r'\((\d+),\s*(\d+)\)', pair)
x, y = match_obj.groups()
result = "(%d,%d)" % (int(x) * 4, int(y) * 4)
cell = sheet.getCellRangeByName("B" + str(row))
cell.setString(result)
row += 1
return row
# Functions that can be called from Tools -> Macros -> Run Macro.
g_exportedScripts = do_calculations,
Save the code to a text file, for example calc_multiply_numbers.py. Put it in Scripts/python in your LibreOffice user directory. On my Windows system it is C:\Users\JimStandard\AppData\Roaming\LibreOffice\4\user\Scripts\python. If the python directory doesn't exist yet, create it.
To run it, open the spreadsheet and go to Tools -> Macros -> Run Macro. Under My Macros, click calc_multiply_numbers and then press the Run button.
EDIT:
I don't think you need to worry about the JRE error. On my system I can uncheck "Use a Java runtime environment" in Tools -> Options -> LibreOffice -> Advanced, and it still works. I just click "No" when it asks if I want to enable the use of a JRE now, and then it runs my python macro.
The reason it is not showing up under My Macros is because python is not able to interpret the file correctly. To find the error, test it with python using the following steps (assuming Windows):
Open a command prompt, for example by pressing Win, typing cmd, and clicking "Command Prompt" from the start menu.
Type cd "path-to-libreoffice/program". On my 64-bit system this is cd "C:\Program Files (x86)\LibreOffice 5\program" I use the normal Windows File Explorer to find the exact path.
Type "python.exe python-script". On my system it is python.exe "C:\Users\JimStandard\AppData\Roaming\LibreOffice\4\user\Scripts\python\calc_multiply_numbers.py"
The python interpreter will give an error message about the problem. If you are not able to figure out the message, write it in the comments below and I will help you.

Resources