Azure Data Factory - Import excel file with dynamic range2 - excel

I have a scenario where I'm converting Excel to CSV/TXT my excel file is formatted,which means main content starts at let's say A12 but end cell is dynamic( like Y400 or y700 etc..) that we don't know, in that case is there a possibility in ADF to define Excel range.

There is a range option in adf for excel dataset wherein you can mention the dynamic range as below:
A12
This would start from A12 and dynamically locate the end point
range :-
The cell range in the given worksheet to locate the selective data, e.g.:
Not specified: reads the whole worksheet as a table from the first non-empty row and column
A3: reads a table starting from the given cell, dynamically detects all the rows below and all the columns to the right
A3:H5: reads this fixed range as a table
A3:A3: reads this single cell

Use a DataFlow and follow the below steps:
1.Add your excel/CSV Source data set( Insert 'FileName' in column to save filename
2. add filter activity and insert 'length(trim(replace(trim(replace(replace(replace(replace(replace( toString(array(columns()) ),'null','' ), ',',''),FileName,''),'[',''),']','')),'""',''))) != 0'
3. Boom!! empty rows are gone.

Related

How to automatically move an entire row when one cell moves on another sheet

I have two Excel spreadsheets. On the first sheet is a list of people's names with other data in the rest of the columns. In the second sheet, the first column is linked to the names in the first sheet (using "='Sheet1'!B1", etc); however, the rest of the columns in the second sheet are different types of data from the first sheet. If I want to move a name on the first sheet, this would automatically move the same name on the second sheet, but it won't bring the rest of the data with it. Is there a way to do this so that data follow the name?
I doubt there is an "canonical answer" because the "problem" is not canonical for Excel. In other words: Excel, which is a spreadsheet application, is not made to solve such problems.
Your assumption "the first column is linked to the names in the first sheet (using "='Sheet1'!B1", etc); " is wrong. The formula ='Sheet1'!B1 does not link to names. The formula result is what value is in 'Sheet1'!B1. If that value changes, the formula result also changes. That is exactly what you observe and call a "problem".
Linked tables are typical for a relational database system. There one table may have foreign keys to link to another table. See Foreign key. But Excel is not a relational database system.
There is Power Query to create a data query from Excel Tables which also is able to have foreign key relations between tables. But this is not really straightforward. So let's have a very simple example:
First create a workbook having two sheets having a data table each.
Example
Data of Sheet1:
Name Mail Value1
Name1 name1#example.com 123
Name2 name2#example.com 234
Name3 name3#example.com 234
Name of the Table is Data1
Data of Sheet2:
Name Value2
Name2 2345
Name1 1234
Name of the Table is Data2
Now create a Power Query from Table Data1. See Import from an Excel Table
Select a cell in Table Data1 (is on Sheet1).
Select Data > Get & Transform Data > From Table/Range.
Excel opens the Power Query Editor with your data displayed in a
preview pane.
To return the transformed data to Excel, select Home > Close & Load.
You get an additional sheet having the result of that data query. For me that sheet gets named "Data1".
Now do the same with Table Data2.
Select a cell in Table Data2 (is on Sheet2).
Select Data > Get & Transform Data > From Table/Range.
Excel opens the Power Query Editor with your data displayed in a
preview pane.
To return the transformed data to Excel, select Home > Close & Load.
You get an additional sheet having the result of that data query. For me that sheet gets named "Data2".
Now select the sheet "Data1" (The sheet which holds the first query result) and edit the query. See Create, load, or edit a query in Excel (Power Query) - > Edit a query from a worksheet -> To edit a query, locate one previously loaded from the Power Query Editor, select a cell in the data, and then select Query > Edit.
Now merge queries from Data1 with the one from Data2. See Merge queries (Power Query) -> Perform aMerge operation.
Select Home > Merge Queries.
The Merge dialog box appears.
Select the primary table from the first drop-down list, and then
select a join column by selecting the column header. This is column
"Name" in our case.
Select the related table from the next drop-down list, and then
select a matching column by selecting the column header. This is
table "Data2" and column "Name" in our case.
Select OK.
The result should be an new column "Data2" added to the query "Data1".
Now select the handle at right side of column name "Data2" and mark only "Value2" selected. The coulmn name should change to "Data2.Value2"
To return the transformed data to Excel, select Home > Close & Load.
Result should be the data table on sheet "Data1" cahnged to show the Value2 of Data2 too.
From now on all changes in Table "Data1" on Sheet1 and in Table "Data2" on Sheet2 will be put together on query result on Sheet "Data1" when you refresh that query.
Disclaimer: This answer is for the linked example in Automatically move an entire row of reference cell when one cell is moved or manipulated.
In Google Sheets, you can use the =QUERY() function to automatically move an entire row when one cell on another sheet changes.
Here is an example of how you can use the =QUERY() function to move an entire row from one sheet to another when a specific cell changes:
In the sheet where you want to move the row, create a new column and name it "Status."
In the sheet where the cell that will trigger the move is located, create a new column and name it "Move."
In the "Move" column, use an IF() statement to check if the cell you want to trigger the move is true or false.
For example: =IF(A1=TRUE,"Move","Stay")
In the sheet where you want to move the row, use the =QUERY() function to select the rows where "Move" is "Move" and "Status" is not "Moved".
For example: =QUERY(Sheet1!A1:Z,"select * where Move = 'Move' and Status != 'Moved'")
In the sheet where the cell that will trigger the move is located, use the =IF() statement to change the value of the "Status" column to "Moved" when "Move" is "Move".
For example: =IF(Move="Move","Moved",Status)
Use a script to automatically run the query and update the status every time the cell that triggers the move is changed.
Please note that this is an example and you will need to modify the formulas and sheet names to match your specific use case.
hope this helps
As the question is not really clear about this, there would be another approach if you have the following structure:
Given following data in Sheet1:
ID Name Mail Value1 Value2
1 Name1 Name1#example.com 123 1234
2 Name2 Name2#example.com 234 2345
3 Name3 Name3#example.com 345 3456
Now Sheet2 shall only pull different data from Sheet1 dependent on Name. Then Sheet2 could have following formulas filled downwards from row 2:
A2: =Sheet1!B2
B2:C4: =VLOOKUP($A2,Sheet1!$B$1:$E$998,MATCH(Sheet2!B$1,Sheet1!$B$1:$AAA$1,0),FALSE)
Now you can change data in Sheet1 and Sheet2 will always show the correct data for name because of the formulas.

[#[COLUMN_NAME]] reference in Excel

I've inherited a MS Excel workbook from a stalled project. I now need to reverse engineer and experiment to see how it all works. How does the formula know which value to fetch? I was expecting to see a cell reference.
=CONCATENATE([#[HAL '#]],"-",[#[Data Dimension]],"-",A3)
"HAL #" is a column header in worksheet (not the worksheet the formula is in)
"Data Dimension" is a column in the same sheet as the formula
"A3" is a column in the same sheet as the formula its a 3 digit number
Context
This will eventually feed a Tableau work flow...that's all I know...for now
What I've tried
I've changed values in the HAL and Data Dimension column. Sometimes
the value in the last parameter fetches the row of that column,
sometimes not...
There are no macro in this workbook
THere are no named range that resemble HAL (and Data Dimension)
HAL # must be a column in the same table. Excel won't let you save a formula with an invalid column reference. Check for hidden columns.

How to convert Execl sheet filled with data like Form to data like Tabular?

How to convert Execl sheet filled with data like Form to data like Tabular please find the next screenshot :
All i need transpose the data with limiting the header columns to one row then the data comes after.
Note : the headers is A column and B is the Data.
Example for the result required :
I tried to copy all the A and B Columns and Special Paste them it giving me error as the next screenshots :
Please offer me a help or workaround thanks in advance .
Say the original data is in sheet source. In another sheet, copy the 5 desriptors and PasteSpecialTranspose into the first row of the new sheet.
Then in cell A2 of the new sheet enter:
=INDEX(source!$B:$B,COLUMNS($A:A)+5*(ROWS($1:1)-1),0)
copy this formula both across and downward.
Source data:
New Sheet:
Note the constant 5 in the formula corresponds to the number of descriptors in the source data.

Excel - Selecting from cell A1 to AX, where X is read in cell B1

I have an Excel with 2 worksheets, first will be dinamically filled with data, having a row with a combo box feeding from row A on the second worksheet.
The second worksheet will also be filled dinamically, where will be:
Row A:
some values of variable number
B1 - Number of values on Row A to considerate.
My question is - Im using Data Validation > List to define the values on the ws1rowA combo box, is it possible to range from A1 to A(value in B1) ?
So far tried this on Data Validation "source" field:
=Sheet2!$A$1:offset(Sheet2!$A$1,=Sheet2!$B$1,0,1,1)
but an error is returned
You can also use INDIRECT function for this.
=INDIRECT("Sheet2!$A$1:$A"&Sheet2!$B$1)
In my version of Excel "You cannot use references to other worksheets or workbooks for Data Validation criteria" but you can use named ranges that have a Workbook Scope, so name a range (eg DataValid) to apply to say the range as #Maxim Korneev and then for Data Validation in Sheet1 use a list whose Source: is =DataValid.
Sure, just use
=OFFSET(Sheet2!$A$1,0,0,Sheet2!$B$1,1)

Excel charts - setting series end dynamically

I've got a spreadsheet with plenty of graphs in it and one sheet with loads of data feeding those graphs.
I've plotted the data on each graph using
=Sheet1!$C5:$C$3000
This basically just plots the values in C5 to C3000 on a graph.
Regularly though I just want to look at a subset of the data i.e. I might just want to look at the first 1000 rows for example. Currently to do this I have to modify the formula in each of my graphs which takes time.
Would you know a way to simplify this? Ideally if I could just have a cell on single sheet that it reads in the row number from and plots all the graphs from C5 to C 'row number' would be best.
Any help would be much appreciated.
OK, I had to do a little more research, here's how to make it work,
completely within the spreadsheet (without VBA):
Using A1 as the end of your desired range,
and the chart being on the same sheet as the data:
Name the first cell of the data (C5) as a named range, say TESTRANGE.
Created a named range MYDATA as the following formula:
=OFFSET(TESTRANGE, 0, 0, Sheet1!$A$1, 1)
Now, go to the SERIES tab of the chart SOURCE DATA dialog,
and change your VALUES statement to:
=Sheet1!MYDATA
Now everytime you change the A1 cell value, it'll change the chart.
Thanks to Robert Mearns for catching the flaws in my previous answer.
This can be achieved in two steps:
Create a dynamic named range
Add some VBA code to update the charts data source to the named range
Create a dynamic named Range
Enter the number of rows in your data range into a cell on your data sheet.
Create a named range on your data sheet (Insert - Name - Define) called MyRange that has a formula similar this:
=OFFSET(Sheet1!$A$1,0,0,Sheet1!$D$1,3)
Update the formula to match your layout
Sheet1!$A$1 set this to the top left hand side of your data range
Sheet1!$D$1 set this to the cell containing the number of rows
3 set this value to the number of columns
Test that the named range is working:
Select the dropdown menus Edit - Go To, type MyRange into the reference field.
Your data area for the chart should be selected.
Add some VBA code
Open the VBA IDE (Alt-F11)
Select Sheet1 in the VBAProject window and insert this code
Private Sub Worksheet_Change(ByVal Target As Range)
If Target.Address <> "$D$1" Then Exit Sub
'Change $D$1 to the cell where you have entered the number of rows
'When the sheet changes, code checks to see if the cell $D$1 has changed
ThisWorkbook.Sheets("Sheet1").ChartObjects(1).Chart.SetSourceData _
Source:=ThisWorkbook.Sheets("Sheet1").Range("MyRange")
' ThisWorkbook.Sheets("Chart1").SetSourceData _
Source:=ThisWorkbook.Sheets("Sheet1").Range("MyRange")
'The first line of code assumes that chart is embedded into Sheet1
'The second line assumes that the chart is in its own chart sheet
'Uncomment and change as required
'Add more code here to update all the other charts
End Sub
Things to watch for
Do not directly use the named range as the data source for the chart. If you enter the named range "MyRange" as the Source Data - Data Range for the chart, Excel will automatically convert the named range into an actual range. Any future changes to your named range will therefore not update your chart.
Performance might be impacted by the approaches listed above.
The OFFSET function in the named range is "volatile" which means that it recalculates whenever any cell in the workbook calculates. If performance is an issue, replace it with the INDEX formula.
=Sheet1!$A$1:INDEX(Sheet1!$1:$65536,Sheet1!$D$1,2)
The code fires everytime data is changed on Sheet1. If performance is an issue, change the code to run only when requested (i.e. via a button or menu).
You could look at dynamic ranges. If you use the OFFSET function, you can specify a starting cell and the number of rows and columns to select. This site has some useful information about assigning a name to an OFFSET range.
You can set the range for a chart dynamically in Excel. You can use something like the following VBA code to do it:
Private Sub Worksheet_Change(ByVal Target as Range)
Select Case Target
Case Cells(14, 2)
Sheet1.ChartObjects(1).Chart.SetSourceData Range("$C5:$C$" & Cells(14,2))
...
End Select
End Sub
In this case, the cell containing the number of the last row to include is B14 (remember row first when referring to the Cells object). You could also use a variable instead of the Cells reference if you wanted to do this entirely in code. (This works in both 2007 and 2003.) You can assign this procedure to a button and click it to refresh your chart once you update the cell containing the last row.
However, this may not be precisely what you want to do ... I am not aware of a way to use a formula directly within a chart to specify source data.
Edit: And as PConroy points out in a comment, you could put this code in the Change event for that worksheet, so that neither a button nor a key combination is necessary to run the code. You can also add code so that it updates each chart only when the matching cell is edited.
I've updated the example above to reflect this.
+1s for the name solution.
Note that names don't really really reference ranges, they reference formulae. That's why you can set a name to something like "=OFFSET(...)" or "=COUNT(...)". You can create named constants, just make the name reference something like "=42".
Named formulae and array formulae are the two worksheet techniques that I find myself applying to not-quite-power-user worksheets over and over again.
An easy way to do this is to just hide the rows/columns you don't want included - when you go to the graph it automatically excludes the hidden rows/columns
Enhancing the answer of #Robert Mearns, here's how to use dynamic cells ranges for graphs using only the Excel's formulas (no VBA required):
Create a dynamic named Range
Say you have 3 columns like:
A5 | Time | Data1 | Data2 |
A6 | 00:00 | 123123 | 234234 |
...
A3000 | 16:54 | 678678 | 987987 |
Now, the range of your data may change according to the data you may have, like you have 20 rows of data, 3000 rows of data or even 25000 rows of data. You want to have a graph that will be updated automatically without the need to re-set the range of your data every time you update the data itself.
Here's how to do it simply:
Define another cell that it's value will have the number of the occupied cells with data, and put the formula =COUNTIF(A:A,"<>"&"") in it. For example, this will be in cell D1.
Go to "Formulas" tab -> "Define Name" to define a name range.
In the "New Name" window:
i. Give your data range a name, like DataRange for example.
ii. In the "Refers to" set the formula to: =OFFSET(Sheet1!$A$1, 0, 0,Sheet1!$D$1,3),
where:
Sheet1!$A$1 => Reference: is the Reference from which you want to base the offset.
0 => Rows: is the number of rows, up or down, that you want the upper-left cell of the results to refer to.
0 => Columns: is the number of columns, to the left or right, that you want the upper-left cell of the results to refer to.
Sheet1!$D$1 => Height: is the height, in number of rows, that you want the result to be.
3 => Width: is the width, in number of columns, that you want the result to be.
Add a Graph, and in the "Select Data Source" window, in the Chart data range, insert the formula as you created. For the example: =Sheet1!DataRange
The Cons: If you directly use the named range as the data source for the chart, Excel will automatically convert the named range into an actual range. Any future changes to your named range will therefore not update your chart.
For that you need to edit the chart and re-set the range to =Sheet1!DataRange every time. This may not be so usable, but it's better than editing the range manually...

Resources