Excel and tableau doesn't detect date dimension as datetime column

Excel and tableau doesn't detect date dimension as datetime column - excel

I have a DIM Date dimension in my cube that has a date column whose type in SQL datamart is "date" and the data type in cube for this arribute is set to "date". But when I query the cube from excel and tableau, this arribute shows up as string instead of date and so I don't get the natural hierarchy of year ->month -> date in the client tools. Both excel and tableau format this column correctly if I connect directly to datamart instead of a cube.
Is there any trick or tip to make these client tools format the date column as datetime instead of string ?
I don't want to manually create these hierarchies in the cubes because there are 60+ date columns in my cube across all dimensions
Thanks

In Tableau, you can change the field type to date and then save the data source as a reusable tds file. The settings should then be retained
From http://www.theinformationlab.co.uk/2013/12/02/tableau-file-types-and-extensions/ :
Tableau Datasource (.tds)
When you connect to your data for the fist time, you may have a little bit of data ‘modelling’ to do – setting the right data types, changing default aggregations, setting default colours, creating some custom calculated fields etc etc. You are giving Tableau information about the data you will be using – you are setting up its ‘metadata’. When you want to connect to this data again, you don’t want to really go through all this data modelling a second time so instead you can save your metadata as a .tds file (again, it is saved in XML format) and connect to your data though this file instead. You could also distribute this file so that your colleagues have access to the nice formatting and custom fields you have worked to set up.
Tableau is clever enough to pick up new columns/fields in the data source if they appear and column ordering does not matter but if column names change or disappear completely, you will need to reconfigure.
To create a .tds file, from Tableau Desktop, right click on your data source connection and select Add to Saved Data Sources. Alternatively you can publish the .tds to Tableau Server by right clicking and selecting Publish to Server instead

Related

Date Field in Excel Won't Populate data for one user, all other user have success

I have a view in Snowflake that has a field with dates. The data type for the table in Snowflake is simply 'date'.
In Snowflake , the date shows as 'yyyy-mm-dd'
From excel, we connect to the Snowflake view via ODBC. There are five of us accessing the same data.
Four of the five users have data flowing into the date field.
One user can see all data except for the date field. From Excel, the user cannot reformat the date field to solve the problem.
While looking at a preview of the data before loading into Excel, the user does see the dates.
This happens with two separate tables. Only the one user has the problem.
My initial thought is to not change the data type in snowflake since it works for all but one, but maybe that is not the correct approach.
Any ideas?
Thank you.

Excel PowerPivot - change data source type

I have an Excel 2016 with 30 graphs based on PowerPivot. PowerPivot fetches the data from another Excel sheet, but I want it to get the data from a SQL server table instead.
How can I change the data source type in PowerPivot? I've tried looking in the Excel xml without any luck. Would be a lot of work re-creating all graphs over again just to switch data source
Thanks
Dennis

One suggestion I would make for the future, if all the users are using 2016 is to use Power Query which comes standard with that version of excel. In the Power Query loading data into Power Pivot scenario, all Power Pivot cares about is the column names. This means that the query can be changed between data source types without causing issues, as long as the same column names are changed.
As an example, I have one file that based on a parameter flag rips data out of a series of excel files on a shared network drive or Share Point. Both of which would be different data sources. The first opening a folder as the data source, then excel files listed within the folder. The other opening a share point list as its data source, then navigating though excel files.

Spreadsheet with relationships

I have to work with data CSV file. They look like this
sample
It represents products with options/cars etc. at the web-store.
It has a lot of columns with duplicated values and in my work in often need to copy some part of this data to another sheet, deduplicate it, edit and then paste it back by matching it for one of the columns that were untouched. More this purpose I'm using Ablebits Excel suit.
Is it possible by any excel function to automate this process or maybe there is some other software that could handle this? Something not so complicated as relational databases like Access, but something close to spreadsheet editor with relationships
I already tried Power Query in Excel and Power Bi, but they seem to be more analytics tools and not the data edit
2nd edition:
Data has a layer structure with duplicates.
Title1|Part number 1|Car1
Title1|Part number 1|Car2
Title2|Part number 2|Option1
Title2|Part number 3|Option2
I want to have opportunity to:
Edit values that duplicate without using "Replace All" or at least have more flexible "Find&Replace".
Extract columns with deduplicating them and saving a reference to the place they were taken. So if you edit some data there it was changed in the 1st place. For example, I have titles(a lot of titles) but need to edit it. Instead of copying it with some id to reference it I want to open it like they appears in filters, edit it, confirm and get it edited in all column

I would use Power Query (aka Get & Transform on the Data ribbon in Excel 2016). The only limitation I see with what you want to do is that Power Query will deliver a new Excel Table with the output of a Query - it can't update existing cells.
If you can get past that, Power Query is very flexible, easy to learn (WYSIWYG query editor), scales well and is integrated with other Microsoft products (as well as Power BI, there is integration with SQL Server Analysis Services in preview and hopefully SQL Server Integration Services one day).

Querying single data points from the Excel Data Model / Power Query (Get & Transform Data)

I'm using an up-to-date version of Excel 2016 (via O365 E3 license) and using Power Query / Get & Transform Data. I can successfully create queries and load them to the page. I have also successfully created Power Pivot reports.
I would like to query single data points from the data loaded via Power Query. For instance, imagine a dataset called DivisionalRevenue with:
Date Division Revenue
2016-01-01 Alpha 1000
2016-01-02 Alpha 1500
2016-01-01 Beta 2000
2016-01-02 Beta 400
I could easily load that to an Excel workbook or include it in the data model and create a power pivot. However, Power Pivot doesn't always meet my requirements, particularly around how the data is displayed on the page. In order to achieve my goal I may want to be able to query individual data points.
I would like to have a cell on the page with a formula in it that I can use to query individual data points. If it was in a pivot table I could use something like:
=GETPIVOTDATA("Revenue",$A$3,"Date",DATE(2016,1,1),"Division","Alpha")
The lookup values (date and division) could be retrieved from a cell on the page or hard-coded into the formula. This is a requirement for several reports I'm working on.
Or, I could add a combined lookup column with Date and Division concatenated and use a vlookup to pull the values like:
=VLOOKUP("42371Alpha",I9:L13,4,FALSE)
Finally, I could use a combination of INDEX and MATCH to identify the correct row number and then pull the data.
All of these solutions require the data to be loaded onto a sheet. One requires a pivot table that has to be refreshed to work properly. The other two require creating arbitrary lookup columns so that you can match a row based on more than one field (date and division in this example), and you have to ensure that that lookup field's formula is properly extended down the length of the data table. In both cases I would have concerns when sharing this workbook with my colleagues in case someone affects the rather fragile setup of the pivot table or the lookup.
So, what I truly want to find is something equivalent to pivot table querying against a dataset.
** This doesn't exist, but I would like to know if something like it does **
=GETQUERYDATA("Revenue","DivisionalRevenue","Date",DATE(2016,1,1),"Division","Alpha")
Does such a thing exist? Can such a thing be done? Can I retrieve arbitrary data points from the dataset created through Power Query / Get & Transform Data?

I think that what you want are cubefunctions:
Some Background
How to easy create cubefunctions from a pivot table

There is a feature in Excel that allows you to query off of a PowerPivot model, but it's not highly advertised for some reason.
Once you have the data in your PowerPivot model, go to your Excel -> Data tab -> Existing Connections -> Tables tab
From there, choose the table that you want to start with. Once that table's data is on your excel sheet, you can actually right click that table -> go to "Table" -> "Edit DAX"
From there you can enter the following DAX function, as an example
EVALUATE
FILTER(SampleData,[Date]=DATE(2016,1,1) && SampleData[Division]="Alpha")
Make sure to choose Command Type=DAX in the drop-down. Here's how it looks on my screen:
To further improve your querying power, you can install the optional "DAX Studio" plugin for Excel, which allows you to write custom DAX queries and then export the results directly back to an Excel sheet.

SSIS Excel Data Source - Is it possible to override column data types?

When an excel data source is used in SSIS, the data types of each individual column are derived from the data in the columns. Is it possible to override this behaviour?
Ideally we would like every column delivered from the excel source to be string data type, so that data validation can be performed on the data received from the source in a later step in the data flow.
Currently, the Error Output tab can be used to ignore conversion failures - the data in question is then null, and the package will continue to execute. However, we want to know what the original data was so that an appropriate error message can be generated for that row.

According to this blog post, the problem is that the SSIS Excel driver determines the data type for each column based on reading values of the first 8 rows:
If the top 8 records contain equal number of numeric and character types – then the priority is numeric
If the majority of top 8 records are numeric then it assigns the data type as numeric and all character values are read as NULLs
If the majority of top 8 records are of character type then it assigns the data type as string and all numeric values are read as
NULLs
The post outlines two things you can do to fix this:
First, add IMEX=1 to the end of your Excel driver connection string. This will allow Excel to read the values as Unicode. However, this is not sufficient if the data in the first 8 rows are numeric.
In the registry, change the value for HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Nod\Microsoft\Jet\4.0\Engines\Excel\TypeGuessRows to 0. This will ensure that the driver looks at all the rows to determine the data type for the column.

Yes, you can. Just go into the output column list on the Excel source and set the type for each of the columns.
To get to the input columns list right click on the Excel source, select 'Show Advanced Editor', click the tab labeled 'Input and Output Properties'.
A potentially better solution is to use the derived column component where you can actually build "new" columns for each column in Excel. This has the benefits of
You have more control over what you convert to.
You can put in rules that control the change (i.e. if null give me an empty string, but if there is data then give me the data as a string)
Your data source is not tied directly to the rest of the process (i.e. you can change the source and the only place you will need to do work is in the derived column)

If your Excel file contains a number in the column in question in the first row of data, it seems that the SSIS engine will reset the type to a numeric type. It kept resetting mine. I went into my Excel file and changed the numbers to "Numbers stored as text" by placing a single quote in front of them. They are now read as text.
I also noticed that SSIS uses the first row to IGNORE what the programmer has indicated is the actual type of the data (I even told Excel to format the entire column as TEXT, but SSIS still used the data, which was a bunch of digits), and reset it. Once I fixed that by putting a single-quote in my Excel file in front of the number in the first row of data, I thought it would get it right, but no, there is additional work.
In fact, even though the SSIS External DataSource Column now has the type DT_WSTR, it will still read 43567192 as 4.35671E+007. So you have to go back into your Excel file and put single quotes in front of all the numbers.
Pretty LAME, Microsoft! But there's your solution. I have no idea what to do if the Excel file is not under your control.

I was looking for a solution for the similar issue, but didn't find anything on the internet. Although most of the found solutions work at design time, they don't work when you want to automate your SSIS package.
I resolved the issue and made it work by changing the properties of "Excel Source". By default the AccessMode property is set to OpenRowSet. If you change it to SQL Command, you can write your own SQL to convert any column as you wish.
For me SSIS was treating the NDCCode column as float, but I needed it as a string and so I used following SQL:
Select [Site], Cstr([NDCCode]) as NDCCode From [Sheet1$]

Excel source is SSIS behaves crazy. SSIS determines the type of data in a particualr column by reading first 10 rows.. hence the issue. If you have a text column with null values in first 10 roes, SSIS takes the data type as Int. With a bit of struggle, here is a workaround
Insert a dummy row (preferrably first row) in the worksheet. I prefer doing this thru a Script task, you may consider using some service to preprocess the file before SSIS connects to it
With the duummy row, you are sure that the datatypes will be set as you need
Read the data using Excel source and filter out the dummy row before you take it for further processing.
I know it is a bit shabby, but it works :)

I could fix this issue. while creating the SSIS package, I manually changed the specific column to text (Open the excel file select the column, right click on column, select format cells, in number tab select Text and save the excel).
Now create the SSIS package and test it. It works. Now try to use the excel file where this column was not set as text.
It worked for me and I could execute the package successfully.

This should be resolved simply, just untick the box "Frist row as column names" and all data will be collected as text data type. Only downside of this choice is that you have to manage the columns names from the auto names (column 1, 2 etc) and handle the first row which contains the column names.

I had trouble implementing the solution here - I could follow the instructions, but it only gave new errors.
I solved my conversion issues by using a Data Conversion entity. This can be found on the SSIS Toolbox under Data Flow Transformations. I placed the Data Conversion between my Excel Source and OLE DB Destination, linked Excel to Data C, Data C to OLE DB, double clicked Data C to bring up a list of the data columns. Gave the problem column a new Alias, and changed the Data Type column.
Lastly, in the Mappings of the OLE DB Destination, use the Alias column name, rather than the original Excel column name. Job done.

You can use a Data Conversion component to convert to the desired data types.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string