I have a data set stored in an excel file, when i importing data using matlab function :
A=xlread(xls -filename)
matrix A only stored numeric values of my table.. and when i used another function such as:
B= readtable(xls-filename)
then table will view complete data include rows and columns headers but when i apply such operation on it like
Bnorm=normc(B)
its unable to perform normalization on it due to the rows and columns headers ..
my question are:
is there any way to avoid rows and columns header in table B.
is there any way to store rows and columns headers when read table using xlread function .. such that
column header = store first row in (xls-filename)
row headers = store first column in (xls-filename)
thanks for any suggestion
dataset table
normalized matrix when apply xlread(xls-filename
The answers to your specific questions are:
With a table, you can avoid row labels but column labels always exist.
As per the doc for xlsread, the first output is the numeric data, and the second output is the text data, which in this case would include your header information.
But, in this case, you just need to learn how to work with tables properly. You want something like,
>> Bnorm = normc(B{:,2:end});
which extracts all the numeric elements of table B and uses them as input to normc.
If you want the result to be a table then use
Bnorm = B;
Bnorm{:,2:end} = normc(B{:,2:end}));
Related
Suppose I have a csv containing columns with parametric information written as a string, something like "Title, temperature=25, voltage=0.8V, x=0" for each column. the rows are statistical data, lets say 10 samples for each set of title, temperature, voltage and x.
x is some variable (0 to 10 for example) that I want to use as my x axis of the new tables
I want to find a fay to group all columns with the same title, temperature and voltage into separate tables.
each column of the new table is an iteration of the statistical data, each row is the x values.
my input csv is:
my output should be 2 csv's (because there are 2 different sets of title, temperature and voltage)
csv out 1
csv out 2
In general there can be more sets so there is no limit on how many outputs I will get.
I was thinking of doing some kind of a loop that goes through all the columns, writes down all the different combinations (excluding the x value) into a set. so in our example it will be
test_set = set()
#first loop to create all the table names
for i in range(csv_in.shape[1]):
col_name = csv_in.columns[s]
#here i need some string manipulation to extract only the title, temp and voltage from the column name
test_set.add(col_name)
then another loop goes over that set and create a sub-table , then pivot it so that the x value is now the rows of the table, and saves the file with the appropriate name using the title, temp and voltage parameters.
practically I don't really know where to start since I'm pretty new to python so I would like some suggestions and any pointers you can give me would be great :)
thanks!
Briefly I am doing this steps.
tables = camelot.read_pdf(doc_file)
tables[0].df
I am using tables[0].df.columns to get column names from the extracted table.
But it does not give the column names.
Camelot extracted tables have no alphabetic column names.
tables[0].df.columns returns, for example, for three columns table:
RangeIndex(start=0, stop=3, step=1)
Instead, you can try to read the first row and get a list from it: tables[0].df.iloc[0].tolist().
The output could be:
['column1', 'column2', 'column3']
I have a SharePoint list as a datasource in Power Query.
It has a "AttachmentFiles" column, that is a table, in that table i want the values from the column "ServerRelativeURL".
I want to split that column so each value in "ServerRelativeURL"gets its own column.
I can get the values if i use the expand table function, but it will split it into multiple rows, I want to keep it in one row.
I only want one row per unique ID.
Example:
I can live with a fixed number of columns as there are usually no more than 3 attachments per ID.
I'm thinking that I can add a custom column that refers to "AttachmentFiles ServerRelativeURL Value(1)" but I don't know how.
Can anybody help?
Try this code:
let
fn = (x)=> {x, #table({"ServerRelativeUrl"},List.FirstN(List.Zip({{"a".."z"}}), x*2))},
Source = #table({"id", "AttachmentFiles"},{fn(2),fn(3),fn(1)}),
replace = Table.ReplaceValue(Source,0,0,(a,b,c)=>a[ServerRelativeUrl],{"AttachmentFiles"}),
cols = List.Transform({1..List.Max(List.Transform(replace[AttachmentFiles], List.Count))}, each "url"&Text.From(_)),
split = Table.SplitColumn(replace, "AttachmentFiles", (x)=>List.Transform({0..List.Count(x)-1}, each x{_}), cols)
in
split
I manged to solve it myself.
I added 3 custom columns like this
CustomColumn1: [AttachmentFiles]{0}
CustomColumn2: [AttachmentFiles]{1}
CustomColumn3: [AttachmentFiles]{2}
And expanded them with only the "ServerRelativeURL" selected.
It would be nice to have a dynamic solution. But this will work fine for now.
I would like to grab the first rows of all CSV files in a folder. I have read that power query would probably be best.
I have gone to Excel > Data > Get Data > From Folder > OK. That has brought me to a table of all the csvs in the folder. I would like to grab the first row of all of these files. I do not want to import all rows of the tables because it was way too many rows. It is also too many tables to do one by one. Please tell me what I should do next. Thank you!
First image is where I am, Second image is where I would like to be
The approach below should give you a single table, wherein each column contains a given CSV's first row's values. It's not exactly what you've shown in your second image (namely, there are no blank columns in between each column of values), but it might still be okay for you.
You can parse a CSV with Csv.Document function (which should give you a table).
You can get the first row of the table (from the previous step) using:
Table.First and Record.FieldValues
or Table.PromoteHeaders and Table.ColumnNames
(It would make sense to create a custom function to do above the steps for you and then invoke the function for each CSV. See GetFirstRowOfCsv in code below.)
The function above returns a list (containing the CSV's first row's values). Calling the function for all your CSVs should give you a list of lists, which you can then combine into a single table with Table.FromColumns.
Overall, starting from the Folder.Files call, the code looks like:
let
filesInFolder = Folder.Files("C:\Users\"),
GetFirstRowOfCsv = (someFile as binary) as list =>
let
csv = Csv.Document(someFile, [Delimiter=",", Encoding=65001, QuoteStyle=QuoteStyle.Csv]),
promoted = Table.PromoteHeaders(csv, [PromoteAllScalars=true]),
firstRow = Table.ColumnNames(promoted)
in firstRow,
firstRowExtracted = Table.AddColumn(filesInFolder, "firstRowExtracted", each GetFirstRowOfCsv([Content]), type list),
combined =
let
columns = firstRowExtracted[firstRowExtracted],
headers = List.Transform(firstRowExtracted[Name], each Text.BeforeDelimiter(_, ".csv")),
toTable = Table.FromColumns(columns, headers)
in toTable
in
combined
which gives me:
The null values are because there were more values in the first row of my ActionLinkTemplate.csv than the first rows of the other CSVs.
You will need to change the folder path in the above code to whatever it is on your machine.
In the GUI, you can select the top N row(s) where you choose N. Then you can expand all remaining rows.
We have a spreadsheet that gets updated monthly, which queries some data from our server.
The query url looks like this:
http://example.com/?2016-01-31
The returned data is in a json format, like below:
{"CID":"1160","date":"2016-01-31","rate":{"USD":1.22}}
We only need the value of 1.22 from the above and I can get that inserted into the worksheet with no problem.
My questions:
1. How to use a cell value [contain the date] to pass the date parameter [2016-01-31] in the query and displays the result in the cell next to it.
2. There's a long list of dates in a column, can this query be filled down automatically per each date?
3. When I load the query result to the worksheet, it always load in pairs. [taking up two cells, one says "Value", the other contains the value which is "1.22" in my case]. Ideally I would only need "1.22", not the title, can this be removed? [Del won't work, will give you a "Column 1" instead, or you have to hide the entire row which will mess up with the layout].
I know this is a lot to ask but I've tried a lot of search and reading in the last few days and I have to say the M language beats me.
Thanks in advance.
Convert your Web.Contents() request into a function:
let
myFunct = ( param as date ) => let
x = Web.Contents(.... & Date.ToText(date) & ....)
in
x
in
myFunct
Reference your data request function from a new query, include any transformations you need (in this case JSON.Document, table expansions, remove extraneous data. Feel free to delete all the extra data here, including columns that just contain the label 'value'.
(assuming your table of domain values already exists) add a custom column like
=Expand(myFunct( [someparameter] ))
edit: got home and got into my bookmarks. Here is a more detailed reference for what you are looking to do: http://datachix.com/2014/05/22/power-query-functions-some-scenarios/
For a table - Add column where you get data and parse JSON
let
tt=#table(
{"date"},{
{"2017-01-01"},
{"2017-01-02"},
{"2017-01-03"}
}),
add_col = Table.AddColumn(tt, "USD", each Json.Document(Web.Contents("http://example.com/?date="&[date]))[rate][USD])
in
add_col
If you need only one value
Json.Document(Web.Contents("http://example.com/?date="&YOUR_DATE_STRING))[rate][USD]