Parsing a long text string in Excel - excel

I have a column of text data in a csv file that looks something like this for each row:
{variable_1=result1, variable2=null, variable_three=result_3, variable_4=[{var4.1=result4.1, var4.2=result4.2, var4.3=result4.3}], variable_5=null}
I am trying to write some formulas to ultimately have the data look like this in Excel:
raw data
variable_1
variable2
variable_three
var4.1
var4.2
var4.3
variable_5
{variable_1=result1, variable2=null, variable_three=result_3, variable_4=[{var4.1=result4.1, var4.2=result4.2, var4.3=result4.3}], variable_5=null}
result1
null
result_3
result4.1
result4.2
result4.3
null
The variable names will be the same for each run of the query that fetches this, but the results will vary in character length--which is why I formatted my example the way I did. There is multiple rows of this too.
What's the best way to go about this??
Edit: There are approx 30 variables in my actual data

Related

Convert ITAB to XSTRING and back

I need to save an itab as an xstring or something like this and save it in dbtab.
Later I need to gather this xstring from dbtab and convert it in the itab before with exactly the same input from before.
I tried a lot of fuba´s like:
SCMS_STRING_TO_XSTRING or SCMS_XSTRING_TO_BINARY but I didn´t find something to convert it back.
Does somebody have tried something like this before and have some samples for me ?
Unfortunately I didn´t find something on other blogs or else.
An easy solution to convert into an xstring:
CALL TRANSFORMATION id SOURCE root = it_table RESULT XML DATA(lv_xstring).
Back would be like:
CALL TRANSFORMATION id SOURCE XML lv_xstring RESULT root = it_table.
For more information, see the ABAP documentation about data serialization and deserialization by using the XSL Identity Transformation.
use
import ... from data buffer
and
export ... to data buffer
to (re)store any variable as xstring.
Or you can use
import|export ... from|to database ...
I did some methods to do this:
First I loop at the table and concatenate it into a string.
Then convert the string into an xstring.
LOOP AT IT_TABLE ASSIGNING FIELD-SYMBOL(<LS_TABLE>).
CONCATENATE LV_STRING <LS_TABLE> INTO LV_STRING SEPARATED BY CL_ABAP_CHAR_UTILITIES=>NEWLINE.
ENDLOOP.
CALL FUNCTION 'SCMS_STRING_TO_XSTRING'
EXPORTING
TEXT = IV_STRING
IMPORTING
BUFFER = LV_XSTRING.
Back would be like:
Convert xstring back to string
String into table
TRY.
CL_BCS_CONVERT=>XSTRING_TO_STRING(
EXPORTING
IV_XSTR = IV_XSTRING
IV_CP = 1100 " SAP character set identification
RECEIVING
RV_STRING = LV_STRING
).
CATCH CX_BCS.
ENDTRY.
SPLIT IV_STRING AT CL_ABAP_CHAR_UTILITIES=>NEWLINE INTO: TABLE <LT_TABLE> .
READ TABLE <LT_TABLE> ASSIGNING FIELD-SYMBOL(<LS_TABLE>) INDEX 1.
IF <LS_TABLE> IS INITIAL.
DELETE TABLE <LT_TABLE> FROM <LS_TABLE>.
ENDIF.

Converting Issue date with download to CSV

I have a problem with the Function Module "GUI_DOWNLOAD" because of the date converting.
I want to get the date like I have it in my internal table but CSV (Excel) keeps converting it everytime.
The internal table contains the line like this: 12345678;GroupDate;2021-12-31;
The Output in the .csv-File should be "2021-12-31" but it keeps converting to "31.12.2021".
I also tried to put an ' (apostroph) before the date but the output will be '2021-12-31
Does anybode have an Idea ?
lv_conv = '2021-12-31'.
CONCATENATE TEXT-001 LV_CONV INTO LV_CONV.
CALL FUNCTION 'GUI_DOWNLOAD'
EXPORTING
FILENAME = IV_PATH
TABLES
DATA_TAB = LT_FILE.
LT_FILE is a string table.
Thanks for the help.
Like Suncatcher and Sandra said the file is right but it is only the settings from excel which convert the date.
If the Output file won´t be needed for other purposes than showing the code could be something like this
CONCATENATE '=("' LV_CONV '")' INTO LV_CONV.
The csv-Output would be a date like this 1960-01-01 but in the cell the value would look like =("1960-01-01").

Reordering data by manipulating column wise in Python

I have data in a csv file as follows:
60,27702,1938470,13935,18513,8
60,32424,1933740,16103,15082,11
60,20080,1946092,9335,14970,2
60,28236,1937936,13799,16871,6
60,22717,1943455,10809,16726,4
120,37702,2938470,23935,28513,8
120,42424,2933740,26103,25082,11
120,30080,2946092,2335,24970,2
120,38236,2937936,23799,26871,6
120,32717,2943455,20809,26726,4
180,47702,3938470,33935,8513,8
180,52424,3933740,36103,5082,11
180,40080,3946092,3335,4970,2
180,48236,3937936,33799,6871,6
180,42717,3943455,30809,6726,4
I then used the following code to insert column heading:
df = pd.read_csv("contikiMAC_new_out.csv", names=['Energest','CPU','LPM','Transmit','Listen','ID'])
I used df.groupby(['ID']) to see the data in group according to column 'ID'.
The problem is the data in column 'LPM' gets reset after some time so I would like to add the previous value with the new value whenever the new value in LPM column is smaller for specific 'ID' .
I tried doing :
for x in df.groupby(['ID']):
for i in df.ID:
if (df.loc[i, 'LPM'] < df.loc[i - 1, 'LPM']):
df.loc[i, 'LPM'] = df.loc[i, 'LPM'] + df.loc[i - 1, 'LPM']
But actually not getting the fruitful result I desire because it mixes with the 'LPM' value of different 'ID' and the process takes a long time. Can anyone please help me in suggesting a way to write the data group wise in a csv file based on 'ID' after performing the sum operation ?
The data structure I like to see is as follows:
60,27702,1938470,13935,18513,8
120,37702,2938470,23935,28513,8
180,47702,3938470,33935,37026,8
60,32424,1933740,16103,15082,11
120,42424,2933740,26103,25082,11
180,52424,3933740,36103,30164,11
60,20080,1946092,9335,14970,2
120,30080,2946092,2335,24970,2
180,40080,3946092,3335,29940,2
60,28236,1937936,13799,16871,6
120,38236,2937936,23799,26871,6
180,48236,3937936,33799,33742,6
60,22717,1943455,10809,16726,4
120,32717,2943455,20809,26726,4
180,42717,3943455,30809,33452,4
If I understood your problem correctly, DataFrame.shift is what you're looking for.
Something like:
df['LPM_prev'] = df.groupby(['ID'])['LPM'].shift(1)
And then you can work with that column

Replace all error values of all columns after importing datas (while keeping the rows)

An Excel table as data source may contain error values (#NA, #DIV/0), which could disturbe later some steps during the transformation process in Power Query.
Depending of the following steps, we may get no output but an error. So how to handle this cases?
I found two standard steps in Power Query to catch them:
Remove errors (UI: Home/Remove Rows/Remove Errors) -> all rows with an error will be removed
Replace error values (UI: Transform/Replace Errors) -> the columns have first to be selected for performing this operations.
The first possibility is not a solution for me, since I want to keep the rows and just replace the error values.
In my case, my data table will change over the time, means the column name may change (e.g. years), or new columns appear. So the second possibility is too static, since I do not want to change the script each time.
So I've tried to get a dynamic way to clean all columns, indepent from the column names (and number of columns). It replaces the errors by a null value.
let
Source = Excel.CurrentWorkbook(){[Name="Tabelle1"]}[Content],
//Remove errors of all columns of the data source. ColumnName doesn't play any role
Cols = Table.ColumnNames(Source),
ColumnListWithParameter = Table.FromColumns({Cols, List.Repeat({""}, List.Count(Cols))}, {"ColName" as text, "ErrorHandling" as text}),
ParameterList = Table.ToRows(ColumnListWithParameter ),
ReplaceErrorSource = Table.ReplaceErrorValues(Source, ParameterList)
in
ReplaceErrorSource
Here the different three queries messages, after I've added two new column (with errors) to the source:
If anybody has another solution to make this kind of data cleaning, please write your post here.
let
src = Excel.CurrentWorkbook(){[Name="Tabelle1"]}[Content],
cols = Table.ColumnNames(src),
replace = Table.ReplaceErrorValues(src, List.Transform(cols, each {_, "!"}))
in
replace
Just for novices like me in Power Query
"!" could be any string as substitute for error values. I initially thought it was a wild card.
List.Transform(cols, each {_, "!"}) generates the list of error handling by column for the main funcion:
Table.ReplaceErrorValues(table_with errors, {{col1,error_str1},{col2,error_str2},{},{}, ...,{coln,error_strn}})
Nice elegant solution, Sergei

MATLAB export data stored in a double array and cell array to a CSV file

I have a MATLAB structure with 19 fields. The main field is a 1 x 108033 double with all values numeric. It looks like this, basically 108033 numbers:
pnum: 5384940 5437561 5570271 5661637 5771155 ...
I have another field called inventors which is a 1 x 108033 cell value. Every cell contains a different number of strings. Columns 1 to 5 for example are
inventors: {2x1 cell} {4x1 cell} {1x1 cell} {1x1 cell} {1x1 cell}
For the first column value, the 2 x 1 cell consists of the following values
5012491-01 and 2035147-03 and so on.
I'd like to jointly export these two to a CSV file. The ideal outcome would repeat the number in pnum so that it establishes a clear link between the pnum and the inventors. Thus, the ideal outcome would look something like this (with the contents of what is in the inventors cell displayed).
pnum inventors
5384940 5012491-01
5384940 2035147-03
5437561 5437561-01
5437561 5437561-02
5437561 5437561-03
5437561 5012491-02
5570271 5437561-03
5661637 1885634-08
5771155 5012491-01
I asked a more complex version of this question before but it was not clear enough what the problem was. Hope it is now.
I'm assuming each cell in inventors is a cell array of strings. It wouldn't make sense for these to be actual floating point or intenger numbers, because the dash would subtract the two numbers separating them together. Now, because you're writing to a CSV file, the easiest thing I can think of is to iterate over each number and cell, then repeat the ID number for as many times as there are elements in a cell. First create the right headers, then write your results. Something like this comes to mind:
f = fopen('data.csv', 'w'); %// Open up data for writing
fprintf(f, 'pnum,inventors\n'); %// Write headers
for ii = 1 : numel(pnum) %// For each unique number
inventor = inventors{ii};
for jj = 1 : numel(inventor) %// For each inventor ID
fprintf(f, '%d,%s\n', pnum(ii), inventor{jj}); %// Write the right combo to file
end
end
fclose(f); %// Close the file
fopen here opens up a file called data.csv so we can write things to it. What is returned is a file pointer called f, which we use to write stuff to this file. After, we write the headers of the file, consisting of pnum and inventors. This is a CSV file so there's a comma separating the two. Now, for each unique number, we then access the right slot in inventors then for each unique inventor, add the same unique ID with the right inventor ID as a line in this file. I use fprintf to write things to file using the associated file pointer established earlier. Once we're done, close the file with fclose.
To double check that this works, I've used the small example you've provided in your post:
pnum = [5384940 5437561 5570271 5661637 5771155];
inventors = {{'5012491-01', '2035147-03'}.', {'5437561-01', '5437561-02', '5437561-03', '5012491-02'}.', {'5437561-03'}, {'1885634-08'}, {'5012491-01'}};
Bear in mind that I don't have access to your struct, so you'll have to access the right fields and assign them to the corresponding variables seen above. So if your struct is called something like data, then you'd do this before you run the above code:
pnum = data.pnum;
inventors = data.inventors;
Running the above code I just wrote and opening up the CSV file (which is called data.csv), I get this:
pnum,inventors
5384940,5012491-01
5384940,2035147-03
5437561,5437561-01
5437561,5437561-02
5437561,5437561-03
5437561,5012491-02
5570271,5437561-03
5661637,1885634-08
5771155,5012491-01

Resources