I am trying to parse an input file given the following format.
file = "Begin 'big section header'
#... section contents ...
sub 1: value
sub 2: value
....
Begin 'interior section header'
....
End 'interior section header'
End 'big section header'"
to return a list that greedily grabs everything between the labeled section header value
['section header', ['section contents']]
my current attempt looks like this
import pyparsing as pp
begin = pp.Keyword('Begin')
header = pp.Word(pp.alphanums+'_')
end = pp.Keyword('End')
content = begin.suppress() + header + pp.SkipTo(end + header)
content.searchString(file).asList()
returns
['section header', ['section contents terminated at the first end and generic header found']]
i suspect my grammar needs to be changed to some form of
begin = pp.Keyword('Begin')
header = pp.Word(pp.alphanums+'_')
placeholder = pp.Forward()
end = pp.Keyword('End')
placeholder << begin.suppress() + header
content = placeholder + pp.SkipTo(end + header)
but I cant for the life of me figure out the correct assignment to the Forward object that doesn't give me what I already have.
Even easier than Forward in this case would be to use matchPreviousLiteral:
content = begin.suppress() + header + pp.SkipTo(end + matchPreviousLiteral(header))
You are matching any end, but what you want is the end that matches the previous begin.
Related
I have a sigle column ComboBox that I have populated with a dynamic array; however, I'd like the code to replace the value of the combo box with respective column from my ListBox.
'Prepares the Active Escorts list box.
ivb = 0
i = 0
With frmEntry
.listboxActiveEscorts.Clear
.listboxActiveEscorts.ColumnHeads = False
.listboxActiveEscorts.ColumnCount = "15"
.listboxActiveEscorts.ColumnWidths = "0,100,100,0,0,100,100,0,0,0,0,0,100,100,80"
i = 0
'Badge # combobox properties
ReDim vbArray(0 To vbArrayCount - 1)
For i = 0 To vbArrayCount - 1
ivb = ivb + 1
vbArray(ivb - 1) = loVisBadge.Range.Cells(i + 2, 1).Value
Next i
.cbxVisitorBadgeNumber.List = vbArray
End With
This population of the ComboBox executes beautifully, in my belief, but if you see a better way to implement the dynamic list, I'm all ears.
I have a button that I will use to reset the form, which also works for the most part. When an item is selected from the list, I am able to clear all controls except for the ComboxBox containing the array of values. I will be adding two additional arrays to each of the other ComboBoxes on the form, and I imagine I am going to have the same problem: "Could not set the value property. Invalid property value"
This is the code assigned to click event of the ListBox:
Private Sub listboxActiveEscorts_Click()
With frmEntry
.cbxEscortSelectName.Value = .listboxActiveEscorts.Column(1) + " " + .listboxActiveEscorts.Column(2)
.txtCredential.Value = .listboxActiveEscorts.Column(4)
.txtEscortCompany.Value = .listboxActiveEscorts.Column(3)
.cbxVisitorName.Value = .listboxActiveEscorts.Column(5) + " " + .listboxActiveEscorts.Column(6)
.txtVECompany.Value = .listboxActiveEscorts.Column(7)
.txtVEDOB.Value = .listboxActiveEscorts.Column(8)
.txtVEIdentification.Value = .listboxActiveEscorts.Column(9)
.txtVEIDNumber.Value = .listboxActiveEscorts.Column(10)
.txtVEExpirationDate.Value = .listboxActiveEscorts.Column(11)
.txtVEStart.Value = .listboxActiveEscorts.Column(12)
.txtVEEnd.Value = .listboxActiveEscorts.Column(13)
'.cbxVisitorBadgeNumber.Value = ""
.cbxVisitorBadgeNumber = vbNullString
.cbxVisitorBadgeNumber.Value = .listboxActiveEscorts.Column(14)
End With
End Sub
What am I missing here. I tried to ReDim the array that assigned the values, but that didn't work. Is it a data type thing, perhaps?
In the picture below, you'll see the values populated in the controls, all except the visitor badge # (ComboBox which throws the error...I have commented out the line for illustration purposes, so you'll see the visitor badge # is blank).
I have a text file c:\test.txt with lot of information and I need only a few details from the file. Here is how my data looks
an army of ants
bchskkkk/kk/kl
id: intyst#abc.com
subject: this is an email
to xyz
kkdkdlkadkadk;kd;
jjdjsjdlasjdaljdljd
<st> This is my actual content
klfjaakjalkjflajflajefljalkfj
daklkajflkjalfkjaljflkajfkl
kkdlkal;dka;ldk <st>
In the above the rows starting with id, subject, and a few rows with start and end as are what I need in my dataset
This is what I have tried
filename data 'c:\test.txt';
data want;
infile data lrecl=1000 missover;
input #3 $ id 3-25 #4 $ sub1 10-25 #8 $ cmc 4-55
run;
The above doesn't solve my purpose. I have some 10k lines in each file with the above format and the text between and can be more than 10 rows.
Is there any better way of solving this?
Thank you
You cannot grab the data using a single input statement.
You will need to do 'landmark' detection in order to discover and extract the data portions you want, all the while retaining the parts you find as you scan the file.
Presuming id: is always a new data set row indicator and always present as the final part for a row, the following could work (not tested):
data messages;
length id $50;
length subject $100;
length message $1000;
retain id subject message;
infile data lrecl=1000 _infile_=line;
input;
if line =: "id" then do;
id_line_number = _n_;
id = substr(line,length("id:")+1);
subject = "";
content = "";
end;
if subject = "" and line =: "subject:" then do;
subject = substr(line,length("subject:")+1);
end;
if message = "" then do;
if line =: "<st>" then do;
* initial line in <st> block;
message = line;
end;
end;
else do;
* accumulate lines within <st> block;
message = trim(message) || ' ' line;
end;
* termination of <st> block, triggers a complete record and output to data set;
if length(message)>4 and substr(message,length(message)-3) = "<st>" then do;
message = substr(message,5,length(message)-8);
output;
message = "";
end;
run;
Some additional coding would be needed if the subject can be wrapped and is continued in subsequent adjacent lines as indented content
I'm using Matlab R2014b (that's why I cannot use strings, but only char vectors). Working inside a class, I have to take data from a table variable, format it following my needs, and then insert it into a GUI table (an instance of uitable, to be exact):
function UpdateTable(this)
siz = size(mydata);
tab = cell(siz);
tab(:,1) = num2cell(this.Data.ID);
tab(:,2) = cellstr(datestr(this.Data.Date,'dd/mm/yyyy'));
tab(:,3) = arrayfun(#(x){MyClass.TypeDef1{x,1}},this.Data.Type1);
tab(:,4) = arrayfun(#(x){MyClass.TypeDef2{x,1}},this.Data.Type2);
tab(:,5) = arrayfun(#(x){MyClass.FormatNumber(x)},this.Data.Value);
this.UITable.Data = tab;
end
Where:
properties (Access = private, Constant)
TypeDef1 = {
'A1' 'Name A1';
'B1' 'Name B1';
'C1' 'Name C1';
'D1' 'Name D1';
...
}
TypeDef2 = {
'A2' 'Name A2';
'B2' 'Name B2';
'C2' 'Name C2';
'D2' 'Name D2';
...
}
end
methods (Access = private, Static)
function str = FormatNumber(num)
persistent df;
if (isempty(df))
dfs = java.text.DecimalFormatSymbols();
dfs.setDecimalSeparator(',');
dfs.setGroupingSeparator('.');
df = java.text.DecimalFormat();
df.setDecimalFormatSymbols(dfs);
df.setMaximumFractionDigits(2);
df.setMinimumFractionDigits(2);
end
str = char(df.format(num));
end
end
Everything is working fine. Now I would like to right justify the strings to be inserted in columns 1 and 5, to improve the table readability. I found the Matlab function that suits my needs, strjust. Reading the documentation, I saw that it can be used with cell arrays of char vectors, so I modified part of my UpdateTable code as follows:
tab(:,1) = cellstr(num2str(this.Data.ID));
tab(:,5) = strjust(arrayfun(#(x){MyClass.FormatNumber(x)},this.Data.Value));
TThe second one produces no changes (strings are still not justified). Should the strings already contain enough whitespace to be all the same length?
Ok, I solved the problem by myself using the following code:
function UpdateTable(this)
siz = size(this.Data);
los = arrayfun(#(x){MyClass.FormatNumber(x)},this.Data.Value);
los_lens = cellfun(#(x)numel(x),los);
pad = cellfun(#blanks,num2cell(max(los_lens) - los_lens),'UniformOutput',false);
tab = cell(siz);
tab(:,1) = cellstr(num2str(this.Data.ID));
tab(:,2) = cellstr(datestr(this.Data.Date,'dd/mm/yyyy'));
tab(:,3) = arrayfun(#(x){MyClass.TypeDef1{x,1}},this.Data.Type1);
tab(:,4) = arrayfun(#(x){MyClass.TypeDef2{x,1}},this.Data.Type2);
tab(:,5) = cellstr(strcat(pad,los));
this.UITable.Data = tab;
end
It's probably not the most elegant solution, but it works. Starting from Matlab 2016, the padding can be performed using the built-in pad function.
I'm quite new to Matlab and I'm struggling trying to figure out how to properly preprocess my data in order to make some calculations with it.
I have an Excel table with financial log returns of many companies such that every row is a day and every column is a company:
I imported everything correctly into Matlab like this:
Now I have to create what's caled "rolling windows". To do this I use the following code:
function [ROLLING_WINDOWS] = setup_returns(RETURNS)
bandwidth = 262;
[rows, columns] = size(RETURNS);
limit_rows = rows - bandwidth;
for i = 1:limit_rows
ROLLING_WINDOWS(i).SYS = RETURNS(i:bandwidth+i-1,1);
end
end
Well if I run this code for the first column of returns everything works fine... but my aim is to produce the same thing for every column of log returns. So basically I have to add a second for loop... but what I don't get is which syntax I need to use in order to make that ".SYS" dynamic and based on my array of string cells containing company names so that...
ROLLING_WINDOWS(i)."S&P 500" = RETURNS(i:bandwidth+i-1,1);
ROLLING_WINDOWS(i)."AIG" = RETURNS(i:bandwidth+i-1,2);
and so on...
Thanks for your help guys!
EDIT: working function
function [ROLLING_WINDOWS] = setup_returns(COMPANIES, RETURNS)
bandwidth = 262;
[rows, columns] = size(RETURNS);
limit_rows = rows - bandwidth;
for i = 1:limit_rows
offset = bandwidth + i - 1;
for j = 1:columns
ROLLING_WINDOWS(i).(COMPANIES{j}) = RETURNS(i:offset, j);
end
end
end
Ok everything is perfect... just one question... matlab intellissense tells me "ROLLING_WINDOWS appears to change size on every loop iteration bla bla bla consider preallocating"... how can I perform this?
You're almost there. Use dynamic field names by building strings for fields. Your fields are in a cell array called COMPANIES and so:
function [ROLLING_WINDOWS] = setup_returns(COMPANIES, RETURNS)
bandwidth = 262;
[rows, columns] = size(RETURNS);
limit_rows = rows - bandwidth;
%// Preallocate to remove warnings
ROLLING_WINDOWS = repmat(struct(), limit_rows, 1);
for i = 1:limit_rows
offset = bandwidth + i - 1;
for j = 1:columns
%// Dynamic field name referencing
ROLLING_WINDOWS(i).(COMPANIES{j}) = RETURNS(i:offset, j);
end
end
end
Here's a great article by Loren Shure from MathWorks if you want to learn more: http://blogs.mathworks.com/loren/2005/12/13/use-dynamic-field-references/ ... but basically, if you have a string and you want to use this string to create a field, you would do:
str = '...';
s.(str) = ...;
s is your structure and str is the string you want to name your field.
EDIT: I figured out the problem. claim was listed as a cell class. Used cell2mat to convert it to char, and the code worked. Thank you, everybody!
I have a string variable set to a file pathway/location. I would like to use this variable as the input to the xlsread function, but Matlab tells me that xlsread cannot take a variable input. I'm having lots of trouble figuring whether a workaround is even possible. Can someone help me out?
function C = claimReader()
inp = csv2struct(['C:\Documents and Settings\nkulczy\My Documents\085 Starry Sky\','Starry_Sky_inputs_vert.csv']);
inputTitles = [{'Output_Dir'};{'Claimz'};{'Prodz'};{'Fault_Locations'};{'Fault_Type'};{'Primary_Failed_Part'};{'Part_Group'};{'Selected_SEAG'};{'Change_Point'};{'Date_Compile'};{'Minimum_Date'}];
claim = inp.(cell2mat(inputTitles(2))); %returns a file path/location string
C = csv2struct(claim);
end
function Out = csv2struct(filename)
%% read xls file with a single header row
[~, ~, raw] = xlsread(filename);
[~ , ~, ext] = fileparts(filename);
if ~strcmpi(ext, '.csv') %convert non .csv files to .csv, so blanks stay blank
filename=[pwd,'tempcsv',datestr(now,'yymmddHHMMSSFFF'),'.csv'];
xlswrite(filename,raw);
[~ , ~, raw] = xlsread(filename);
delete(filename);
end
if size(raw,1)==11 && size(raw,2)==2 %transpose SS inputs (must match dimensions of input matrix EXACTLY!!!)
raw = raw';
end
nRow = size(raw,1);
nCol = size(raw,2);
header = raw(1,:);
raw(1,:) = [];
end
Use the syntax as following and there shouldn't be any problems:
pathname = 'c:\...\filename.xlsx';
A = xlsread(pathname);
Edit: regarding your code:
I can't see where you define filename - you should pass claim (it contains the desired path?) to the xlsread-function.
Probably you get a cell with chars. So your input needs to be claim{1}
Check the documentation of xlsread, the first output (yourNums) will only return the numerical values in the sheet. txt will only return text. rawData will return the raw data in the sheet.
flNm = 'c:\myFolder\myFile.xlsx';
[yourNums, txt, rawData] = xlsread(flNm);
Update after TS:
claim is a cell array. So you need to pass claim{1} in order to let it be a string.