how do I get rid of leading/trailing spaces in SAS search terms? - excel

I have had to look up hundreds (if not thousands) of free-text answers on google, making notes in Excel along the way and inserting SAS-code around the answers as a last step.
The output looks like this:
This output contains an unnecessary number of blank spaces, which seems to confuse SAS's search to the point where the observations can't be properly located.
It works if I manually erase superflous spaces, but that will probably take hours. Is there an automated fix for this, either in SAS or in excel?
I tried using the STRIP-function, to no avail:
else if R_res_ort_txt=strip(" arild ") and R_kom_lan=strip(" skåne ") then R_kommun=strip(" Höganäs " );

If you want to generate a string like:
if R_res_ort_txt="arild" and R_kom_lan="skåne" then R_kommun="Höganäs";
from three variables, let's call them A B C, then just use code like:
string=catx(' ','if R_res_ort_txt=',quote(trim(A))
,'and R_kom_lan=',quote(trim(B))
,'then R_kommun=',quote(trim(C)),';') ;
Or if you are just writing that string to a file just use this PUT statement syntax.
put 'if R_res_ort_txt=' A :$quote. 'and R_kom_lan=' B :$quote.
'then R_kommun=' C :$quote. ';' ;

A saner solution would be to continue using the free-text answers as data and perform your matching criteria for transformations with a left join.
proc import out=answers datafile='my-free-text-answers.xlsx';
data have;
attrib R_res_ort_txt R_kom_lan length=$100;
input R_res_ort_txt ...;
datalines4;
... whatever all those transforms will be performed on...
;;;;
proc sql;
create table want as
select
have.* ,
answers.R_kommun_answer as R_kommun
from
have
left join
answers
on
have.R_res_ort_txt = answers.res_ort_answer
& have.R_kom_lan = abswers.kom_lan_answer
;

I solved this by adding quotes in excel using the flash fill function:
https://www.youtube.com/watch?v=nE65QeDoepc

Related

Struct name from variable in Matlab

I have created a structure containing a few different fields. The fields contain data from a number of different subjects/participants.
At the beginning of the script I prompt the user to enter the "Subject number" like so:
prompt='Enter the subject number in the format SUB_n: ';
SUB=input(prompt,'s');
Example SUB_34 for the 34th subject.
I want to then name my structure such that it contains this string... i.e. I want the name of my structure to be SUB_34, e.g. SUB_34.field1. But I don't know how to do this.
I know that you can assign strings to a specific field name for example for structure S if I want field1 to be called z then
S=struct;
field1='z';
S.(field1);
works but it does not work for the structure name.
Can anyone help?
Thanks
Rather than creating structures named SUB_34 I would strongly recommend just using an array of structures instead and having the user simply input the subject number.
number = input('Subject Number')
S(number) = data_struct
Then you could simply find it again using:
subject = S(number);
If you really insist on it, you could use the method proposed in the comment by #Sembei using eval to get the struct. You really should not do this though
S = eval([SUB, ';']);
Or to set the structure
eval([SUB, ' = mydata;']);
One (of many) reasons not to do this is that I could enter the following at your prompt:
>> prompt = 'Enter the subject number in the format SUB_n: ';
>> SUB = input(prompt, 's');
>> eval([SUB, ' = mydata;']);
And I enter:
clear all; SUB_34
This would have the unforeseen consequence that it would remove all of your data since eval evaluates the input string as a command. Using eval on user input assumes that the user is never going to ever write something malformed or malicious, accidentally or otherwise.

How to input string in a Matlab function

I want to write a function that loads a text file and plots its content with time. I have 20 text files so I want to be able to choose from them.
My current not working code:
TextFile is a generic variable
text123.txt is the actual name of one of the files i want to load
function []= PlotText(TextFile)
text(1,:)=load('text123.txt') ;
t=0:10;
plot(t,text)
end
I appreciate any help!!
use importdata instead of load with appropriate delimiter. I assume you used Tab.
filename = 'num.txt';
delimiterIn = '\t';
text = importdata(filename,delimiterIn)
t=1:10;
plot(t,text);
Firstly, you can also use dlmread if your file contains only numeric data separated by the same symbol (called a delimiter) such as a comma (,), semicolon (;), space ( ), or tab ( ). This would look like:
function []= PlotText(TextFile)
text(1,:)=dlmread('text123.txt');
t=0:10;
plot(t,text)
end
Keep in mind that your code is written in a way that expects the contents of text123.txt to have 11 values in a single row. Also, if you are using multiple files, then I suggest having the file name be another input to the function:
function []= PlotText(TextFile,filename)
text(1,:)=load(filename) ;
t=0:10;
plot(t,text)
end

Storing Matlab data and strings in a tabulated file

I am creating a program which opens an image, and uses the MATLAB ginput command to store x and y coordinates, which are operated on in the loop to fulfill requirements of an if statement and output a number or string corresponding to the region clicked during the ginput session. At the same time, I am using the input command to input a string from the command window relating to these numbers. The ginput session is placed in a while loop so a click in a specific area will end the input session. For each session (while loop), only one or two inputs from the command window are needed. Finally, I am trying to store all the data in a csv or txt file, but I would like it to be tabulated so it is easy to read, i.e. rows and columns with headers. I am including some sample code. My questions are: 1, how can an input of x and y coordinates be translated to a string? It is simple to do this for a number, but I cannot get it to work with a string. 2, any help on printing the strings and number to a tabulated text or cdv file would be appreciated.
Command line input:
prompt='Batter:';
Batter=input(prompt,'s');
While Loop:
count=1;
flag=0;
while(flag==0)
[x,y]= ginput(1);
if (y>539)
flag=1;
end
if x<594 && x>150 && y<539 && y>104
%it's in the square
X=x;
Y=y;
end
if x<524 && x>207 && y<480 && y>163
result='strike'
else
result='ball'
end
[x,y]= ginput(1);
pitch=0;
if x<136 && x>13
%its' pitch column
if y<539
pitch=6;
end
if y<465
pitch=5;
end
if y<390
pitch=4;
end
if y<319
pitch=3;
end
if y<249
pitch=2;
end
if y<175
pitch=1;
end
end
if pitch==0
else
plot(X,Y,'o','MarkerFaceColor',colors(pitch),'MarkerSize',25);
text(X,Y,mat2str(count));
end
count=count+1
M(count,:)=[X,Y,pitch];
end
For the above series of if statements, I would prefer a string output rather than the numbers 1-6 if the condition is satisfied.
The fprintf function is used to print to a file, but I have issues combining the strings and numbers using it:
fileID = fopen('pitches.csv','w');
fid = fopen('gamedata.txt','w');
fmtString = [repmat('%s\t',1,size(Batter,2)-1),'%s\n'];
fprintf(fid,fmtString,Batter,result);
fclose(fid);
for i=1:length(M)
fprintf(fileID,'%6.2f %6.2f %d\n',M(i,1),M(i,2),M(i,3));
end
fclose(fileID);
I have tried adding the string handles to the fprintf command along with the columns of M, but get errors. I either need to store them in an array (How?) and print all the array columns to the file, or use some other method. I also tried a version of the writetable method:
writetable(T,'tabledata2.txt','Delimiter','\t','WriteRowNames',true)
but I can't get everything to work right. Thanks very much for any help.
Let's tackle your questions one at a time:
1, how can an input of x and y coordinates be translated to a string?
You can use the sprintf command in MATLAB. This takes exactly the same syntax as fprintf, but the output of this function will give you a string / character array of whatever you desire.
2, any help on printing the strings and number to a tabulated text or cdv file would be appreciated.
You can still use fprintf but you can specify a matrix as the input. As such, you can do this:
fprintf(fileID,'%6.2f %6.2f %d\n', M.');
This will write the entire matrix to file. However, care must be taken here because MATLAB writes to files in column major format. This means that it will traverse along the rows before going to the next column. If you want to write data row by row, you will need to transpose the matrix first so that when you are traversing down the rows, it will basically do what you want. You will need to keep this in mind before you start trying to write strings to an file. What I would recommend is that you place each string in a cell array, then loop through each element in the cell array and write each string individually line by line.
Hopefully this helps push you in the right direction. Reply back to me in a comment and we can keep talking if you need more help.

Oracle/SQL - Removing undefined chars from string

I currently have an assignemnt where i have to handle data from a lot of countries. My customer have given me a list of acceptable characters, lets call it:
'aber =*'
All other characters should just be changed to '_'.
I know the conversion for my country's specific chars (æøå), easily done with something like
select replace ('Ål', 'Å', 'AA') from dual;
But how would i go about removing all unwanted "noise" without splitting it up in char-by-char comparison?
For example "bear*2 = fear" should become "bear*_ = _ear" as 2 and f are not in the accepted list.
Oracle 10g and up. As one of the approaches, you can use regular expression function regexp_replace():
select regexp_replace('bear*2 = fear', '[^aber =*]', '_') as res
from dual
res
------------------------------
bear*_ = _ear
Find out more about regexp_replace() function.

How to access to specific cells in vtkPolyData?

Question : Is there a way to have direct access to a specific cell in vtkPolyData structure?
I'm using vtkPolyData to store sets of lines, say L.
Currently, I'm using GetLines() to know the number of lines in L. Then, I have to use GetNextCell to go through this set of lines using a "while" loop.
Current code is something like:
vtkSmartPointer<vtkPolyData> a;
...
vtkSmartPointer<vtkCellArray> lines = a->GetLines();
...
while(lines->GetNextCell(numberOfPoints, pointIds) != 0)
-> I'd like to be able to work directly on a specific line by doing something like:
myline = a[10];
doSomething(myline);
you could get an access to a specific cell using vtkDataSet::GetCell(vtkIdType cellId) function

Resources