R string comparison - string

I new to R and try to bring together two datasets (here answc and diagc) based on matching contents. Since the string "1 - Tester1" occurs twice in answc I would expect the result of answc==diagc to return in res at least twice 1 (=true); See example below.
Where did I go wrong?
head(answc)
[1] "1 - Tester1" "2 - Tester2" "3 - Tester3" "1 - Tester1" "2 - Tester2"
[6] "3 - Tester3"
is.character(answc)
[1] TRUE
head(diagc)
[1] "1 - Tester1"
is.character(diagc)
[1] TRUE
res<-ifelse(answc==diagc, 1, 0)
head(res)
[1] 0 0 0 0 0 0

Thank you for the feedback
The hint with the str() got me the confirmation, that the problem may have been in the data types -> I re-did the whole process with data from ANSI-formatted csv-files, read them with "stringsAsFactors=FALSE", and made sure that the relevant answc and diagc really are "chr".
The second repetition got the the desired matches, and although I can't really point out the exact error I would like to close this question.
Thank you
Christian
PS: Form now on I'll always check the encoding and the classes of the elements that are involved in a comparison/match...

Related

Print "n" results of a function "n" times

I want to print out a sort of pyramid. User inputs an integer value 'i', and that is displayed i-times.
Like if input=5
1
22
333
4444
55555
I have tried this:
input=5
for i in range(input+1):
print("i"*i)
i=i+1
The result of which is
i
ii
iii
iiii
iiiii
The problem is that (as far as I know), only a string can be printed out 'n' times, but if I take out the inverted commas around "i", it becomes (i*i) and gives out squares:
0
1
4
9
16
25
Is there a simple way around this?
Thanks!
Just convert your int loop varaible to str before building the output string by multiplying:
input = 5
for i in range(1, input+1):
print(str(i) * i)
Try this:
a = 5
for i in range(a): # <-- this causes i to go from 0,1,2,3,...,a-1
print("{}".format(i+1)*(i+1)) # < -- this creates a new string in each iteration ; an alternative would be str(i+1)*(i+1)
i=i+1 # <-- this is unnecessary, i already goes from 0 to a-1 and will be re-created in the next iteration of the loop.
This creates a new string in each iteration of the loop.
Note that for i in range(a) will go through the range by itself. There is no need to additionally increment i at the end. In general it is considered bad practise to change indices you loop over.

Looking to check if part of a user input can be in a range of integers

I'm fairly new to Python,
I'm trying to check if the user input can be checked in a range of integers
The following is the code I have already written
#LL DD LLL
#where L is a letter
#where D is a digit
#eg SG 61 ABC
area_codes = ["SG", "PV", "LJ", "EX"]
reg = input("Enter registration: ")
if reg[0:2] in area_codes:
print(reg[0:2])
if reg[2:3] in range(0,18):
print(reg[2:3])
else:
print("nope")
And this is the response I am given,
Enter registration: SG15
SG
nope
How do I check this properly?
I have tried a few things but I don't even know if this is possible.
Thank you in advance,
Donberry.
reg[2:3] is a slice of your input string. So it's a number, but stored as string.
When you do:
if reg[2:3] in range(0,18):
you're checking if the string in contained in the range object (python 3) or list object (python 2) which contains integers. So the test fails every time.
Had you done
if 0 <= reg[2:3] < 18:
you'd have gotten an explicit error in python 3. Besides, it avoids to build a range or list object just for the sake of testing. Chained comparison like this is way faster.
So I'm suggesting:
if 0 <= int(reg[2:3]) < 18:
You should convert the string to an integer before checking it's in the range. Also, (and I don't know if you did this), but you should verify that you want numbers between 0 and 17, which is what your code does.
That is, range(0, 18) - equivalent to range(18), by the way - generates the list of numbers starting at 0 and ending at 17, including both 0 and 17.
Anyway, you would check it like this:
if int(reg[2:3]) in range(0,18):
print(reg[2:3])

Pattern Matching BASIC programming Language and Universe Database

I need to identify following patterns in string.
- "2N':'2N':'2N"
- "2N'-'2N'-'2N"
- "2N'/'2N'/'2N"
- "2N'/'2N'-'2N"
AND SO ON.....
basically i want this pattern if written in Simple language
2 NUMBERS [: / -] 2 NUMBERS [: / -] 2 NUMBERS
So is there anyway by which i could write one pattern which will cover all the possible scenarios ? or else i have to write total 9 patterns and had to match all 9 patterns to string.... and it is not the scenario in my code , i have to match 4, 2 number digits separated by [: / -] to string for which i have towrite total 27 patterns. So for understanding purpose i have taken 3 ,2 digit scenario...
Please help me...Thank you
Maybe you could try something like (Pick R83 style)
OK = X MATCH "2N1X2N1X2N" AND X[3,1]=X[6,1] AND INDEX(":/-",X[3,1],1) > 0
Where variable X is some input string like: 12-34-56
Should set variable OK to 1 if validation passes, else 0 for any invalid format.
This seems to get all your required validation into a single statement. I have assumed that the non-numeric characters have to be the same. If this is not true, the check could be changed to something like:
OK = X MATCH "2N1X2N1X2N" AND INDEX(":/-",X[3,1],1) > 0 AND INDEX(":/-",X[6,1],1) > 0
Ok, I guess the requirement of surrounding characters was not obvious to me. Still, it does not make it much harder. You just need to 'parse' the string looking for the first (I assume) such pattern (if any) in the input string. This can be done in a couple of lines of code. Here is a (rather untested ) R83 style test program:
PROMPT ":"
LOOP
LOOP
CRT 'Enter test string':
INPUT S
WHILE S # "" AND LEN(S) < 8 DO
CRT "Invalid input! Hit RETURN to exit, or enter a string with >= 8 chars!"
REPEAT
UNTIL S = "" DO
*
* Look for 1st occurrence of pattern in string..
CARDNUM = ""
FOR I = 1 TO LEN(S)-7 WHILE CARDNUM = ""
IF S[I,8] MATCH "2N1X2N1X2N" THEN
IF INDEX(":/-",S[I+2,1],1) > 0 AND INDEX(":/-",S[I+5,1],1) > 0 THEN
CARDNUM = S[I,8] ;* Found it!
END ELSE I = I + 8
END
NEXT I
*
CRT CARDNUM
REPEAT
There is only 7 or 8 lines here that actually look for the card number pattern in the source/test string.
Not quite perfect but how about 2N1X2N1X2N this gets you 2 number followed by 1 of any character followed by 2 numbers etc.
This might help:
BIG.STRING ="HELLO TILDE ~ CARD 12:34:56 IS IN THIS STRING"
TEMP.STRING = BIG.STRING
CONVERT "~:/-" TO "*~~~" IN TEMP.STRING
IF TEMP.STRING MATCHES '0X2N"~"2N"~"2N0X' THEN
FIRST.TILDE.POSN = INDEX(TEMP.STRING,"~",1)
CARD.STRING = BIG.STRING[FIRST.TILDE.POSN-2,8]
PRINT CARD.STRING
END

Recognize relevant string information by checking the first characters

I have a table with 2 columns. In column 1, I have a string information, in column 2, I have a logical index
%% Tables and their use
T={'A2P3';'A2P3';'A2P3';'A2P3 with (extra1)';'A2P3 with (extra1) and (extra 2)';'A2P3 with (extra1)';'B2P3';'B2P3';'B2P3';'B2P3 with (extra 1)';'A2P3'};
a={1 1 0 1 1 0 1 1 0 1 1 }
T(:,2)=num2cell(1);
T(3,2)=num2cell(0);
T(6,2)=num2cell(0);
T(9,2)=num2cell(0);
T=table(T(:,1),T(:,2));
class(T.Var1);
class(T.Var2);
T.Var1=categorical(T.Var1)
T.Var2=cell2mat(T.Var2)
class(T.Var1);
class(T.Var2);
if T.Var1=='A2P3' & T.Var2==1
disp 'go on'
else
disp 'change something'
end
UPDATES:
I will update this section as soon as I know how to copy my workspace into a code format
** still don't know how to do that but here it goes
*** why working with tables is a double edged sword (but still cool): I have to be very aware of the class inside the table to refer to it in an if else construct, here I had to convert two columns to categorical and to double from cell to make it work...
Here is what my data looks like:
I want to have this:
if T.Var1=='A2P3*************************' & T.Var2==1
disp 'go on'
else
disp 'change something'
end
I manage to tell matlab to do as i wish, but the whole point of this post is: how do i tell matlab to ignore what comes after A2P3 in the string, where the string length is variable? because otherwise it would be very tiring to look up every single piece of string information left on A2P3 (and on B2P3 etc) just to say thay.
How do I do that?
Assuming you are working with T (cell array) as listed in your code, you may use this code to detect the successful matches -
%%// Slightly different than yours
T={'A2P3';'NotA2P3';'A2P3';'A2P3 with (extra1)';'A2P3 with (extra1) and (extra 2)';'A2P3 with (extra1)';'B2P3';'B2P3';'NotA2P3';'B2P3 with (extra 1)';'A2P3'};
a={1 1 0 1 1 0 1 1 0 1 1 }
T(:,2)=num2cell(1);
T(3,2)=num2cell(0);
T(6,2)=num2cell(0);
T(9,2)=num2cell(0);
%%// Get the comparison results
col1_comps = ismember(char(T(:,1)),'A2P3') | ismember(char(T(:,1)),'B2P3');
comparisons = ismember(col1_comps(:,1:4),[1 1 1 1],'rows').*cell2mat(T(:,2))
One quick solution would be to make a function that takes 2 strings and checks whether the first one starts with the second one.
Later Edit:
The function will look like this:
for i = 0, i < second string's length, i = i + 1
if the first string's character at index i doesn't equal the second string's character at index i
return false
after the for, return true
This assuming the second character's lenght is always smaller the first's. Otherwise, return the function with the arguments swapped.

Add leading zero to numbers

I, need get values like 05:00, -05:00 ,when on input have number values 5, -5..If on input values like 10,-12 then don't need adding leading zeros..I can create some function which check how many digits have number and then add if needed "0" char, but maybe anyone have finest decision?
The closest thing I can find to help with this is the FormatNumber function in VBScript. W3 has a good example and test tool here: http://www.w3schools.com/vbscript/func_formatnumber.asp
You will most likely have to wrap this function to handle your specific case of appending a 0. This should be pretty simple though, simply do an IFTHEN statement sort of like:
IF x > 0 & x < 10 THEN "0" + x
ELSEIF x > -10 & x < 0 THEN "-0" + Abs(x)
Or something of that nature. Again this would have to be a string formatting thing as the integer will always reflect 5 or -5 not 05 or -05
Hope that helps
Try
Function FormatHour (input)
Dim sign
If (input < 0) Then
sign = "-"
End If
FormatHour = sign & FormatDateTime(TimeSerial(Abs(input), 0, 0), vbShortTime)
End Function
How about this lpad and rpad example from Microsoft?
http://support.microsoft.com/kb/96458
Alexey, next time please publish what you've got, now we have to guess what you really want, so here is my guess
a = array(0, 5, -5, 15, -55)
for each e in a
wscript.echo mid(" -",instr(e,"-")+1,1)&right("0"&abs(e),2)&":00"
next
00:00
05:00
-05:00
15:00
-55:00
OR
for i = 0 to uBound(a)
a(i) = mid(" -",instr(a(i),"-")+1,1)&right("0"&abs(a(i)),2)&":00"
next
wscript.echo join(a,",")
'00:00, 05:00,-05:00, 15:00,-55:00

Resources