Extract substring using index in Pharo Smalltalk - string

I'm trying to get a substring from an initial string in Smalltalk. I'm wondering if there's a way to do it. For example in Java, the method aStringObject.substring(index), allows you to trim a String object using an index (or its position in the array). I've been looking in the browser for something that works in a similar way, but couldn't find it. So far every trimming method uses a character or string to do the separation.
As an example of what I'm looking for:
initialString:='Hello'.
finalString:=initialString substring: 1
The value of finalString should be 'ello'.

In Smalltalk a String is a type of SequencableCollection so you can use the copying protocol messages as well.
For example you could use:
copyFrom: start to: stop
allButFirst (will not copy the first character)
allButFirst: n (more generally answer a copy of the receiver containing all but the first n elements.

Related

LUA -- gsub problems -- passing a variable to the match string isn't working [duplicate]

This question already has an answer here:
How to match a sentence in Lua
(1 answer)
Closed 1 year ago.
Been stuck on this for over a day.
I'm trying to use gsub to extract a portion of an input string. The exact pattern of the input varies in different cases, so I'm trying to use a variable to represent that pattern, so that the same routine - which is otherwise identical - can be used in all cases, rather than separately coding each.
So, I have something along the lines of:
newstring , n = oldstring:gsub(matchstring[i],"%1");
where matchstring[] is an indexed table of the different possible pattern matches, set up so that "%1" will match the target sequence in each matchstring[].
For instance, matchstring[1] might be
"\[User\] <code:%w*>([^<]*)<\\code>.*" -- extract user name from within the <code>...<\code>
while matchstring[2] could be
"\[World\] (%w)* .*" -- extract user name as first word after prefix '[World] '
and matchstring[3] could be
"<code:%w*>([^<]*)<\\code>.*" -- extract username from within <code>...<\code> at start
This does not work.
Yet when, debugging one of the cases, I replace matchstring[i] with the exact same string -- only now passed as a string literal rather than saved in a variable -- it works.
So.. I'm guessing there must be some 'processing' of the string - stripping out special characters or something - when it's sent as a variable rather than a string literal ... but for the life of me I can't figure out how to adjust the matchstring[] entries to compensate!
Help much appreciated...
FACEPALM
Thankyou, Piglet, you got me on the right track.
Given how this particular platform processes & passes strings, anything within <...> needed the escape character \ for downstream use, but of course - duh - for the lua gsub's processing itself it needed the standard %
much obliged

squeak(smalltalk) how to use method `findSubstring: in: startingAt: matchTable:`?

what I should send for matchTable: selector?
in the implementation, there are no examples or detailed explanation so
I don't understand which object is getting the message if I put the string in in: selector
The matchTable: keyword provides a way to identify characters so that they become equivalent in comparisons. The argument is usually a ByteArray of 256 entries, containing at position i the code point of the ith character to be considered when comparing.
The main use of the table is to implement case-insensitive searches, where, e.g., A=a. Thus, instead of comparing the characters at hand during the search, what are compared are the elements found in the matchTable at their respective code points. So, instead of
(string1 at: i) = (string2 at: j)
the testing becomes something on the lines of
cp1 := string1 basicAt: i.
cp2 := string2 basicAt: j.
(table at: cp1) = (table at: cp2).
In other words, the matchTable: argument is used to map actual characters to the ones that actually matter for the comparisons.
Note that the same technique can be applied for case-sensitive/insensitive sorting.
Finally, bear in mind that this is a rather low-level method that non-system programmers would rarely need. You should be using instead higher level versions for finding substrings such as findString:startingAt:caseSensitive:, where the argument of the last keyword is a Boolean.

Regex or IndexOf?

I have a long string "AB100123485;AB10064279293-IP-1-KNPO;AB473898487-MM41". I have to extract integer value after "IP-" i.e 1 (only) what is the most efficient way ? I am using c#
Thanks
The 'most-efficient' way depends on how consistent your string is in terms of length and appearance. You can surely do this with a regular expression as a quick solution if you just want to get the digit directly following IP-.
You can utilize the RegularExpressions API, passing in your regular expression and input string.
https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.match?view=netframework-4.8#System_Text_RegularExpressions_Regex_Match_System_String_System_String_
This pattern should get you started IP-[0-9]; refine it more to your use case as needed.
For example:
Match matched = System.Text.RegularExpressions.Match(
"AB100123485;AB10064279293-IP-1-KNPO;AB473898487-MM41",
"IP-[0-9]"
);

Concatenation with empty string raises ERR:INVALID DIM

In TI-BASIC, the + operation is overloaded for string concatenation (in this, if nothing else, TI-BASIC joins the rest of the world).
However, any attempt to concatenate involving an empty string raises a Dimension Mismatch error:
"Fizz"+"Buzz"
FizzBuzz
"Fizz"+""
Error
""+"Buzz"
Error
""+""
Error
Why does this occur, and is there an elegant workaround? I've been using a starting space and truncating the string when necessary (doesn't always work well) or using a loop to add characters one at a time (slow).
The best way depends on what you are doing.
If you have a string (in this case, Str1) that you need to concatenate with another (Str2), and you don't know if it is empty, then this is a good general-case solution:
Str2
If length(Str1
Str1+Str2
If you need to loop and add a stuff to the string each time, then this is your best solution:
Before the loop:
" →Str1
In the loop:
Str1+<stuff_that_isn't_an_empty_string>→Str1
After the loop:
sub(Str1,2,length(Str1)-1→Str1
There are other situations, too, and if you have a specific situation, then you should post a simplified version of the relevant code.
Hope this helps!
It is very unfortunate that TI-Basic doesn't support empty strings. If you are starting with an empty string and adding chars, you have to do something like this:
"?
For(I,1,3
Prompt Str1
Ans+Str1
End
sub(Ans,2,length(Ans)-1
Another useful trick is that if you have a string that you are eventually going to evaluate using expr(, you can do "("+Str1+")"→Str1 and then freely do search and replace on the string. This is a necessary workaround since you can't search and replace any text involving the first or last character in a string.

issues while serializing to YAML file

I have started using .net API for yaml and it seems to be helpful. However I have few questions and wondering if you can provide some sample/work around for the same.
(1) I have an object consisting 4 strings I would like to serialize its collection (List or String[]). I wrote a helper method to return me the strings in the format I want, however it adds an extra single quote before and after the string. So I am getting
-'{str1: str2, str3: str4}'
-'{str5: str6, str7: str8}'
instead of
-{str1: str2, str3: str4}
-{str5: str6, str7: str8}
Can you suggest any workarounds?
(2) I am trying to insert xaml as a string in a yaml document. My xaml is well formed xml but when I serialize it, it cuts before 3rd last element. Any idea why?
Regarding the first question, if you are serializing an array of strings, then it is normal that each element is quoted because it starts with a '{'. In this case, you should be serializing the list of objects directly instead of converting them to string first.
Regarding the second question, you should add some code to the question to clarify what you are doing.

Resources