I find that there is no built-in trim (strip) method to remove leading and trailing spaces from strings in the built-in String class. I want to extend it with my functions. Is it possible? Using example here, I tried following code:
String extend [
trimleading: str [ |ch ret flag|
ret := str. "make a copy of sent string"
flag := true.
[flag] whileTrue: [ "while first char is space"
ch := ret first: 1. "again test first char"
ch = ' ' "check if space remaining"
ifTrue: [ ret := (ret copyFrom: 2 to: ret size)] "copy from 2nd char"
ifFalse: [flag := false]
].
^ret "return modified string"
]
trim: str [ |ret|
ret := str.
ret := (self trimleading: ret). "trim leading spaces"
ret := (self trimleading: (ret reverse)). "reverse string and repeat trim leading"
^(ret reverse) "return re-reversed string"
]
].
oristr := ' this is a test '
('>>',oristr,'<<') displayNl.
('>>',(oristr trim),'<<') displayNl.
Above code does not work and gives following error:
$ gst string_extend_trim.st
>> this is a test <<
Object: ' this is a test ' error: did not understand #trim
MessageNotUnderstood(Exception)>>signal (ExcHandling.st:254)
String(Object)>>doesNotUnderstand: #trim (SysExcept.st:1448)
UndefinedObject>>executeStatements (string_extend_trim.st:23)
Where is the problem and how can it be corrected? Thanks.
Edit: Following code works but it does not change original string:
String extend [
trimleading [ |ch ret flag|
ret := self. "make a copy of sent string"
flag := true.
[flag] whileTrue: [ "while first char is space"
ch := ret first: 1. "again test first char"
ch = ' ' "check if space remaining"
ifTrue: [ ret := (ret copyFrom: 2 to: ret size)] "copy from 2nd char"
ifFalse: [flag := false]
].
^ret "return modified string"
]
trim [ |ret|
ret := self.
ret := (self trimleading). "trim leading spaces"
ret := ((ret reverse) trimleading ). "reverse string and repeat trim leading"
^(ret reverse) "return re-reverse string"
]
].
oristr := ' this is a test '
('>>',oristr,'<<') displayNl.
('>>',(oristr trim),'<<') displayNl.
('>>',oristr,'<<') displayNl.
oristr := (oristr trim).
('>>',oristr,'<<') displayNl.
How can oristr trim change oristr? I do not want to write oristr := oristr trim.
The first problem you already solved: originally you defined a method trim: with one argument but sent trim with no arguments.
The second problem is to modify the String in place. You can change the chars with self at: index put: aCharacter and some other methods to copy and overwrite ranges, but you won't be able to change the size (length) of the String. In the Smalltalks I know, Objects cannot change their size after they have been created. Therefore I propose that you stick to making a new String with less characters in trim.
There is a method to exchange one object for another everywhere in the System. It is called become:. But I think you should not use it here. Depending on the Smalltalk implementation you might end up with unwanted side effects, such as replacing a String literal in a method (so the next method invocation would actually run with a different, trimmed string in place of the literal).
The difference between your code and the example you have linked is that in the example they are extending a custom class, but you are extending a core class. The difference is in the way you should load your code and run it. You should use Packages in GNU-Smalltalk to build it. There is an excellent answer on SO by #lurker how to use extened classes in gst, please read it and upvote it if you like it, I don't want to duplicate the information here.
To adapt your code to String extend:
String extend [
trimleading: str [ |ch ret flag|
ret := str. "make a copy of sent string"
flag := true.
[flag] whileTrue: [ "while first char is space"
ch := ret first: 1. "again test first char"
ch = ' '
ifTrue: [ ret := (ret copyFrom: 2 to: ret size) ] "copy from 2nd char"
ifFalse: [flag := false ]
].
^ ret "value is modified string"
]
trim [ | ret |
ret := self trimleading: self. "trim leading spaces"
ret := self trimleading: (ret copy reverse). "reverse string and repeat trim leading"
^ (ret reverse) "return re-reverse string"
]
].
oristr := ' this is a test '.
('>>',oristr,'<<') displayNl.
('>>',(oristr trim),'<<') displayNl.
('>>',oristr,'<<') displayNl.
oristr := (oristr trim).
('>>',oristr,'<<') displayNl.
You are sending the message #trim to origstr variable so you must define without any parameters. However, that does not apply to #trimleading: so I have taken your previous code and put it there.
Note: You should really read about the self keyword and what it does and understand it - you are using it incorrectly. You assign the ret := self value but you don't use it, you just overwrite it with next assigment.
Related
I make a function that contains if in NI Kontakt:
on init
message(Add(1,2))
end on
function Add(x,y) -> output
if x > 0
output := x + y
else
output := 0
end if
end function
And I get the error message:
The definition of function Add needs to consist of a single line (eg.
"result := ") in order to be used in this context
Ho do I make a function with if?
there are several things wrong. I'm sure reading some example code helps to avoid too much try-and-error with this exotic language. But that's probably done after almost 4 months? ;-)
Firstly you need to declare all variables in the on init and always use their corresponding prefix (for integers its "$") like so:
on init
declare $x
declare $y
declare $output
end on
Secondly you can not call a function in the on init. For this example I use the on note callback that triggers every time you play a note. Additionally use "call" to execute a function.
on note
$x := 1
$y := 2
call Add
message($output)
end on
And lastly use brackets around your conditions:
function Add
if ($x > 0)
$output := $x + $y
else
$output := 0
end if
end function
It is like in most programming languages important to declare all your functions before their execution. Since you can not use them in the on init, you can place this callback always on top followed by your functions.
This would be the full code:
on init
declare $x
declare $y
declare $output
end on
function Add
if ($x > 0)
$output := $x + $y
else
$output := 0
end if
end function
on note
$x := 1
$y := 2
call Add
message($output)
end on
Enjoy ;-)
Often in my AutoHotkey scripts I need to compare a variable to several values values
if (myString == "val1" || myString == "val2" || myString == "val3" || myString == "val4")
; Do stuff
In most languages, there are ways to make this comparison a bit more concise.
Java
if (myString.matches("val1|val2|val3"))
// Do stuff
Python
if myString in ["val1","val2","val3","val4"]
# Do stuff
Does AutoHotkey have anything similar? Is there a better way to compare a variable against multiple strings?
Many different ways.
Autohotkey way (if Var in MatchList)
if myString in val1,val2,val3,val4
Similar to your java regex based way
if (RegExMatch(myString, "val1|val2|val3|val4"))
Similar to your python though not as nice way based on Associative Arrays
if ({val1:1,val2:1,val3:1,val4:1}.hasKey(myString))
; If-Var-In way. Case-insensitive.
Ext := "txt"
If Ext In txt,jpg,png
MsgBox,,, % "Foo"
; RegEx way. Case-insensitive. To make it case-sensitive, remove i).
Ext := "txt"
If (RegExMatch(Ext, "i)^(?:txt|jpg|png)$"))
MsgBox,,, % "Foo"
; Array way 1. Array ways are case-insensitive.
Ext := "txt"
If ({txt: 1, jpg: 1, png: 1}.HasKey(Ext))
MsgBox,,, % "Foo"
; Array way 2.
Extensions := {txt: 1, jpg: 1, png: 1}
Ext := "txt"
If (Extensions[Ext])
MsgBox,,, % "Foo"
If-Var-In is the most native way. However, you should be aware that it isn't an expression, and therefore it cannot be a part of another expression.
Broken:
SomeCondition := True
Extension := "exe"
If (SomeCondition && Extension In "txt,jpg,png")
MsgBox,,, % "Foo"
Else
MsgBox,,, % "Bar"
Works correctly:
SomeCondition := True
Extension := "exe"
If (SomeCondition && RegExMatch(Extension, "i)^(?:txt|jpg|png)$"))
MsgBox,,, % "Foo"
Else
MsgBox,,, % "Bar"
For the same reason (i.e. because it isn't an expression), you cannot you K&R brace style.
Works correctly:
Ext := "txt"
If Ext In txt,jpg,png
MsgBox,,, % "Foo"
Ext := "txt"
If Ext In txt,jpg,png
{
MsgBox,,, % "Foo"
}
Broken:
Ext := "txt"
If Ext In txt,jpg,png {
MsgBox,,, % "Foo"
}
I have two variables in the makefile they can be empty or having empty string OR valid string
I need to error exit in case either of the variable is empty or empty string
Here is simple makefile I am using
ABC := ""
XYZ := hello
all:
ifeq ($(and $(ABC),$(XYZ)),)
$(error "Either of var is null")
endif
#echo "Done"
With this I get output as Done While I want it to fail.
If I change ifeq condition as follows,
ifeq ($(and $(ABC),$(XYZ)),"") then in following condition make is not error exiting
ABC :=
XYZ := hello
all:
ifeq ($(and $(ABC),$(XYZ)),"")
$(error "Either of var is null")
endif
#echo "Done"
One solution could be as follows,(?)
ABC := hello
XYZ := hello
all:
ifeq ($(and $(ABC),$(XYZ)),)
$(error "Var is null")
endif
ifeq ($(and $(ABC),$(XYZ)),"")
$(error "Var is null2")
endif
#echo "Done"
However I feel there could a better way of doing it, Any suggestions ?
EDIT
Just to explain what I want is,
if ABC is empty string(ABC := "") OR empty(ABC := ) OR
XYZ is empty string(XYZ := "") OR empty(XYZ := )
$(error "empty string or null")
endif
Just to be clear, make doesn't care about quotes in any way. When make talks about "empty" variables it means variable with no value. If you write:
ABC := ""
then that variable has a value, the literal characters "". To make, that's no different from assigning ab etc. (at least in how make interprets those values).
For your problem you can use something like:
ifeq (,$(subst ",,$(ABC)$(XYZ)))
$(error empty string or null)
endif
which will replace all quotes with nothing; if the result of that is an empty string then you know the variables were either empty or contained nothing but quotes.
Note that this will also cause variables that contain only one quote, or more than two quotes, to be considered empty; e.g.,
ABC := "
XYZ := """""""""""""
will also be considered empty. If you really want only to consider exactly two quotes to be empty then you need something more fancy.
Ack, I don't have enough reputation to make comments, but it appears as though none of the above actually answer your question. Concatenation is the functional equivalent of $(or, not $(and, so $(ABC)$(XYZ) is not equivalent to $(and $(ABC),$(XYZ)) (and $(subst ",,""xxx) will not be blank either).
Also, your second example in your question will not work, as if $(ABC) is "", then $(and $ABC,xxx) will be xxx, not "".
What you need is a macro, say unquote to remove the qoutes, and then do:
unquote=$(subst ",,$(1))
ifeq ($(and $(call unquote,$(ABC)),$(call unquote,$(XYZ))),)
Which is a bit ugly. You could of course change unquote to only simplify empty quoted strings.
Perl usually converts numeric to string values and vice versa transparently. Yet there must be something which allows e.g. Data::Dumper to discriminate between both, as in this example:
use Data::Dumper;
print Dumper('1', 1);
# output:
$VAR1 = '1';
$VAR2 = 1;
Is there a Perl function which allows me to discriminate in a similar way whether a scalar's value is stored as number or as string?
A scalar has a number of different fields. When using Perl 5.8 or higher, Data::Dumper inspects if there's anything in the IV (integer value) field. Specifically, it uses something similar to the following:
use B qw( svref_2object SVf_IOK );
sub create_data_dumper_literal {
my ($x) = #_; # This copying is important as it "resolves" magic.
return "undef" if !defined($x);
my $sv = svref_2object(\$x);
my $iok = $sv->FLAGS & SVf_IOK;
return "$x" if $iok;
$x =~ s/(['\\])/\\$1/g;
return "'$x'";
}
Checks:
Signed integer (IV): ($sv->FLAGS & SVf_IOK) && !($sv->FLAGS & SVf_IVisUV)
Unsigned integer (IV): ($sv->FLAGS & SVf_IOK) && ($sv->FLAGS & SVf_IVisUV)
Floating-point number (NV): $sv->FLAGS & SVf_NOK
Downgraded string (PV): ($sv->FLAGS & SVf_POK) && !($sv->FLAGS & SVf_UTF8)
Upgraded string (PV): ($sv->FLAGS & SVf_POK) && ($sv->FLAGS & SVf_UTF8)
You could use similar tricks. But keep in mind,
It'll be very hard to stringify floating point numbers without loss.
You need to properly escape certain bytes (e.g. NUL) in string literals.
A scalar can have more than one value stored in it. For example, !!0 contains a string (the empty string), a floating point number (0) and a signed integer (0). As you can see, the different values aren't even always equivalent. For a more dramatic example, check out the following:
$ perl -E'open($fh, "non-existent"); say for 0+$!, "".$!;'
2
No such file or directory
It is more complicated. Perl changes the internal representation of a variable depending on the context the variable is used in:
perl -MDevel::Peek -e '
$x = 1; print Dump $x;
$x eq "a"; print Dump $x;
$x .= q(); print Dump $x;
'
SV = IV(0x794c68) at 0x794c78
REFCNT = 1
FLAGS = (IOK,pIOK)
IV = 1
SV = PVIV(0x7800b8) at 0x794c78
REFCNT = 1
FLAGS = (IOK,POK,pIOK,pPOK)
IV = 1
PV = 0x785320 "1"\0
CUR = 1
LEN = 16
SV = PVIV(0x7800b8) at 0x794c78
REFCNT = 1
FLAGS = (POK,pPOK)
IV = 1
PV = 0x785320 "1"\0
CUR = 1
LEN = 16
There's no way to find this out using pure perl. Data::Dumper uses a C library to achieve it. If forced to use Perl it doesn't discriminate strings from numbers if they look like decimal numbers.
use Data::Dumper;
$Data::Dumper::Useperl = 1;
print Dumper(['1',1])."\n";
#output
$VAR1 = [
1,
1
];
Based on your comment that this is to determine whether quoting is needed for an SQL statement, I would say that the correct solution is to use placeholders, which are described in the DBI documentation.
As a rule, you should not interpolate variables directly in your query string.
One simple solution that wasn't mentioned was Scalar::Util's looks_like_number. Scalar::Util is a core module since 5.7.3 and looks_like_number uses the perlapi to determine if the scalar is numeric.
The autobox::universal module, which comes with autobox, provides a type function which can be used for this purpose:
use autobox::universal qw(type);
say type("42"); # STRING
say type(42); # INTEGER
say type(42.0); # FLOAT
say type(undef); # UNDEF
When a variable is used as a number, that causes the variable to be presumed numeric in subsequent contexts. However, the reverse isn't exactly true, as this example shows:
use Data::Dumper;
my $foo = '1';
print Dumper $foo; #character
my $bar = $foo + 0;
print Dumper $foo; #numeric
$bar = $foo . ' ';
print Dumper $foo; #still numeric!
$foo = $foo . '';
print Dumper $foo; #character
One might expect the third operation to put $foo back in a string context (reversing $foo + 0), but it does not.
If you want to check whether something is a number, the standard way is to use a regex. What you check for varies based on what kind of number you want:
if ($foo =~ /^\d+$/) { print "positive integer" }
if ($foo =~ /^-?\d+$/) { print "integer" }
if ($foo =~ /^\d+\.\d+$/) { print "Decimal" }
And so on.
It is not generally useful to check how something is stored internally--you typically don't need to worry about this. However, if you want to duplicate what Dumper is doing here, that's no problem:
if ((Dumper $foo) =~ /'/) {print "character";}
If the output of Dumper contains a single quote, that means it is showing a variable that is represented in string form.
You might want to try Params::Util::_NUMBER:
use Params::Util qw<_NUMBER>;
unless ( _NUMBER( $scalar ) or $scalar =~ /^'.*'$/ ) {
$scalar =~ s/'/''/g;
$scalar = "'$scalar'";
}
The following function returns true (1) if the input is numeric and false ("") if it is a string. The function also returns true (-1) if the input is a numeric Inf or NaN. Similar code can be found in the JSON::PP module.
sub is_numeric {
my $value = shift;
no warnings 'numeric';
# string & "" -> ""
# number & "" -> 0 (with warning)
# nan and inf can detect as numbers, so check with * 0
return unless length((my $dummy = "") & $value);
return unless 0 + $value eq $value;
return 1 if $value * 0 == 0; # finite number
return -1; # inf or nan
}
I don't think there is perl function to find type of value. One can find type of DS(scalar,array,hash). Can use regex to find type of value.
Given pairs of string like this.
my $s1 = "ACTGGA";
my $s2 = "AGTG-A";
# Note the string can be longer than this.
I would like to find position and character in in $s1 where it differs with $s2.
In this case the answer would be:
#String Position 0-based
# First col = Base in S1
# Second col = Base in S2
# Third col = Position in S1 where they differ
C G 1
G - 4
I can achieve that easily with substr(). But it is horribly slow.
Typically I need to compare millions of such pairs.
Is there a fast way to achieve that?
Stringwise ^ is your friend:
use strict;
use warnings;
my $s1 = "ACTGGA";
my $s2 = "AGTG-A";
my $mask = $s1 ^ $s2;
while ($mask =~ /[^\0]/g) {
print substr($s1,$-[0],1), ' ', substr($s2,$-[0],1), ' ', $-[0], "\n";
}
EXPLANATION:
The ^ (exclusive or) operator, when used on strings, returns a string composed of the result of an exclusive or on each bit of the numeric value of each character. Breaking down an example into equivalent code:
"AB" ^ "ab"
( "A" ^ "a" ) . ( "B" ^ "b" )
chr( ord("A") ^ ord("a") ) . chr( ord("B") ^ ord("b") )
chr( 65 ^ 97 ) . chr( 66 ^ 98 )
chr(32) . chr(32)
" " . " "
" "
The useful feature of this here is that a nul character ("\0") occurs when and only when the two strings have the same character at a given position. So ^ can be used to efficiently compare every character of the two strings in one quick operation, and the result can be searched for non-nul characters (indicating a difference). The search can be repeated using the /g regex flag in scalar context, and the position of each character difference found using $-[0], which gives the offset of the beginning of the last successful match.
Use binary bit ops on the complete strings.
Things like $s1 & $s2 or $s1 ^ $s2 run incredibly fast, and work with strings of arbitrary length.
I was bored on Thanksgiving break 2012 and answered the question and more. It will work on strings of equal length. It will work if they are not. I added a help, opt handling just for fun. I thought someone might find it useful.
If you are new to PERL add don't know. Don't add any code in your script below DATA to the program.
Have fun.
./diftxt -h
usage: diftxt [-v ] string1 string2
-v = Verbose
diftxt [-V|--version]
diftxt [-h|--help] "This help!"
Examples: diftxt test text
diftxt "This is a test" "this is real"
Place Holders: space = "·" , no charater = "ζ"
cat ./diftxt
----------- cut ✂----------
#!/usr/bin/perl -w
use strict;
use warnings;
use Getopt::Std;
my %options=();
getopts("Vhv", \%options);
my $helptxt='
usage: diftxt [-v ] string1 string2
-v = Verbose
diftxt [-V|--version]
diftxt [-h|--help] "This help!"
Examples: diftxt test text
diftxt "This is a test" "this is real"
Place Holders: space = "·" , no charater = "ζ"';
my $Version = "inital-release 1.0 - Quincey Craig 11/21/2012";
print "$helptxt\n\n" if defined $options{h};
print "$Version\n" if defined $options{V};
if (#ARGV == 0 ) {
if (not defined $options{h}) {usage()};
exit;
}
my $s1 = "$ARGV[0]";
my $s2 = "$ARGV[1]";
my $mask = $s1 ^ $s2;
# setup unicode output to STDOUT
binmode DATA, ":utf8";
my $ustring = <DATA>;
binmode STDOUT, ":utf8";
my $_DIFF = '';
my $_CHAR1 = '';
my $_CHAR2 = '';
sub usage
{
print "\n";
print "usage: diftxt [-v ] string1 string2\n";
print " -v = Verbose \n";
print " diftxt [-V|--version]\n";
print " diftxt [-h|--help]\n\n";
exit;
}
sub main
{
print "\nOrig\tDiff\tPos\n----\t----\t----\n" if defined $options{v};
while ($mask =~ /[^\0]/g) {
### redirect stderr to allow for test of empty variable with error message from substr
open STDERR, '>/dev/null';
if (substr($s2,$-[0],1) eq "") {$_CHAR2 = "\x{03B6}";close STDERR;} else {$_CHAR2 = substr($s2,$-[0],1)};
if (substr($s2,$-[0],1) eq " ") {$_CHAR2 = "\x{00B7}"};
$_CHAR1 = substr($s1,$-[0],1);
if ($_CHAR1 eq "") {$_CHAR1 = "\x{03B6}"} else {$_CHAR1 = substr($s1,$-[0],1)};
if ($_CHAR1 eq " ") {$_CHAR1 = "\x{00B7}"};
### Print verbose Data
print $_CHAR1, "\t", $_CHAR2, "\t", $+[0], "\n" if defined $options{v};
### Build difference list
$_DIFF = "$_DIFF$_CHAR2";
### Build mask
substr($s1,"$-[0]",1) = "\x{00B7}";
} ### end loop
print "\n" if defined $options{v};
print "$_DIFF, ";
print "Mask: \"$s1\"\n";
} ### end main
if ($#ARGV == 1) {main()};
__DATA__
This is the easiest form you can get
my $s1 = "ACTGGA";
my $s2 = "AGTG-A";
my #s1 = split //,$s1;
my #s2 = split //,$s2;
my $i = 0;
foreach (#s1) {
if ($_ ne $s2[$i]) {
print "$_, $s2[$i] $i\n";
}
$i++;
}