Question related to string

Question related to string - string

I have two statements:
String aStr = new String("ABC");
String bStr = "ABC";
I read in book that in first statement JVM creates two bjects and one reference variable, whereas second statement creates one reference variable and one object.
How is that? When I say new String("ABC") then It's pretty clear that object is created.
Now my question is that for "ABC" value to we do create another object?
Please clarify a bit more here.
Thank you

You will end up with two Strings.
1) the literal "ABC", used to construct aStr and assigned to bStr. The compiler makes sure that this is the same single instance.
2) a newly constructed String aStr (because you forced it to be new'ed, which is really pretty much non-sensical)

Using a string literal will only create a single object for the lifetime of the JVM - or possibly the classloader. (I can't remember the exact details, but it's almost never important.)
That means it's hard to say that the second statement in your code sample really "creates" an object - a certain object has to be present, but if you run the same code in a loop 100 times, it won't create any more objects... whereas the first statement would. (It would require that the object referred to by the "ABC" literal is present and create a new instance on each iteration, by virtue of calling the constructor.)
In particular, if you have:
Object x = "ABC";
Object y = "ABC";
then it's guaranteed (by the language specification) than x and y will refer to the same object. This extends to other constant expressions equal to the same string too:
private static final String A = "a";
Object z = A + "BC"; // x, y and z are still the same reference...
The only time I ever use the String(String) constructor is if I've got a string which may well be backed by a rather larger character array which I don't otherwise need:
String x = readSomeVeryLargeString();
String y = x.substring(5, 10);
String z = new String(y); // Copies the contents
Now if the strings that y and x refer to are eligible for collection but the string that z refers to isn't (e.g. it's passed on to other methods etc) then we don't end up holding all of the original long string in memory, which we would otherwise.

Related

kotlin string concatenation no side effect

Why is string concatenation kept without side-effects?
fun main() {
var s1 = "abc"
val s2 = "def"
s1.plus(s2)
println(s1)
// s1 = s1.plus(s2)
// println(s1)
}
I expected abcdef, but got just abc. The commented code works fine, but seems awkward.

The plus() method (or the + operator, which is basically the same thing in kotlin) is returning the concatenated string. So by calling
s1.plus(s2)
//or
s1 + s2
you concatenate these strings but you throw away the result.

As this answer shows. Your plus operation returns the concatenated string independent from the given strings. This is because strings are immutable in Kotlin. An alternative is to use CharLists or buffers.
Note that immutability has some serious benefits. With immutablility it is no problem to give away your references to immutable objects because nobody can change the object. You need to change the reference in your model to hold a new value. This way it helps a lot to guarantee consistency and integrity in your model. Also note that immutable objects are threadsafe for that reason.

When a GString will change its toString representation

I am reading the Groovy closure documentation in https://groovy-lang.org/closures.html#this. Having a question regarding with GString behavior.
Closures in GStrings
The document mentioned the following:
Take the following code:
def x = 1
def gs = "x = ${x}"
assert gs == 'x = 1'
The code behaves as you would expect, but what happens if you add:
x = 2
assert gs == 'x = 2'
You will see that the assert fails! There are two reasons for this:
a GString only evaluates lazily the toString representation of values
the syntax ${x} in a GString does not represent a closure but an expression to $x, evaluated when the GString is created.
In our example, the GString is created with an expression referencing x. When the GString is created, the value of x is 1, so the GString is created with a value of 1. When the assert is triggered, the GString is evaluated and 1 is converted to a String using toString. When we change x to 2, we did change the value of x, but it is a different object, and the GString still references the old one.
A GString will only change its toString representation if the values it references are mutating. If the references change, nothing will happen.
My question is regarding the above-quoted explanation, in the example code, 1 is obviously a value, not a reference type, then if this statement is true, it should update to 2 in the GString right?
The next example listed below I feel also a bit confusing for me (the last part)
why if we mutate Sam to change his name to Lucy, this time the GString is correctly mutated??
I am expecting it won't mutate?? why the behavior is so different in the two examples?
class Person {
String name
String toString() { name }
}
def sam = new Person(name:'Sam')
def lucy = new Person(name:'Lucy')
def p = sam
def gs = "Name: ${p}"
assert gs == 'Name: Sam'
p = Lucy. //if we change p to Lucy
assert gs == 'Name: Sam' // the string still evaluates to Sam because it was the value of p when the GString was created
/* I would expect below to be 'Name: Sam' as well
* if previous example is true. According to the
* explanation mentioned previously.
*/
sam.name = 'Lucy' // so if we mutate Sam to change his name to Lucy
assert gs == 'Name: Lucy' // this time the GString is correctly mutated
Why the comment says 'this time the GString is correctly mutated? In previous comments it just metioned
the string still evaluates to Sam because it was the value of p when the GString was created, the value of p is 'Sam' when the String was created
thus I think it should not change here??
Thanks for kind help.

These two examples explain two different use cases. In the first example, the expression "x = ${x}" creates a GString object that internally stores strings = ['x = '] and values = [1]. You can check internals of this particular GString with println gs.dump():
<org.codehaus.groovy.runtime.GStringImpl#6aa798b strings=[x = , ] values=[1]>
Both objects, a String one in the strings array, and an Integer one in the values array are immutable. (Values are immutable, not arrays.) When the x variable is assigned to a new value, it creates a new object in the memory that is not associated with the 1 stored in the GString.values array. x = 2 is not a mutation. This is new object creation. This is not a Groovy specific thing, this is how Java works. You can try the following pure Java example to see how it works:
List<Integer> list = new ArrayList<>();
Integer number = 2;
list.add(number);
number = 4;
System.out.println(list); // prints: [2]
The use case with a Person class is different. Here you can see how mutation of an object works. When you change sam.name to Lucy, you mutate an internal stage of an object stored in the GString.values array. If you, instead, create a new object and assigned it to sam variable (e.g. sam = new Person(name:"Adam")), it would not affect internals of the existing GString object. The object that was stored internally in the GString did not mutate. The variable sam in this case just refers to a different object in the memory. When you do sam.name = "Lucy", you mutate the object in the memory, thus GString (which uses a reference to the same object) sees this change. It is similar to the following plain Java use case:
List<List<Integer>> list2 = new ArrayList<>();
List<Integer> nested = new ArrayList<>();
nested.add(1);
list2.add(nested);
System.out.println(list2); // prints: [[1]]
nested.add(3);
System.out.println(list2); // prints: [[1,3]]
nested = new ArrayList<>();
System.out.println(list2); // prints: [[1,3]]
You can see that list2 stores the reference to the object in the memory represented by nested variable at the time when nested was added to list2. When you mutated nested list by adding new numbers to it, those changes are reflected in list2, because you mutate an object in the memory that list2 has access to. But when you override nested with a new list, you create a new object, and list2 has no connection with this new object in the memory. You could add integers to this new nested list and list2 won't be affected - it stores a reference to a different object in the memory. (The object that previously could be referred to using nested variable, but this reference was overridden later in the code with a new object.)
GString in this case behaves similarly to the examples with lists I shown you above. If you mutate the state of the interpolated object (e.g. sam.name, or adding integers to nested list), this change is reflected in the GString.toString() that produces a string when the method is called. (The string that is created uses the current state of values stored in the values internal array.) On the other hand, if you override a variable with a new object (e.g. x = 2, sam = new Person(name:"Adam"), or nested = new ArrayList()), it won't change what GString.toString() method produces, because it still uses an object (or objects) that is stored in the memory, and that was previously associated with the variable name you assigned to a new object.

That's almost the whole story, as you can use a Closure for your GString evaluation, so in place of just using the variable:
def gs = "x = ${x}"
You can use a closure that returns the variable:
def gs = "x = ${-> x}"
This means that the value x is evaluated at the time the GString is changed to a String, so this then works (from the original question)
def x = 1
def gs = "x = ${-> x}"
assert gs == 'x = 1'
x = 2
assert gs == 'x = 2'

Allocation memory for string

In D string is alias on immutable char[]. So every operation on string processing with allocation of memory. I tied to check it, but after replacing symbol in string I see the same address.
string str = "big";
writeln(&str);
str.replace("i","a");
writeln(&str);
Output:
> app.exe
19FE10
19FE10
I tried use ptr:
string str = "big";
writeln(str.ptr);
str.replace(`i`,`a`);
writeln(str.ptr);
And got next output:
42E080
42E080
So it's showing the same address. Why?

You made a simple error in your code:
str.replace("i","a");
str.replace returns the new string with the replacement done, it doesn't actually replace the existing variable. So try str = str.replace("i", "a"); to see the change.
But you also made a too broadly general statement about allocations:
So every operation on string processing with allocation of memory.
That's false, a great many operations do not require allocation of new memory. Anything that can slice the existing string will do so, avoiding needing new memory:
import std.string;
import std.stdio;
void main() {
string a = " foo ";
string b = a.strip();
assert(b == "foo"); // whitespace stripped off...
writeln(a.ptr);
writeln(b.ptr); // but notice how close those ptrs are
assert(b.ptr == a.ptr + 2); // yes, b is a slice of a
}
replace will also return the original string if no replacement was actually done:
string a = " foo ";
string b = a.replace("p", "a"); // there is no p to replace
assert(a.ptr is b.ptr); // so same string returned
Indexing and iteration require no new allocation (of course). Believe it or not, but even appending sometimes will not allocate because there may be memory left at the end of the slice that is not yet used (though it usually will).
There's also various functions that return range objects that do the changes as you iterate through them, avoiding allocation. For example, instead of replace(a, "f", "");, you might do something like filter!(ch => ch != 'f')(a); and loop through, which doesn't allocate a new string unless you ask it to.
So it is a lot more nuanced than you might think!

In D all arrays are a length + a pointer to the start of the array values. These are usually stored on the stack which just so happens to be RAM.
When you go take an address of a variable (which is in a function body) what you really are doing is getting a pointer to the stack.
To get the address of an array values use .ptr.
So replace &str with str.ptr and you will get the correct output.

How arguments are passed into a cppFunction

I'm puzzled as to how arguments are passed into a cppFunction when we use Rcpp. In particular, I wonder if someone can explain the result of the following code.
library(Rcpp)
cppFunction("void test(double &x, NumericVector y) {
x = 2016;
y[0] = 2016;
}")
a = 1L
b = 1L
c = 1
d = 1
test(a,b)
test(c,d)
cat(a,b,c,d) #this prints "1 1 1 2016"

As stated before in other areas, Rcpp establishes convenient classes around R's SEXP objects.
For the first parameter, the double type does not have a default SEXP object. This is because within R, there is no such thing as a scalar. Thus, new memory is allocate making the & reference incompatible. Hence, the variable scope for the modification is limited to the function and there is never an update to the result. As a result, for both test cases, you will see 1.
For the second case, there is a mismatch between object classes. Within the first call object supplied is of type integer due to the L appended on the end, which conflicts with the C++ function expected type of numeric. The issue is resolved once the L is dropped as the object is instantiated as a numeric. Therefore, an intermediary memory location does not need to be created that is of the correct type to receive the value. Hence, the modification in the second case is able to propagate back to R.
e.g.
a = 1L
class(a)
# "integer"
a = 1
class(a)
# "numeric"

Groovy DSL: creating dynamic closures from Strings

There are some other questions on here that are similar but sufficiently different that I need to pose this as a fresh question:
I have created an empty class, lets call it Test. It doesn't have any properties or methods. I then iterate through a map of key/value pairs, dynamically creating properties named for the key and containing the value... like so:
def langMap = [:]
langMap.put("Zero",0)
langMap.put("One",1)
langMap.put("Two",2)
langMap.put("Three",3)
langMap.put("Four",4)
langMap.put("Five",5)
langMap.put("Six",6)
langMap.put("Seven",7)
langMap.put("Eight",8)
langMap.put("Nine",9)
langMap.each { key,val ->
Test.metaClass."${key}" = val
}
Now I can access these from a new method created like this:
Test.metaClass.twoPlusThree = { return Two + Three }
println test.twoPlusThree()
What I would like to do though, is dynamically load a set of instructions from a String, like "Two + Three", create a method on the fly to evaluate the result, and then iteratively repeat this process for however many strings containing expressions that I happen to have.
Questions:
a) First off, is there simply a better and more elegant way to do this (Based on the info I have given) ?
b) Assuming this path is viable, what is the syntax to dynamically construct this closure from a string, where the string references variable names valid only within a method on this class?
Thanks!

I think the correct answer depends on what you're actually trying to do. Can the input string be a more complicated expression, like '(Two + Six) / Four'?
If you want to allow more complex expressions, you may want to directly evaluate the string as a Groovy expression. Inside the GroovyConsole or a Groovy script, you can directly call evaluate, which will evaluate an expression in the context of that script:
def numNames = 'Zero One Two Three Four Five Six Seven Eight Nine'.split()
// Add each numer name as a property to the script.
numNames.eachWithIndex { name, i ->
this[name] = i
}
println evaluate('(Two + Six) / Four') // -> 2
If you are not in one of those script-friendly worlds, you can use the GroovyShell class:
def numNames = 'Zero One Two Three Four Five Six Seven Eight Nine'.split()
def langMap = [:]
numNames.eachWithIndex { name, i -> langMap[name] = i }
def shell = new GroovyShell(langMap as Binding)
println shell.evaluate('(Two + Six) / Four') // -> 2
But, be aware that using eval is very risky. If the input string is user-generated, i would not recommend you going this way; the user could input something like "rm -rf /".execute(), and, depending on the privileges of the script, erase everything from wherever that script is executed. You may first validate that the input string is "safe" (maybe checking it only contains known operators, whitespaces, parentheses and number names) but i don't know if that's safe enough.
Another alternative is defining your own mini-language for those expressions and then parsing them using something like ANTLR. But, again, this really depends on what you're trying to accomplish.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Question related to string - string

You will end up with two Strings. 1) the literal "ABC", used to construct aStr and assigned to bStr. The compiler makes sure that this is the same single instance. 2) a newly constructed String aStr (because you forced it to be new'ed, which is really pretty much non-sensical)

Related

kotlin string concatenation no side effect

When a GString will change its toString representation

Allocation memory for string

How arguments are passed into a cppFunction

Groovy DSL: creating dynamic closures from Strings

Categories

Resources