String created as literal and new operator - string

When I declare a String using:
String a = new String("Hello");
2 objects are created. 1 object resides in heap and another in String literal pool.
So when I do:
String b = "Hello";
Is a new object created or is "Hello" from String pool referenced to b?

A new object is created.
Explanation
new String("Hello") creates one object on the heap. It is not stored in the String literal pool. More information here.
String b = "Hello" first wants to reuse String "Hello" from the pool, but there is none. So, it will create a new "Hello" String object in the pool and assign your reference to it.
You can read more about String literals in Java Language Specification 3.10.5.
Test
We can test that the references are pointing to different objects:
String a = new String("Hello");
String b = "Hello";
System.out.println(a == b);
Prints false, as expected.

I am talking about Java language. I have read in many places that
String a = new String("Hello")
creates 2 objects. One is in Heap. Could you please let me know where is the other object created?

Related

Why Assignment Error not caught at Compile Time in Nim?

The code below produces ObjectAssignmentError, as far as I understand it's because the sequence defined as seq[Doc] and not as seq[TextDoc].
But aren't such errors supposed to be caught at compile time, by compiler and not leak into runtime?
type
Doc = object of RootObj
TextDoc* = object of Doc
title*: string
text*: string
TodoDoc* = object of Doc
todo*: string
var all_docs*: seq[Doc] = #[]
all_docs.add TextDoc(title: "", text: "")
The reason this isn't caught at compile time is because it's possible in general to add a TextDoc to an array of Doc, as long as they've been declared sensibly, i.e. heterogenous arrays have to be ref objects or Variants (otherwise they can't all fit in the same space)
to put it another way: inheritance means any procedure that accepts a Doc will also accept a TextDoc, including the procedure
proc add(s:var seq[Doc],d:Doc)
Nim supports value semantics, which are important for cache-friendliness, but that freedom can be unsettling to those used to e.g. Java where everything is passed by reference.
with reference semantics the above code runs with no errors:
type
Doc = ref object of RootObj
TextDoc* = ref object of Doc
title*: string
text*: string
TodoDoc* = ref object of Doc
todo*: string
var all_docs*: seq[Doc] = #[]
all_docs.add TextDoc(title: "", text: "blah")
assert TextDoc(all_docs[0]).text == "blah"

When a GString will change its toString representation

I am reading the Groovy closure documentation in https://groovy-lang.org/closures.html#this. Having a question regarding with GString behavior.
Closures in GStrings
The document mentioned the following:
Take the following code:
def x = 1
def gs = "x = ${x}"
assert gs == 'x = 1'
The code behaves as you would expect, but what happens if you add:
x = 2
assert gs == 'x = 2'
You will see that the assert fails! There are two reasons for this:
a GString only evaluates lazily the toString representation of values
the syntax ${x} in a GString does not represent a closure but an expression to $x, evaluated when the GString is created.
In our example, the GString is created with an expression referencing x. When the GString is created, the value of x is 1, so the GString is created with a value of 1. When the assert is triggered, the GString is evaluated and 1 is converted to a String using toString. When we change x to 2, we did change the value of x, but it is a different object, and the GString still references the old one.
A GString will only change its toString representation if the values it references are mutating. If the references change, nothing will happen.
My question is regarding the above-quoted explanation, in the example code, 1 is obviously a value, not a reference type, then if this statement is true, it should update to 2 in the GString right?
The next example listed below I feel also a bit confusing for me (the last part)
why if we mutate Sam to change his name to Lucy, this time the GString is correctly mutated??
I am expecting it won't mutate?? why the behavior is so different in the two examples?
class Person {
String name
String toString() { name }
}
def sam = new Person(name:'Sam')
def lucy = new Person(name:'Lucy')
def p = sam
def gs = "Name: ${p}"
assert gs == 'Name: Sam'
p = Lucy. //if we change p to Lucy
assert gs == 'Name: Sam' // the string still evaluates to Sam because it was the value of p when the GString was created
/* I would expect below to be 'Name: Sam' as well
* if previous example is true. According to the
* explanation mentioned previously.
*/
sam.name = 'Lucy' // so if we mutate Sam to change his name to Lucy
assert gs == 'Name: Lucy' // this time the GString is correctly mutated
Why the comment says 'this time the GString is correctly mutated? In previous comments it just metioned
the string still evaluates to Sam because it was the value of p when the GString was created, the value of p is 'Sam' when the String was created
thus I think it should not change here??
Thanks for kind help.
These two examples explain two different use cases. In the first example, the expression "x = ${x}" creates a GString object that internally stores strings = ['x = '] and values = [1]. You can check internals of this particular GString with println gs.dump():
<org.codehaus.groovy.runtime.GStringImpl#6aa798b strings=[x = , ] values=[1]>
Both objects, a String one in the strings array, and an Integer one in the values array are immutable. (Values are immutable, not arrays.) When the x variable is assigned to a new value, it creates a new object in the memory that is not associated with the 1 stored in the GString.values array. x = 2 is not a mutation. This is new object creation. This is not a Groovy specific thing, this is how Java works. You can try the following pure Java example to see how it works:
List<Integer> list = new ArrayList<>();
Integer number = 2;
list.add(number);
number = 4;
System.out.println(list); // prints: [2]
The use case with a Person class is different. Here you can see how mutation of an object works. When you change sam.name to Lucy, you mutate an internal stage of an object stored in the GString.values array. If you, instead, create a new object and assigned it to sam variable (e.g. sam = new Person(name:"Adam")), it would not affect internals of the existing GString object. The object that was stored internally in the GString did not mutate. The variable sam in this case just refers to a different object in the memory. When you do sam.name = "Lucy", you mutate the object in the memory, thus GString (which uses a reference to the same object) sees this change. It is similar to the following plain Java use case:
List<List<Integer>> list2 = new ArrayList<>();
List<Integer> nested = new ArrayList<>();
nested.add(1);
list2.add(nested);
System.out.println(list2); // prints: [[1]]
nested.add(3);
System.out.println(list2); // prints: [[1,3]]
nested = new ArrayList<>();
System.out.println(list2); // prints: [[1,3]]
You can see that list2 stores the reference to the object in the memory represented by nested variable at the time when nested was added to list2. When you mutated nested list by adding new numbers to it, those changes are reflected in list2, because you mutate an object in the memory that list2 has access to. But when you override nested with a new list, you create a new object, and list2 has no connection with this new object in the memory. You could add integers to this new nested list and list2 won't be affected - it stores a reference to a different object in the memory. (The object that previously could be referred to using nested variable, but this reference was overridden later in the code with a new object.)
GString in this case behaves similarly to the examples with lists I shown you above. If you mutate the state of the interpolated object (e.g. sam.name, or adding integers to nested list), this change is reflected in the GString.toString() that produces a string when the method is called. (The string that is created uses the current state of values stored in the values internal array.) On the other hand, if you override a variable with a new object (e.g. x = 2, sam = new Person(name:"Adam"), or nested = new ArrayList()), it won't change what GString.toString() method produces, because it still uses an object (or objects) that is stored in the memory, and that was previously associated with the variable name you assigned to a new object.
That's almost the whole story, as you can use a Closure for your GString evaluation, so in place of just using the variable:
def gs = "x = ${x}"
You can use a closure that returns the variable:
def gs = "x = ${-> x}"
This means that the value x is evaluated at the time the GString is changed to a String, so this then works (from the original question)
def x = 1
def gs = "x = ${-> x}"
assert gs == 'x = 1'
x = 2
assert gs == 'x = 2'

Allocation memory for string

In D string is alias on immutable char[]. So every operation on string processing with allocation of memory. I tied to check it, but after replacing symbol in string I see the same address.
string str = "big";
writeln(&str);
str.replace("i","a");
writeln(&str);
Output:
> app.exe
19FE10
19FE10
I tried use ptr:
string str = "big";
writeln(str.ptr);
str.replace(`i`,`a`);
writeln(str.ptr);
And got next output:
42E080
42E080
So it's showing the same address. Why?
You made a simple error in your code:
str.replace("i","a");
str.replace returns the new string with the replacement done, it doesn't actually replace the existing variable. So try str = str.replace("i", "a"); to see the change.
But you also made a too broadly general statement about allocations:
So every operation on string processing with allocation of memory.
That's false, a great many operations do not require allocation of new memory. Anything that can slice the existing string will do so, avoiding needing new memory:
import std.string;
import std.stdio;
void main() {
string a = " foo ";
string b = a.strip();
assert(b == "foo"); // whitespace stripped off...
writeln(a.ptr);
writeln(b.ptr); // but notice how close those ptrs are
assert(b.ptr == a.ptr + 2); // yes, b is a slice of a
}
replace will also return the original string if no replacement was actually done:
string a = " foo ";
string b = a.replace("p", "a"); // there is no p to replace
assert(a.ptr is b.ptr); // so same string returned
Indexing and iteration require no new allocation (of course). Believe it or not, but even appending sometimes will not allocate because there may be memory left at the end of the slice that is not yet used (though it usually will).
There's also various functions that return range objects that do the changes as you iterate through them, avoiding allocation. For example, instead of replace(a, "f", "");, you might do something like filter!(ch => ch != 'f')(a); and loop through, which doesn't allocate a new string unless you ask it to.
So it is a lot more nuanced than you might think!
In D all arrays are a length + a pointer to the start of the array values. These are usually stored on the stack which just so happens to be RAM.
When you go take an address of a variable (which is in a function body) what you really are doing is getting a pointer to the stack.
To get the address of an array values use .ptr.
So replace &str with str.ptr and you will get the correct output.

How a property, of type string, is passed

I have the following code (note the code below doesnt update the property)
private void queryResultsFilePath_Click(object sender, EventArgs e)
{
Library.SProc.Browse browser = new Browse();
browser.selectFile(QueryResultFilePath);
}
and
public class Browse
{
public void selectFile(string propertyName)
{
...
propertyName = browserWindow.FileName;
}
}
Now i realise that i need to change the second method so that it returns a string and manually assign it to the property in the first example.
What im unsure of is that i thought that when i assigned a ref type as an actual parameter of a method, a copy of its value on the stack (ie its memory address in the heap) was copied to the new location on the stack for the methods formal parameter, so they are both pointing to the same memory address on the heap. So when i changed the value of the formal parameter, it would actually change the value stored on the heap and thus the actual parameters value.
Obviously im missing something since im having to return a string and manually assign it to the property. If someone could point out what ive misunderstood id appreciate it.
Thanks.
I believe the missing piece here is: strings are immutable.
Although you pass it by reference, as soon as anything attempts to mutate the string, a new string is created leaving the old one intact.
I believe it is the only reference type that has enforced immutability.
From MSDN:
Strings are immutable--the contents of a string object cannot be
changed after the object is created, although the syntax makes it
appear as if you can do this. For example, when you write this code,
the compiler actually creates a new string object to hold the new
sequence of characters, and that new object is assigned to b. The
string "h" is then eligible for garbage collection.
Further reading:
http://social.msdn.microsoft.com/Forums/en/netfxbcl/thread/e755cbcd-4b09-4a61-b31f-e46e48d1b2eb
If you wish the method to "change" the caller's string then you can simulate this using the ref keyword:
public void SelectFile(ref string propertyName)
{
propertyName = browserWindow.FileName;
}
In this example, the parameter propertyName will be assigned to in the method, because of ref being used, this also changes the string that the caller is pointing to. Note here that immutability is still enforced. propertyName used to point to string A, but after assignment now points to string B - the old string A is now unreferenced and will be garbage collected (but importantly still exists and wasn't changed - immutable). If the ref keyword wasn't used, the caller would still point at A and the method would point at B. However, because the ref keyword was used the callers variable now points to string B.
This is the same effect as the following example:
static void Main(string[] args)
{
MyClass classRef = new MyClass("A");
PointToANewClass(ref classRef);
// classRef now points to a brand new instance containing "B".
}
public static void PointToANewClass(ref MyClass classRef)
{
classRef = new MyClass("B");
}
If you try the above without the ref keyword, classRef would still point to an object containing "A" even though the class was passed by reference.
Don't get confused between string semantics and ref semantics. And also don't get confused between passing something by reference and assignment. Stuff is technically never passed by reference, the pointer to the object on the heap is passed by value - hence ref on a reference type has the behaviour specified above. Also hence not using ref will not allow a new assignment to be "shared" between caller and method, the method has received its own copy of the pointer to the object on the heap, dereferencing the pointer has the usual effect (looking at the same underlying object), but assigning to the pointer will not affect the callers copy of the pointer.
I'm really grateful to Adam Houldsworth, because I've finally understood how the .NET framework uses reference parameters and what happens with the string.
In .NET there are two kind of data types:
value type: primitive types like int, float, bool, and so on
reference type: all the other objects, including string
In the case of reference type, the object is stored in the heap, and a variable only holds a reference pointing to this object. You can access the object's properties through the reference and modify them. When you pass one of this variables as parameter, a copy of the reference pointing to the same object is passed on to the method body. So, when you access and modify properties, you are modifyin gthe same object stored on the heap. I.e, this class is a reference object:
public class ClassOne
{
public string Desc { get; set; }
}
When you do this
ClassOne one = new { Desc = "I'm a class one!" };
there's an object on the heap pointed to by the reference one. If you do this:
one.Desc = "Changed value!";
the object on the heap has been modified. If you pass this reference as a parameter:
public void ChangeOne(ClassOne one)
{
one.Desc = "Changed value!"
}
The original object on the heap is also changed, because one helds a copy of the original reference that points to the same object on the heap.
But if you do this:
public void ChangeOne(ClassOne one)
{
one = new ClassOne { Desc ="Changed value!" };
}
The original object is unchanged. That's because one was a copy of the reference that it's now pointing to a different object.
If you pass it explicitly by reference:
public void ChangeOne(ref ClassOne one)
{
one = new ClassOne { Desc ="Changed value!" };
}
one inside this method is not a copy of the outer refernce, but the reference itself, so, the original reference now points to this new object.
strings are inmutable. This means that you cannot change a string. if you try to do so, a new string is created. So, if you do this:
string s = "HELL";
s = s + "O";
The second line creates a new instance of string, with the value "HELLO" and "HELL" is abandoned on the heap (left to be garbage collected).
So it's not possible to change it if you pass it as a parameter like this:
public void AppendO(string one)
{
one = one + "O";
}
string original = "HELL";
AppendO(original);
the original string is left as is. The code inside the function creates a new object, and assign it to one, which is a copy of original reference. But original keeps pointing to "HELL".
In the case of value types, when they are passed as parameters to a function, they are passed by value, i.e. the function receives a copy of the original value. So, any modification done to the object inside the function body won't affect the original value outside the function.
The problem is that, although string is a reference type, it looks as if it behaves like a value type (this applies to comparisons, passing parameters, and so on).
However, as explained above, it's possible to make the compiler pass a reference type by reference using the ref keyword. This also also works for strings.
You can check this code, and you'll see that the string is modified (this would also apply to an int, float or any other value type):
public static class StringTest
{
public static void AppednO(ref string toModify)
{
toModify = toModify + "O";
}
}
// test:
string hell = "HELL";
StringTest.AppendO(ref hell);
if (hell == "HELLO")
{
// here, hell is "HELLO"
}
Note that, for avoiding errors, when you define a parameter as ref, you also have to pass the parameter with this modifier.
Anyway, for this case (and similar cases) I'd recommend you to use the more natural functional syntax:
var hell = StringTest.AppendO(hell);
(Of course, in this case, the function will have this signature and corresponding implementation:
public static string AppendO(string value)
{
return value + "O";
}
If you're going to make many changes to a string, you should use the StringBuilder class, which works with "mutable strings".
How a property, of type string, is passed
Strings are immutable and therefore you are passing copies of them to methods. This means that the copy changes but the original parameter stays the same.

Question related to string

I have two statements:
String aStr = new String("ABC");
String bStr = "ABC";
I read in book that in first statement JVM creates two bjects and one reference variable, whereas second statement creates one reference variable and one object.
How is that? When I say new String("ABC") then It's pretty clear that object is created.
Now my question is that for "ABC" value to we do create another object?
Please clarify a bit more here.
Thank you
You will end up with two Strings.
1) the literal "ABC", used to construct aStr and assigned to bStr. The compiler makes sure that this is the same single instance.
2) a newly constructed String aStr (because you forced it to be new'ed, which is really pretty much non-sensical)
Using a string literal will only create a single object for the lifetime of the JVM - or possibly the classloader. (I can't remember the exact details, but it's almost never important.)
That means it's hard to say that the second statement in your code sample really "creates" an object - a certain object has to be present, but if you run the same code in a loop 100 times, it won't create any more objects... whereas the first statement would. (It would require that the object referred to by the "ABC" literal is present and create a new instance on each iteration, by virtue of calling the constructor.)
In particular, if you have:
Object x = "ABC";
Object y = "ABC";
then it's guaranteed (by the language specification) than x and y will refer to the same object. This extends to other constant expressions equal to the same string too:
private static final String A = "a";
Object z = A + "BC"; // x, y and z are still the same reference...
The only time I ever use the String(String) constructor is if I've got a string which may well be backed by a rather larger character array which I don't otherwise need:
String x = readSomeVeryLargeString();
String y = x.substring(5, 10);
String z = new String(y); // Copies the contents
Now if the strings that y and x refer to are eligible for collection but the string that z refers to isn't (e.g. it's passed on to other methods etc) then we don't end up holding all of the original long string in memory, which we would otherwise.

Resources