Is string a reference type or a value type - string

I was recently reading the source code of go. I see that there is a file called string.go in the source code, but at the same time, the string is the predeclared identifiers, and it also being used in the source code directly?
I found some article that said string is the reference type. but I try to run the following code:
func TestString(t *testing.T) {
s := "abc"
fmt.Println("address of s: ", &s)
xx := func(sss string) {
fmt.Println("address of sss: ", &sss)
sss = "123"
}
xx(s)
fmt.Println("value of s after sss modified the content: ", s)
}
output:
=== RUN TestString
address of s: 0xc00010a560
address of sss: 0xc00010a570
value of s after sss modified the content: abc
--- PASS: TestString (0.00s)
if the string is reference, So when I pass the s to func(sss string), the address of sss should be the same with s, and s should be modified, but it wasn't, why?
Is the go did something of string, let it seems like a value type? but where is the code? if the string is a refernce type and the actual type of string is type stringStruct, it should be defined the behavior somewhere?
It really confuse me

&s and &sss are the addresses of variables. Since they are 2 distinct non-zero size variables, their addresses must be different, which you experience.
There are no reference types in Go in the classic C sense. A string is a small struct-like value described by reflect.StringHeader:
type StringHeader struct {
Data uintptr
Len int
}
It contains a pointer where the UTF-8 encoded bytes of the string are stored, and a byte-length.
When you assign something to a variable of string type, you change the value of the variable (the above small StringHeader struct), but not the pointed data. When you assign something to the sss variable, the original s variable is unchanged, it still contains the same data pointer pointing to the same bytes.
Read The Go Blog: Strings, bytes, runes and characters in Go
See related questions:
Immutable string and pointer address
What is the difference between the string and []byte in Go?

Related

How to convert String to Primitive.ObjectID in Golang?

There are questions similar to this. But mostly they are using Hex()(like here) for primitive Object to String conversion. I'm using String() for conversion. How do I convert it back to primitive Object type ?
The String() method of types may result in an arbitrary string representation. Parsing it may not always be possible, as it may not contain all the information the original value holds, or it may not be "rendered" in a way that is parsable unambiguously. There's also no guarantee the "output" of String() doesn't change over time.
Current implementation of ObjectID.String() does this:
func (id ObjectID) String() string {
return fmt.Sprintf("ObjectID(%q)", id.Hex())
}
Which results in a string like this:
ObjectID("4af9f070cc10e263c8df915d")
This is parsable, you just have to take the hex number, and pass it to primitive.ObjectIDFromHex():
For example:
id := primitive.NewObjectID()
s := id.String()
fmt.Println(s)
hex := s[10:34]
id2, err := primitive.ObjectIDFromHex(hex)
fmt.Println(id2, err)
This will output (try it on the Go Playground):
ObjectID("4af9f070cc10e263c8df915d")
ObjectID("4af9f070cc10e263c8df915d") <nil>
This solution could be improved to find " characters in the string representation and use the indices instead of the fixed 10 and 34, but you shouldn't be transferring and parsing the result of ObjectID.String() in the first place. You should use its ObjectID.Hex() method in the first place, which can be passed as-is to primitive.ObjectIDFromHex().

How does type conversion internally work? What is the memory utilization for the same?

How does Go type conversion internally work?
What is the memory utilisation for a type cast?
For example:
var str1 string
str1 = "26MB string data"
byt := []byte(str1)
str2 := string(byt)
whenever I type convert any variable, will it consume more memory?
I am concerned about this because when I try to unmarshall, I get "fatal error: runtime: out of memory"
err = json.Unmarshal([]byte(str1), &obj)
str1 value comes from HTTP response, but read using ioutils.ReadAll, hence it contains the complete response.
It's called conversion in Go (not casting), and this is covered in Spec: Conversions:
Specific rules apply to (non-constant) conversions between numeric types or to and from a string type. These conversions may change the representation of x and incur a run-time cost. All other conversions only change the type but not the representation of x.
So generally converting does not make a copy, only changes the type. Converting to / from string usually does, as string values are immutable, and for example if converting a string to []byte would not make a copy, you could change the content of the string by changing elements of the resulting byte slice.
See related question: Does convertion between alias types in Go create copies?
There are some exceptions (compiler optimizations) when converting to / from string does not make a copy, for details see golang: []byte(string) vs []byte(*string).
If you already have your JSON content as a string value which you want to unmarshal, you should not convert it to []byte just for the sake of unmarshaling. Instead use strings.NewReader() to obtain an io.Reader which reads from the passed string value, and pass this reader to json.NewDecoder(), so you can unmarshal without having to make a copy of your big input JSON string.
This is how it could look like:
input := "BIG JSON INPUT"
dec := json.NewDecoder(strings.NewReader(input))
var result YourResultType
if err := dec.Decode(&result); err != nil {
// Handle error
}
Also note that this solution can further be optimized if the big JSON string is read from an io.Reader, in which case you can completely omit reading it first, just pass that to json.NewDecoder() directly, e.g.:
dec := json.NewDecoder(jsonSource)
var result YourResultType
if err := dec.Decode(&result); err != nil {
// Handle error
}

What are the differences between a *string and a string in Golang?

Aim: understanding the difference between *string and string in Golang
Attempt
func passArguments() {
username := flag.String("user", "root", "Username for this server")
flag.Parse()
fmt.Printf("Your username is %q.", *username)
fmt.Printf("Your username is %q.", username)
}
results in:
Your username is "root".Your username is %!q(*string=0xc820072200)
but when the *string is assigned to a string:
bla:=*username
fmt.Printf("Your username is %q.", bla)
it is able to print the string again:
Your username is "root".Your username is %!q(*string=0xc8200781f0).Your username is "root".
Questions
Why is a *string != string, e.g. display of: "root" vs.
%!q(*string=0xc8200781f0)?
In what other cases should a *string be
used instead of a string and why?
Why is it possible to assign a
*string to a string variable, while the display of the string is different, e.g. display of: "root" vs.
%!q(*string=0xc8200781f0)?
A *string is a pointer to a string. If you're not familiar with pointers, let's just say that it's a value that holds the address of another value, instead of the value itself (it's a level of indirection).
When a * is used in a type, it denotes a pointer to that type. *int is a pointer to an integer. ***bool is a pointer to a pointer to a pointer to a bool.
flag.String returns a pointer to a string because it it can then modify the string value (after the call to flag.Parse) and you are able to retrieve that value using the dereference operator - that is, when using * on a variable, it dereferences it, or retrieves the value pointed to instead of the value of the variable itself (which in the case of a pointer would just be a memory address).
So to answer your specific questions:
the %q verb in the fmt package understands strings (and slices of bytes), not pointers, hence the apparent gibberish displayed (when a value is not of the expected type for the matching verb - here %q - the fmt functions display %!q along with the actual type and value passed)
A pointer to a string is very rarely used. A string in Go is immutable (https://golang.org/ref/spec#String_types) so in cases like flag.String where you need to return a string that will be mutated later on, you have to return a pointer to a string. But you won't see that very often in idiomatic Go.
You are not assigning a *string (pointer to a string) to a string. What you are doing, as I mentioned earlier, is dereferencing the *string variable, extracting its string value. So you are in fact assigning a string to a string. Try removing the * on that line, you'll see the compiler error message. (actually, because you're using the short variable declaration notation, :=, you won't see a compiler error, but your variable will be declared as a pointer-to-a-string. Try this instead, to better understand what's going on:
var s string
s = username
That will raise the compiler error).

What is the difference between the string and []byte in Go?

s := "some string"
b := []byte(s) // convert string -> []byte
s2 := string(b) // convert []byte -> string
what is the difference between the string and []byte in Go?
When to use "he" or "she"?
Why?
bb := []byte{'h','e','l','l','o',127}
ss := string(bb)
fmt.Println(ss)
hello
The output is just "hello", and lack of 127, sometimes I feel that it's weird.
string and []byte are different types, but they can be converted to one another:
3 . Converting a slice of bytes to a string type yields a string whose successive bytes are the elements of the slice.
4 . Converting a value of a string type to a slice of bytes type yields a slice whose successive elements are the bytes of the string.
Blog: Arrays, slices (and strings): The mechanics of 'append':
Strings are actually very simple: they are just read-only slices of bytes with a bit of extra syntactic support from the language.
Also read: Strings, bytes, runes and characters in Go
When to use one over the other?
Depends on what you need. Strings are immutable, so they can be shared and you have guarantee they won't get modified.
Byte slices can be modified (meaning the content of the backing array).
Also if you need to frequently convert a string to a []byte (e.g. because you need to write it into an io.Writer()), you should consider storing it as a []byte in the first place.
Also note that you can have string constants but there are no slice constants. This may be a small optimization. Also note that:
The expression len(s) is constant if s is a string constant.
Also if you are using code already written (either standard library, 3rd party packages or your own), in most of the cases it is given what parameters and values you have to pass or are returned. E.g. if you read data from an io.Reader, you need to have a []byte which you have to pass to receive the read bytes, you can't use a string for that.
This example:
bb := []byte{'h','e','l','l','o',127}
What happens here is that you used a composite literal (slice literal) to create and initialize a new slice of type []byte (using Short variable declaration). You specified constants to list the initial elements of the slice. You also used a byte value 127 which - depending on the platform / console - may or may not have a visual representation.
Late but i hope this could help.
In simple words
Bit: 0 and 1 is how machines represents all the information
Byte: 8 bits that represents UTF-8 encodings i.e. characters
[ ]type: slice of a given data type. Slices are dynamic size arrays.
[ ]byte: this is a byte slice i.e. a dynamic size array that contains bytes i.e. each element is a UTF-8 character.
String: read-only slices of bytes i.e. immutable
With all this in mind:
s := "Go"
bs := []byte(s)
fmt.Printf("%s", bs) // Output: Go
fmt.Printf("%d", bs) // Output: [71 111]
or
bs := []byte{71, 111}
fmt.Printf("%s", bs) // Output: Go
%s converts byte slice to string
%d gets UTF-8 decimal value of bytes
IMPORTANT:
As strings are immutable, they cannot be changed within memory, each time you add or remove something from a string, GO creates a new string in memory. On the other hand, byte slices are mutable so when you update a byte slice you are not recreating new stuffs in memory.
So choosing the right structure could make a difference in your app performance.

Go - Comparing strings/byte slices input by the user

I am getting input from the user, however when I try to compare it later on to a string literal it does not work. That is just a test though.
I would like to set it up so that when a blank line is entered (just hitting the enter/return key) the program exits. I don't understand why the strings are not comparing because when I print it, it comes out identical.
in := bufio.NewReader(os.Stdin);
input, err := in.ReadBytes('\n');
if err != nil {
fmt.Println("Error: ", err)
}
if string(input) == "example" {
os.Exit(0)
}
string vs []byte
string definition:
string is the set of all strings of 8-bit bytes, conventionally but not necessarily representing UTF-8-encoded text. A string may be empty, but not nil. Values of string type are immutable.
byte definition:
byte is an alias for uint8 and is equivalent to uint8 in all ways. It is used, by convention, to distinguish byte values from 8-bit unsigned integer values.
What does it mean?
[]byte is a byte slice. slice can be empty.
string elements are unicode characters, which can have more then 1 byte.
string elements keep a meaning of data (encoding), []bytes not.
equality operator is defined for string type but not for slice type.
As you see they are two different types with different properties.
There is a great blog post explaining different string related types [1]
Regards the issue you have in your code snippet.
Bear in mind that in.ReadBytes(char) returns a byte slice with char inclusively. So in your code input ends with '\n'. If you want your code to work in desired way then try this:
if string(input) == "example\n" { // or "example\r\n" when on windows
os.Exit(0)
}
Also make sure that your terminal code page is the same as your .go source file. Be aware about different end-line styles (Windows uses "\r\n"), Standard go compiler uses utf8 internally.
[1] Comparison of Go data types for string processing.

Resources