How do I override the printing of arrays? - pharo

printOn: aStream
| normalized |
normalized := self normalized.
aStream nextPut: ${.
self isEmpty ifFalse: [
normalized printElem: 1 on: aStream.
2 to: self size do: [ :i |
aStream nextPutAll: ' . '.
normalized printElem: i on: aStream
].
].
aStream nextPut: $}
This printOn: method works, but the Inspector is using some other route to print the array. How do I to tell the Inspector to use the above method for my class that inherits from Array?

Inspector uses gtDisplayOn: to represent objects.
In Object it is implemented as:
gtDisplayOn: stream
"This offers a means to customize how the object is shown in the inspector"
^ self printOn: stream
However, Collection overrides it as:
gtDisplayOn: stream
self printNameOn: stream.
stream
space;
nextPut: $[;
print: self size;
nextPutAll: (' item' asPluralBasedOn: self size);
nextPut: $];
space.
self size <= self gtCollectionSizeThreshold
ifTrue: [ self printElementsOn: stream ]
Just override it again in your class to use printOn: as Object does.

Related

Convert byte to string using reflect.StringHeader still allocates new memory?

I've got this small code snippet to test 2 ways of converting byte slice to string object, one function to allocate a new string object, another uses unsafe pointer arithmetic to construct string*, which doesn't allocate new memory:
package main
import (
"fmt"
"reflect"
"unsafe"
)
func byteToString(b []byte) string {
return string(b)
}
func byteToStringNoAlloc(b []byte) string {
if len(b) == 0 {
return ""
}
sh := reflect.StringHeader{uintptr(unsafe.Pointer(&b[0])), len(b)}
return *(*string)(unsafe.Pointer(&sh))
}
func main() {
b := []byte("hello")
fmt.Printf("1st element of slice: %v\n", &b[0])
str := byteToString(b)
sh := (*reflect.StringHeader)(unsafe.Pointer(&str))
fmt.Printf("New alloc: %v\n", sh)
toStr := byteToStringNoAlloc(b)
shNoAlloc := (*reflect.StringHeader)(unsafe.Pointer(&toStr))
fmt.Printf("No alloc: %v\n", shNoAlloc) // why different from &b[0]
}
I run this program under go 1.13:
1st element of slice: 0xc000076068
New alloc: &{824634204304 5}
No alloc: &{824634204264 5}
I exptect that the "1st element of slice" should print out the same address like "No alloc", but acturally they're very different. Where did I get wrong?
First of all, type conversions are calling a internal functions, for this case it's slicebytetostring.
https://golang.org/src/runtime/string.go?h=slicebytetostring#L75
It does copy of slice's content into new allocated memory.
In the second case you're creating a new header of the slice and cast it into string header the new unofficial holder of slice's content.
The problem of this is that garbage collector doesn't handle such kind of cases and resulting string header will be marked as a single structure which has no relations with the actual slice which holds the actual content, so, your resulting string would be valid only while the actual content holders are alive (don't count this string header itself).
So once garbage collector sweep the actual content, your string will still point to the same address but already freed memory, and you'll get the panic error or undefined behavior if you touch it.
By the way, there's no need to use reflect package and its headers because direct cast already creates new header as a result:
*(*string)(unsafe.Pointer(&byte_slice))

Ada - (Streams) How to correctly call String'Read () without knowing string length beforehand

I'm attempting to write a quick program to send AT commands to a serial-port modem. I have opened the port with the right settings (B115200, 8N1 etc) and the String'Write call in the below code sample does actually work correctly.
Now I'm adding the code to read the modem's response back as a string. However I cannot know the length of the response beforehand and hence I cannot create a String variable to pass in to the out String parameter unless I do know the length.
package GSC renames GNAT.Serial_Communications;
SP : aliased GSC.Serial_Port;
function Send (Port : in GSC.Serial_Port; S : in String) return String is
begin
String'Write (SP'Access, S);
delay 0.1;
declare
Retval : String; -- NOT VALID - needs to be initialised
begin
String'Read (SP'Access, Retval);
return Retval;
end;
end Send;
I have a chicken / egg situation here.
The answer is probably to read the input one character at a time until you reach the terminator.
You could allocate a buffer long enough to hold the longest possible response (e.g. 1024 bytes!) (or maybe use recursion - but that’d be more complicated and make it difficult to diagnose possible overrun errors).
If the string is terminated by a specific character, you could use Interfaces.C.Pointers:
function Receive (Port : in GSC.Serial_Port) return String is
package Character_Pointers is new Interfaces.C.Pointers (
Index => Positive, Element => Character, Element_Array => String,
Default_Terminator => Character'Val (13)); -- CR-Terminated
function Convert is new Ada.Unchecked_Conversion (
Source => access all Streams.Stream_Element,
Target => Character_Pointers.Pointer);
-- assuming no more than 1023 characters + terminator can be given.
Max_Elements : constant Streams.Stream_Element_Offset :=
1024 * Character'Size / Streams.Stream_Element'Size;
Buffer : Streams.Stream_Element_Array (1 .. Max_Elements);
Last : Stream_Element_Offset;
begin
Port.Read (Buffer, Last);
return Characters_Pointers.Value (Convert (Buffer (1)'Access));
end Receive;
This code makes several assumptions:
String is terminated with CR (can be modified by setting Default_Terminator appropriately).
The response contains nothing other than the string (additional content that may have been read after the string is silently discarded).
The whole content will never be longer than 1024 bytes.
The typical way to achieve this is to send the length first, then read the value. (This is what things like bencode do.) -- Something like:
-- A stream from Standard-Input; for passing to example parameters:
Some_Stream: not null access Ada.Streams.Root_Stream_Type'Class :=
Ada.Text_IO.Text_Streams.Stream( Ada.Text_IO.Standard_Input );
-- The simple way, use the 'Input attribute; this calls the appropriate
-- default deserializations to return an unconstrained type.
-- HOWEVER, if you're reading from an already extant data-stream, you may
-- need to customize the type's Input function.
Some_Value : Constant String := String'Input( Some_Stream );
-- If the stream places a length into the stream first, you can simply read
-- it and use that value, to prealocate the proper size and fill it with the
-- 'Read attribure.
Function Get_Value( Input : not null access Ada.Streams.Root_Stream_Type'Class ) return String is
Length : Constant Natural := Natural'Input( Input );
Begin
Return Result : String(1..Length) do
String'Read( Input, Result );
End Return;
End Get_Value;
-- The last method is to use when you're dealing with buffered information.
-- (Use this if you're dealing with idiocy like null-terminated strings.)
Function Get_Buffered_Value( Input : not null access Ada.Streams.Root_Stream_Type'Class;
Buffer_Size : Positive := 1024;
Full_Buffer : Boolean := True;
Terminator : Character:= ASCII.NUL
) return String is
Buffer : String(1..Buffer_Size);
Begin
-- Full_Buffer means we can read the entire buffer-size w/o
-- "overconsuming" -- IOW, the stream is padded to buffer-length.
if full_buffer then
String'Read(Input, Buffer);
declare
Index : Natural renames Ada.Strings.Fixed.Index(
Source => Buffer,
Pattern => (1..1 => Terminator),
From => Buffer'First
);
begin
Return Buffer(Buffer'First..Index);
end;
else
declare
Index : Positive := Buffer'First;
begin
-- Read characters.
loop
Character'Read( Input, Buffer(Index) );
exit when Buffer(Index) = Terminator;
Index:= Positive'Succ( Index );
exit when Index not in Buffer'Range;
end loop;
-- We're returning everything but the terminator.
Return Buffer(1..Positive'Pred(Index));
end;
end if;
End Get_Buffered_Value;

How to use map[string]*string

I'm trying to use sarama (Admin mode) to create a topic.
Without the ConfigEntries works fine. But I need to define some configs.
I set up the topic config (Here is happening the error):
tConfigs := map[string]*string{
"cleanup.policy": "delete",
"delete.retention.ms": "36000000",
}
But then I get an error:
./main.go:99:28: cannot use "delete" (type string) as type *string in map value
./main.go:100:28: cannot use "36000000" (type string) as type *string in map value
I'm trying to use the admin mode like this:
err = admin.CreateTopic(t.Name, &sarama.TopicDetail{
NumPartitions: 1,
ReplicationFactor: 3,
ConfigEntries: tConfigs,
}, false)
Here is the line from the sarama module that defines CreateTopic()
https://github.com/Shopify/sarama/blob/master/admin.go#L18
Basically, I didn't understand how the map of pointers strings works :)
To initialize a map having string pointer value type with a composite literal, you have to use string pointer values. A string literal is not a pointer, it's just a string value.
An easy way to get a pointer to a string value is to take the address of a variable of string type, e.g.:
s1 := "delete"
s2 := "36000000"
tConfigs := map[string]*string{
"cleanup.policy": &s1,
"delete.retention.ms": &s2,
}
To make it convenient when used many times, create a helper function:
func strptr(s string) *string { return &s }
And using it:
tConfigs := map[string]*string{
"cleanup.policy": strptr("delete"),
"delete.retention.ms": strptr("36000000"),
}
Try the examples on the Go Playground.
See background and other options here: How do I do a literal *int64 in Go?

Garbage collection and correct usage of pointers in Go

I come from a Python/Ruby/JavaScript background. I understand how pointers work, however, I'm not completely sure how to leverage them in the following situation.
Let's pretend we have a fictitious web API that searches some image database and returns a JSON describing what's displayed in each image that was found:
[
{
"url": "https://c8.staticflickr.com/4/3707/11603200203_87810ddb43_o.jpg",
"description": "Ocean islands",
"tags": [
{"name":"ocean", "rank":1},
{"name":"water", "rank":2},
{"name":"blue", "rank":3},
{"name":"forest", "rank":4}
]
},
...
{
"url": "https://c3.staticflickr.com/1/48/164626048_edeca27ed7_o.jpg",
"description": "Bridge over river",
"tags": [
{"name":"bridge", "rank":1},
{"name":"river", "rank":2},
{"name":"water", "rank":3},
{"name":"forest", "rank":4}
]
}
]
My goal is to create a data structure in Go that will map each tag to a list of image URLs that would look like this:
{
"ocean": [
"https://c8.staticflickr.com/4/3707/11603200203_87810ddb43_o.jpg"
],
"water": [
"https://c8.staticflickr.com/4/3707/11603200203_87810ddb43_o.jpg",
"https://c3.staticflickr.com/1/48/164626048_edeca27ed7_o.jpg"
],
"blue": [
"https://c8.staticflickr.com/4/3707/11603200203_87810ddb43_o.jpg"
],
"forest":[
"https://c8.staticflickr.com/4/3707/11603200203_87810ddb43_o.jpg",
"https://c3.staticflickr.com/1/48/164626048_edeca27ed7_o.jpg"
],
"bridge": [
"https://c3.staticflickr.com/1/48/164626048_edeca27ed7_o.jpg"
],
"river":[
"https://c3.staticflickr.com/1/48/164626048_edeca27ed7_o.jpg"
]
}
As you can see, each image URL can belong to multiple tags at the same time. If I have thousands of images and even more tags, this data structure can grow very large if image URL strings are copied by value for each tag. This is where I want to leverage pointers.
I can represent the JSON API response by two structs in Go, func searchImages() mimics the fake API:
package main
import "fmt"
type Image struct {
URL string
Description string
Tags []*Tag
}
type Tag struct {
Name string
Rank int
}
// this function mimics json.NewDecoder(resp.Body).Decode(&parsedJSON)
func searchImages() []*Image {
parsedJSON := []*Image{
&Image {
URL: "https://c8.staticflickr.com/4/3707/11603200203_87810ddb43_o.jpg",
Description: "Ocean islands",
Tags: []*Tag{
&Tag{"ocean", 1},
&Tag{"water", 2},
&Tag{"blue", 3},
&Tag{"forest", 4},
},
},
&Image {
URL: "https://c3.staticflickr.com/1/48/164626048_edeca27ed7_o.jpg",
Description: "Bridge over river",
Tags: []*Tag{
&Tag{"bridge", 1},
&Tag{"river", 2},
&Tag{"water", 3},
&Tag{"forest", 4},
},
},
}
return parsedJSON
}
Now the less optimal mapping function that results in a very large in-memory data structure can look like this:
func main() {
result := searchImages()
tagToUrlMap := make(map[string][]string)
for _, image := range result {
for _, tag := range image.Tags {
// fmt.Println(image.URL, tag.Name)
tagToUrlMap[tag.Name] = append(tagToUrlMap[tag.Name], image.URL)
}
}
fmt.Println(tagToUrlMap)
}
I can modify it to use pointers to the Image struct URL field instead of copying it by value:
// Version 1
tagToUrlMap := make(map[string][]*string)
for _, image := range result {
for _, tag := range image.Tags {
// fmt.Println(image.URL, tag.Name)
tagToUrlMap[tag.Name] = append(tagToUrlMap[tag.Name], &image.URL)
}
}
It works and my first question is what happens to the result data structure after I build the mapping in this way? Will the Image URL string fields be left in memory somehow and the rest of the result will be garbage collected? Or will the result data structure stay in memory until the end of the program because something points to its members?
Another way to do this would be to copy the URL to an intermediate variable and use a pointer to it instead:
// Version 2
tagToUrlMap := make(map[string][]*string)
for _, image := range result {
imageUrl = image.URL
for _, tag := range image.Tags {
// fmt.Println(image.URL, tag.Name)
tagToUrlMap[tag.Name] = append(tagToUrlMap[tag.Name], &imageUrl)
}
}
Is this better? Will the result data structure be garbage collected correctly?
Or perhaps I should use a pointer to string in the Image struct instead?
type Image struct {
URL *string
Description string
Tags []*Tag
}
Is there a better way to do this? I would also appreciate any resources on Go that describe various uses of pointers in depth. Thanks!
https://play.golang.org/p/VcKWUYLIpH7
UPDATE: I'm worried about optimal memory consumption and not generating unwanted garbage the most. My goal is to use the minimal amount of memory possible.
Foreword: I released the presented string pool in my github.com/icza/gox library, see stringsx.Pool.
First some background. string values in Go are represented by a small struct-like data structure reflect.StringHeader:
type StringHeader struct {
Data uintptr
Len int
}
So basically passing / copying a string value passes / copies this small struct value, which is 2 words only regardless of the length of the string. On 64-bit architectures, it's only 16 bytes, even if the string has a thousand characters.
So basically string values already act as pointers. Introducing another pointer like *string just complicates usage, and you won't really gain any noticable memory. For the sake of memory optimization, forget about using *string.
It works and my first question is what happens to the result data structure after I build the mapping in this way? Will the Image URL string fields be left in memory somehow and the rest of the result will be garbage collected? Or will the result data structure stay in memory until the end of the program because something points to its members?
If you have a pointer value pointing to a field of a struct value, then the whole struct will be kept in memory, it can't be garbage collected. Note that although it could be possible to release memory reserved for other fields of the struct, but the current Go runtime and garbage collector does not do so. So to achieve optimal memory usage, you should forget about storing addresses of struct fields (unless you also need the complete struct values, but still, storing field addresses and slice/array element addresses always requires care).
The reason for this is because memory for struct values are allocated as a contiguous segment, and so keeping only a single referenced field would strongly fragment the available / free memory, and would make optimal memory management even harder and less efficient. Defragmenting such areas would also require copying the referenced field's memory area, which would require "live-changing" pointer values (changing memory addresses).
So while using pointers to string values may save you some tiny memory, the added complexity and additional indirections make it unworthy.
So what to do then?
"Optimal" solution
So the cleanest way is to keep using string values.
And there is one more optimization we didn't talk about earlier.
You get your results by unmarshaling a JSON API response. This means that if the same URL or tag value is included multiple times in the JSON response, different string values will be created for them.
What does this mean? If you have the same URL twice in the JSON response, after unmarshaling, you will have 2 distinct string values which will contain 2 different pointers pointing to 2 different allocated byte sequences (string content which otherwise will be the same). The encoding/json package does not do string interning.
Here's a little app that proves this:
var s []string
err := json.Unmarshal([]byte(`["abc", "abc", "abc"]`), &s)
if err != nil {
panic(err)
}
for i := range s {
hdr := (*reflect.StringHeader)(unsafe.Pointer(&s[i]))
fmt.Println(hdr.Data)
}
Output of the above (try it on the Go Playground):
273760312
273760315
273760320
We see 3 different pointers. They could be the same, as string values are immutable.
The json package does not detect repeating string values because the detection adds memory and computational overhead, which is obviously something unwanted. But in our case we shoot for optimal memory usage, so an "initial", additional computation does worth the big memory gain.
So let's do our own string interning. How to do that?
After unmarshaling the JSON result, during building the tagToUrlMap map, let's keep track of string values we have come across, and if the subsequent string value has been seen earlier, just use that earlier value (its string descriptor).
Here's a very simple string interner implementation:
var cache = map[string]string{}
func interned(s string) string {
if s2, ok := cache[s]; ok {
return s2
}
// New string, store it
cache[s] = s
return s
}
Let's test this "interner" in the example code above:
var s []string
err := json.Unmarshal([]byte(`["abc", "abc", "abc"]`), &s)
if err != nil {
panic(err)
}
for i := range s {
hdr := (*reflect.StringHeader)(unsafe.Pointer(&s[i]))
fmt.Println(hdr.Data, s[i])
}
for i := range s {
s[i] = interned(s[i])
}
for i := range s {
hdr := (*reflect.StringHeader)(unsafe.Pointer(&s[i]))
fmt.Println(hdr.Data, s[i])
}
Output of the above (try it on the Go Playground):
273760312 abc
273760315 abc
273760320 abc
273760312 abc
273760312 abc
273760312 abc
Wonderful! As we can see, after using our interned() function, only a single instance of the "abc" string is used in our data structure (which is actually the first occurrence). This means all other instances (given no one else uses them) can be–and will be–properly garbage collected (by the garbage collector, some time in the future).
One thing to not forget here: the string interner uses a cache dictionary which stores all previously encountered string values. So to let those strings go, you should "clear" this cache map too, simplest done by assigning a nil value to it.
Without further ado, let's see our solution:
result := searchImages()
tagToUrlMap := make(map[string][]string)
for _, image := range result {
imageURL := interned(image.URL)
for _, tag := range image.Tags {
tagName := interned(tag.Name)
tagToUrlMap[tagName] = append(tagToUrlMap[tagName], imageURL)
}
}
// Clear the interner cache:
cache = nil
To verify the results:
enc := json.NewEncoder(os.Stdout)
enc.SetIndent("", " ")
if err := enc.Encode(tagToUrlMap); err != nil {
panic(err)
}
Output is (try it on the Go Playground):
{
"blue": [
"https://c8.staticflickr.com/4/3707/11603200203_87810ddb43_o.jpg"
],
"bridge": [
"https://c3.staticflickr.com/1/48/164626048_edeca27ed7_o.jpg"
],
"forest": [
"https://c8.staticflickr.com/4/3707/11603200203_87810ddb43_o.jpg",
"https://c3.staticflickr.com/1/48/164626048_edeca27ed7_o.jpg"
],
"ocean": [
"https://c8.staticflickr.com/4/3707/11603200203_87810ddb43_o.jpg"
],
"river": [
"https://c3.staticflickr.com/1/48/164626048_edeca27ed7_o.jpg"
],
"water": [
"https://c8.staticflickr.com/4/3707/11603200203_87810ddb43_o.jpg",
"https://c3.staticflickr.com/1/48/164626048_edeca27ed7_o.jpg"
]
}
Further memory optimizations:
We used the builtin append() function to add new image URLs to tags. append() may (and usually does) allocate bigger slices than needed (thinking of future growth). After our "build" process, we may go through our tagToUrlMap map and "trim" those slices to the minimum needed.
This is how it could be done:
for tagName, urls := range tagToUrlMap {
if cap(urls) > len(urls) {
urls2 := make([]string, len(urls))
copy(urls2, urls)
tagToUrlMap[tagName] = urls2
}
}
Will the [...] be garbage collected correctly?
Yes.
You never need to worry that something will be collected which is still in use and you can rely on everything being collected once it is no longer used.
So the question about GC is never "Will it be collected correctly?" but "Do I generate unnecessary garbage?". Now this actual question does not depend that much on the data structure than on the amount of neu objects created (on the heap). So this is a question about how the data structures are used and much less on the structure itself. Use benchmarks and run go test with -benchmem.
(High end performance might also consider how much work the GC has to do: Scanning pointers might take time. Forget that for now.)
The other relevant question is about memory consumption. Copying a string copies just three words while copying a *string copies one word. So there is not much to safe here by using *string.
So unfortunately there are no clear answers to the relevant questions (amount of garbage generated and total memory consumption). Don't overthink the problem, use what fits your purpose, measure and refactor.

Calling method with array or arguments using QueuedConnection

I want to call arbitrary slot of QObject in other thread.
I have:
| Arguments: | Can use QueuedConnection?
QMetaObject::invokeMethod | fixed number | YES
qt_metacall | array | NO
I want:
<something> | array | YES
I don't want to do things like duplicating invokeMethod code based on the number of arguments.
Where to get invokeMethod that accepts array of arguments or how to make qt_metacall queued?
You can either:
write a signal with the same default parameters as the slot you want to call, connect it to the slot with Qt::QueuedConnection and call the signal with qt_metacall and your array, or
write a QObject derived class that:
takes your parameter array as parameter for its constructor, and stores it internally,
calls QMetaObject::invokeMethod in the constructor with Qt::QueuedConnection to invoke a slot without parameter which will call qt_metacall with the stored parameter array before deleting the QObject.
Internally Qt uses the 2nd method but with a internal class: QMetaCallEvent (in corelib/kernel/qobject_p.h) and postEvent instead of a signal/slot connection.
Working around by creating array initialized by GenericArgument:
QGenericArgument args[] = {
QGenericArgument(), ....... ,QGenericArgument(),};
for (int p = 0; p < parameterTypes.count(); ++p) {
QVariant::Type type = QVariant::nameToType(parameterTypes.at(p));
switch(type) {
case QVariant::String:
args[p] = Q_ARG(QString, obtainTheNextStringArgument());
break;
// the rest needed types here
}
}
mm.invoke(object, Qt::QueuedConnection, args[0], args[1], args[2], args[3], args[4], args[5], args[6], args[7], args[8],args[9]);

Resources