Difference between String and StaticString - string

I was browsing the docs, and I found StaticString. It states:
An simple string designed to represent text that is "knowable at compile-time".
I originally thought that String has the same behaviour as NSString, which is known at compile time, but it looks like that I was wrong. So my question is when should we use StaticString instead of a String, and is the only difference is that StaticString is known at compile-time?
One thing I found is
var a: String = "asdf" //"asdf"
var b: StaticString = "adsf" //{(Opaque Value), (Opaque Value), (Opaque Value)}
sizeofValue(a) //24
sizeofValue(b) //17
So it looks like StaticString has a little bit less memory footprint.

It appears that StaticString can hold string literals. You can't assign a variable of type String to it, and it can't be mutated (with +=, for example).
"Knowable at compile time" doesn't mean that the value held by the variable will be determined at compile time, just that any value assigned to it is known at compile time.
Consider this example which does work:
var str: StaticString
for _ in 1...10 {
switch arc4random_uniform(3) {
case 0: str = "zero"
case 1: str = "one"
case 2: str = "two"
default: str = "default"
}
print(str)
}
Any time you can give Swift more information about how a variable is to be used, it can optimize the code using it. By restricting a variable to StaticString, Swift knows the variable won't be mutated so it might be able to store it more efficiently, or access the individual characters more efficiently.
In fact, StaticString could be implemented with just an address pointer and a length. The address it points to is just the place in the static code where the string is defined. A StaticString doesn't need to be reference counted since it doesn't (need to) exist in the heap. It is neither allocated nor deallocated, so no reference count is needed.
"Knowable at compile time" is pretty strict. Even this doesn't work:
let str: StaticString = "hello " + "world"
which fails with error:
error: 'String' is not convertible to 'StaticString'

StaticString is knowable at compile time. This can lead to optimizations. Example:
EDIT: This part doesn't work, see edit below
Suppose you have a function that calculates an Int for some String values for some constants that you define at compile time.
let someString = "Test"
let otherString = "Hello there"
func numberForString(string: String) -> Int {
return string.stringValue.unicodeScalars.reduce(0) { $0 * 1 << 8 + Int($1.value) }
}
let some = numberForString(someString)
let other = numberForString(otherString)
Like this, the function would be executed with "Test" and "Hello there" when it really gets called in the program, when the app starts for example. Definitely at runtime. However if you change your function to take a StaticString
func numberForString(string: StaticString) -> Int {
return string.stringValue.unicodeScalars.reduce(0) { $0 * 1 << 8 + Int($1.value) }
}
the compiler knows that the passed in StaticString is knowable at compile time, so guess what it does? It runs the function right at compile time (How awesome is that!). I once read an article about that, the author inspected the generated assembly and he actually found the already computed numbers.
As you can see this can be useful in some cases like the one mentioned, to not decrease runtime performance for stuff that can be done at compile time.
EDIT: Dániel Nagy and me had a conversation. The above example of mine doesn't work because the function stringValue of StaticString can't be known at compile time (because it returns a String). Here is a better example:
func countStatic(string: StaticString) -> Int {
return string.byteSize // Breakpoint here
}
func count(string: String) -> Int {
return string.characters.count // Breakpoint here
}
let staticString : StaticString = "static string"
let string : String = "string"
print(countStatic(staticString))
print(count(string))
In a release build only the second breakpoint gets triggered whereas if you change the first function to
func countStatic(string: StaticString) -> Int {
return string.stringValue.characters.count // Breakpoint here
}
both breakpoints get triggered.
Apparently there are some methods which can be done at compile time while other can't. I wonder how the compiler figures this out actually.

Related

How to use map[string]*string

I'm trying to use sarama (Admin mode) to create a topic.
Without the ConfigEntries works fine. But I need to define some configs.
I set up the topic config (Here is happening the error):
tConfigs := map[string]*string{
"cleanup.policy": "delete",
"delete.retention.ms": "36000000",
}
But then I get an error:
./main.go:99:28: cannot use "delete" (type string) as type *string in map value
./main.go:100:28: cannot use "36000000" (type string) as type *string in map value
I'm trying to use the admin mode like this:
err = admin.CreateTopic(t.Name, &sarama.TopicDetail{
NumPartitions: 1,
ReplicationFactor: 3,
ConfigEntries: tConfigs,
}, false)
Here is the line from the sarama module that defines CreateTopic()
https://github.com/Shopify/sarama/blob/master/admin.go#L18
Basically, I didn't understand how the map of pointers strings works :)
To initialize a map having string pointer value type with a composite literal, you have to use string pointer values. A string literal is not a pointer, it's just a string value.
An easy way to get a pointer to a string value is to take the address of a variable of string type, e.g.:
s1 := "delete"
s2 := "36000000"
tConfigs := map[string]*string{
"cleanup.policy": &s1,
"delete.retention.ms": &s2,
}
To make it convenient when used many times, create a helper function:
func strptr(s string) *string { return &s }
And using it:
tConfigs := map[string]*string{
"cleanup.policy": strptr("delete"),
"delete.retention.ms": strptr("36000000"),
}
Try the examples on the Go Playground.
See background and other options here: How do I do a literal *int64 in Go?

Function Overloading in golang [duplicate]

I'm porting a C library to Go. A C function (with varargs) is defined like this:
curl_easy_setopt(CURL *curl, CURLoption option, ...);
So I created wrapper C functions:
curl_wrapper_easy_setopt_str(CURL *curl, CURLoption option, char* param);
curl_wrapper_easy_setopt_long(CURL *curl, CURLoption option, long param);
If I define function in Go like this:
func (e *Easy)SetOption(option Option, param string) {
e.code = Code(C.curl_wrapper_easy_setopt_str(e.curl, C.CURLoption(option), C.CString(param)))
}
func (e *Easy)SetOption(option Option, param long) {
e.code = Code(C.curl_wrapper_easy_setopt_long(e.curl, C.CURLoption(option), C.long(param)))
}
The Go compiler complains:
*Easy·SetOption redeclared in this block
So does Go support function (method) overloading, or does this error mean something else?
No it does not.
See the Go Language FAQ, and specifically the section on overloading.
Method dispatch is simplified if it doesn't need to do type matching as well. Experience with other languages told us that having a variety of methods with the same name but different signatures was occasionally useful but that it could also be confusing and fragile in practice. Matching only by name and requiring consistency in the types was a major simplifying decision in Go's type system.
Update: 2016-04-07
While Go still does not have overloaded functions (and probably never will), the most useful feature of overloading, that of calling a function with optional arguments and inferring defaults for those omitted can be simulated using a variadic function, which has since been added. But this comes at the loss of type checking.
For example: http://changelog.ca/log/2015/01/30/golang
According to this, it doesn't: http://golang.org/doc/go_for_cpp_programmers.html
In the Conceptual Differences section, it says:
Go does not support function overloading and does not support user defined operators.
Even though this question is really old, what I still want to say is that there is a way to acheive something close to overloading functions. Although it may not make the code so easy to read.
Say if you want to overload the funtion Test():
func Test(a int) {
println(a);
}
func Test(a int, b string) {
println(a);
println(b);
}
The code above will cause error. However if you redefine the first Test() to Test1() and the second to Test2(), and define a new function Test() using go's ..., you would be able to call the function Test() the way it is overloaded.
code:
package main;
func Test1(a int) {
println(a);
}
func Test2(a int, b string) {
println(a);
println(b);
}
func Test(a int, bs ...string) {
if len(bs) == 0 {
Test1(a);
} else {
Test2(a, bs[0]);
}
}
func main() {
Test(1);
Test(1, "aaa");
}
output:
1
1
aaa
see more at: https://golangbyexample.com/function-method-overloading-golang/ (I'm not the author of this linked article but personally consider it useful)
No, Go doesn't have overloading.
Overloading adds compiler complexity and will likely never be added.
As Lawrence Dol mentioned, you could use a variadic function at the cost of no type checking.
Your best bet is to use generics and type constraints that were added in Go 1.18
To answer VityaSchel's question, in the comments of Lawrence's answer, of how to make a generic sum function, I've written one below.
https://go.dev/play/p/hRhInhsAJFT
package main
import "fmt"
type Number interface {
int | int8 | int16 | int32 | int64 | uint | uint8 | uint16 | uint32 | uint64 | float32 | float64
}
func Sum[number Number](a number, b number) number {
return a + b
}
func main() {
var a float64 = 5.1
var b float64 = 3.2
println(Sum(a, b))
var a2 int = 5
var b2 int = 3
println(Sum(a2, b2))
}

C++ : Strings, Structures and Access Violation Writing Locations

I'm attempting to try and use a string input from a method and set that to a variable of a structure, which i then place in a linked list. I didn't include, all of code but I did post constructor and all that good stuff. Now the code is breaking at the lines
node->title = newTitle;
node->isbn = newISBN;
So newTitle is the string input from the method that I'm trying to set to the title variable of the Book structure of the variable node. Now, I'm assuming this has to do with a issue with pointers and trying to set data to them, but I can't figure out a fix/alternative.
Also, I tried using
strcpy(node->title, newTitle)
But that had an issue with converting the string into a list of chars because strcpy only uses a list of characters. Also tried a few other things, but none seemed to pan out, help with an explanation would be appreciated.
struct Book
{
string title;
string isbn;
struct Book * next;
};
//class LinkedList will contains a linked list of books
class LinkedList
{
private:
Book * head;
public:
LinkedList();
~LinkedList();
bool addElement(string title, string isbn);
bool removeElement(string isbn);
void printList();
};
//Constructor
//It sets head to be NULL to create an empty linked list
LinkedList::LinkedList()
{
head = NULL;
}
//Description: Adds an element to the link in alphabetical order, unless book with
same title then discards
// Returns true if added, false otherwise
bool LinkedList::addElement(string newTitle, string newISBN)
{
struct Book *temp;
struct Book *lastEntry = NULL;
temp = head;
if (temp==NULL) //If the list is empty, sets data to first entry
{
struct Book *node;
node = (Book*) malloc(sizeof(Book));
node->title = newTitle;
node->isbn = newISBN;
head = node;
}
while (temp!=NULL)
{
... //Rest of Code
Note that your Book struct is already a linked list implementation, so you don't need the LinkedList class at all, or alternatively you don't need the 'next' element of the struct.
But there's no reason from the last (long) code snippet you pasted to have an error at the lines you indicated. node->title = newTitle should copy the string in newTitle to the title field of the struct. The string object is fixed size so it's not possible to overwrite any buffer and cause a seg fault.
However, there may be memory corruption from something you do further up the code, which doesn't cause an error until later on. The thing to look for is any arrays, including char[], that you might be overfilling. Another idea is you mention you save method parameters. If you copy, it's ok, but if you do something like
char* f() {
char str[20];
strcpy(str, "hello");
return str;
}
...then you've got a problem. (Because str is allocated on the stack and you return only the pointer to a location that won't be valid after the function returns.) Method parameters are local variables.
The answer you seek can be found here.
In short: the memory malloc returns does not contain a properly constructed object, so you can't use it as such. Try using new / delete instead.

Another weird issue with Garbage Collection?

OK, so here's the culprit method :
class FunctionDecl
{
// More code...
override void execute()
{
//...
writeln("Before setting... " ~ name);
Glob.functions.set(name,this);
writeln("After setting." ~ name);
//...
}
}
And here's what happens :
If omit the writeln("After setting." ~ name); line, the program crashes, just at this point
If I keep it in (using the name attribute is the key, not the writeln itself), it works just fine.
So, I suppose this is automatically garbage collected? Why is that? (A pointer to some readable reference related to GC and D would be awesome)
How can I solve that?
UPDATE :
Just tried a GC.disable() at the very beginning of my code. And... automagically, everything works again! So, that was the culprit as I had suspected. The thing is : how is this solvable without totally eliminating Garbage Collection?
UPDATE II :
Here's the full code of functionDecl.d - "unnecessary" code omitted :
//================================================
// Imports
//================================================
// ...
//================================================
// C Interface for Bison
//================================================
extern (C)
{
void* FunctionDecl_new(char* n, Expressions i, Statements s) { return cast(void*)(new FunctionDecl(to!string(n),i,s)); }
void* FunctionDecl_newFromReference(char* n, Expressions i, Expression r) { return cast(void*)(new FunctionDecl(to!string(n),i,r)); }
}
//================================================
// Functions
//================================================
class FunctionDecl : Statement
{
// .. class variables ..
this(string n, Expressions i, Statements s)
{
this(n, new Identifiers(i), s);
}
this(string n, Expressions i, Expression r)
{
this(n, new Identifiers(i), r);
}
this(string n, Identifiers i, Statements s)
{
// .. implementation ..
}
this(string n, Identifiers i, Expression r)
{
// .. implementation ..
}
// .. other unrelated methods ..
override void execute()
{
if (Glob.currentModule !is null) parentModule = Glob.currentModule.name;
Glob.functions.set(name,this);
}
}
Now as for what Glob.functions.set(name,this); does :
Glob is an instance holding global definitions
function is the class instance dealing with defined functions (it comes with a FunctionDecl[] list
set simply does that : list ~= func;
P.S. I'm 99% sure it has something to do with this one : Super-weird issue triggering "Segmentation Fault", though I'm still not sure what went wrong this time...
I think the problem is that the C function is allocating the object, but D doesn't keep a reference. If FunctionDecl_new is called back-to-back in a tight memory environment, here's what would happen:
the first one calls, creating a new object. That pointer goes into the land of C, where the D GC can't see it.
The second one goes, allocating another new object. Since memory is tight (as far as the GC pool is concerned), it tries to run a collection cycle. It finds the object from (1), but cannot find any live pointers to it, so it frees it.
The C function uses that freed object, causing the segfault.
The segfault won't always run because if there's memory to spare, the GC won't free the object when you allocate the second one, it will just use its free memory instead of collecting. That's why omitting the writeln can get rid of the crash: the ~ operator allocates, which might just put you over the edge of that memory line, triggering a collection (and, of course, running the ~ gives the gc a chance to run in the first place. If you never GC allocate, you never GC collect either - the function looks kinda like gc_allocate() { if(memory_low) gc_collect(); return GC_malloc(...); })
There's three solutions:
Immediately store a reference in the FunctionDecl_new function in a D structure, before returning:
FunctionDecl[] fdReferences;
void* FunctionDecl_new(...) {
auto n = new FunctionDecl(...);
fdReferences ~= n; // keep the reference for later so the GC can see it
return cast(void*) n;
}
Call GC.addRoot on the pointer right before you return it to C. (I don't like this solution, I think the array is better, a lot simpler.)
Use malloc to create the object to give to C:
void* FunctionDecl_new(...) {
import std.conv : emplace;
import core.stdc.stdlib : malloc;
enum size = __traits(classInstanceSize, FunctionDecl);
auto memory = malloc(size)[0 .. size]; // need to slice so we know the size
auto ref = emplace!FunctionDecl(memory, /* args to ctor */); // create the object in the malloc'd block
return memory.ptr; // give the pointer to C
}
Then, of course, you ought to free the pointer when you know it is no longer going to be used, though if you don't, it isn't really wrong.
The general rule I follow btw is any memory that crosses language barriers for storage (usage is different) ought to be allocated similarly to what that language expects: So if you pass data to C or C++, allocate it in a C fashion, e.g. with malloc. This will lead to the least surprising friction as it gets stored.
If the object is just being temporarily used, it is fine to pass a plain pointer to it, since a temp usage isn't stored or freed by the receiving function so there's less danger there. Your reference will still exist too, if nothing else, on the call stack.

duck typing in D

I'm new to D, and I was wondering whether it's possible to conveniently do compile-time-checked duck typing.
For instance, I'd like to define a set of methods, and require that those methods be defined for the type that's being passed into a function. It's slightly different from interface in D because I wouldn't have to declare that "type X implements interface Y" anywhere - the methods would just be found, or compilation would fail. Also, it would be good to allow this to happen on any type, not just structs and classes. The only resource I could find was this email thread, which suggests that the following approach would be a decent way to do this:
void process(T)(T s)
if( __traits(hasMember, T, "shittyNameThatProbablyGetsRefactored"))
// and presumably something to check the signature of that method
{
writeln("normal processing");
}
... and suggests that you could make it into a library call Implements so that the following would be possible:
struct Interface {
bool foo(int, float);
static void boo(float);
...
}
static assert (Implements!(S, Interface));
struct S {
bool foo(int i, float f) { ... }
static void boo(float f) { ... }
...
}
void process(T)(T s) if (Implements!(T, Interface)) { ... }
Is is possible to do this for functions which are not defined in a class or struct? Are there other/new ways to do it? Has anything similar been done?
Obviously, this set of constraints is similar to Go's type system. I'm not trying to start any flame wars - I'm just using D in a way that Go would also work well for.
This is actually a very common thing to do in D. It's how ranges work. For instance, the most basic type of range - the input range - must have 3 functions:
bool empty(); //Whether the range is empty
T front(); // Get the first element in the range
void popFront(); //pop the first element off of the range
Templated functions then use std.range.isInputRange to check whether a type is a valid range. For instance, the most basic overload of std.algorithm.find looks like
R find(alias pred = "a == b", R, E)(R haystack, E needle)
if (isInputRange!R &&
is(typeof(binaryFun!pred(haystack.front, needle)) : bool))
{ ... }
isInputRange!R is true if R is a valid input range, and is(typeof(binaryFun!pred(haystack.front, needle)) : bool) is true if pred accepts haystack.front and needle and returns a type which is implicitly convertible to bool. So, this overload is based entirely on static duck typing.
As for isInputRange itself, it looks something like
template isInputRange(R)
{
enum bool isInputRange = is(typeof(
{
R r = void; // can define a range object
if (r.empty) {} // can test for empty
r.popFront(); // can invoke popFront()
auto h = r.front; // can get the front of the range
}));
}
It's an eponymous template, so when it's used, it gets replaced with the symbol with its name, which in this case is an enum of type bool. And that bool is true if the type of the expression is non-void. typeof(x) results in void if the expression is invalid; otherwise, it's the type of the expression x. And is(y) results in true if y is non-void. So, isInputRange will end up being true if the code in the typeof expression compiles, and false otherwise.
The expression in isInputRange verifies that you can declare a variable of type R, that R has a member (be it a function, variable, or whatever) named empty which can be used in a condition, that R has a function named popFront which takes no arguments, and that R has a member front which returns a value. This is the API expected of an input range, and the expression inside of typeof will compile if R follows that API, and therefore, isInputRange will be true for that type. Otherwise, it will be false.
D's standard library has quite a few such eponymous templates (typically called traits) and makes heavy use of them in its template constraints. std.traits in particular has quite a few of them. So, if you want more examples of how such traits are written, you can look in there (though some of them are fairly complicated). The internals of such traits are not always particularly pretty, but they do encapsulate the duck typing tests nicely so that template constraints are much cleaner and more understandable (they'd be much, much uglier if such tests were inserted in them directly).
So, that's the normal approach for static duck typing in D. It does take a bit of practice to figure out how to write them well, but that's the standard way to do it, and it works. There have been people who have suggested trying to come up with something similar to your Implements!(S, Interface) suggestion, but nothing has really come of that of yet, and such an approach would actually be less flexible, making it ill-suited for a lot of traits (though it could certainly be made to work with basic ones). Regardless, the approach that I've described here is currently the standard way to do it.
Also, if you don't know much about ranges, I'd suggest reading this.
Implements!(S, Interface) is possible but did not get enough attention to get into standard library or get better language support. Probably if I won't be the only one telling it is the way to go for duck typing, we will have a chance to have it :)
Proof of concept implementation to tinker around:
http://dpaste.1azy.net/6d8f2dc4
import std.traits;
bool Implements(T, Interface)()
if (is(Interface == interface))
{
foreach (method; __traits(allMembers, Interface))
{
foreach (compareTo; MemberFunctionsTuple!(Interface, method))
{
bool found = false;
static if ( !hasMember!(T, method) )
{
pragma(msg, T, " has no member ", method);
return false;
}
else
{
foreach (compareWhat; __traits(getOverloads, T, method))
{
if (is(typeof(compareTo) == typeof(compareWhat)))
{
found = true;
break;
}
}
if (!found)
{
return false;
}
}
}
}
return true;
}
interface Test
{
bool foo(int, double);
void boo();
}
struct Tested
{
bool foo(int, double);
// void boo();
}
pragma(msg, Implements!(Tested, Test)());
void main()
{
}

Resources