I have an array that holds strings of a maximum of 20 characters:
subtype c_string is string(1..20);
type string_array is array (natural range 1..100) of c_string;
When I try to assign a string to a position of string_array, I get the following error if the string is not exactly 20 characters long:
raised CONSTRAINT_ERROR : (...) length check failed
This is the line of code that causes the problem:
str_a: string_array;
(....)
str_a(n) := "stringToAssign" --Causes error
What would the best way to avoid this be?
Your c_string can’t hold a maximum of 20 characters; it holds exactly 20 characters, hence the CE.
You could use Ada.Strings.Bounded if it’s important to have an upper limit, or Ada.Strings.Unbounded if you don’t actually care.
In the bounded case, that’d be something like
package B_Strings is new Ada.Strings.Bounded.Generic_Bounded_Length (Max => 20);
type String_Array is array (1 .. 200) of B_Strings.Bounded_String;
and then
Str_A : String_Array;
Str_A (N) := B_Strings.To_Bounded_String (“stringToAssign”);
There’s more in the Ada Wikibook.
Related
If I create a subprogram of type function that for instance orders you to type a string of a particular length and you type Overflow, it's supposed to type the last half of the string, so in this case it would be flow. But on the other end if I type an odd number of characters like Stack it's supposed to type the last half of the string + the middle letter, so in this case it would be "ack".
Let me make it clearer (text in bold is user input):
Type a string that's not longer than 7 characters: Candy
The other half of the string is: ndy
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Integer_Text_IO; use Ada.Integer_Text_IO;
function Split_String (S : in String) return String is
begin
Mid := 1 + (S'Length / 2);
return S(Mid .. S'Last);
end Split_String;
S : String(1 .. 7);
I : Integer;
begin
Put("Type a string that's no longer than 7 characters: ");
Get_Line(S, I);
Put(Split_String(S));
end Split;
Let me tell you how I've been thinking. So I do a Get_Line to see how many characters the string contains. I then put I in my subprogram to determine if its evenly dividable by two or not. If it's dividable by two, the rest should be 0, thus it'll mean that typing out the other half of the string + THE MIDDLE CHARACTER is not needed. If in all the other cases, it's not dividable by two I have to type out the other half of the string + the middle character. But now I stumbled upon a big problem in my main program. I don't know how type out the other half of a string. If a string contains 4 words I can just type out Put(S(3 .. 4); but the thing is that I don't know a general formula for this. Help is appreciated! :) Have a good day!
You need a more general approach to your problem. Also, try to understand how Get_Line works for you.
For example, if you declare an input string with a large size such as
Input : String (1..1024);
You will have a string large enough to work with any likely input values.
Next, you need a variable to indicate how many characters were actually read by Get_Line.
Length : Natural;
The data returned by Get_Line will then be in the slice of the input string designated as
Input (1 .. Length);
Pass that slice to your function to return the second half of the string.
function last_half(S : string) return string;
last_half(Input(1..Length));
Now all you need is to calculate the last half of the string passed to the function last_half. The function will output a slice of the string passed to it. To find the first index of the last half of the input string you must perform the calculation
mid : Positive := 1 + (S'length / 2);
Then simply return the string S(mid .. S'Last).
It appears that the goal of this exercise is to learn how to use array slices. Concentrate on how slices work for you in the problem and the solution will be very simple.
One possible solution is
with Ada.Text_IO; use Ada.Text_IO;
procedure Main is
Input : String (1 .. 1_024);
Length : Natural;
function last_half (S : in String) return String is
Mid : Positive := 1 + (S'Length / 2);
begin
return S (Mid .. S'Last);
end last_half;
begin
Put ("Enter a string: ");
Get_Line (Input, Length);
Put_Line (Input (1 .. Length) & " : " & last_half (Input (1 .. Length)));
end Main;
Study how the solution uses array slices on the return value of Get_Line and on the parameter for the function last_half and on its return statement. It is also important to remember that the type String is defined as an unbounded array of character. This means that every slice of a string is also a string.
type String is array ( Positive range <> ) of Character;
Aside from being an untidy mess, your latest code edit (as of 20:11 GMT on 15 Nov 2021) doesn’t even compile. Please don’t show us code like this! (unless, of course, that’s the problem).
I’d like to strongly suggest this alternate way of inputting strings:
declare
S : constant String := Get_Line;
begin
-- do things with S, which is exactly as long as
-- the input you typed: no undefined characters at
-- the end to confuse the result, no need to worry
-- about overrunning an input buffer
end;
With this change, and obvious syntactic changes, your current code will do what you want.
Can someone explain why I got different capacity when converting the same string in []rune?
Take a look at this code
package main
import (
"fmt"
)
func main() {
input := "你好"
runes := []rune(input)
fmt.Printf("len %d\n", len(input))
fmt.Printf("len %d\n", len(runes))
fmt.Printf("cap %d\n", cap(runes))
fmt.Println(runes[:3])
}
Which return
len 6
len 2
cap 2
panic: runtime error: slice bounds out of range [:3] with capacity 2
But when commenting the fmt.Println(runes[:3]) it return :
len 6
len 2
cap 32
See how the []rune capacity has changed in the main from 2 to 32. How ? Why ?
If you want to test => Go playground
The capacity may change to whatever as long as the result slice of the conversion contains the runes of the input string. This is the only thing the spec requires and guarantees. The compiler may make decisions to use lower capacity if you pass it to fmt.Println() as this signals that the slice may escape. Again, the decision made by the compiler is out of your hands.
Escape means the value may escape from the function, and as such, it must be allocated on the heap (and not on the stack), because the stack may get destroyed / overwritten once the function returns, and if the value "escapes" from the function, its memory area must be retained as long as there is a reference to the value. The Go compiler performs escape analysis, and if it can't prove a value does not escape the function it's declared in, the value will be allocated on the heap.
See related question: Calculating sha256 gives different results after appending slices depending on if I print out the slice before or not
The reason the string and []rune return different results from len is that it's counting different things; len(string) returns the length in bytes (which may be more than the number of characters, for multi-byte characters), while len([]rune) returns the length of the rune slice, which in turn is the number of UTF-8 runes (generally the number of characters).
This blog post goes into detail how exactly Go treats text in various forms: https://blog.golang.org/strings
I believe there are no LeftStr(str,n) (take at most n first characters), RightStr(str,n) (take at most n last characters) and SubStr(str,pos,n) (take first n characters after pos) function in Go, so I tried to make one
// take at most n first characters
func Left(str string, num int) string {
if num <= 0 {
return ``
}
if num > len(str) {
num = len(str)
}
return str[:num]
}
// take at most last n characters
func Right(str string, num int) string {
if num <= 0 {
return ``
}
max := len(str)
if num > max {
num = max
}
num = max - num
return str[num:]
}
But I believe those functions will give incorrect output when the string contains unicode characters. What's the fastest solution for those function, is using for range loop is the only way?
As mentioned in already in comments,
combining characters, modifying runes, and other multi-rune
"characters"
can cause difficulties.
Anyone interested in Unicode handling in Go should probably read the Go Blog articles
"Strings, bytes, runes and characters in Go"
and "Text normalization in Go".
In particular, the later talks about the golang.org/x/text/unicode/norm package which can help in handling some of this.
You can consider several levels increasingly of more accurate (or increasingly more Unicode aware) spiting the first (or last) "n characters" from a string.
Just use n bytes.
This may split in the middle of a rune but is O(1), is very simple, and in many cases you know the input consists of only single byte runes.
E.g. str[:n].
Split after n runes.
This may split in the middle of a character. This can be done easily, but at the expense of copying and converting with just string([]rune(str)[:n]).
You can avoid the conversion and copying by using the unicode/utf8 package's DecodeRuneInString (and DecodeLastRuneInString) functions to get the length of each of the first n runes in turn and then return str[:sum] (O(n), no allocation).
Split after the n'th "boundary".
One way to do this is to use
norm.NFC.FirstBoundaryInString(str) repeatedly
or norm.Iter to find the byte position to split at and then return str[:pos].
Consider the displayed string "cafés" which could be represented in Go code as: "cafés", "caf\u00E9s", or "caf\xc3\xa9s" which all result in the identical six bytes. Alternative it could represented as "cafe\u0301s" or "cafe\xcc\x81s" which both result in the identical seven bytes.
The first "method" above may split those into "caf\xc3"+"\xa9s" and cafe\xcc"+"\x81s".
The second may split them into "caf\u00E9"+"s" ("café"+"s") and "cafe"+"\u0301s" ("cafe"+"́s").
The third should split them into "caf\u00E9"+"s" and "cafe\u0301"+"s" (both shown as "café"+"s").
I want to make a constant string that looks like:
My_Null_String : constant String(1 .. 50) := "NULL***********************";
with all of the *s being Ascii.Nul characters. It is not possible to do this via the others keyword, as in:
My_Null_String : constant String(1 .. 50) := "NULL" & (others => Ascii.Nul);
Is there an elegant solution to this that doesn't involve a huge block of Ascii.Nul characters to fill out the rest of my string?
Thanks
My_Null_String : constant String(1 .. 50) := "NULL" & (5 .. 50 => ASCII.NUL);
The problem with your original attempt is that in order to evaluate
(others => ASCII.NUL)
the program has to have a way to determine the bounds. It doesn't, and it's not smart enough to make calculations such as figuring out that this is formed by concatenating two strings and therefore we can figure out that the bounds should be whatever is left over after the first string is evaluated. The language would have to make a special case just for this (array concatenation), and it doesn't.
How about:
My_Null_String : constant String(1 .. 50) := ('N','U','L','L', others => ASCII.Nul);
I only started Go today, so this may be obvious but I couldn't find anything on it.
What does var x uint64 = 0x12345678; y := string(x) give y?
I know var x uint8 = 65; y := string(x) would give y the byte 65, character A, and common sense would suggest (since types larger than uint8 are allowed to be cast to strings) that they would simply be packed in to native byte order (i.e little endian) and assigned to the variable.
This does not seem to be the case:
hex.EncodeToString([]byte(y)) ==> "efbfbd"
First thought says this is an address with the last byte being left off because of some weird null terminator thingy, but if I allocate two x and y variables with two different values and print them out I get the same result.
var x, x2 uint64 = 0x10000000, 0x20000000
y, y2 := string(x), string(x2)
fmt.Println(hex.EncodeToString([]byte(y))) // "efbfbd"
fmt.Println(hex.EncodeToString([]byte(y2))) // "efbfbd"
Maddeningly I can't find the implementation for the string type anywhere although I probably haven't looked hard enough.
This is covered in the Spec: Conversions: Conversions to and from a string type:
Converting a signed or unsigned integer value to a string type yields a string containing the UTF-8 representation of the integer. Values outside the range of valid Unicode code points are converted to "\uFFFD".
So effectively when you convert a numeric value to string, it can only yield a string having one rune (character). And since Go stores strings as the UTF-8 encoded byte sequences in memory, that is what you will see if you convert your string to []byte:
Converting a value of a string type to a slice of bytes type yields a slice whose successive elements are the bytes of the string.
When you try to conver the 0x12345678, 0x10000000 and 0x20000000 values to string, since they are outside of the range of valid Unicode code points, as per spec they are converted to "\uFFFD" which in UTF-8 encoding is []byte{239, 191, 189}; when encoded to hex string:
fmt.Println(hex.EncodeToString([]byte("\uFFFD"))) // Output: efbfbd
Or simply:
fmt.Printf("%x", "\uFFFD") // Output: efbfbd
Read the blog post Strings, bytes, runes and characters in Go for more details about string internals.
And btw since Go 1.5 the Go runtime is implemented (mostly) in Go, so these conversions are now implemented in Go and can be found in the runtime package: runtime/string.go, look for the intstring() function.