How to understand if there was an Integer overflow during parsing of string with Kotlin? - string

In Java architects of the language used prefixes like L/l for long numbers to make parsing easier and to differentiate Int vs Long constants. I am making a deserialisation library for TOML and received a request from the user not only to support easy-parsing to Long, but also to Int numbers depending on the string input.
So during the parsing I need to understand if the string in the input is Byte/Short/Int or Long and select a proper type automatically for an input.
This leads me to a question: is there any library in Kotlin (my library is MPP) that can help me to check if ? Like Math does it in Java. I am pretty sure that there should be some obvious library or algorithm for it, so I do not want to implement my yet another one.
If you will propose not a library, but a good algorithm for determine a type of an integer number by the string input - that will also be fine for me. But better it would be some known-algorithm.
We should not also forget about UNSIGNED int that are there in Kotlin but missing in Java

You can parse the positive part of an integer. If your integer is larger, than Int.MAX_VALUE it becomes negative. So when it becomes negative, you know there is an overflow.
Int.MAX_VALUE + 1 // produces a negative value
You can also try this library for parsing. https://github.com/tiksem/KotlinSpirit
val errorCode = int.compile().parseWithResult("3453534435543543345345").errorCode
errorCode will be ParseCode.INT_OUT_OF_BOUNDS

Related

How can I bit-convert between Int and Word quickly?

The Haskell base documentation says that "A Word is an unsigned integral type, with the same size as Int."
How can I take an Int and cast its bit representation to a Word, so I get a Word value with the same bit representation as the original Int (even though the number values they represent will be different)?
I can't use fromIntegral because that will change the bit representation.
I could loop through the bits with the Bits class, but I suspect that will be very slow - and I don't need to do any kind of bit manipulation. I want some kind of function that will be compiled down to a no-op (or close to it), because no conversion is done.
Motivation
I want to use IntSet as a fast integer set implementation - however, what I really want to store in it are Words. I feel that I could create a WordSet which is backed by an IntSet, by converting between them quickly. The trouble is, I don't want to convert by value, because I don't want to truncate the top half of Word values: I just want to keep the bit representation the same.
int2Word#/word2Int# in GHC.Prim perform bit casting. You can implement wrapper functions which cast between boxed Int/Word using them easily.

perl6 "P6opaque, Str" vs simple "Str" types

I was trying to obtain a list from user input doing my usual codes, but sometimes it fails unpredictably due to this error:
This type cannot unbox to a native integer: P6opaque, Str
The code line is
my #a = prompt("Enter list: ").words || (1,2,3);
It failed only if I enter only one number.
When is a Str converted to "P6opaque, Str" without user awareness? I cannot use +#a[0] or #a[0].Int to convert this "P6opaque, Str" to an Int. What am I missing here?
TL;DR The mention of P6Opaque is a mostly a red herring. Some code is trying to assign a string to an int. You'll need to coerce it to an Int first. I know you've tried that. All that's left is to find out where it needs to be done. Hopefully this answer will guide us there.
You can only assign an integer to an integer variable
It's an error to assign a string to an Int or an int:
my Int $a = '1'; # Type check failed ... expected Int but got Str
my int $a = '1'; # This type cannot unbox to a native integer: P6opaque, Str
The error on assigning to an Int is caught by high level machinery which responds with a high level error message. For int it's low level machinery which responds with a low level message. We'll take a closer look at this difference below, but it's a red herring as far as fixing your problem is concerned.
To fix this problem you'll need to find where a string is being assigned or bound to a variable with a native integer type constraint like int and then coerce before the assignment with something like this:
my int $a = +'1' # Works
I know you've tried something like that. I don't know why it hasn't worked because you haven't yet shared the part of your code that's causing the problem.
Finding the problem
There must be some use of a native integer that's either directly in your code (i.e. you explicitly specified a native integer type, an all lowercase type like int, int32, uint etc.) or in some code your code uses.
So, search your code first.
If you still haven't found it, then please share enough of your code that we can reproduce the problem, preferably after reading StackOverflow's freshly Named/URLed page How to create a Minimal, Reproducible Example. TIA.
Red herring or LTA?
“P6opaque, Str” vs simple “Str” types
They're the same. P6opaque, Str is a reference to exactly the same type as Str.
When is a Str converted to "P6opaque, Str" without user awareness?
It isn't.
Quoting is repr and native representations:
P6opaque is the default representation used for all objects in Perl 6.
A representation is a set of rules for representing a type in a computer's memory.
Errors related to P6 objects are generally handled by the high level "front end" of the P6 language/compiler. High level error messages don't mention representations because most ordinary P6 objects have the same one (P6Opaque) and even when they don't the representation still won't be relevant.
But here we're dealing with an error handled by MoarVM.
MoarVM's error messages don't mention the representation if it's deemed irrelevant. For example:
my int64 $a = 2⁶³
displays a MoarVM exception with an error message about the bigint type whose representation is P6bigint:
Cannot unbox 64 bit wide bigint into native integer
This error message doesn't mention the representation (P6bigint).
But the MoarVM response to trying to put anything other than an integer into a native integer is a MoarVM exception which does mention the representation. For example, if you attempt to assign an Str it's:
This type cannot unbox to a native integer: P6opaque, Str
If someone doesn't know about representations, this message is bit opaque aka LTA. But while removing the representation removes the confusion it also removes information that might be important:
This type cannot unbox to a native integer: Str
I'm not convinced that that is actually better and/or worthwhile but if you feel strongly about it, feel free to file a MoarVM bug about this with an LTA tag.

Public fixed-length Strings

I am just summarizing info about implementing a digital tree (Trie) in VBA. I am not asking how to do that so please do not post your solutions - my specific question regarding fixed-length Strings in class modules comes at the end of this post.
A Trie is all about efficiency and performance therefore most of other programming languages use a Char data type to represent members of TrieNodes. Since VBA does not have a Char datatype I was thinking about faking it and using a fixed-length String with 1 character.
Note: I can come up with a work-around to this ie. use Byte and a simple function to convert between Chr() and Asc() or an Enum, or delcare as a private str as String * 1 and take advantage of get/let properties but that's not the point. Stay tuned though because...
According to Public Statement on Microsoft Help Page you can't declare a fixed-length String variable in class modules.
I can't find any reasonable explanation for this constrain.
Can anyone give some insight why such a restriction applies to fixed-length Strings in class modules in VBA?
The VBA/VB6 runtime is heavily reliant on the COM system (oleaut32 et al) and this enforces some rules.
You can export a class flile between VB "stuff" but if you publish (or could theoretically publish) it as a COM object it must be able to describe a "fixed length string" in its interface description/type library so that say a C++ client can consume it.
A fixed length string is "special" because it has active behaviour, i.e. its not a dumb datatype, it behaves somewhat like a class; for example its always padded - if you assign to it it will have trailing spaces, in VBA the compiler adds generated code to get that behaviour. A C++ consumer would be unaware of the fixed-length nature of the string because the interface cant describe it/does not support a corresponding type (a String is a BSTR) which could lead to problems.
Strings are of type BSTR and like a byte array you would still lose the padding semantics if you used one of those instead.

Fortran77 Casting integer to a String

Out of interest, I was trying to find a way to cast an integer to a String in Fortran77.
I came across CHAR(I), but this converts the ASCII index I into the character in that postion.
Is there a way to just simply cast an integer to a String in Fortran77?
How about vice versa?
The Fortran way is to write the value of the integer into a string variable; this operation is known as an internal write. I'm heading out the door now so won't check this, and I have an ethical objection to writing FORTRAN77 or helping anyone else write it, so make no guarantee that the following doesn't contain bits of more modern Fortran.
First declare a character variable to receive the integer
character(len=12) :: int_as_string
then write the integer into it as you would normally write an integer to any other channel such as stdout
write(int_as_string,'(i12.12)') my_int
I expect you'll want to set the format for writing the integer to something that suits you better

How to have a bigint hash for a string

We have an alpha numeric string (up to 32 characters) and we want to transform it to an integer (bigint). Now we're looking for an algorithm to do that. Collision isn't bad (therefor we use an bigint to prevent this a little bit), important thing is, that the calculated integers are constantly distributed over bigint range and the calculated integer is always the same for a given string.
This page has a few. You'll need to port to 64bit, but that should be trivial. A C# port of SBDM hash is here. Another page of hash functions here
Most programming languages come with a built-in construct or a standard library call to do this. Without knowing the language, I don't think anyone can help you.
Yes, a "hash" should be the right description for my problem. I know, that there is CRC32, but it only provides an 32-bit int (in PHP) and this 32-bit integers are at least 10 characters long, so a huge range of integer number is unused!?
Mostly, we have a short string like "PX38IEK" or an 36 character UUID like "24868d36-a150-11df-8882-d8d385ffc39c", so the strings are arbitrary, yes.
It doesn't has to be reversible (so collisions aren't bad). It also doesn't matter what int a string is converted to, my only wish is, that the full bigint range is used as best as possible.

Resources