Finding string that matches specific requirements

Finding string that matches specific requirements - string

There is a function that should return true:
func accessible(agent string) bool {
a := strings.Split(agent, " ")
if len(a) != 3 { return false }
b := a[0]
c := a[1]
d := a[2]
x := strings.EqualFold(b, c)
y := b != strings.ToLower(c)
z := strings.Index(d, b+c) == 1 && len(d) == 5
return x && y && z
}
However I can't figure out which string input will match these requirements. Am I missing something?
PS: This is task #3 from gocode.io

agent must be 3 "words", 3 parts separated by spaces:
a := strings.Split(agent, " ")
if len(a) != 3 { return false }
1st and 2nd words must match case insensitive:
x := strings.EqualFold(b, c)
But not case sensitive:
y := b != strings.ToLower(c)
And 3rd word must contain the first 2 concatenated:
z := strings.Index(d, b+c) == 1 && len(d) == 5
Starting at index 1 (prepend with any character) and must contain 5 chars (5 bytes) (postpend to have 5 chars/bytes).
Example:
fmt.Println(accessible("A a _Aa__"))
Prints:
true

Related

Efficient reverse indexing of number (type string) in Go

Background
TLDR (and simplified): Given a string s, where s is any positive integer, reverse the order and summate each digit multiplied by it's new index (+1).
For example, the value returned from "98765" would be: (1x5) + (2x6) + (3x7) + (4x8) + (5*9)= 115.
My current working solution can be found here: Go playground. I'd like to know whether there's a better way of doing this, be it readability or efficiency. For example, I decided in favour of a count variable instead utilising i and len as it seemed clearer. I'm also not very familiar with int/string conversions but I'm assuming making use of strconv is required.
func reverseStringSum(s string) int {
total := 0
count := 1
for i := len(s) - 1; i >= 0; i-- {
char := string([]rune(s)[i])
num, _ := strconv.Atoi(char)
total += count * num
count++
}
return total
}

Here's an efficient way to solve the complete problem: sum("987-65") = 115. The complete problem is documented in your working solution link: https://go.dev/play/p/DJ1ZYYDFnfq.
package main
import "fmt"
func reverseSum(s string) int {
sum := 0
for i, j := len(s)-1, 0; i >= 0; i-- {
d := int(s[i]) - '0'
if 0 <= d && d <= 9 {
j++
sum += j * d
}
}
return sum
}
func main() {
s := "987-65"
sum := reverseSum(s)
fmt.Println(sum)
}
https://go.dev/play/p/bx7wfmtXaie
115
Since we are talkng about efficient Go code, we need some Go benchmarks.
$ go test reversesum_test.go -bench=. -benchmem
BenchmarkSumTBJ-8 4001182 295.8 ns/op 52 B/op 6 allocs/op
BenchmarkSumA2Q-8 225781720 5.284 ns/op 0 B/op 0 allocs/op
Your solution (TBJ) is slow.
reversesum_test.go:
package main
import (
"strconv"
"strings"
"testing"
)
func reverseSumTBJ(s string) int {
total := 0
count := 1
for i := len(s) - 1; i >= 0; i-- {
char := string([]rune(s)[i])
num, _ := strconv.Atoi(char)
total += count * num
count++
}
return total
}
func BenchmarkSumTBJ(b *testing.B) {
for n := 0; n < b.N; n++ {
rawString := "987-65"
stringSlice := strings.Split(rawString, "-")
numberString := stringSlice[0] + stringSlice[1]
reverseSumTBJ(numberString)
}
}
func reverseSumA2Q(s string) int {
sum := 0
for i, j := len(s)-1, 0; i >= 0; i-- {
d := int(s[i]) - '0'
if 0 <= d && d <= 9 {
j++
sum += j * d
}
}
return sum
}
func BenchmarkSumA2Q(b *testing.B) {
for n := 0; n < b.N; n++ {
rawString := "987-65"
reverseSumA2Q(rawString)
}
}
The reverse sum is part of a larger problem, computing a CAS Registry Number check digit.
package main
import "fmt"
// CASRNCheckDigit returns the computed
// CAS Registry Number check digit.
func CASRNCheckDigit(s string) string {
// CAS Registry Number
// https://en.wikipedia.org/wiki/CAS_Registry_Number
//
// The check digit is found by taking the last digit times 1,
// the preceding digit times 2, the preceding digit times 3 etc.,
// adding all these up and computing the sum modulo 10.
//
// The CAS number of water is 7732-18-5:
// the checksum 5 is calculated as
// (8×1 + 1×2 + 2×3 + 3×4 + 7×5 + 7×6)
// = 105; 105 mod 10 = 5.
//
// Check Digit Verification of CAS Registry Numbers
// https://www.cas.org/support/documentation/chemical-substances/checkdig
for i, sep := 0, 0; i < len(s); i++ {
if s[i] == '-' {
sep++
if sep == 2 {
s = s[:i]
break
}
}
}
sum := 0
for i, j := len(s)-1, 0; i >= 0; i-- {
d := int(s[i]) - '0'
if 0 <= d && d <= 9 {
j++
sum += j * d
}
}
return string(rune(sum%10 + '0'))
}
func main() {
var rn, cd string
// 987-65-5: Adenosine 5'-triphosphate disodium salt
// https://www.chemicalbook.com/CASEN_987-65-5.htm
rn = "987-65"
cd = CASRNCheckDigit(rn)
fmt.Println("CD:", cd, "\tRN:", rn)
// 732-18-5: Water
// https://www.chemicalbook.com/CASEN_7732-18-5.htm
rn = "7732-18-5"
cd = CASRNCheckDigit(rn)
fmt.Println("CD:", cd, "\tRN:", rn)
// 7440-21-3: Silicon
// https://www.chemicalbook.com/CASEN_7440-21-3.htm
rn = "7440-21-3"
cd = CASRNCheckDigit(rn)
fmt.Println("CD:", cd, "\tRN:", rn)
}
https://go.dev/play/p/VYh-5LuGpCn
BenchmarkCD-4 37187641 30.29 ns/op 4 B/op 1 allocs/op

Maybe this is more efficient
func reverseStringSum(s string) int {
total := 0
count := 1
for i := len(s) - 1; i >= 0; i-- {
num, _ := strconv.Atoi(string(s[i]))
total += count * num
count++
}
return total
}

Is there a better way to insert "|' into binary string rep to get this 10|000|001

Is there a better way to insert "|" into a string
given a binary string representation of decimal 200 = 11001000
this function returns a string = 11|001|000
While this function works, it seems very kludgy!! Why is it so
hard in GO to do a simple character insertion???
func (i Binary) FString() string {
a := strconv.FormatUint(i.Get(), 2)
y := make([]string, len(a), len(a)*2)
data := []rune(a)
r := []rune{}
for i := len(data) - 1; i >= 0; i-- {
r = append(r, data[i])
}
for j := len(a) - 1; j >= 0; j-- {
y = append(y, string(r[j]))
if ((j)%3) == 0 && j > 0 {
y = append(y, "|")
}
}
return strings.Join(y, "")
}

Depends on what you call better. I'd use regular expressions.
In this case, the complexity arises from inserting separators from the right. If we padded the string so that its length was a multiple of 3, we could insert the separator from the left. And we could easily use a regular expression to insert | before every three characters. Then, we can just strip off the leading | + padding.
func (i Binary) FString() string {
a := strconv.FormatUint(i.Get(), 2)
pad_req := len(a) % 3
padding := strings.Repeat("0", (3 - pad_req))
a = padding + a
re := regexp.MustCompile("([01]{3})")
a = re.ReplaceAllString(a, "|$1")
start := len(padding) + 1
if len(padding) == 3 {
// If we padded with "000", we want to remove the `|` before *and* after it
start = 5
}
a = a[start:]
return a
}
Snippet on the Go Playground

If performance is not critical and you just want a compact version, you may copy the input digits to output, and insert a | symbol whenever a group of 2 has been written to the output.
Groups are counted from right-to-left, so when copying the digits from left-to-right, the first group might be smaller. So the counter of digits inside a group may not necessarily start from 0 in case of the first group, but from len(input)%3.
Here is an example of it:
func Format(s string) string {
b, count := &bytes.Buffer{}, len(s)%3
for i, r := range s {
if i > 0 && count == i%3 {
b.WriteRune('|')
}
b.WriteRune(r)
}
return b.String()
}
Testing it:
for i := uint64(0); i < 10; i++ {
fmt.Println(Format(strconv.FormatUint(i, 2)))
}
fmt.Println(Format(strconv.FormatInt(1234, 2)))
Output (try it on the Go Playground):
0
1
10
11
100
101
110
111
1|000
1|001
10|011|010|010
If you have to do this many times and performance does matter, then check out my answer to the question: How to fmt.Printf an integer with thousands comma
Based on that a fast solution can be:
func Format(s string) string {
out := make([]byte, len(s)+(len(s)-1)/3)
for i, j, k := len(s)-1, len(out)-1, 0; ; i, j = i-1, j-1 {
out[j] = s[i]
if i == 0 {
return string(out)
}
if k++; k == 3 {
j, k = j-1, 0
out[j] = '|'
}
}
}
Output is the same of course. Try it on the Go Playground.

This is a partitioning problem. You can use this function:
func partition(s, separator string, pLen int) string {
if pLen < 1 || len(s) == 0 || len(separator) == 0 {
return s
}
buffer := []rune(s)
L := len(buffer)
pCount := L / pLen
result := []string{}
index := 0
for ; index < pCount; index++ {
_from := L - (index+1)*pLen
_to := L - index*pLen
result = append(result, string(buffer[_from:_to]))
}
if L%pLen != 0 {
result = append(result, string(buffer[0:L-index*pLen]))
}
for h, t := 0, len(result)-1; h < t; h, t = h+1, t-1 {
result[t], result[h] = result[h], result[t]
}
return strings.Join(result, separator)
}
And s := partition("11001000", "|", 3) will give you 11|001|000.
Here is a little test:
func TestSmokeTest(t *testing.T) {
input := "11001000"
s := partition(input, "|", 3)
if s != "11|001|000" {
t.Fail()
}
s = partition(input, "|", 2)
if s != "11|00|10|00" {
t.Fail()
}
input = "0111001000"
s = partition(input, "|", 3)
if s != "0|111|001|000" {
t.Fail()
}
s = partition(input, "|", 2)
if s != "01|11|00|10|00" {
t.Fail()
}
}

Verifying a Dafny method that shifts a region of an array

I'm using Dafny to make a delete method where you receive:
char array line
the length of the array l
a position at
the number of characters to delete p
First you delete the characters of line from at to at + p, and then you must move all the characters on the right of at + p to at.
For example, if you have [e][s][p][e][r][m][a], and at = 3, and p = 3, then the final result should be [e][s][p][a]
I'm trying to prove a postcondition that makes sense like:
ensures forall j :: (at<=j<l) ==> line[j] == old(line[j+p]);
To ensure that all chars from the right of at + p are in the new positions.
But Dafny outputs two errors:
index out of range 7 53
postcondition might not hold on this return path. 19 2
method delete(line:array<char>, l:int, at:int, p:int)
requires line!=null;
requires 0 <= l <= line.Length && p >= 0 && at >= 0;
requires 0 <= at+p <= l;
modifies line;
ensures forall j :: (at<=j<l) ==> line[j] == old(line[j+p]) ;
{
var tempAt:int := at;
var tempAt2:int := at;
var tempPos:int := at+p;
while(tempAt < at + p)
invariant at<=tempAt<=at + p;
{
line[tempAt] := ' ';
tempAt := tempAt + 1;
}
while(tempPos < line.Length && tempAt2 < at + p)
invariant at + p<=tempPos<=line.Length;
invariant at<=tempAt2<=at+p;
{
line[tempAt2] := line[tempPos];
tempAt2 := tempAt2 + 1;
line[tempPos] := ' ';
tempPos := tempPos + 1;
}
}
Here is the program on rise4fun

I don't think it is necessary to use quantifiers to express such postconditions. They are usually better expressed by slicing the array into sequences.
When you are trying to verify a loop you need to provide a loop invariant which is strong enough to imply the postcondition when combined with the negation of the loop condition.
A good strategy for picking a loop invariant is to use the method postcondition, but with the loop index substituted for the array length.
Your loop invariant also needs to be strong enough for the induction to work. In this case, you need to say not only how the loop modifies line, but also which parts of line remain the same in each iteration.
Solution on rise4fun.
// line contains string of length l
// delete p chars starting from position at
method delete(line:array<char>, l:nat, at:nat, p:nat)
requires line!=null
requires l <= line.Length
requires at+p <= l
modifies line
ensures line[..at] == old(line[..at])
ensures line[at..l-p] == old(line[at+p..l])
{
var i:nat := 0;
while(i < l-(at+p))
invariant i <= l-(at+p)
invariant at+p+i >= at+i
invariant line[..at] == old(line[..at])
invariant line[at..at+i] == old(line[at+p..at+p+i])
invariant line[at+i..l] == old(line[at+i..l]) // future is untouched
{
line[at+i] := line[at+p+i];
i := i+1;
}
}
Overwriting with spaces
If you want to overwrite the trailing part of the old string with spaces you can do this:
method delete(line:array<char>, l:nat, at:nat, p:nat)
requires line!=null
requires l <= line.Length
requires at+p <= l
modifies line
ensures line[..at] == old(line[..at])
ensures line[at..l-p] == old(line[at+p..l])
ensures forall i :: l-p <= i < l ==> line[i] == ' '
{
var i:nat := 0;
while(i < l-(at+p))
invariant i <= l-(at+p)
invariant at+p+i >= at+i
invariant line[..at] == old(line[..at])
invariant line[at..at+i] == old(line[at+p..at+p+i])
invariant line[at+i..l] == old(line[at+i..l]) // future is untouched
{
line[at+i] := line[at+p+i];
i := i+1;
}
var j:nat := l-p;
while(j < l)
invariant l-p <= j <= l
invariant line[..at] == old(line[..at])
invariant line[at..l-p] == old(line[at+p..l])
invariant forall i :: l-p <= i < j ==> line[i] == ' '
{
line[j] := ' ';
j := j+1;
}
}
Extended solution on rise4fun.

Split string by length in Golang

Does anyone know how to split a string in Golang by length?
For example to split "helloworld" after every 3 characters, so it should ideally return an array of "hel" "low" "orl" "d"?
Alternatively a possible solution would be to also append a newline after every 3 characters..
All ideas are greatly appreciated!

Make sure to convert your string into a slice of rune: see "Slice string into letters".
for automatically converts string to rune so there is no additional code needed in this case to convert the string to rune first.
for i, r := range s {
fmt.Printf("i%d r %c\n", i, r)
// every 3 i, do something
}
r[n:n+3] will work best with a being a slice of rune.
The index will increase by one every rune, while it might increase by more than one for every byte in a slice of string: "世界": i would be 0 and 3: a character (rune) can be formed of multiple bytes.
For instance, consider s := "世a界世bcd界efg世": 12 runes. (see play.golang.org)
If you try to parse it byte by byte, you will miss (in a naive split every 3 chars implementation) some of the "index modulo 3" (equals to 2, 5, 8 and 11), because the index will increase past those values:
for i, r := range s {
res = res + string(r)
fmt.Printf("i %d r %c\n", i, r)
if i > 0 && (i+1)%3 == 0 {
fmt.Printf("=>(%d) '%v'\n", i, res)
res = ""
}
}
The output:
i 0 r 世
i 3 r a <== miss i==2
i 4 r 界
i 7 r 世 <== miss i==5
i 10 r b <== miss i==8
i 11 r c ===============> would print '世a界世bc', not exactly '3 chars'!
i 12 r d
i 13 r 界
i 16 r e <== miss i==14
i 17 r f ===============> would print 'd界ef'
i 18 r g
i 19 r 世 <== miss the rest of the string
But if you were to iterate on runes (a := []rune(s)), you would get what you expect, as the index would increase one rune at a time, making it easy to aggregate exactly 3 characters:
for i, r := range a {
res = res + string(r)
fmt.Printf("i%d r %c\n", i, r)
if i > 0 && (i+1)%3 == 0 {
fmt.Printf("=>(%d) '%v'\n", i, res)
res = ""
}
}
Output:
i 0 r 世
i 1 r a
i 2 r 界 ===============> would print '世a界'
i 3 r 世
i 4 r b
i 5 r c ===============> would print '世bc'
i 6 r d
i 7 r 界
i 8 r e ===============> would print 'd界e'
i 9 r f
i10 r g
i11 r 世 ===============> would print 'fg世'

Here is another variant playground.
It is by far more efficient in terms of both speed and memory than other answers. If you want to run benchmarks here they are benchmarks. In general it is 5 times faster than the previous version that was a fastest answer anyway.
func Chunks(s string, chunkSize int) []string {
if len(s) == 0 {
return nil
}
if chunkSize >= len(s) {
return []string{s}
}
var chunks []string = make([]string, 0, (len(s)-1)/chunkSize+1)
currentLen := 0
currentStart := 0
for i := range s {
if currentLen == chunkSize {
chunks = append(chunks, s[currentStart:i])
currentLen = 0
currentStart = i
}
currentLen++
}
chunks = append(chunks, s[currentStart:])
return chunks
}
Please note that the index points to a first byte of a rune on iterating over a string. The rune takes from 1 to 4 bytes. Slicing also treats the string as a byte array.
PREVIOUS SLOWER ALGORITHM
The code is here playground. The conversion from bytes to runes and then to bytes again takes a lot of time actually. So better use the fast algorithm at the top of the answer.
func ChunksSlower(s string, chunkSize int) []string {
if chunkSize >= len(s) {
return []string{s}
}
var chunks []string
chunk := make([]rune, chunkSize)
len := 0
for _, r := range s {
chunk[len] = r
len++
if len == chunkSize {
chunks = append(chunks, string(chunk))
len = 0
}
}
if len > 0 {
chunks = append(chunks, string(chunk[:len]))
}
return chunks
}
Please note that these two algorithms treat invalid UTF-8 characters in a different way. First one processes them as is when second one replaces them by utf8.RuneError symbol ('\uFFFD') that has following hexadecimal representation in UTF-8: efbfbd.

Also needed a function to do this recently, see example usage here
func SplitSubN(s string, n int) []string {
sub := ""
subs := []string{}
runes := bytes.Runes([]byte(s))
l := len(runes)
for i, r := range runes {
sub = sub + string(r)
if (i + 1) % n == 0 {
subs = append(subs, sub)
sub = ""
} else if (i + 1) == l {
subs = append(subs, sub)
}
}
return subs
}

Here is another example (you can try it here):
package main
import (
"fmt"
"strings"
)
func ChunkString(s string, chunkSize int) []string {
var chunks []string
runes := []rune(s)
if len(runes) == 0 {
return []string{s}
}
for i := 0; i < len(runes); i += chunkSize {
nn := i + chunkSize
if nn > len(runes) {
nn = len(runes)
}
chunks = append(chunks, string(runes[i:nn]))
}
return chunks
}
func main() {
fmt.Println(ChunkString("helloworld", 3))
fmt.Println(strings.Join(ChunkString("helloworld", 3), "\n"))
}

An easy solution using regex
re := regexp.MustCompile((\S{3}))
x := re.FindAllStringSubmatch("HelloWorld", -1)
fmt.Println(x)
https://play.golang.org/p/mfmaQlSRkHe

I tried 3 version to implement the function, the function named "splitByWidthMake" is fastest.
These functions ignore the unicode but only the ascii code.
package main
import (
"fmt"
"strings"
"time"
"math"
)
func splitByWidthMake(str string, size int) []string {
strLength := len(str)
splitedLength := int(math.Ceil(float64(strLength) / float64(size)))
splited := make([]string, splitedLength)
var start, stop int
for i := 0; i < splitedLength; i += 1 {
start = i * size
stop = start + size
if stop > strLength {
stop = strLength
}
splited[i] = str[start : stop]
}
return splited
}
func splitByWidth(str string, size int) []string {
strLength := len(str)
var splited []string
var stop int
for i := 0; i < strLength; i += size {
stop = i + size
if stop > strLength {
stop = strLength
}
splited = append(splited, str[i:stop])
}
return splited
}
func splitRecursive(str string, size int) []string {
if len(str) <= size {
return []string{str}
}
return append([]string{string(str[0:size])}, splitRecursive(str[size:], size)...)
}
func main() {
/*
testStrings := []string{
"hello world",
"",
"1",
}
*/
testStrings := make([]string, 10)
for i := range testStrings {
testStrings[i] = strings.Repeat("#", int(math.Pow(2, float64(i))))
}
//fmt.Println(testStrings)
t1 := time.Now()
for i := range testStrings {
_ = splitByWidthMake(testStrings[i], 2)
//fmt.Println(t)
}
elapsed := time.Since(t1)
fmt.Println("for loop version elapsed: ", elapsed)
t1 = time.Now()
for i := range testStrings {
_ = splitByWidth(testStrings[i], 2)
}
elapsed = time.Since(t1)
fmt.Println("for loop without make version elapsed: ", elapsed)
t1 = time.Now()
for i := range testStrings {
_ = splitRecursive(testStrings[i], 2)
}
elapsed = time.Since(t1)
fmt.Println("recursive version elapsed: ", elapsed)
}

Not the most efficient, will work for most use-cases.
Go playground: https://play.golang.org/p/0JSqv3OMdCR
// splitBy splits a string s by int n.
func splitBy(s string, n int) []string {
var ss []string
for i := 1; i < len(s); i++ {
if i%n == 0 {
ss = append(ss, s[:i])
s = s[i:]
i = 1
}
}
ss = append(ss, s)
return ss
}
// test
s := "helloworld"
ss := splitBy(s, 3)
fmt.Println(ss)
# output
$ go run main.go
[hel low orl d]

Counting substring that begin with character 'A' and ends with character 'X'

PYTHON QN:
Using just one loop, how do I devise an algorithm that counts the number of substrings that begin with character A and ends with character X? For example, given the input string CAXAAYXZA there are four substrings that begin with A and ends with X, namely: AX, AXAAYX, AAYX, and AYX.
For example:
>>>count_substring('CAXAAYXZA')
4

Since you didn't specify a language, im doing c++ish
int count_substring(string s)
{
int inc = 0;
int substring_count = 0;
for(int i = 0;i < s.length();i++)
{
if(s[i] == 'A') inc++;
if(s[i] == 'X') substring_count += inc;
}
return substring_count;
}
and in Python
def count_substring(s):
inc = 0
substring_count = 0
for c in s:
if(c == 'A'): inc = inc + 1
if(c == 'X'): substring_count = substring_count + inc
return substring_count

First count number of "A" in the string
Then count "X" in the string
using
Public Function CountCharacter(ByVal value As String, ByVal ch As Char) As Integer
Dim cnt As Integer = 0
For Each c As Char In value
If c = ch Then cnt += 1
Next
Return cnt
End Function
then take each "A" as a start position and "X" as an end position and get the substring. Do this for each "X" and then start with second "A" and run that for "X" count times. Repeat this and you will get all the substrings starting with "A" and ending with "X".

Just another solution In python:
def count_substring(str):
length = len(str) + 1
found = []
for i in xrange(0, length):
for j in xrange(i+1, length):
if str[i] == 'A' and str[j-1] == 'X':
found.append(str[i:j])
return found
string = 'CAXAAYXZA'
print count_substring(string)
Output:
['AX', 'AXAAYX', 'AAYX', 'AYX']

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Finding string that matches specific requirements - string

Related

Efficient reverse indexing of number (type string) in Go

Is there a better way to insert "|' into binary string rep to get this 10|000|001

Verifying a Dafny method that shifts a region of an array

Split string by length in Golang

Counting substring that begin with character 'A' and ends with character 'X'

Categories

Resources