How to remove the firsts n characters from a string in Elixir? - string

I have a list of strings. Each of those strings starts with n characters I want to get rid of.
I can't use something like "123" <> new_string = old_string because the characters can be anything.
So I'd like to do something like this:
my_list |> Enum.map(fn(str) ->
# Remove the n leading characters from str
end)
Do you know how I could achieve this?

You can use String.slice/2 to remove the first N graphemes of a string, and binary_part/3 or pattern matching to remove the first N bytes of a string.
Setup:
iex(1)> a = "abc"
"abc"
iex(2)> b = "πr²"
"πr²"
Removing the first 2 graphemes of a string:
iex(3)> String.slice(a, 2..-1)
"c"
iex(4)> String.slice(b, 2..-1)
"²"
Removing the first 2 bytes of a string:
iex(5)> binary_part(a, 2, byte_size(a) - 2)
"c"
iex(6)> binary_part(b, 2, byte_size(b) - 2)
"r²"
iex(7)> remove = 2
2
iex(8)> <<_::binary-size(remove), rest::binary>> = a; rest
"c"
iex(9)> <<_::binary-size(remove), rest::binary>> = b; rest
"r²"

Another alternative:
defmodule StringExtensions do
def remove_first_n_chars(s, n) do
{_, new_string} = s |> String.codepoints() |> Enum.split(n)
new_string |> Enum.join()
end
end
Which would then be used like so:
l = ["abcdefg","hijklmno","pqrstuv"]
l2 = l |> Enum.map(fn str -> StringExtensions.remove_first_n_chars(str,2) end) # l2 -> ["cdefg", "jklmno", "rstuv"]
Just wanted to offer a potential alternative, FWIW.

Related

How convert first char to lowerCase

Try to play with string and I have string like: "Hello.Word" or "stackOver.Flow"
and i what first char convert to lower case: "hello.word" and "stackOver.flow"
For snakeCase it easy we need only change UpperCase to lower and add '_'
but in camelCase (with firs char in lower case) i dont know how to do this
open System
let convertToSnakeCase (value:string) =
String [|
Char.ToLower value.[0]
for ch in value.[1..] do
if Char.IsUpper ch then '_'
Char.ToLower ch |]
Who can help?
module Identifier =
open System
let changeCase (str : string) =
if String.IsNullOrEmpty(str) then str
else
let isUpper = Char.IsUpper
let n = str.Length
let builder = new System.Text.StringBuilder()
let append (s:string) = builder.Append(s) |> ignore
let rec loop i j =
let k =
if i = n (isUpper str.[i] && (not (isUpper str.[i - 1])
((i + 1) <> n && not (isUpper str.[i + 1]))))
then
if j = 0 then
append (str.Substring(j, i - j).ToLower())
elif (i - j) > 2 then
append (str.Substring(j, 1))
append (str.Substring(j + 1, i - j - 1).ToLower())
else
append (str.Substring(j, i - j))
i
else
j
if i = n then builder.ToString()
else loop (i + 1) k
loop 1 0
type System.String with
member x.ToCamelCase() = changeCase x
printfn "%s" ("StackOver.Flow".ToCamelCase()) //stackOver.Flow
//need stackOver.flow
I suspect there are much more elegant and concise solutions, I sense you are learning functional programming, so I think its best to do stuff like this with recursive function rather than use some magic library function. I notice in your question you ARE using a recusive function, but also an index into an array, lists and recursive function work much more easily than arrays, so if you use recursion the solution is usually simpler if its a list.
I'd also avoid using a string builder, assuming you are learning fp, string builders are imperative, and whilst they obviously work, they wont help you get your head around using immutable data.
The key then is to use the pattern match to match the scenario that you want to use to trigger the upper/lower case logic, as it depends on 2 consecutive characters.
I THINK you want this to happen for the 1st char, and after a '.'?
(I've inserted a '.' as the 1st char to allow the recursive function to just process the '.' scenario, rather than making a special case).
let convertToCamelCase (value : string) =
let rec convertListToCamelCase (value : char list) =
match value with
| [] -> []
| '.' :: second :: rest ->
'.' :: convertListToCamelCase (Char.ToLower second :: rest)
| c :: rest ->
c :: convertListToCamelCase rest
// put a '.' on the front to simplify the logic (and take it off after)
let convertAsList = convertListToCamelCase ('.' :: (value.ToCharArray() |> Array.toList))
String ((convertAsList |> List.toArray).[1..])
The piece to worry about is the recusive piece, the rest of it is just flipping an array to a list and back again.

How to split string by odd length

Lets say with a string = "AABBAAAAABBBBAAABBBBAA"
I want to return string split by the odd lengths of the string (i.e when A = 5 or A = 3),
What I want returned is 1) AABBAAAAA 2)BBBBAAA 3)BBBBAA,
How can I do that?
I tried using regex [A]+[B]+ for a slightly different case
One option might be to regex iterate using re.finditer with the following pattern:
.*?(?:AAA(?:AA)?|$)
This pattern will non greedily consume until reaching either 3 A's, 5 A's, or the end of the string. Then, we can print out each complete match as we iterate.
input = 'AABBAAAAABBBBAAABBBBAA'
pattern = '.*?(?:AAA(?:AA)?|$)'
for match in re.finditer(pattern, input):
print match.group()
This prints:
AABBAAAAA
BBBBAAA
BBBBAA
You can use itertools.groupby:
s = 'BBAAAAABBBBAAABBBBAA'
from itertools import groupby
out = ['']
for v, g in groupby(s):
l = [*g]
out[-1] += ''.join(l)
if v == 'A' and len(l) in (3, 5):
out.append('')
print(out)
Prints:
['BBAAAAA', 'BBBBAAA', 'BBBBAA']

How to split a string into sub strings of n length?

How would i split a string into sub array's of n length in Matlab?
eg.
Input: "ABCDEFGHIJKL", with sub arrays of length 3
Output: {ABC}, {DEF}, {GHI}, {JKL}
If the string length is not a multiple of n you probably need a loop or arrayfun:
x = 'ABCDEFGHIJK'; % length 11
n = 3;
result = arrayfun(#(k) x(k:min(k+n-1, end)), 1:n:numel(x), 'UniformOutput', false)
Alternatively, accumarray can be used as well:
x = 'ABCDEFGHIJK';
n = 3;
result = accumarray(floor((0:numel(x)-1).'/n)+1, x, [], #(t) {t.'}).';
Either of the above gives, in this example,
result =
1×4 cell array
{'ABC'} {'DEF'} {'GHI'} {'JK'}
A regular expression can do the job here:
str = 'abcdefgh'
exp = '.{1,3}' %the regular expression (get all the group of 3 char, if number of char left < 3, take the rest)
res = regexp(str,exp,'match')
which give:
res =
1×3 cell array
{'abc'} {'def'} {'gh'}
If you only want to match group of 3 char:
exp = '.{3}' %this will output {'abc'} {'def'} but no {'gh'}
This shoud do it :)
string = cellstr(reshape(string, 3, [])')

F# Count how Many times a substring Contains within a string

How could one count how many times a substring exists within a string?
I mean if you have a String "one, two, three, one, one, two" how could you make it count "one" being present 3 times?
I thought String.Contains would be able to do the job but that only checks if the substring is present at all. String.forall is for chars and therefofre niether an option.
So i am really at a complete halt here. Can some enligten me?
You can use Regex.Escape to turn the string you're searching for into a regex, then use regex functions:
open System.Text.RegularExpressions
let countMatches wordToMatch (input : string) =
Regex.Matches(input, Regex.Escape wordToMatch).Count
Test:
countMatches "one" "one, two, three, one, one, two"
// Output: 3
Here's a simple implementation that walks through the string, using String.IndexOf to skip through to the next occurrence of the substring, and counts up how many times it succeeds.
let substringCount (needle : string) (haystack : string) =
let rec loop count (index : int) =
if index >= String.length haystack then count
else
match haystack.IndexOf(needle, index) with
| -1 -> count
| idx -> loop (count + 1) (idx + 1)
if String.length needle = 0 then 0 else loop 0 0
Bear in mind, this counts overlapping occurrences, e.g., subtringCount "aa" "aaaa" = 3. If you want non-overlapping, simply replace idx + 1 with idx + String.length needle.
Create a sequence of tails of the string to search in, that is, all substring slices anchored at its end. Then you can use forall functionality to determine the number of matches against the beginning of each of them. It's just golfier than (fun s -> s.StartsWith needle).
let count needle haystack =
[ for i in 0..String.length haystack - 1 -> haystack.[i..] ]
|> Seq.filter (Seq.forall2 (=) needle)
|> Seq.length
count "aba" "abacababac"
// val it : int = 3
a fellow student of mine came up with the so far simpelst solutions i have seen.
let countNeedle (haystack :string) (needle : string) =
match needle with
| "" -> 0
| _ -> (haystack.Length - haystack.Replace(needle, "").Length) / needle.Length
// This approach assumes the data is comma-delimited.
let data = "one, two, three, one, one, two"
let dataArray = data.Split([|','|]) |> Array.map (fun x -> x.Trim())
let countSubstrings searchTerm = dataArray |> Array.filter (fun x -> x = searchTerm) |> Array.length
let countOnes = countSubstrings "one"
let data' = "onetwothreeoneonetwoababa"
// This recursive approach makes no assumptions about a delimiter,
// and it will count overlapping occurrences (e.g., "aba" twice in "ababa").
// This is similar to Jake Lishman's answer.
let rec countSubstringFromI s i what =
let len = String.length what
if i + len - 1 >= String.length s then 0
else (if s.Substring(i, len) = what then 1 else 0) + countSubstringFromI s (i + 1) what
let countSubStrings' = countSubstringFromI data' 0 "one"

Find substring of string w/o knowing the length of string

I have a string x: x = "{abc}{def}{ghi}"
And I need to print the string between second { and second }, in this case def. How can I do this without knowing the length of the string? For example, the string x could also be {abcde}{fghij}{klmno}"
This is where pattern matching is useful:
local x = "{abc}{def}{ghi}"
local result = x:match(".-{.-}.-{(.-)}")
print(result)
.- matches zero or more characters, non-greedy. The whole pattern .-{.-}.-{(.-)} captures what's between the second { and the second }.
Try also x:match(".-}{(.-)}"), which is simpler.
I would go about it in a different manner:
local i, x, result = 1, "{abc}{def}{ghi}"
for w in x:gmatch '{(.-)}' do
if i == 2 then
result = w
break
else
i = i + 1
end
end
print( result )

Resources