split string logic error - haskell

Hello this function should take a String and return a list of Strings split at the Char c. I should define some helper functions but currently the user must initialize args that should be hidden from them.
xs = output list, i = start index for substr, j = end index for substr
example: split "123,456,789" ',' [] 0 0
should yield ["789", "456", "123"]
split s c xs i j =
if j == length s
then (subStr s i j) : xs
else if head (drop j s) == c
then split s c (subStr s i j : xs) (j + 1) (j + 1)
else split s c xs i (j + 1)
subStr s i j = take j(drop i s)
When i apply the function with the following args: split "123,456,789" ',' [] 0 0
I'm getting the result: ["789", "456,789", "123"]

I already mentioned this on your other post, but the issue with this is your subStr function. If you change it to subStr s i j = take (j-i) (drop i s) it should work. And if that's all you want, great. But it could be written more clearly and easily using takeWhile, or using split from data.Text.
Also, type signatures please. (Although I do appreciate that you defined the inputs this time.) Not only do they make it easier for us to help, you can often solve your own problems in the process of figuring them out.

Related

How convert first char to lowerCase

Try to play with string and I have string like: "Hello.Word" or "stackOver.Flow"
and i what first char convert to lower case: "hello.word" and "stackOver.flow"
For snakeCase it easy we need only change UpperCase to lower and add '_'
but in camelCase (with firs char in lower case) i dont know how to do this
open System
let convertToSnakeCase (value:string) =
String [|
Char.ToLower value.[0]
for ch in value.[1..] do
if Char.IsUpper ch then '_'
Char.ToLower ch |]
Who can help?
module Identifier =
open System
let changeCase (str : string) =
if String.IsNullOrEmpty(str) then str
else
let isUpper = Char.IsUpper
let n = str.Length
let builder = new System.Text.StringBuilder()
let append (s:string) = builder.Append(s) |> ignore
let rec loop i j =
let k =
if i = n (isUpper str.[i] && (not (isUpper str.[i - 1])
((i + 1) <> n && not (isUpper str.[i + 1]))))
then
if j = 0 then
append (str.Substring(j, i - j).ToLower())
elif (i - j) > 2 then
append (str.Substring(j, 1))
append (str.Substring(j + 1, i - j - 1).ToLower())
else
append (str.Substring(j, i - j))
i
else
j
if i = n then builder.ToString()
else loop (i + 1) k
loop 1 0
type System.String with
member x.ToCamelCase() = changeCase x
printfn "%s" ("StackOver.Flow".ToCamelCase()) //stackOver.Flow
//need stackOver.flow
I suspect there are much more elegant and concise solutions, I sense you are learning functional programming, so I think its best to do stuff like this with recursive function rather than use some magic library function. I notice in your question you ARE using a recusive function, but also an index into an array, lists and recursive function work much more easily than arrays, so if you use recursion the solution is usually simpler if its a list.
I'd also avoid using a string builder, assuming you are learning fp, string builders are imperative, and whilst they obviously work, they wont help you get your head around using immutable data.
The key then is to use the pattern match to match the scenario that you want to use to trigger the upper/lower case logic, as it depends on 2 consecutive characters.
I THINK you want this to happen for the 1st char, and after a '.'?
(I've inserted a '.' as the 1st char to allow the recursive function to just process the '.' scenario, rather than making a special case).
let convertToCamelCase (value : string) =
let rec convertListToCamelCase (value : char list) =
match value with
| [] -> []
| '.' :: second :: rest ->
'.' :: convertListToCamelCase (Char.ToLower second :: rest)
| c :: rest ->
c :: convertListToCamelCase rest
// put a '.' on the front to simplify the logic (and take it off after)
let convertAsList = convertListToCamelCase ('.' :: (value.ToCharArray() |> Array.toList))
String ((convertAsList |> List.toArray).[1..])
The piece to worry about is the recusive piece, the rest of it is just flipping an array to a list and back again.

String out of range [Python]

I'm trying to make a program that combines two words together in python.
For example, if I am combining "hello" and "chadd" it will return "hcehlaldod" by alternating letters.
Heres my code:
string1 = "hey"
string2 = "hii"
len1 = len(str(string1))
len2 = len(str(string2))
x = 0
final = ""
while (x <= len1):
final = final + string1[x] + string2[x]
x = x + 1
any help?
There is a simplest way of to do that like this:
string1 = "hey"
string2 = "hii"
new_str = ""
for char1,char2 in zip(string1, string2):
new_str += char1 + char2
if __name__ == '__main__':
print(new_str)
Change while (x <= len1) to while (x < len1) if you only care about the length of the first string.
If you care about the length of both strings, do while (x < len1 and x < len2) instead.
Your immediate problem with your loop is because you're using the condition while (x <= len1):
Let me explain. The length of your string is 3. The characters (and their indexes) are as follows:
0 1 2
h e y
You will see your string ends at index position 2. So now go back to your condition. You have set it to continue looping while (x <= len1):. So your loop will operate when x=0, x=1, x=2 and x=3. The x=3 is out of bounds since the indexes for your string end at index position 2.
What you should use is while (x < len1): which will stop at the correct point in your string.
You can pay attention to the lengths, as others have suggested, but you can also take a more functional approach with the built in zip function:
string1 = "hello"
string2 = "chadd"
string3 = ''.join(t[0] + t[1] for t in zip(string1, string2)) # hcehlaldod
zip works by pairing it's inputs:
print(list(zip(string1, string2))) # note that you should turn it into a list to print it
# [('h', 'c'), ('e', 'h'), ('l', 'a'), ('l', 'd'), ('o', 'd')]
And you can then just combine those into a string (like my first code snippit does).

Cut K sequences of length L to obtain the biggest number

We have a number of N digits (it can start with 0). We must find the biggest number which can be obtained cutting K disjoint sequences of length L.
N can be very big so our number should be stored as a string.
Example 1)
nr = 12122212212212121222
K = 2, L = 3
answer: 22212212221222
We can cut "121" (from 0th digit) and "121" (from 12th digit).
Example 2)
nr = 0739276145
K = 3, L = 3
answer: 9
We can cut "073", "276" and "145".
I have tried something like this:
void cut(string str, int K, int L) {
if (K == 0)
return;
// here we cut a single sequence of length L
// in a way that the new number is the biggest
cut(str, K - 1, L);
}
But in this way, I can cut 2 sequences which in the initial number are not disjoint, so my method it's not correct. Please help me solve the problem!
You can define cutsrecursively:
cuts(s, 0, L) = s
cuts(s, K, L) = max(s[i:j] + cuts(s[j+L:], K-1, L) for j=i..len(s)-K*L)
As is normal in these problems, you can use dynamic programming to avoid an exponential runtime. You can probably avoid so much string slicing and appending, but this is an example solution in Python:
def cuts(s, K, L):
dp = [s[i:] for i in xrange(len(s)+1)]
for k in xrange(1, K+1):
dp = [max(s[i:j] + dp[j+L] for j in xrange(i, len(dp)-L))
for i in xrange(len(dp)-L)]
return dp[0]
print cuts('12122212212212121222', 2, 3)
print cuts('0739276145', 3, 3)
Output:
22212212221222
9

F# Count how Many times a substring Contains within a string

How could one count how many times a substring exists within a string?
I mean if you have a String "one, two, three, one, one, two" how could you make it count "one" being present 3 times?
I thought String.Contains would be able to do the job but that only checks if the substring is present at all. String.forall is for chars and therefofre niether an option.
So i am really at a complete halt here. Can some enligten me?
You can use Regex.Escape to turn the string you're searching for into a regex, then use regex functions:
open System.Text.RegularExpressions
let countMatches wordToMatch (input : string) =
Regex.Matches(input, Regex.Escape wordToMatch).Count
Test:
countMatches "one" "one, two, three, one, one, two"
// Output: 3
Here's a simple implementation that walks through the string, using String.IndexOf to skip through to the next occurrence of the substring, and counts up how many times it succeeds.
let substringCount (needle : string) (haystack : string) =
let rec loop count (index : int) =
if index >= String.length haystack then count
else
match haystack.IndexOf(needle, index) with
| -1 -> count
| idx -> loop (count + 1) (idx + 1)
if String.length needle = 0 then 0 else loop 0 0
Bear in mind, this counts overlapping occurrences, e.g., subtringCount "aa" "aaaa" = 3. If you want non-overlapping, simply replace idx + 1 with idx + String.length needle.
Create a sequence of tails of the string to search in, that is, all substring slices anchored at its end. Then you can use forall functionality to determine the number of matches against the beginning of each of them. It's just golfier than (fun s -> s.StartsWith needle).
let count needle haystack =
[ for i in 0..String.length haystack - 1 -> haystack.[i..] ]
|> Seq.filter (Seq.forall2 (=) needle)
|> Seq.length
count "aba" "abacababac"
// val it : int = 3
a fellow student of mine came up with the so far simpelst solutions i have seen.
let countNeedle (haystack :string) (needle : string) =
match needle with
| "" -> 0
| _ -> (haystack.Length - haystack.Replace(needle, "").Length) / needle.Length
// This approach assumes the data is comma-delimited.
let data = "one, two, three, one, one, two"
let dataArray = data.Split([|','|]) |> Array.map (fun x -> x.Trim())
let countSubstrings searchTerm = dataArray |> Array.filter (fun x -> x = searchTerm) |> Array.length
let countOnes = countSubstrings "one"
let data' = "onetwothreeoneonetwoababa"
// This recursive approach makes no assumptions about a delimiter,
// and it will count overlapping occurrences (e.g., "aba" twice in "ababa").
// This is similar to Jake Lishman's answer.
let rec countSubstringFromI s i what =
let len = String.length what
if i + len - 1 >= String.length s then 0
else (if s.Substring(i, len) = what then 1 else 0) + countSubstringFromI s (i + 1) what
let countSubStrings' = countSubstringFromI data' 0 "one"

Reading a file of lists of integers in Fortran

I would like to read a data file with a Fortran program, where each line is a list of integers.
Each line has a variable number of integers, separated by a given character (space, comma...).
Sample input:
1,7,3,2
2,8
12,44,13,11
I have a solution to split lines, which I find rather convoluted:
module split
implicit none
contains
function string_to_integers(str, sep) result(a)
integer, allocatable :: a(:)
integer :: i, j, k, n, m, p, r
character(*) :: str
character :: sep, c
character(:), allocatable :: tmp
!First pass: find number of items (m), and maximum length of an item (r)
n = len_trim(str)
m = 1
j = 0
r = 0
do i = 1, n
if(str(i:i) == sep) then
m = m + 1
r = max(r, j)
j = 0
else
j = j + 1
end if
end do
r = max(r, j)
allocate(a(m))
allocate(character(r) :: tmp)
!Second pass: copy each item into temporary string (tmp),
!read an integer from tmp, and write this integer in the output array (a)
tmp(1:r) = " "
j = 0
k = 0
do i = 1, n
c = str(i:i)
if(c == sep) then
k = k + 1
read(tmp, *) p
a(k) = p
tmp(1:r) = " "
j = 0
else
j = j + 1
tmp(j:j) = c
end if
end do
k = k + 1
read(tmp, *) p
a(k) = p
deallocate(tmp)
end function
end module
My question:
Is there a simpler way to do this in Fortran? I mean, reading a list of values where the number of values to read is unknown. The above code looks awkward, and file I/O does not look easy in Fortran.
Also, the main program has to read lines with unknown and unbounded length. I am able to read lines if I assume they are all the same length (see below), but I don't know how to read unbounded lines. I suppose it would need the stream features of Fortran 2003, but I don't know how to write this.
Here is the current program:
program read_data
use split
implicit none
integer :: q
integer, allocatable :: a(:)
character(80) :: line
open(unit=10, file="input.txt", action="read", status="old", form="formatted")
do
read(10, "(A80)", iostat=q) line
if(q /= 0) exit
if(line(1:1) /= "#") then
a = string_to_integers(line, ",")
print *, ubound(a), a
end if
end do
close(10)
end program
A comment about the question: usually I would do this in Python, for example converting a line would be as simple as a = [int(x) for x in line.split(",")], and reading a file is likewise almost a trivial task. And I would do the "real" computing stuff with a Fortran DLL. However, I'd like to improve my Fortran skills on file I/O.
I don't claim it is the shortest possible, but it is much shorter than yours. And once you have it, you can reuse it. I don't completely agree with these claims how Fotran is bad at string processing, I do tokenization, recursive descent parsing and similar stuff just fine in Fortran, although it is easier in some other languages with richer libraries. Sometimes you can use the libraries written in other languages (especially C and C++) in Fortran too.
If you always use the comma you can remove the replacing by comma and thus shorten it even more.
function string_to_integers(str, sep) result(a)
integer, allocatable :: a(:)
character(*) :: str
character :: sep
integer :: i, n_sep
n_sep = 0
do i = 1, len_trim(str)
if (str(i:i)==sep) then
n_sep = n_sep + 1
str(i:i) = ','
end if
end do
allocate(a(n_sep+1))
read(str,*) a
end function
Potential for shortening: view the str as a character array using equivalence or transfer and use count() inside of allocate to get the size of a.
The code assumes that there is just one separator between each number and there is no separator before the first one. If multiple separators are allowed between two numbers, you have to check whether the preceding character is a separator or not
do i = 2, len_trim(str)
if (str(i:i)==sep .and. str(i-1:i-1)/=sep) then
n_sep = n_sep + 1
str(i:i) = ','
end if
end do
My answer is probably too simplistic for your goals but I have spent a lot of time recently reading in strange text files of numbers. My biggest problem is finding where they start (not hard in your case) then my best friend is the list-directed read.
read(unit=10,fmt=*) a
will read in all of the data into vector 'a', done deal. With this method you will not know which line any piece of data came from. If you want to allocate it then you can read the file once and figure out some algorithm to make the array larger than it needs to be, like maybe count the number of lines and you know a max data amount per line (say 21).
status = 0
do while ( status == 0)
line_counter = line_counter + 1
read(unit=10,, iostat=status, fmt=*)
end do
allocate(a(counter*21))
If you want to then eliminate zero values you can remove them or pre-seed the 'a' vector with a negative number if you don't expect any then remove all of those.
Another approach stemming from the other suggestion is to first count the commas then do a read where the loop is controlled by
do j = 1, line_counter ! You determined this on your first read
read(unit=11,fmt=*) a(j,:) ! a is now a 2 dimensional array (line_counter, maxNumberPerLine)
! You have a separate vector numberOfCommas(j) from before
end do
And now you can do whatever you want with these two arrays because you know all the data, which line it came from, and how many data were on each line.

Resources