Haskell incorrect indentation - haskell

I have an error saying "Possibly incorrect indentation"
boyerMooreSearch :: [Char] -> [Char] -> [Int] -> Int
boyerMooreSearch string pattern skipTable
| skip == 0 = 0
| skip > 0 && (string length > pattern length) = boyerMooreSearch (substring string skip (string length)) pattern skipTable
| otherwise = -1
where
subStr = (substring 0 (pattern length))
skip = (calculateSkip subStr pattern skipTable)
Whats wrong with it? Can anyone explain indentation rules in Haskell?

On the line with substr, you have a string of whitespace followed by a literal tab character, and on the line with skip you have the same string followed by four spaces. These are incompatible; one robust, flexible way to get this right is to line things in a block up with exact the same string of whitespace at the beginning of each line.
The real rule, though, since you asked, is that tabs increase the indentation level to the next multiple of eight, and all other characters increase the indentation level by one. Different lines in a block must be at the same indentation level. do, where, let, and of introduce blocks (I may be forgetting a few).

Related

Use scanf to split a string on a non-whitespace separator

I aim to scan a string containing a colon as a division and save both parts of it in a tuple.
For example:
input: "a:b"
output: ("a", "b")
My approach so far keeps getting the error message:
"scanf: bad input at char number 9: looking for ':', found '\n'".
Scanf.bscanf Scanf.Scanning.stdin "%s:%s" (fun x y -> (x,y));;
Additionally, my approach works with integers, I'm confused why it is not working with strings.
Scanf.bscanf Scanf.Scanning.stdin "%d:%d" (fun x y -> (x,y));;
4:3
- : int * int = (4, 3)
The reason for the issue you're seeing is that the first %s is going to keep consuming input until one of the following conditions hold:
a whitespace has been found,
a scanning indication has been encountered,
the end-of-input has been reached.
Note that seeing a colon isn't going to satisfy any of these (if you don't use a scanning indication). This means that the first %s is going to consume everything up to, in your case, the newline character in the input buffer, and then the : is going to fail.
You don't have this same issue for %d:%d because %d isn't going to consume the colon as part of matching an integer.
You can fix this by instead using a format string which will not consume the colon, e.g., %[^:]:%s. You could also use a scanning indication, like so: %s#:%s.
Additionally, your current method won't consume any trailing whitespace in the buffer, which might result in newlines being added to the first element on subsequent use of this, so you might prefer %s#:%s\n to consume the newline.
So, in all,
Scanf.bscanf Scanf.Scanning.stdin "%s#:%s\n" (fun x y -> (x,y));;
The %s specifier is greedy and it will read the string up to whitespace or a scanning indicator. The indicator could be specified using #<indicator> just after the %s specifier, where <indicator> is a single character, e.g.,
let split str =
Scanf.sscanf str "%s#:%s" (fun x y -> x,y)
This will instruct scanf to read everything up to : into the first string, drop : and then read the rest into the second string.
The string specifier %s is eager by default and will swallow all your content until the next space. You need to add a scanning indication(https://ocaml.org/api/Scanf.html#indication) to explain to Scanf.sscanf that you expect the first string to end on the first : :
For instance,
Scanf.sscanf "a:b"
"%s#:%s"
(fun x y -> x,y)
returns "a", "b". Here the scanning indication is the #: specifier just after the first %s specifier. In general, scanning indication are written #c for a character c.

return only chars from the string in python

I am looking to extract only chars from the given string. but my query is doing exactly opposite
s= "A man, a plan, a canal: Panama"
newS = ''.join(re.findall("[^a-zA-Z]*", s))
print(newS) // my o/p: , , :
expected o/p string is:
"A man a plan a canal Panama"
Your regular expression is inverting the match - that's what the caret symbol (^) does inside square brackets (negated character class). You first need to remove that.
Next, you should be matching a sequence of one or more characters (+) rather than zero or more characters (*) -- using * will match the empty string, which you don't want in this case.
Finally your join should join with a space to get the intended output, rather than an empty string -- which won't retain the spaces between the words.
newS = ' '.join(re.findall(r'[a-zA-Z]+', s))
Though not essential in this case, its advised to use raw strings for regular expressions (r). More in this post.
Full working code:
import re
s = 'A man, a plan, a canal: Panama'
newS = ' '.join(re.findall(r'[a-zA-Z]+', s))
print(newS)

Why is Haskell complaining about this plus sign?

From this code intended to convert a balanced ternary representation to a Haskell Integer:
frombal3 :: String -> Integer
frombal3 "+" = 1
frombal3 "0" = 0
frombal3 "-" = -1
frombal3 current:therest = \
(*) frombal3 current (^) 3 length therest \
+ frombal3 therest
I got the error:
main.hs:7:3: error: parse error on input ‘+’
|
7 | + frombal3 therest
| ^
<interactive>:3:1: error:
• Variable not in scope: main
• Perhaps you meant ‘min’ (imported from Prelude)
It is not clear what you are trying to achieve, but I can see some mistakes that can be already pointed out.
Problems
You don't need \ to continue a line, that's only needed inside strings. Indentation is enough in Haskell
You need to wrap your pattern matching with parenthesis: (current:therest). Furthermore, this pattern will make current a Char and not a String, so you cannot directly pass it to your function that takes a String.
You need to wrap your function arguments as well: if you want to multiply frombal3 current by 3, you need (*) (frombal3 current) 3, or the much better frombal3 current * 3. Infix functions have higher precedence and make the code more clear.
Suggestions
I am not sure what you want to achieve, but this looks like somthing that can be done with a fold or simple list comprehension
Don't use backslashes, and remember to properly bracket pattern matches:
frombal3 :: String -> Integer
frombal3 "+" = 1
frombal3 "0" = 0
frombal3 "-" = -1
frombal3 (current:therest) = -- ^ Note brackets
(*) frombal3 current (^) 3 length therest
+ frombal3 therest
This still causes a problem due to how you're using operators, but I think you can solve this on your own, especially since I can't work out what you're trying to do here.
You appear to be trying to use backslashes to continue onto the next line; don't do that. If you just delete all the backslashes, the error will go away. (You'll get several other errors, but this particular one will go away.)
Haskell uses indentation to detect where one part ends and the next begins. You don't need to manually add backslashes to the end of each line to continue an expression.

Backslash in string changing output

I am currently trying to implement a method that counts the number of characters and digits in a string. However if I use a string that contains the '\' character I am getting strange results. I am guessing it's because the backslash character is an escape character.
Here is the method:
import Data.Char
countLettersAndDigits :: String -> Int
countLettersAndDigits [] = 0
countLettersAndDigits (x:xs) = if isDigit x == True || isLetter x == True
then 1 + countLettersAndDigits xs
else countLettersAndDigits xs
Here is a set of inputs with their respective results:
"1234fd" -> 6 (Doesn't contain '\')
"1234f\d" -> lexical error in string/character literal at character
'd'
"1234\fd" -> 5
"123\4fd" -> 5
"12\34fd" -> 4
"1\234fd" -> 4
"\1234fd" -> 3
I find it strange that, for example, "1234\fd" and "123\4fd" both give 5 as a result.
Any help explaining why this maybe the case and also how to get around this problem? would be great!
Cheers.
Edit
I forgot to mention that the string that I used above was just an example I was playing with. The actual string that is causing a problem is being generated by Quick Check. The string was "\178". So I require a way to be able to handle this case in my code when their is only one backslash and the string is being generated for me. Cheers.
You are correct that \ is Haskell's escape character. If you print out the generated strings, the answer may be more obvious:
main = mapM_ putStrLn [ "1234fd"
, "1234\fd"
, "123\4fd"
, "12\34fd"
, "1\234fd"
, "\1234fd"
]
yields...
1234fd
1234d
123fd
12"fd
1êfd
Ӓfd
If you actually intended on including a backslash character in your string, you need to double it up: "\\" will result in a single \ being printed.
You can read up on escape sequences here.

Standard ML string to a list

Is there a way in ML to take in a string and output a list of those string where a separation is a space, newline or eof, but also keeping strings inside strings intact?
EX) hello world "my id" is 5555
-> [hello, world, my id, is, 5555]
I am working on a tokenizing these then into:
->[word, word, string, word, int]
Sure you can! Here's the idea:
If we take a string like "Hello World, \"my id\" is 5555", we can split it at the quote marks, ignoring the spaces for now. This gives us ["Hello World, ", "my id", " is 5555"]. The important thing to notice here is that the list contains three elements - an odd number. As long as the string only contains pairs of quotes (as it will if it's properly formatted), we'll always get an odd number of elements when we split at the quote marks.
A second important thing is that all the even-numbered elements of the list will be strings that were unquoted (if we start counting from 0), and the odd-numbered ones were quoted. That means that all we need to do is tokenize the ones that were unquoted, and then we're done!
I put some code together - you can continue from there:
fun foo s =
let
val quoteSep = String.tokens (fn c => c = #"\"") s
val spaceSep = String.tokens (fn c => c = #" ") (* change this to include newlines and stuff *)
fun sepEven [] = []
| sepEven [x] = (* there were no quotes in the string *)
| sepEven (x::y::xs) = (* x was unquoted, y was quoted *)
in
if length quoteSep mod 2 = 0
then (* there was an uneven number of quote marks - something is wrong! *)
else (* call sepEven *)
end
String.tokens brings you halfway there. But if you really want to handle quotes like you are sketching then there is no way around writing an actual lexer. MLlex, which comes with SML/NJ and MLton (but is usable with any SML) could help. Or you just write it by hand, which should be easy enough in this case as well.

Resources