Scala: convert each digit in a string to an integer - string

I want to convert each digit in a number to an int. Here is my code
for (in <- lines) {
for (c <- in) {
val ci = c.toInt
if (ci == 0) {
// do stuff
}
}
}
The result I get is the ascii code, i.e. a 1 gives 49. I'm looking for the value 1.
The answer is trivial, I know. I'm trying to pull myself up with my own bootstraps until my Scala course begins in two weeks. Any assistance gratefully accepted.

One possible solution is:
for(in <- lines) {
in.toString.map(_.asDigit).foreach { i =>
if(i == 1) {
//do stuff
}
}
}
And more compact w/ output:
lines.foreach(in => in.toString.map(_.asDigit).filter(_ == 1).foreach(i => println(s"found $i in $in.")))
If lines is already a collection of Strings, omit the .toString on in.toString.

You can have this:
val number = 123456
//convert Int to String and do transformation for each character to Digit(Int)
val digitsAsList = number.toString.map(_.asDigit)
This will result to digitizing the number. Then with that Collection, you can do anything from filtering, mapping, zipping: you can checkout the the List api on this page: http://www.scala-lang.org/api/2.11.8/#scala.collection.immutable.List
Hope that's help.

Related

Making sure every Alphabet is in a string (Kotlin)

So I have a question where I am checking if a string has every letter of the alphabet in it. I was able to check if there is alphabet in the string, but I'm not sure how to check if there is EVERY alphabet in said string. Here's the code
fun isPangram (pangram: Array<String>) : String {
var panString : String
var outcome = ""
for (i in pangram.indices){
panString = pangram[i]
if (panString.matches(".^*[a-z].*".toRegex())){
outcome = outcome.plus('1')
}
else {outcome = outcome.plus('0')}
}
return outcome
}
Any ideas are welcomed Thanks.
I think it would be easier to check if all members of the alphabet range are in each string than to use Regex:
fun isPangram(pangram: Array<String>): String =
pangram.joinToString("") { inputString ->
when {
('a'..'z').all { it in inputString.lowercase() } -> "1"
else -> "0"
}
}
Hi this is how you can make with regular expression
Kotlin Syntax
fun isStrinfContainsAllAlphabeta( input: String) {
return input.lowercase()
.replace("[^a-z]".toRegex(), "")
.replace("(.)(?=.*\\1)".toRegex(), "")
.length == 26;
}
In java:
public static boolean isStrinfContainsAllAlphabeta(String input) {
return input.toLowerCase()
.replace("[^a-z]", "")
.replace("(.)(?=.*\\1)", "")
.length() == 26;
}
the function takes only one string. The first "replaceAll" removes all the non-alphabet characters, The second one removes the duplicated character, then you check how many characters remained.
Just to bounce off Tenfour04's solution, if you write two functions (one for the pangram check, one for processing the array) I feel like you can make it a little more readable, since they're really two separate tasks. (This is partly an excuse to show you some Kotlin tricks!)
val String.isPangram get() = ('a'..'z').all { this.contains(it, ignoreCase = true) }
fun checkPangrams(strings: Array<String>) =
strings.joinToString("") { if (it.isPangram) "1" else "0" }
You could use an extension function instead of an extension property (so it.isPangram()), or just a plain function with a parameter (isPangram(it)), but you can write stuff that almost reads like English, if you want!

Replacing the number in a string

if my string is lets say "Alfa1234Beta"
how can I convert all the number in to "_"
for example "Alfa1234Beta"
will be "Alfa____Beta"
Going with the Regex approach pointed out by others is possibly OK for your scenario. Mind you however, that Regex sometimes tend to be overused. A hand rolled approach could be like this:
static string ReplaceDigits(string str)
{
StringBuilder sb = null;
for (int i = 0; i < str.Length; i++)
{
if (Char.IsDigit(str[i]))
{
if (sb == null)
{
// Seen a digit, allocate StringBuilder, copy non-digits we might have skipped over so far.
sb = new StringBuilder();
if (i > 0)
{
sb.Append(str, 0, i);
}
}
// Replace current character (a digit)
sb.Append('_');
}
else
{
if (sb != null)
{
// Seen some digits (being replaced) already. Collect non-digits as well.
sb.Append(str[i]);
}
}
}
if (sb != null)
{
return sb.ToString();
}
return str;
}
It is more light weight than Regex and only allocates when there is actually something to do (replace). So, go ahead use the Regex version if you like. If you figure out during profiling that is too heavy weight, you can use something like the above. YMMV
You can run for loop on the string and then use the following method to replace numbers with _
if (!System.Text.RegularExpressions.Regex.IsMatch(i, "^[0-9]*$"))
Here variable i is the character in the for loop .
You can use this:
var s = "Alfa1234Beta";
var s2 = System.Text.RegularExpressions.Regex.Replace(s, "[0-9]", "_");
s2 now contains "Alfa____Beta".
Explanation: the regex [0-9] matches any digit from 0 to 9 (inclusive). The Regex.Replace then replaces all matched characters with an "_".
EDIT
And if you want it a bit shorter AND also match non-latin digits, use \d as a regex:
var s = "Alfa1234Beta๓"; // ๓ is "Thai digit three"
var s2 = System.Text.RegularExpressions.Regex.Replace(s, #"\d", "_");
s2 now contains "Alfa____Beta_".

Find matches in different indices in String. Kotlin

I have two lines of the same length. I need to get the number of letters that match as letters and have different index in the string (without nesting loop into loop). How I can do it?
Would the following function work for you?
fun check(s1: String, s2: String): Int {
var count = 0
s2.forEachIndexed { index, c ->
if (s1.contains(c) && s1[index] != c) count++
}
return count
}
See it working here
Edit:
Alternatively, you could do it like this if you want a one-liner
val count = s1.zip(s2){a,b -> s1.contains(b) && a != b}.count{it}
where s1 and s2 are the 2 strings

String Matching: Matching words with or without spaces

I want to find a way by which I can map "b m w" to "bmw" and "ali baba" to "alibaba" in both the following examples.
"b m w shops" and "bmw"
I need to determine whether I can write "b m w" as "bmw"
I thought of this approach:
remove spaces from the original string. This gives "bmwshops". And now find the Largest common substring in "bmwshop" and "bmw".
Second example:
"ali baba and 40 thieves" and "alibaba and 40 thieves"
The above approach does not work in this case.
Is there any standard algorithm that could be used?
It sounds like you're asking this question: "How do I determine if string A can be made equal to string B by removing (some) spaces?".
What you can do is iterate over both strings, advancing within both whenever they have the same character, otherwise advancing along the first when it has a space, and returning false otherwise. Like this:
static bool IsEqualToAfterRemovingSpacesFromOne(this string a, string b) {
return a.IsEqualToAfterRemovingSpacesFromFirst(b)
|| b.IsEqualToAfterRemovingSpacesFromFirst(a);
}
static bool IsEqualToAfterRemovingSpacesFromFirst(this string a, string b) {
var i = 0;
var j = 0;
while (i < a.Length && j < b.Length) {
if (a[i] == b[j]) {
i += 1
j += 1
} else if (a[i] == ' ') {
i += 1;
} else {
return false;
}
}
return i == a.Length && j == b.Length;
}
The above is just an ever-so-slightly modified string comparison. If you want to extend this to 'largest common substring', then take a largest common substring algorithm and do the same sort of thing: whenever you would have failed due to a space in the first string, just skip past it.
Did you look at Suffix Array - http://en.wikipedia.org/wiki/Suffix_array
or Here from Jon Bentley - Programming Pearl
Note : you have to write code to handle spaces.

Is There a Groovier Way To Add Dashes to a String?

I have the following code, which works, but I'm wondering if there is a "groovier" way of doing this:
/**
* 10 digit - #-######-##-#
* 13 digit - ###-#-######-##-#
* */
private formatISBN(String isbn) {
if (isbn?.length() == 10) {
def part1 = isbn.substring(0, 1)
def part2 = isbn.substring(1, 7)
def part3 = isbn.substring(7, 9)
def part4 = isbn.substring(9, 10)
return "${part1}-${part2}-${part3}-${part4}"
} else if (isbn?.length() == 13) {
def part1 = isbn.substring(0, 3)
def part2 = isbn.substring(3, 4)
def part3 = isbn.substring(4, 10)
def part4 = isbn.substring(10, 12)
def part5 = isbn.substring(12, 13)
return "${part1}-${part2}-${part3}-${part4}-${part5}"
} else {
return isbn
}
}
You could first use the [] string operator to get the substrings instead of substring and drop the intermediate variables. For example in the case for length == 10:
"${isbn[0]}-${isbn[1..6]}-${isbn[7..8]}-${isbn[9]}"
Now, there is a bit of repetition there. You can get instead first get all the isbn segments and then .join them with '-':
[isbn[0], isbn[1..6], isbn[7..8], isbn[9]].join('-')
And, even further, instead of referencing isbn every time, you can make a list of the ranges you want to get and then get them all the same time using collect:
[0, 1..6, 7..8, 9].collect { isbn[it] }.join('-')
If you're going for code golfing, you can also do:
('-'+isbn)[1, 0, 2..7, 0, 8..9, 0, 10]
I'll leave it to you to figure out how that works, but i guess it's probably not a good idea to leave that on production code, unless you want to surprise future maintainers hehe.
Also, notice that the format when length == 13 is the same as for length == 10 but with a different prefix, you can then reuse the same function in that case. The whole function (with a couple of tests) would be:
/**
* 10 digit - #-######-##-#
* 13 digit - ###-#-######-##-#
**/
def formatIsbn(isbn) {
switch (isbn?.length()) {
case 10: return [0, 1..6, 7..8, 9].collect { isbn[it] }.join('-')
case 13: return isbn.take(3) + '-' + formatIsbn(isbn.drop(3))
default: return isbn
}
}
assert formatIsbn('abcdefghij') == 'a-bcdefg-hi-j'
assert formatIsbn('abcdefghijklm') == 'abc-d-efghij-kl-m'
Now, i think there are some bad smells in that code. Can isbn be null? At least to me, this doesn't look like a function that needs to bother about the nullity of its argument, or at least that's not clear by reading its name (it should be called something like formatIsbnOrNull instead if both ISBN strings and null values are accepted). If null values are not valid, then let it blow up with a NullPointerException when accessing isbn.length() so the caller know they have passed a wrong argument, instead of silently returning the same null.
The same goes for the return ISBN at the end. Is it expected for that function to receive a string that's neither 10 nor 13 characters long? If not, better throw new IllegalArgumentException() and let the caller know they have called it wrongly.
Finally, i'm not sure if this is the most "readable" solution. Another possible solution is having a string for the format, like '###-#-######-##-#' and then replace the #s by the isbn characters. I think it might be more self-documenting:
def formatIsbn(isbn) {
def format = [
10: '#-######-##-#',
13: '###-#-######-##-#'
][isbn.length()]
def n = 0
format.replaceAll(/#/) { isbn[n++] }
}
Consider adding the method to the String class, as shown here. Note that this answer is a spin on a clever suggestion in epidemian's answer (re: collect).
Note:
This code augments String with asIsbn().
The range [0..2] does not need the call to asIsbn(), but the symmetry of using collect twice is irresistable.
Groovy returns the last expression in if/else, so 'return' is not necessary
/**
* 10 digit - #-######-##-#
* 13 digit - ###-#-######-##-#
**/
String.metaClass.asIsbn = { ->
if (delegate.length() == 10) {
[0, 1..6, 7..8, 9].collect { delegate[it] }.join('-')
} else if (delegate.length() == 13) {
[0..2, 3..12].collect { delegate[it].asIsbn() }.join('-')
} else {
delegate
}
}
assert "abcdefghij".asIsbn() == 'a-bcdefg-hi-j'
assert "abcdefghijklm".asIsbn() == 'abc-d-efghij-kl-m'
assert "def".asIsbn() == "def"
String s = null
assert s?.asIsbn() == null
I would try using Regex... I think it's pretty much readable if you know how to use regex, and it's javascript inspired syntax in groovy is pretty cool also.
One more thing: it's pretty clear, looking at the capture groups, what your string looks like for the desired formatting.
private formatISBN(String isbn) {
if (isbn?.length() == 10) {
m = isbn =~ /(\d{1})(\d{6})(\d{2})(\d{1})/
return "${m[0][1]}-${m[0][2]}-${m[0][3]}-${m[0][4]}"
} else if (isbn?.length() == 13) {
m = isbn =~ /(\d{3})(\d{1})(\d{6})(\d{2})(\d{1})/
return "${m[0][1]}-${m[0][2]}-${m[0][3]}-${m[0][4]}-${m[0][5]}"
} else {
return isbn
}
}
Btw, #epidemian suggestion using backreferences is great! I think the code would look like:
private formatISBN(String isbn) {
if (isbn?.length() == 10) {
return isbn.replaceAll(/(\d{1})(\d{6})(\d{2})(\d{1})/, '$1-$2-$3-$4')
} else if (isbn?.length() == 13) {
return isbn.replaceAll(/(\d{3})(\d{1})(\d{6})(\d{2})(\d{1})/, '$1-$2-$3-$4-$5')
} else {
return isbn
}
}
Dunno if I like this any better. I'd make the position map a static final, too.
private isbnify(String isbn) {
def dashesAt = [ 10: [[0,1], [1,7], [7,9], [9,10]],
13: [[0,3], [3,4], [4,10], [10,12], [12,13]]]
def dashes = dashesAt[isbn?.length()]
(dashes == null) ? isbn
: dashes.collect { isbn.substring(*it) }.join('-')
}
Ranges make for a bit less clutter, IMO:
private isbnify3(String isbn) {
def dashesAt = [ 10: [0, 1..6, 7..8, 9],
13: [0..2, 3, 4..9, 10..11, 12]]
def dashes = dashesAt[isbn?.length()]
dashes == null ? isbn : dashes.collect { isbn[it] }.join("-")
}
With an inject-with-two-accumulators it should be easy to do a list-of-dash-positions version, too.
This should be a comment to #everton, but I don't have the 50 reputation needed to do that yet. So this answer is really just a suggested variation on #everton's answer.
One less regex by making the first 3 digits optional. The downside is having to remove a leading '-' if the ISBN is 10 characters. (I also prefer \d over \d{1}.)
private formatISBN(String isbn) {
String result = isbn.replaceAll(/^(\d{3})?(\d)(\d{6})(\d{2})(\d)$/,
'$1-$2-$3-$4-$5')
if (result) {
return result.startsWith('-') ? result[1..-1] : result
} else {
return isbn // return value unchanged, pattern didn't match
}
}
println formatISBN('1234567890')
println formatISBN('9991234567890')
println formatISBN('123456789') // test an ISBN that's too short
println formatISBN('12345678901234') // test an ISBN that's too long

Resources