Remove words of a string started by uppercase characters in Scala - string

I want to write an algorithm that removes every word started by an uppercase character in a string.
For example:
Original string: "Today is Friday the 29Th."
Desired result: "is the 29Th."
I wrote this algorithm, but it is not complete:
def removeUpperCaseChars(str: String) = {
for (i <- 0 to str.length - 1) {
if (str.charAt(i).isUpper) {
var j = i
var cont = i
while (str.charAt(j) != " ") {
cont += 1
}
val subStr = str.substring(0, i) + str.substring(cont, str.length - 1)
println(subStr)
}
}
}
It (supposedly) removes every word with uppercase characters instead of removing only the words that start with uppercase characters. And worse than that, Scala doesn't give any result.
Can anyone help me with this problem?

With some assumptions, like words are always split with a space you can implement it like this:
scala> "Today is Friday the 29Th.".split("\\s+").filterNot(_.head.isUpper).mkString(" ")
res2: String = is the 29Th.
We don't really want to write algorithms in the way you did in scala. This is reather a way you would do this in C.

How about string.replaceAll("""\b[A-Z]\w+""", "")?

Related

(dart) replaceAll method in a string in a loop (cypher)

I'm attempting to do CS50 courses in dart, so for week 2 substitution test i'm stuck with this:
void main(List<String> args) {
String alphabet = 'abcdefghijklmnopqrstuvwxyz';
String cypher = 'qwertyuiopasdfghjklzxcvbnm';
int n = alphabet.length;
print('entertext:');
String text = stdin.readLineSync(encoding: utf8)!;
for (int i = 0; i < n; i++) {
text = text.replaceAll(alphabet[i], cypher[i]);
}
print(text);
}
Expected result: abcdef = qwerty
Actual result: jvmkmn
Any ideas why this is happening? I'm a total beginner by the way
It is because you at first substitute the letter a with the letter q, but when n = 16, you will replace all the letter q with the letter j. This is why your a is turned into a j, and so forth...
Best of luck to you :)
For the record, the (very direct and) safer approach would be:
void main(List<String> args) {
String alphabet = 'abcdefghijklmnopqrstuvwxyz';
String cypher = 'qwertyuiopasdfghjklzxcvbnm';
assert(alphabet.length == cypher.length);
// Pattern matching any character in `alphabet`.
var re = RegExp('[${RegExp.escape(alphabet)}]');
print('enter text:');
String text = stdin.readLineSync(encoding: utf8)!;
// Replace each character matched by `re` with the corresponding
// character in `cypher`.
text = text.replaceAllMapped(re, (m) => cypher[alphabet.indexOf(m[0]!)]);
print(text);
}
(This is not an efficient approach. It does a linear lookup in the alphabet for each character. A more efficient approach would either recognize that the alphabet is a contiguous range of character codes, and just do some arithmetic to find the position in the alphabet, or (if it wasn't a contiguous range) could build a more efficient lookup table for the alphabet first).

Node split string by last space

var str="72 tocirah abba tesh sneab";
I currently have this string and want a new string that is called "72 tocirah abba tesh". What is the best way to do this in node/Javascript?
Another one-liner:
"72 tocirah abba tesh sneab".split(" ").slice(0, -1).join(" ")
Basically, you split the string using the space separator, slice the resulting array removing the last item, and join the array using the space separator.
You can use replace, like:
"72 tocirah abba tesh sneab".replace(/\s\w+$/, '')
(This replaces the last space and word with an empty string)
I would solve it like that:
let str = "72 tocirah abba tesh sneab"
Split by " ":
let list = str.split(" ") // [72,tocirah,abba,tesh,sneab]
Remove the last element using pop():
list.pop() // [72,tocirah,abba,tesh]
Then join it back together:
str = list.join(" ") // 72 tocirah abba tesh
Following is one of the ways to handle this.
use lastIndexOf function
const lastSpacePosition = str.lastIndexOf(" ");
// in case there is no space in the statement. take the whole string as result.
if(lastSpacePosition < 0) {
lastSpacePosition = str.length;
}
str.substr(0, lastSpacePosition);
There are other ways to handle this using regEx, split->join as well

How do I get from string "3+10" to strings "3" "+" "10"?

I'm making a graphing calculator in Unity and I have input with strings like "3+10" and I want to split it to "3","+" and "10".
I can figure out a way to deal with them once I've got them to this form, but I really need a way to split the string to the left and right of key characters such as plus, times, exponent, etc.
I'm doing this in Unity, but a way to do this in any language should help.
C#
The following code will do what you asked for (and nothing more).
string input = "3+10-5";
string pattern = #"([-+^*\/])";
string[] substrings = Regex.Split(input, pattern);
// results in substrings = {"3", "+", "10", "-", "5"}
By using Regex.Split instead of String.Split you are able to retrieve the math operators as well. This is done by putting the math operators in a capture group ( ). If you're not familiar with regular expressions you should google the basics.
The code above will stubbornly use the math operators to split your string. If the string doesn't make sense, the method doesn't care and may even produce unexpected results. For example "5//10-" will result in {"5", "/", "", "10", "-", ""}. Note that only one / is returned and empty strings are added.
You can use more complex regular expressions to check if your string is a valid mathematical expression before you try to split it. For example ^(\d+(?:.\d+)?+([-+*^\/]\g<1>)?)$ would check if your string consists of a decimal number and zero or more combinations of an operator and another decimal number.
Here is the C# way -- which I mention because you are using Unity.
words = phrase.Split(default(string[]),StringSplitOptions.RemoveEmptyEntries);
https://msdn.microsoft.com/en-us/library/tabh47cf%28v=vs.110%29.aspx
Here is Java code for splitting a String by math operators
String[] splitByOperators(String input) {
String[] output = new String[input.length()];
int index = 0;
String current = "";
for (char c : input){
if (c == '+' || c == '-' || c == '*' || c == '/'){
output[index] = current;
index++;
output[index] = c;
index++;
current = "";
} else {
current = current + c;
}
}
output[index] = current;
return output;
}
Using Python regular expressions:
>>> import re
>>> match = re.search(r'(\d+)(.*)(\d+)', "3+1")
>>> match.group(1)
'3'
>>> match.group(2)
'+'
>>> match.group(3)
'1'
The reason for using regular expressions is for greater flexibility in handling a variety of simple arithmetic expressions.
R: EDITED
Take your input vector as x<-c("3+10", "4/12" , "8-3" ,"12*1","1+2-3*4/8").
We can use the following string split based on regex:
> strsplit(x,split="(?<=\\d)(?=[+*-/])|(?<=[+*-/])(?=\\d)",perl=T)
[[1]]
[1] "3" "+" "10"
[[2]]
[1] "4" "/" "12"
[[3]]
[1] "8" "-" "3"
[[4]]
[1] "12" "*" "1"
[[5]]
[1] "1" "+" "2" "-" "3" "*" "4" "/" "8"
How it works:
Split the string when one of two things is found:
A digit followed by an arithmetic operator. (?<=\\d) finds something immediately preceded by a digit, while (?=[+*-/]) finds something immediately succeeded by an arithmetic operator, i.e. +, *, -, or /. The "something" in both cases is the blank string "" found between a digit and an operator, and the string is split at such a point.
An arithmetic operator followed by a digit. This is just the reverse of the above.

Trimming strings in Scala

How do I trim the starting and ending character of a string in Scala
For inputs such as ",hello" or "hello,", I need the output as "hello".
Is there is any built-in method to do this in Scala?
Try
val str = " foo "
str.trim
and have a look at the documentation. If you need to get rid of the , character, too, you could try something like:
str.stripPrefix(",").stripSuffix(",").trim
Another way to clean up the front-end of the string would be
val ignoreable = ", \t\r\n"
str.dropWhile(c => ignorable.indexOf(c) >= 0)
which would also take care of strings like ",,, ,,hello"
And for good measure, here's a tiny function, which does it all in one sweep from left to right through the string:
def stripAll(s: String, bad: String): String = {
#scala.annotation.tailrec def start(n: Int): String =
if (n == s.length) ""
else if (bad.indexOf(s.charAt(n)) < 0) end(n, s.length)
else start(1 + n)
#scala.annotation.tailrec def end(a: Int, n: Int): String =
if (n <= a) s.substring(a, n)
else if (bad.indexOf(s.charAt(n - 1)) < 0) s.substring(a, n)
else end(a, n - 1)
start(0)
}
Use like
stripAll(stringToCleanUp, charactersToRemove)
e.g.,
stripAll(" , , , hello , ,,,, ", " ,") => "hello"
To trim the start and ending character in a string, use a mix of drop and dropRight:
scala> " hello,".drop(1).dropRight(1)
res4: String = hello
The drop call removes the first character, dropRight removes the last. Note that this isn't "smart" like trim is. If you don't have any extra character at the start of "hello,", you will trim it to "ello". If you need something more complicated, regex replacement is probably the answer.
If you want to trim only commas and might have more than one on either end, you could do this:
str.dropWhile(_ == ',').reverse.dropWhile(_ == ',').reverse
The use of reverse here is because there is no dropRightWhile.
If you're looking at a single possible comma, stripPrefix and stripSuffix are the way to go, as indicated by Dirk.
Given you only want to trim off invalid characters from the prefix and the suffix of a given string (not scan through the entire string), here's a tiny trimPrefixSuffixChars function to quickly perform the desired effect:
def trimPrefixSuffixChars(
string: String
, invalidCharsFunction: (Char) => Boolean = (c) => c == ' '
): String =
if (string.nonEmpty)
string
.dropWhile(char => invalidCharsFunction(char)) //trim prefix
.reverse
.dropWhile(char => invalidCharsFunction(char)) //trim suffix
.reverse
else
string
This function provides a default for the invalidCharsFunction defining only the space (" ") character as invalid. Here's what the conversion would look like for the following input strings:
trimPrefixSuffixChars(" Tx ") //returns "Tx"
trimPrefixSuffixChars(" . Tx . ") //returns ". Tx ."
trimPrefixSuffixChars(" T x ") //returns "T x"
trimPrefixSuffixChars(" . T x . ") //returns ". T x ."
If you have you would prefer to specify your own invalidCharsFunction function, then pass it in the call like so:
trimPrefixSuffixChars(",Tx. ", (c) => !c.isLetterOrDigit) //returns "Tx"
trimPrefixSuffixChars(" ! Tx # ", (c) => !c.isLetterOrDigit) //returns "Tx"
trimPrefixSuffixChars(",T x. ", (c) => !c.isLetterOrDigit) //returns "T x"
trimPrefixSuffixChars(" ! T x # ", (c) => !c.isLetterOrDigit) //returns "T x"
This attempts to simplify a number of the example solutions provided in other answers.
Someone requested a regex-version, which would be something like this:
val result = " , ,, hello, ,,".replaceAll("""[,\s]+(|.*[^,\s])[,\s]+""", "'$1'")
Result is: result: String = hello
The drawback with regexes (not just in this case, but always), is that it is quite hard to read for someone who is not already intimately familiar with the syntax. The code is nice and concise, though.
Another tailrec function:
def trim(s: String, char: Char): String = {
if (s.stripSuffix(char.toString).stripPrefix(char.toString) == s)
{
s
} else
{
trim(s.stripSuffix(char.toString).stripPrefix(char.toString), char)
}
}
scala> trim(",hello",',')
res12: String = hello
scala> trim(",hello,,,,",',')
res13: String = hello

Pattern matching a string in linear time

Given two strings S and T, where the T is the pattern string. Find if any scrambled form of pattern string exists as SubString in the string S and if present return the start index.
Example:
String S: abcdef
String T: efd
String S has "def", a combination of search string T: "efd".
I have found a solution with a run time of O(m*n). I am working on a linear time solution where I used to HashMaps (static one, maintained for String T, and another a dynamic copy of the previous HashMap used for checking the current substring of T). I'd start checking at the next character where it fails. But this runs in O(m*n) in worst case.
I'd like to get some pointers to make it work in O(m+n) time. Any help would be appreciated.
First of all, I would like to know boundaries for string S length (m) and pattern T length (n).
There exist one general idea but complexity of the solution based on it depends on the pattern length. Complexity varies from O(m) to O(m*n^2) for short patterns with length<=100 and O(n) for long patterns.
Fundamental theorem of arithmetic states that every integer number can be uniquely represented as a product of prime numbers.
Idea - I guess, your alphabet is english letters. So, alphabet size is 26. Let's replace first letter with first prime, second letter with the second and so on. I mean the following replacement: a->2b->3c->5d->7e->11 and so on.
Let's denote product of primes corresponding for the letters of some string as prime product(string). For example, primeProduct(z) will be 101 as 101 is 26-th prime number, primeProduct(abc) will be 2*3*5=30,primeProduct(cba) will also be 5*3*2=30.
Why we choose prime numbers? If we replace a ->2; b ->3, c->4, we won't be able to decipher for exapmle 4 - is it "c" or "aa".
Solution for the short patterns case:
For the string S, we should calculate in linear time prime product for all prefixes. I mean we have to create array A such that A[0] = primeProduct(S[0]), A[1] = primeProduct(S[0]S[1]), A[N] = primeProduct(S). Sample implementation:
A[0] = getPrime(S[0]);
for(int i=1;i<S.length;i++)
A[i]=A[i-1]*getPrime(S[i]);
Searching pattern T. Calculate primeProduct(T). For all 'windows' in S which have the same length with pattern compare it's primeProduct with primeProduct(pattern). If currentWindow is equal to the pattern or currentWindow is a scrumbled form(anagramm) of the pattern primeProducts will be the same.
Important note! We have prepared array A for fast computing primeProduct for any substring of S. primeProduct of(S[i],S[i+1],...S[j]) = getPrime(S[i])*...*getPrime(S[j]) = A[j]/A[i-1];
Complexity: if pattern length is <=9, even 'zzzzzzzzz' is 101^9<=MAX_LONG_INT; All calculations fit in standart long type and complexity is O(N)+O(M) where N is for calculating primeProduct of pattern and M is iterating over all windows in S. If length<=100 you have to add complexity of mul/div long numbers that's why complexity becomes O(m*n^2). length of 101^length is O(N) mul/div of such long numbers is O(N^2)
For the long patterns with length>=1000 it's better to store some hash map(prime,degree). Array of prefixes will become array of hash maps and A[j]/A[i-1] trick will become differenceBetween(A[j] and A[i-1] hashmaps's key sets).
Would this JavaScript example be linear time?
<script>
function matchT(t,s){
var tMap = [], answer = []
//map the character count in t
for (var i=0; i<t.length; i++){
var chr = t.charCodeAt(i)
if (tMap[chr]) tMap[chr]++
else tMap[chr] = 1
}
//traverse string
for (var i=0; i<s.length; i++){
if (tMap[s.charCodeAt(i)]){
var start = i, j = i + 1, tmp = []
tmp[s.charCodeAt(i)] = 1
while (tMap[s.charCodeAt(j)]){
var chr = s.charCodeAt(j++)
if (tmp[chr]){
if (tMap[chr] > tmp[chr]) tmp[chr]++
else break
}
else tmp[chr] = 1
}
if (areEqual (tmp,tMap)){
answer.push(start)
i = j - 1
}
}
}
return answer
}
//function to compare arrays
function areEqual(arr1,arr2){
if (arr1.length != arr2.length) return false
for (var i in arr1)
if (arr1[i] != arr2[i]) return false
return true
}
</script>
Output:
console.log(matchT("edf","ghjfedabcddef"))
[3, 10]
If the alphabet is not too large (say, ASCII), then there is no need to use a hash to take care of strings.
Just use a big array which is of the same size as the alphabet, and the existence checking becomes O(1). Thus the whole algorithm becomes O(m+n).
Let us consider for the given example,
String S: abcdef
String T: efd
Create a HashSet which consists of the characters present in the Substring T. So, the set consists of .
Generate a label for the Substring T: 1e1f1d. (number of occurences of each characters + the character itself, can be done using technique similar to count sort)
Now we have to generate labels for the input of the sub-string's length.
Let us start from the first position, which has character a. Since it is not present we do not create any sub-string and move to the next character b. Similarly, to character c and then stop at d.
Since d is present in the HashSet start generating labels(of the sub-string length) for each time the character appears. We can do this in different function to avoid clearing the count array(doing this reduces the complexity from O(m*n) to O(m+n)). If at any point the input string does not consists of the Substring T we can start the label generation from the next position(since the position till the break occurred cannot be a part of the anagram).
So, by generating the labels we can solve the problem in linear O(m+n) time complexity.
m: length of the input string,
n: length of the sub string.
That Code below I used for the pattern searching questions in GFG its accepted in all test cases and works in linear time.
// { Driver Code Starts
import java.util.*;
class Implement_strstr
{
public static void main(String args[])
{
Scanner sc = new Scanner(System.in);
int t = sc.nextInt();
sc.nextLine();
while(t>0)
{
String line = sc.nextLine();
String a = line.split(" ")[0];
String b = line.split(" ")[1];
GfG g = new GfG();
System.out.println(g.strstr(a,b));
t--;
}
}
}// } Driver Code Ends
class GfG
{
//Function to locate the occurrence of the string x in the string s.
int strstr(String a, String d)
{
if(a.equals("") && d.equals("")) return 0;
if(a.length()==1 && d.length()==1 && a.equals(d)) return 0;
if(d.length()==1 && a.charAt(a.length()-1)==d.charAt(0)) return a.length()-1;
int t=0;
int pl=-1;
boolean b=false;
int fl=-1;
for(int i=0;i<a.length();i++)
{
if(pl!=-1)
{
if(i==pl+1 && a.charAt(i)==d.charAt(t))
{
t++;
pl++;
if(t==d.length())
{
b=true;
break;
}
}
else
{
fl=-1;
pl=-1;
t=0;
}
}
else
{
if(a.charAt(i)==d.charAt(t))
{
fl=i;
pl=i;
t=1;
}
}
}
return b?fl:-1;
}
}
Here is the link to the question https://practice.geeksforgeeks.org/problems/implement-strstr/1

Resources