Finding sub-strings in Java 6 - string

I looked through the String API in Java 6 and I did not find any method for computing how many times a specific sub-string appears within a given String.
For example, I would like to know how many times "is" or "not" appears in the string "noisxxnotyynotxisi".
I can do the long way with a loop, but I would like to know whether there is a simpler way.
Thanks.
Edit: I'm using Java 6.

org.apache.commons.lang.StringUtils.countMatches method could be preferred.

Without using an external library, you can use String.indexOf(String str, int fromIndex); in a loop.
Update This example fully works.
/**
* #author The Elite Gentleman
* #since 31 March 2011
*
*/
public class Test {
private static final String STR = "noisxxnotyynotxisi";
public static int count(String str) {
int count = 0;
int index = -1;
//if (STR.lastIndexOf(str) == -1) {
// return count;
//}
while ((index = STR.indexOf(str, index + 1)) != -1) {
count++;
}
return count;
}
/**
* #param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
System.out.println(Test.count("is"));
System.out.println(Test.count("no"));
}
}

You can do this, but a loop would be faster.
String text = "noisxxnotyynotxisinono";
String search = "no";
int count = text.split(search,-1).length-1;
System.out.println(Arrays.toString(text.split(search,-1)));
System.out.println("count= " + count);
prints
[, isxx, tyy, txisi, , ]
count= 5
As you can see this is correct if the text starts or ends with the search value. The -1 argument stops it removing trailing seperators.
You can use a loop with indexOf() which is more efficient, but not as simple.
BTW: Java 5.0 has been EOL since Aug 2007. Perhaps its is time to look at Java 6. (though the docs are very similar)

Related

I need to create a function in Groovy that has a single integer as a parameter and returns the number of significant figures it contains

Long story short, I'm working in a system that only works with groovy in its expression editor, and I need to create a function that returns the number of significant figures an integer has. I've found the following function in stack overflow for Java, however it doesnt seem like groovy (or the system itself) likes the regex:
String myfloat = "0.0120";
String [] sig_figs = myfloat.split("(^0+(\\.?)0*|(~\\.)0+$|\\.)");
int sum = 0;
for (String fig : sig_figs)
{
sum += fig.length();
}
return sum;
I've since tried to convert it into a more Groovy-esque syntax to be compatible, and have produced the following:
def sum = 0;
def myint = toString(mynum);
def String[] sig_figs = myint.split(/[^0+(\\.?)0*|(~\\.)0+$|\\.]/);
for (int i = 0; i <= sig_figs.size();i++)
{
sum += sig_figs[i].length();
}
return(sum);
Note that 'mynum' is the parameter of the method
It should also be noted that this system has very little visibility in regards to what groovy functions are available in the system, so the solution likely needs to be as basic as possible
Any help would be greatly appreciated. Thanks!
I think this is the regex you need:
def num = '0.0120'
def splitted = num.split(/(^0+(\.?)0*|(~\.)0+$|\.)/)
def sf = splitted*.length().sum()
It's been a while since I've had to think about significant figures, so sorry if I have the wrong idea. But I've made two regular expressions that combined should count the number of significant figures (sorry I'm no regex wizard) in a string representing a decimal. It doesn't handle commas, you would have to strip those out.
This first regex matches all significant figures before the decimal point
([1-9]+\d*[1-9]|[1-9]+)
And this second regex matches all significant figures after the decimal point:
\.((\d*[1-9]+)+)?
If you add up the lengths of the first capture group (or 0 when no match) for both matches, then it should give you the number of significant figures.
Example:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class SigFigs {
private static final Pattern pattern1 = Pattern.compile("([1-9]+\\d*[1-9]|[1-9]+)");
private static final Pattern pattern2 = Pattern.compile("\\.((\\d*[1-9]+)+)?");
public static int getSignificantFigures(String number) {
int sigFigs = 0;
for (int i=0; i < 2; i++) {
Matcher matcher = (i == 0 ? pattern1 : pattern2).matcher(number);
if (matcher.find()) {
try {
String s = matcher.group(1);
if (s != null) sigFigs += s.length();
} catch (IndexOutOfBoundsException ignored) { }
}
}
return sigFigs;
}
public static void main(String[] args) {
System.out.println(getSignificantFigures("0305.44090")); // 7 sig. figs
}
}
Of course using two matches is suboptimal (like I've said, I'm not crazy good at regex like some I could mention) but its fairly robust and readable

Find the minimum number of messages that need to be sent

I recently encountered a problem in one of my coding interview tests. The problem is as follows.
Suppose there is a service to send messages to a user. Each message has the length of maximum 30 characters. This service receives a complete message and then breaks it into sub-messages, each of size 30 characters at most. But there is an issue with the service. It doesn't guarantee the order in which the sub-messages are received by the user. Hence, for every sub-message, it appends a suffix (k/n) where k denotes the kth sub-message out of the n sub-messages. This suffix is also considered when counting the number of characters in the sub-message which cannot exceed 30. Find the minimum number of sub-messages required to send.
Eg-1:
message: The quick brown fox jumps over the lazy dog
The first sub-message can be: The quick brown fox jumps (1/2) but
the above is incorrect as it exceeds 30 characters. This has 31 characters.
So,the correct sub-messages are:
The quick brown fox (1/2)
jumps over the lazy dog (2/2)
So, the answer is 2.
Eg-2:
message: The quick brown fox jumps over the lazy tortoise
So,the correct sub-messages are:
The quick brown fox (1/3)
jumps over the lazy (2/3)
tortoise (3/3)
So, the answer is 3.
Eg-3:
message: Hello My name is
sub-message: Hello My name is
Answer = 1.
Note: A word cannot be broken across sub-messages. Assume no word is greater than 30 characters in length. If its a single message, then no need to use the suffix
My approach: If the total character length of string is less than 30 then return 1. If not, then get sub-message till character count is 30, checking per word. But now it gets complicated as I don't know the value of n in the suffix. Is there a simpler way to approach the problem?
Thanks for posting this, I do enjoy these sorts of problems.
As domen mentioned above, there is a bit of a challenge here in that you do not know how many lines are required. Thus you do not know whether to allow for 2 (or more) digits for the message number / total message count. Also, could you use Hexadecimal (16 messages require a single digit, or even a base 62 format number (0-9, then A-Z followed by a-z)?
You could of course use a guess and say, if the input is more than, say, 200 characters, then you might use a two digit message number, but then if the message was a single letter followed by a single space repeated 100 times, then you could probably get away with single digit message numbers.
So, you might find that you need to run the algorithm a couple of times. I will assume that a single digit message number is acceptable for this problem, you can enhance my solution to use base 52 message numbers if you like.
My approach uses 2 classes:
Create a class MessageLine that represents a single line of the message.
MessageSender a class that collects the MessageLine(s). It has a helper method that and processes the message and returns a list of MessageLines.
Here is the main MessageSender class. If you run it, you can pass a message on the command line for it to process.
package com.gtajb.stackoverflow;
import java.util.LinkedList;
public class MessageSender {
public static void main(String[] args) {
if (args.length == 0) {
System.out.println("Please supply a message to send");
System.exit(1);
}
// Collect the command line parameters into a single string.
StringBuilder sb = new StringBuilder();
boolean firstWord = true;
for (String s: args) {
if (!firstWord) {
sb.append(" ");
}
firstWord = false;
sb.append(s);
}
// Process the input String and create the MessageSender object.
MessageSender ms = new MessageSender(sb.toString());
System.out.println("Input message: " + sb.toString());
// Retrieve the blocked message and output it.
LinkedList<MessageLine> msg = ms.getBlockedMessage();
int lineNo = 0;
for (MessageLine ml : msg) {
lineNo += 1;
System.out.printf("%2d: %s\n", lineNo, ml.getFormattedLine(msg.size()));
}
}
private String msg;
public MessageSender(String msg) {
this.msg = msg;
processMessage();
}
private LinkedList<MessageLine> blockedMessage = new LinkedList<MessageLine> ();
public LinkedList<MessageLine> getBlockedMessage() {
return blockedMessage;
}
private static final int LINE_MAX_SIZE = 30;
/**
* A private helper method that processes the supplied message when
* the object is constructed.
*/
private void processMessage() {
// Split the message into words and work out how long the message is.
String [] words = msg.split("\\s+");
int messageLength = 0;
for (String w: words) {
messageLength += w.length();
}
messageLength += words.length - 1; // Add in the number of words minus one to allow for the single spaces.
// Can we get away with a single MessageLine?
if (messageLength < LINE_MAX_SIZE) {
// A single message line is good enough.
MessageLine ml = new MessageLine(1);
blockedMessage.add(ml);
for (String w: words) {
ml.add(w);
}
} else {
// Multiple MessageLines will be required.
int lineNo = 1;
MessageLine ml = new MessageLine(lineNo);
blockedMessage.add(ml);
for (String w: words) {
// check if this word will blow the max line length.
// The maximum number of lines is 2. It can be anything that is > 1.
if (ml.getFormattedLineLength(2) + w.length() + 1 > LINE_MAX_SIZE) {
// The word will blow the line length, so create a new line.
lineNo += 1;
ml = new MessageLine(lineNo);
blockedMessage.add(ml);
}
ml.add(w);
}
}
}
}
and here is the Message Line class:
package com.gtajb.stackoverflow;
import java.util.LinkedList;
public class MessageLine extends LinkedList<String> {
private int lineNo;
public MessageLine(int lineNo) {
this.lineNo = lineNo;
}
/**
* Add a new word to this message line.
* #param word the word to add
* #return true if the collection is modified.
*/
public boolean add(String word) {
if (word == null || word.trim().length() == 0) {
return false;
}
return super.add(word.trim());
}
/**
* Return the formatted message length.
* #param totalNumLines the total number of lines in the message.
* #return the length of this line when formatted.
*/
public int getFormattedLineLength(int totalNumLines) {
return getFormattedLine(totalNumLines).length();
}
/**
* Return the formatted line optionally with the line count information.
* #param totalNumLines the total number of lines in the message.
* #return the formatted line.
*/
public String getFormattedLine(int totalNumLines) {
boolean firstWord = true;
StringBuilder sb = new StringBuilder();
for (String w : this) {
if (! firstWord) {
sb.append (" ");
}
firstWord = false;
sb.append(w);
}
if (totalNumLines > 1) {
sb.append (String.format(" (%d/%d)", lineNo, totalNumLines));
}
return sb.toString();
}
}
I tested your scenarios and it seems to produce the correct result.
Let me know if we get the job. :-)
You can binary-search on the total number of submessages. That is, start with two numbers L and H, such that you know that L submessages are not enough, and that H submessages are enough, and see whether their average (L+H)/2 is enough by trying to construct a solution under the assumption that that many submessages are involved: If it is, make that the new H, otherwise make it the new L. Stop as soon as H = L+1: H is then the smallest number of submessages that works, so construct an actual solution using that many submessages. This will require O(n log n) time.
To get initial values for L and H, you could start at 1 and keep doubling until you get a high enough number. The first value that is large enough to work becomes your H, and the previous one your L.
BTW, the constraints you give are not enough to ensure a solution exists: For example, an input consisting of two 29-letter words separated by a space has no solution.

How do I delete a word with Recursion and count the times it deletes?

I've completed about half of my assignment where I have to count the "chickens" in a string, remove the chickens, and return the amount of times I have to remove them.
public static int countChickens(String word)
{
int val = word.indexOf("chicken");
int count = 0;
if(val > -1){
count++;
word = word.substring(val + 1);
//I'm aware the following line doesn't work. It's my best guess.
//word.remove.indexOf("chicken");
val = word.indexOf("chicken");
}
return count;
}
As is, the program counts the correct amount of chickens in the word itself. (Sending it "afunchickenhaschickenfun" returns 2.) However, I need it to be able to return 2 if I send it something like "chichickencken" because it removed the first chicken, and then the second chicken came into play. How do I do the remove part?
Not tested and writen in sudo code, but should give you a better idea on a way to approach this.
int numberOfChickens = 0;
public void CountAndReplaceChicken(string word)
{
int initCheck = word.indexOf("chicken");
if (initCheck > -1)
{
word = word.remove.indexOf("chicken"); // not sure about the syntax in Eclipse but given you figure this part out
numberOfChickens++;
int recursionCheck = word.indexOf("chicken");
if (recursionCheck > -1)
CountAndReplaceChicken(word);
}
}
Okay, the teacher showed us how to do it a few days later. If I understood David Lee's code right, this is just a simplified way of what he did.
public static int countChickens(String word)
{
int val = word.indexOf("chicken");
if(val > -1){
return 1 + countChickens(word.substring(0, val) + word.substring(val + 7));
}
return 0;
}

Noob, creating string method's indexof and substring

As an assignment we are supposed to create methods that copy what string methods do. We are just learning methods and I understand them, but am having trouble getting it to work.
given:
private String st = "";
public void setString(String p){
st = p;
}
public String getString(){
return st;
}
I need to create public int indexOf(char index){}, and public String substring(int start, int end){} I've succesfuly made charAt, and equals but I need some help. We are only allowed to use String methods charAt(), and length(), and + operator. No arrays or anything more advanced either. This is how I'm guessing you start these methods:
public int indexOf(char index){
for(int i = 0; i < st.length(); i++){
return index;
}
return 0;
}
public String substring(int start, int end){
for(int i = 0; i < st.length(); i++){
}
return new String(st + start);
}
thanks!
here's my two working methods:
public boolean equals(String index){
for(int a = 0; a < index.length() && a < st.length(); a++){
if(index.charAt(a) == st.charAt(a) && index.length() == st.length()){
return true;
}
else{
return false;
}
}
return false;
}
public char charAt(int index){
if(index >= 0 && index <= st.length() - 1)
return st.charAt(index);
else
return 0;
}
For your indexOf method, you're on the right track. You'll want to modify the code in the loop. Since you're looping through the whole String, and you only have two methods available, which will help you most to get the characters from the String? Look to your other methods (equals and charAt) to see how you did them, it might give a hint. Remember, you want find a single character in your String and print out the index in which you found it.
For your substring method what you need to do is get all the characters that are represented beginning at start index and go up until end index. A loop is a good start, but you will need a base String to hold your progress in (you will need an empty String). The beginning and end point of your loop need a looking at. For substring, you want to get everything starting at start and everything before end. For instance, if I do the following:
String myString = "Racecar";
String sub = myString.substring(1, 4);
System.out.println(sub);
I should get the output ace.
I would give you the answer, but I think helping guide your reasoning will give you more benefit. Enjoy your assignment!

Sorting a string using another sorting order string [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I saw this in an interview question ,
Given a sorting order string, you are asked to sort the input string based on the given sorting order string.
for example if the sorting order string is dfbcae
and the Input string is abcdeeabc
the output should be dbbccaaee.
any ideas on how to do this , in an efficient way ?
The Counting Sort option is pretty cool, and fast when the string to be sorted is long compared to the sort order string.
create an array where each index corresponds to a letter in the alphabet, this is the count array
for each letter in the sort target, increment the index in the count array which corresponds to that letter
for each letter in the sort order string
add that letter to the end of the output string a number of times equal to it's count in the count array
Algorithmic complexity is O(n) where n is the length of the string to be sorted. As the Wikipedia article explains we're able to beat the lower bound on standard comparison based sorting because this isn't a comparison based sort.
Here's some pseudocode.
char[26] countArray;
foreach(char c in sortTarget)
{
countArray[c - 'a']++;
}
int head = 0;
foreach(char c in sortOrder)
{
while(countArray[c - 'a'] > 0)
{
sortTarget[head] = c;
head++;
countArray[c - 'a']--;
}
}
Note: this implementation requires that both strings contain only lowercase characters.
Here's a nice easy to understand algorithm that has decent algorithmic complexity.
For each character in the sort order string
scan string to be sorted, starting at first non-ordered character (you can keep track of this character with an index or pointer)
when you find an occurrence of the specified character, swap it with the first non-ordered character
increment the index for the first non-ordered character
This is O(n*m), where n is the length of the string to be sorted and m is the length of the sort order string. We're able to beat the lower bound on comparison based sorting because this algorithm doesn't really use comparisons. Like Counting Sort it relies on the fact that you have a predefined finite external ordering set.
Here's some psuedocode:
int head = 0;
foreach(char c in sortOrder)
{
for(int i = head; i < sortTarget.length; i++)
{
if(sortTarget[i] == c)
{
// swap i with head
char temp = sortTarget[head];
sortTarget[head] = sortTarget[i];
sortTarget[i] = temp;
head++;
}
}
}
In Python, you can just create an index and use that in a comparison expression:
order = 'dfbcae'
input = 'abcdeeabc'
index = dict([ (y,x) for (x,y) in enumerate(order) ])
output = sorted(input, cmp=lambda x,y: index[x] - index[y])
print 'input=',''.join(input)
print 'output=',''.join(output)
gives this output:
input= abcdeeabc
output= dbbccaaee
Use binary search to find all the "split points" between different letters, then use the length of each segment directly. This will be asymptotically faster then naive counting sort, but will be harder to implement:
Use an array of size 26*2 to store the begin and end of each letter;
Inspect the middle element, see if it is different from the element left to it. If so, then this is the begin for the middle element and end for the element before it;
Throw away the segment with identical begin and end (if there are any), recursively apply this algorithm.
Since there are at most 25 "split"s, you won't have to do the search for more than 25 segemnts, and for each segment it is O(logn). Since this is constant * O(logn), the algorithm is O(nlogn).
And of course, just use counting sort will be easier to implement:
Use an array of size 26 to record the number of different letters;
Scan the input string;
Output the string in the given sorting order.
This is O(n), n being the length of the string.
Interview questions are generally about thought process and don't usually care too much about language features, but I couldn't resist posting a VB.Net 4.0 version anyway.
"Efficient" can mean two different things. The first is "what's the fastest way to make a computer execute a task" and the second is "what's the fastest that we can get a task done". They might sound the same but the first can mean micro-optimizations like int vs short, running timers to compare execution times and spending a week tweaking every millisecond out of an algorithm. The second definition is about how much human time would it take to create the code that does the task (hopefully in a reasonable amount of time). If code A runs 20 times faster than code B but code B took 1/20th of the time to write, depending on the granularity of the timer (1ms vs 20ms, 1 week vs 20 weeks), each version could be considered "efficient".
Dim input = "abcdeeabc"
Dim sort = "dfbcae"
Dim SortChars = sort.ToList()
Dim output = New String((From c In input.ToList() Select c Order By SortChars.IndexOf(c)).ToArray())
Trace.WriteLine(output)
Here is my solution to the question
import java.util.*;
import java.io.*;
class SortString
{
public static void main(String arg[])throws IOException
{
BufferedReader br=new BufferedReader(new InputStreamReader(System.in));
// System.out.println("Enter 1st String :");
// System.out.println("Enter 1st String :");
// String s1=br.readLine();
// System.out.println("Enter 2nd String :");
// String s2=br.readLine();
String s1="tracctor";
String s2="car";
String com="";
String uncom="";
for(int i=0;i<s2.length();i++)
{
if(s1.contains(""+s2.charAt(i)))
{
com=com+s2.charAt(i);
}
}
System.out.println("Com :"+com);
for(int i=0;i<s1.length();i++)
if(!com.contains(""+s1.charAt(i)))
uncom=uncom+s1.charAt(i);
System.out.println("Uncom "+uncom);
System.out.println("Combined "+(com+uncom));
HashMap<String,Integer> h1=new HashMap<String,Integer>();
for(int i=0;i<s1.length();i++)
{
String m=""+s1.charAt(i);
if(h1.containsKey(m))
{
int val=(int)h1.get(m);
val=val+1;
h1.put(m,val);
}
else
{
h1.put(m,new Integer(1));
}
}
StringBuilder x=new StringBuilder();
for(int i=0;i<com.length();i++)
{
if(h1.containsKey(""+com.charAt(i)))
{
int count=(int)h1.get(""+com.charAt(i));
while(count!=0)
{x.append(""+com.charAt(i));count--;}
}
}
x.append(uncom);
System.out.println("Sort "+x);
}
}
Here is my version which is O(n) in time. Instead of unordered_map, I could have just used a char array of constant size. i.,e. char char_count[256] (and done ++char_count[ch - 'a'] ) assuming the input strings has all ASCII small characters.
string SortOrder(const string& input, const string& sort_order) {
unordered_map<char, int> char_count;
for (auto ch : input) {
++char_count[ch];
}
string res = "";
for (auto ch : sort_order) {
unordered_map<char, int>::iterator it = char_count.find(ch);
if (it != char_count.end()) {
string s(it->second, it->first);
res += s;
}
}
return res;
}
private static String sort(String target, String reference) {
final Map<Character, Integer> referencesMap = new HashMap<Character, Integer>();
for (int i = 0; i < reference.length(); i++) {
char key = reference.charAt(i);
if (!referencesMap.containsKey(key)) {
referencesMap.put(key, i);
}
}
List<Character> chars = new ArrayList<Character>(target.length());
for (int i = 0; i < target.length(); i++) {
chars.add(target.charAt(i));
}
Collections.sort(chars, new Comparator<Character>() {
#Override
public int compare(Character o1, Character o2) {
return referencesMap.get(o1).compareTo(referencesMap.get(o2));
}
});
StringBuilder sb = new StringBuilder();
for (Character c : chars) {
sb.append(c);
}
return sb.toString();
}
In C# I would just use the IComparer Interface and leave it to Array.Sort
void Main()
{
// we defin the IComparer class to define Sort Order
var sortOrder = new SortOrder("dfbcae");
var testOrder = "abcdeeabc".ToCharArray();
// sort the array using Array.Sort
Array.Sort(testOrder, sortOrder);
Console.WriteLine(testOrder.ToString());
}
public class SortOrder : IComparer
{
string sortOrder;
public SortOrder(string sortOrder)
{
this.sortOrder = sortOrder;
}
public int Compare(object obj1, object obj2)
{
var obj1Index = sortOrder.IndexOf((char)obj1);
var obj2Index = sortOrder.IndexOf((char)obj2);
if(obj1Index == -1 || obj2Index == -1)
{
throw new Exception("character not found");
}
if(obj1Index > obj2Index)
{
return 1;
}
else if (obj1Index == obj2Index)
{
return 0;
}
else
{
return -1;
}
}
}

Resources