When trying to add a space back into a tokenized String, why are two spaces added into my output? - string

Beginner here. Sorry for the vague title but the code should put my question into perspective.
public static void main(String [] args)
{
String sentence = "hi. how are you! i'm just dandy.";
String tokenSent;
tokenSent = sentenceCapitalizer(sentence);
System.out.println(tokenSent);
}
public static String sentenceCapitalizer(String theSentence)
{
StringTokenizer strTokenizer = new StringTokenizer(theSentence, ".!", true);
String token = null;
String totalToken = "";
String ch = "";
while(strTokenizer.hasMoreTokens())
{
token = strTokenizer.nextToken().trim();
token = token.replace(token.charAt(0), Character.toUpperCase(token.charAt(0)));
StringBuilder str = new StringBuilder(token);
str.append(" ");
totalToken += str;
}
return totalToken;
}
OUTPUT AS IS: Hi . How are you ! I'm just dandy .
I was able to capitalize the first letter of each sentence but I'm wanting the output to keep the same format as the original String. The problem is that it puts a space before and after the ending punctuation. Is there any way to fix this problem using only a StringBuilder and/or StringTokenizer? Thank you for your time.

You can do this.
String delim = ".!";
StringTokenizer strTokenizer = new StringTokenizer(theSentence, delim, true);
Then,
if(delim.contains(token))
str.append(" ");

When you declare new StringTokenizer(theSentence, ".!", true) ,true enables to return delimiter characters also as next token. Since you appending a space to each token, it is normal to get that output.
You can check if the token is a delimiter character, if so, you add the space.
if(token.length() == 1 &&
delims.indexOf(token.charAt(0)) != -1)
str.append(" "); //we just hit the delim char, add space

Related

Replacing the number in a string

if my string is lets say "Alfa1234Beta"
how can I convert all the number in to "_"
for example "Alfa1234Beta"
will be "Alfa____Beta"
Going with the Regex approach pointed out by others is possibly OK for your scenario. Mind you however, that Regex sometimes tend to be overused. A hand rolled approach could be like this:
static string ReplaceDigits(string str)
{
StringBuilder sb = null;
for (int i = 0; i < str.Length; i++)
{
if (Char.IsDigit(str[i]))
{
if (sb == null)
{
// Seen a digit, allocate StringBuilder, copy non-digits we might have skipped over so far.
sb = new StringBuilder();
if (i > 0)
{
sb.Append(str, 0, i);
}
}
// Replace current character (a digit)
sb.Append('_');
}
else
{
if (sb != null)
{
// Seen some digits (being replaced) already. Collect non-digits as well.
sb.Append(str[i]);
}
}
}
if (sb != null)
{
return sb.ToString();
}
return str;
}
It is more light weight than Regex and only allocates when there is actually something to do (replace). So, go ahead use the Regex version if you like. If you figure out during profiling that is too heavy weight, you can use something like the above. YMMV
You can run for loop on the string and then use the following method to replace numbers with _
if (!System.Text.RegularExpressions.Regex.IsMatch(i, "^[0-9]*$"))
Here variable i is the character in the for loop .
You can use this:
var s = "Alfa1234Beta";
var s2 = System.Text.RegularExpressions.Regex.Replace(s, "[0-9]", "_");
s2 now contains "Alfa____Beta".
Explanation: the regex [0-9] matches any digit from 0 to 9 (inclusive). The Regex.Replace then replaces all matched characters with an "_".
EDIT
And if you want it a bit shorter AND also match non-latin digits, use \d as a regex:
var s = "Alfa1234Beta๓"; // ๓ is "Thai digit three"
var s2 = System.Text.RegularExpressions.Regex.Replace(s, #"\d", "_");
s2 now contains "Alfa____Beta_".

How can I make this more elegant and applicable to any String?

public class JavaIntern{
public static void main(String []args){
String str = "JavaIntern"; //hardcoded, but not the problem
char[] s = str.toCharArray();
String result = new String (s,0,1); //this is where the dilemma begins
System.out.println(result);
String result1 = new String (s,0,2);
System.out.println(result1);
String result2 = new String (s,0,3);
System.out.println(result2);
String result3 = new String (s,0,4);
System.out.println(result3);
String result4 = new String (s,0,5);
System.out.println(result4);
String result5 = new String (s,0,6);
System.out.println(result5);
String result6 = new String (s,0,7);
System.out.println(result6);
String result7 = new String (s,0,8);
System.out.println(result7);
String result8 = new String (s,0,9);
System.out.println(result8);
String result9 = new String (s,0,10);
System.out.println(result9); //and this is where it ends... how can I get rid of this?
}
}
//but still get this:
J
Ja
Jav
Java
JavaI
JavaIn
JavaInt
JavaInte
JavaInter
JavaIntern
I guess you want to improve the code and also don't depend on the length of the string.
What about something like this?
public class JavaIntern{
public static void main(String []args){
String str = "JavaIntern"; //hardcoded, but not the problem
String substring = "";
for (char ch: str.toCharArray()) {
substring += ch;
System.out.println(substring);
}
}
}
This will also print:
J
Ja
Jav
Java
JavaI
JavaIn
JavaInt
JavaInte
JavaInter
JavaIntern
The loop gets one character of the string at a time and concatenates it to the substring before printing it.
Im assuming you want to be able to print out one letter more each time.
To do this we use a for loop, and this way it is fairly simple.
public class MainClass {
public static void main(String[] args) {
String str = "JavaIntern";
for (int i = 1; i <= str.length(); i++) {
System.out.println(str.substring(0, i));
}
}
}
We set i to 0 in the loop, keep iterating while i less than or equal to the length of the string, and each time we iterate, add one to i.
We use the substring method to split the string from the first letter, to i.

String Tokenizer requiremen

String str= -PT31121936-1-0069902679870--BLUECH
I want divide the above string by useing string Tokenize
output like this:
amount=" ";
txnNo = PT31121936;
SeqNo = 1;
AccNo = 0069902679870;
Cldflag=" ";
FundOption= BLUECH;
Solution in Java using String split, it would be better than String tokenizer.
There are two solutions
1) This approach is assuming the input string will be always in a specific order.
2) This approach is more dynamic, where we can accommodate change in the order of the input string and also in the number of parameters. My preference would be the second approach.
public class StringSplitExample {
public static void main(String[] args) {
// This solution is based on the order of the input
// is Amount-Txn No-Seq No-Acc No-Cld Flag-Fund Option
String str= "-PT31121936-1-0069902679870--BLUECH";
String[] tokens = str.split("-");
System.out.println("Amount :: "+tokens[0]);
System.out.println("Txn No :: "+tokens[1]);
System.out.println("Seq No :: "+tokens[2]);
System.out.println("Acc No :: "+tokens[3]);
System.out.println("Cld Flag :: "+tokens[4]);
System.out.println("Fund Option :: "+tokens[5]);
// End of First Solution
// The below solution can take any order of input, but we need to provide the order of input
String[] tokensOrder = {"Txn No", "Amount", "Seq No", "Cld Flag", "Acc No", "Fund Option"};
String inputString = "PT31121936--1--0069902679870-BLUECH";
String[] newTokens = inputString.split("-");
// Check whether both arrays are having equal count - To avoid index out of bounds exception
if(newTokens.length == tokensOrder.length) {
for(int i=0; i<tokensOrder.length; i++) {
System.out.println(tokensOrder[i]+" :: "+newTokens[i]);
}
}
}
}
Reference: String Tokenizer vs String split
Scanner vs. StringTokenizer vs. String.Split

Capitalize the last letter of each word in a string

def headAndFoot(s):
"""precondition: s is a string containing lower-case words and spaces
The string s contains no non-alpha characters.
postcondition: For each word, capitalize the first and last letter.
If a word is one letter, just capitalize it. """
last = len(s) - 1
x = s.title()
y = x.split()
return y
What changes do I need to make?
Take a look at my code below and try to use it in the context of your question.
s = 'The dog crossed the street'
result = " ".join([x[:-1].title()+x[len(x)-1].upper() for x in s.split()])
Then result will look like
'ThE DoG CrosseD ThE StreeT'
import java.io.*;
public class ConverT_CapS
{
void main()throws IOException
{
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
System.out.println("Enter a sentence ");
String str = br.readLine();
str= str.toLowerCase();
StringBuffer sb = new StringBuffer(str);
for(int i=1;i<str.length()-1;i++)
{
if(str.charAt(i+1)==' '||str.charAt(i-1)==' ')
{
char ch= (char)(str.charAt(i)-32);
sb.setCharAt(i,ch);
}
}
sb.setCharAt(0,(char)(str.charAt(0)-32));
sb.setCharAt(str.length()-1,(char)(str.charAt(str.length()-1)-32));
System.out.println(sb);
}
}
input : whats ur name
output: WhatS UR NamE

Read specific string from each line of text file using BufferedReader in java

my text file :
3.456 5.234 Saturday 4.15am
2.341 6.4556 Saturday 6.08am
At first line, I want to read 3.456 and 5.234 only.
At second line, I want to read 2.341 and 6.4556 only.
Same goes to following line if any.
Here's my code so far :
InputStream instream = openFileInput("myfilename.txt");
if (instream != null) {
InputStreamReader inputreader = new InputStreamReader(instream);
BufferedReader buffreader = new BufferedReader(inputreader);
String line=null;
while (( line = buffreader.readLine()) != null) {
}
}
Thanks for showing some effort. Try this
while (( line = buffreader.readLine()) != null) {
String[] parts = line.split(" ");
double x = Double.parseDouble(parts[0]);
double y = Double.parseDouble(parts[1]);
}
I typed this from memory, so there might be syntax errors.
int linenumber = 1;
while((line = buffreader.readLine()) != null){
String [] parts = line.split(Pattern.quote(" "));
System.out.println("Line "+linenumber+"-> First Double: "+parts[0]+" Second Double:"
+parts[1]);
linenumber++;
}
The code of Bilbert is almost right. You should use a Pattern and call quote() for the split. This removes all whitespace from the array. Your problem would be, that you have a whitespace after every split in your array if you do it without pattern. Also i added a Linenumber to my output, so you can see which line contains what. It should work fine

Resources