Using std C++, I would like to split a string delimited by commas but ignore commas in strings that are surrounded by single quotes. For example:
1,'2,3',4,5,'6,7',8
when split becomes
1
'2,3'
4
5
'6,7'
8
I think this might be best handled with regex, but I'm not sure how to construct the pattern. Any solution without regex is welcome, too. Thanks.
I'm not sure what the C++ syntax would be, but here's some pseudocode:
vector<string> split(const string& value)
{
bool is_escaped = false;
vector<char> current;
vector<string> result;
for (char c : value)
{
if (c == '\'')
{
is_escaped = !is_escaped;
}
if (c == ',' && !is_escaped)
{
result.push_back(string(current.begin(), current.end());
current.clear();
}
else
{
current.push_back(c);
}
}
result.push_back(string(current.begin(), current.end());
return result;
}
Obviously you'll need to adjust it to be valid C++ but it should do the trick.
Related
if my string is lets say "Alfa1234Beta"
how can I convert all the number in to "_"
for example "Alfa1234Beta"
will be "Alfa____Beta"
Going with the Regex approach pointed out by others is possibly OK for your scenario. Mind you however, that Regex sometimes tend to be overused. A hand rolled approach could be like this:
static string ReplaceDigits(string str)
{
StringBuilder sb = null;
for (int i = 0; i < str.Length; i++)
{
if (Char.IsDigit(str[i]))
{
if (sb == null)
{
// Seen a digit, allocate StringBuilder, copy non-digits we might have skipped over so far.
sb = new StringBuilder();
if (i > 0)
{
sb.Append(str, 0, i);
}
}
// Replace current character (a digit)
sb.Append('_');
}
else
{
if (sb != null)
{
// Seen some digits (being replaced) already. Collect non-digits as well.
sb.Append(str[i]);
}
}
}
if (sb != null)
{
return sb.ToString();
}
return str;
}
It is more light weight than Regex and only allocates when there is actually something to do (replace). So, go ahead use the Regex version if you like. If you figure out during profiling that is too heavy weight, you can use something like the above. YMMV
You can run for loop on the string and then use the following method to replace numbers with _
if (!System.Text.RegularExpressions.Regex.IsMatch(i, "^[0-9]*$"))
Here variable i is the character in the for loop .
You can use this:
var s = "Alfa1234Beta";
var s2 = System.Text.RegularExpressions.Regex.Replace(s, "[0-9]", "_");
s2 now contains "Alfa____Beta".
Explanation: the regex [0-9] matches any digit from 0 to 9 (inclusive). The Regex.Replace then replaces all matched characters with an "_".
EDIT
And if you want it a bit shorter AND also match non-latin digits, use \d as a regex:
var s = "Alfa1234Beta๓"; // ๓ is "Thai digit three"
var s2 = System.Text.RegularExpressions.Regex.Replace(s, #"\d", "_");
s2 now contains "Alfa____Beta_".
I like to find all words in a List string equal input word, but 2 characters has variation. I like to find all words equal:
xxxV1xxx;
xxxV2xxx;
xxxV3xxx;...
I do not care if the word include V1, V2, V3; but has to have the same characters before and after.
Use mystring.StartsWith("xxx") && mystring.EndsWith("xxx")
Here is an example:
string[] str = { "xxxv1xxx", "xxxV2xxx", "xxxv3xxx", "xxv4xx", "xxV5xxx"};
foreach (string s in str)
{
if( s.StartsWith("xxx") && s.EndsWith("xxx"))
Console.WriteLine(s); //do whatever you want here
}
Fiddle: https://dotnetfiddle.net/STnyWE
I'm stuck on a task of trying to print words that contain only lowercase letters a-z. I have already stripped out an inputted string if it contains any number 0-9 and if it contains an Uppercase letter:
String[] textParts;
textParts = text.Split(delimChars);
for (int i = 0; i < textParts.Length; i++) //adds s to words list and checks for capitals
{
String s = textParts[i];
bool valid = true;
foreach (char c in textParts[i])
{
if (char.IsUpper(c))
{
valid = false;
break;
}
if (c >= '0' && c <= '9')
{
valid = false;
break;
}
if (char.IsPunctuation(c))
{
valid = false;
break;
}
}
if (valid) pageIn.words.Add(s);
This is my code so far. The last part I'm trying to check to see if a word contains any punctuation (it's not working) is there an easier way I could do this and how could I get the last part of my code to work?
P.S. I'm not that comfortable with using Regex.
Many Thanks,
Ellie
Without regex, you can use LINQ (might be less performant)
bool isOnlyLower = s.Count(c => Char.IsLower(c)) == s.Length;
Count will retrieve the number of char that are in lower in the following string. If it match the string's length, string is composed only of lowercase letters.
An alternative would be to check if there's any UpperCase :
bool isOnlyLower = !s.Any(c => Char.IsUpper(c));
var regex = new Regex("^[a-z]+$");
if (!regex.IsMatch(input))
{
// is't not only lower case letters, remove input
}
I'm not sure whether I get your question right, but shouldn't the following work?
for (int i = 0; i < textParts.Length; i++) //adds s to words list and checks for capitals
{
String s = textParts[i];
if(s.Equals(s.ToLower()))
{
// string is all lower
}
}
I asked this question in a few interviews. I want to know from the Stackoverflow readers as to what should be the answer to this question.
Such a seemingly simple question, but has been interpreted quite a few different ways.
if your definition of a "word" is a series of non-whitespace characters surrounded by a whitespace character, then in 5 second pseudocode you do:
var words = split(inputString, " ")
var reverse = new array
var count = words.count -1
var i = 0
while count != 0
reverse[i] = words[count]
count--
i++
return reverse
If you want to take into consideration also spaces, you can do it like that:
string word = "hello my name is";
string result="";
int k=word.size();
for (int j=word.size()-1; j>=0; j--)
{
while(word[j]!= ' ' && j>=0)
j--;
int end=k;
k=j+1;
int count=0;
if (j>=0)
{
int temp=j;
while (word[temp]==' '){
count++;
temp--;
}
j-=count;
}
else j=j+1;
result+=word.substr(k,end-k);
k-=count;
while(count!=0)
{
result+=' ';
count--;
}
}
It will print out for you "is name my hello"
Taken from something called "Hacking a Google Interview" that was somewhere on my computer ... don't know from where I got it but I remember I saw this exact question inside ... here is the answer:
Reverse the string by swapping the
first character with the last
character, the second with the
second-to-last character, and so on.
Then, go through the string looking
for spaces, so that you find where
each of the words is. Reverse each of
the words you encounter by again
swapping the first character with the
last character, the second character
with the second-to-last character, and
so on.
This came up in LessThanDot Programmer Puzzles
#include<stdio.h>
void reverse_word(char *,int,int);
int main()
{
char s[80],temp;
int l,i,k;
int lower,upper;
printf("Enter the ssentence\n");
gets(s);
l=strlen(s);
printf("%d\n",l);
k=l;
for(i=0;i<l;i++)
{
if(k<=i)
{temp=s[i];
s[i]=s[l-1-i];
s[l-1-i]=temp;}
k--;
}
printf("%s\n",s);
lower=0;
upper=0;
for(i=0;;i++)
{
if(s[i]==' '||s[i]=='\0')
{upper=i-1;
reverse_word(s,lower,upper);
lower=i+1;
}
if(s[i]=='\0')
break;
}
printf("%s",s);
return 0;
}
void reverse_word(char *s,int lower,int upper)
{
char temp;
//int i;
while(upper>lower)
{
temp=s[lower];
s[lower]=s[upper];
s[upper]=temp;
upper=upper-1;
lower=lower+1;
}
}
The following code (C++) will convert a string this is a test to test a is this:
string reverseWords(string str)
{
string result = "";
vector<string> strs;
stringstream S(str);
string s;
while (S>>s)
strs.push_back(s);
reverse(strs.begin(), strs.end());
if (strs.size() > 0)
result = strs[0];
for(int i=1; i<strs.size(); i++)
result += " " + strs[i];
return result;
}
PS: it's actually a google code jam question, more info can be found here.
How do I format a string to title case?
Here is a simple static method to do this in C#:
public static string ToTitleCaseInvariant(string targetString)
{
return System.Threading.Thread.CurrentThread.CurrentCulture.TextInfo.ToTitleCase(targetString);
}
I would be wary of automatically upcasing all whitespace-preceded-words in scenarios where I would run the risk of attracting the fury of nitpickers.
I would at least consider implementing a dictionary for exception cases like articles and conjunctions. Behold:
"Beauty and the Beast"
And when it comes to proper nouns, the thing gets much uglier.
Here's a Perl solution http://daringfireball.net/2008/05/title_case
Here's a Ruby solution http://frankschmitt.org/projects/title-case
Here's a Ruby one-liner solution: http://snippets.dzone.com/posts/show/4702
'some string here'.gsub(/\b\w/){$&.upcase}
What the one-liner is doing is using a regular expression substitution of the first character of each word with the uppercase version of it.
To capatilise it in, say, C - use the ascii codes (http://www.asciitable.com/) to find the integer value of the char and subtract 32 from it.
This is a poor solution if you ever plan to accept characters beyond a-z and A-Z.
For instance: ASCII 134: å, ASCII 143: Å.
Using arithmetic gets you: ASCII 102: f
Use library calls, don't assume you can use integer arithmetic on your characters to get back something useful. Unicode is tricky.
In Silverlight there is no ToTitleCase in the TextInfo class.
Here's a simple regex based way.
Note: Silverlight doesn't have precompiled regexes, but for me this performance loss is not an issue.
public string TitleCase(string str)
{
return Regex.Replace(str, #"\w+", (m) =>
{
string tmp = m.Value;
return char.ToUpper(tmp[0]) + tmp.Substring(1, tmp.Length - 1).ToLower();
});
}
In Perl:
$string =~ s/(\w+)/\u\L$1/g;
That's even in the FAQ.
If the language you are using has a supported method/function then just use that (as in the C# ToTitleCase method)
If it does not, then you will want to do something like the following:
Read in the string
Take the first word
Capitalize the first letter of that word 1
Go forward and find the next word
Go to 3 if not at the end of the string, otherwise exit
1 To capitalize it in, say, C - use the ascii codes to find the integer value of the char and subtract 32 from it.
There would need to be much more error checking in the code (ensuring valid letters etc.), and the "Capitalize" function will need to impose some sort of "title-case scheme" on the letters to check for words that do not need to be capatilised ('and', 'but' etc. Here is a good scheme)
In what language?
In PHP it is:
ucwords()
example:
$HelloWorld = ucwords('hello world');
In Java, you can use the following code.
public String titleCase(String str) {
char[] chars = str.toCharArray();
for (int i = 0; i < chars.length; i++) {
if (i == 0) {
chars[i] = Character.toUpperCase(chars[i]);
} else if ((i + 1) < chars.length && chars[i] == ' ') {
chars[i + 1] = Character.toUpperCase(chars[i + 1]);
}
}
return new String(chars);
}
Excel-like PROPER:
public static string ExcelProper(string s) {
bool upper_needed = true;
string result = "";
foreach (char c in s) {
bool is_letter = Char.IsLetter(c);
if (is_letter)
if (upper_needed)
result += Char.ToUpper(c);
else
result += Char.ToLower(c);
else
result += c;
upper_needed = !is_letter;
}
return result;
}
http://titlecase.com/ has an API
There is a built-in formula PROPER(n) in Excel.
Was quite pleased to see I didn't have to write it myself!
Here's an implementation in Python: https://launchpad.net/titlecase.py
And a port of this implementation that I've just done in C++: http://codepad.org/RrfcsZzO
Here is a simple example of how to do it :
public static string ToTitleCaseInvariant(string str)
{
return System.Threading.Thread.CurrentThread.CurrentCulture.TextInfo.ToTitleCase(str);
}
I think using the CultureInfo is not always reliable, this the simple and handy way to manipulate string manually:
string sourceName = txtTextBox.Text.ToLower();
string destinationName = sourceName[0].ToUpper();
for (int i = 0; i < (sourceName.Length - 1); i++) {
if (sourceName[i + 1] == "") {
destinationName += sourceName[i + 1];
}
else {
destinationName += sourceName[i + 1];
}
}
txtTextBox.Text = desinationName;
In C#
using System.Globalization;
using System.Threading;
protected void Page_Load(object sender, EventArgs e)
{
CultureInfo cultureInfo = Thread.CurrentThread.CurrentCulture;
TextInfo textInfo = cultureInfo.TextInfo;
Response.Write(textInfo.ToTitleCase("WelcometoHome<br />"));
Response.Write(textInfo.ToTitleCase("Welcome to Home"));
Response.Write(textInfo.ToTitleCase("Welcome#to$home<br/>").Replace("#","").Replace("$", ""));
}
In C# you can simply use
CultureInfo.InvariantCulture.TextInfo.ToTitleCase(str.ToLowerInvariant())
Invariant
Works with uppercase strings
Without using a ready-made function, a super-simple low-level algorithm to convert a string to title case:
convert first character to uppercase.
for each character in string,
if the previous character is whitespace,
convert character to uppercase.
This asssumes the "convert character to uppercase" will do that correctly regardless of whether or not the character is case-sensitive (e.g., '+').
Here you have a C++ version. It's got a set of non uppercaseable words like prononuns and prepositions. However, I would not recommend automating this process if you are to deal with important texts.
#include <iostream>
#include <string>
#include <vector>
#include <cctype>
#include <set>
using namespace std;
typedef vector<pair<string, int> > subDivision;
set<string> nonUpperCaseAble;
subDivision split(string & cadena, string delim = " "){
subDivision retorno;
int pos, inic = 0;
while((pos = cadena.find_first_of(delim, inic)) != cadena.npos){
if(pos-inic > 0){
retorno.push_back(make_pair(cadena.substr(inic, pos-inic), inic));
}
inic = pos+1;
}
if(inic != cadena.length()){
retorno.push_back(make_pair(cadena.substr(inic, cadena.length() - inic), inic));
}
return retorno;
}
string firstUpper (string & pal){
pal[0] = toupper(pal[0]);
return pal;
}
int main()
{
nonUpperCaseAble.insert("the");
nonUpperCaseAble.insert("of");
nonUpperCaseAble.insert("in");
// ...
string linea, resultado;
cout << "Type the line you want to convert: " << endl;
getline(cin, linea);
subDivision trozos = split(linea);
for(int i = 0; i < trozos.size(); i++){
if(trozos[i].second == 0)
{
resultado += firstUpper(trozos[i].first);
}
else if (linea[trozos[i].second-1] == ' ')
{
if(nonUpperCaseAble.find(trozos[i].first) == nonUpperCaseAble.end())
{
resultado += " " + firstUpper(trozos[i].first);
}else{
resultado += " " + trozos[i].first;
}
}
else
{
resultado += trozos[i].first;
}
}
cout << resultado << endl;
getchar();
return 0;
}
With perl you could do this:
my $tc_string = join ' ', map { ucfirst($\_) } split /\s+/, $string;