ANRLR4 lexer semantic predicate issue - antlr4

I'm trying to use a semantic predicate in the lexer to look ahead one token but somehow I can't get it right. Here's what I have:
lexer grammar
lexer grammar TLLexer;
DirStart
: { getCharPositionInLine() == 0 }? '#dir'
;
DirEnd
: { getCharPositionInLine() == 0 }? '#end'
;
Cont
: 'contents' [ \t]* -> mode(CNT)
;
WS
: [ \t]+ -> channel(HIDDEN)
;
NL
: '\r'? '\n'
;
mode CNT;
CNT_DirEnd
: '#end' [ \t]* '\n'?
{ System.out.println("--matched end--"); }
;
CNT_LastLine
: ~ '\n'* '\n'
{ _input.LA(1) == CNT_DirEnd }? -> mode(DEFAULT_MODE)
;
CNT_Line
: ~ '\n'* '\n'
;
parser grammar
parser grammar TLParser;
options { tokenVocab = TLLexer; }
dirs
: ( dir
| NL
)*
;
dir
: DirStart Cont
contents
DirEnd
;
contents
: CNT_Line* CNT_LastLine
;
Essentially each line in the stuff in the CNT mode is free-form, but it never begins with #end followed by optional whitespace. Basically I want to keep matching the #end tag in the default lexer mode.
My test input is as follows:
#dir contents
..line..
#end
If I run this in grun I get the following
$ grun TL dirs test.txt
--matched end--
line 3:0 extraneous input '#end\n' expecting {CNT_LastLine, CNT_Line}
So clearly CNT_DirEnd gets matched, but somehow the predicate doesn't detect it.
I know that this this particular task doesn't require a semantic predicate, but that's just the part that doesn't work. The actual parser, while it may be written without the predicate, will be a lot less clean if I simply move the matching of the the #end tag into the mode CNT.
Thanks,
Kesha.

I think I figured it out. The member _input represents the characters of the original input, thus _input.LA returns characters, not lexer token IDs (is that the correct term?). Either way, the numbers returned by the lexer to the parser have nothing to do with the values returned by _input.LA, hence the predicate fails unless by some weird luck the character value returned by _input.LA(1) is equal to the lexer ID of CNT_DirEnd.
I modified the lexer as shown below and now it works, even though it is not as elegant as I hoped it would be (maybe someone knows a better way?)
lexer grammar TLLexer;
#lexer::members {
private static final String END_DIR = "#end";
private boolean isAtEndDir() {
StringBuilder sb = new StringBuilder();
int n = 1;
int ic;
// read characters until EOF
while ((ic = _input.LA(n++)) != -1) {
char c = (char) ic;
// we're interested in the next line only
if (c == '\n') break;
if (c == '\r') continue;
sb.append(c);
}
// Does the line begin with #end ?
if (sb.indexOf(END_DIR) != 0) return false;
// Is the #end followed by whitespace only?
for (int i = END_DIR.length(); i < sb.length(); i++) {
switch (sb.charAt(i)) {
case ' ':
case '\t':
continue;
default: return false;
}
}
return true;
}
}
[skipped .. nothing changed in the default mode]
mode CNT;
/* removed CNT_DirEnd */
CNT_LastLine
: ~ '\n'* '\n'
{ isAtEndDir() }? -> mode(DEFAULT_MODE)
;
CNT_Line
: ~ '\n'* '\n'
;

Related

bug in python target of antlr

The python code generated by antlr-4.9 has some syntax problems. Eg, for the following antlr grammar:
e returns [ObjExpr v]
: a=e op=('*'|'/') b=e {
$v = ObjExpr($op.type)
$v.e1 = $a.v
$v.e2 = $b.v
}
| INT {
$v = ObjExpr(21)
$v.i = $INT.int
}
;
MUL : '*' ;
DIV : '/' ;
INT : [0-9]+ ;
NEWLINE:'\r'? '\n' ;
WS : [ \t]+ -> skip ;
The code generated is:
localctx.v = ObjExpr((0 if localctx.op is None else localctx.op.type())
localctx.v.e1 = localctx.a.v
localctx.v.e2 = localctx.b.v
Whereas, the correct code should be:
localctx.v = ObjExpr((0 if localctx.op is None else localctx.op.type))
localctx.v.e1 = localctx.a.v
localctx.v.e2 = localctx.b.v
i.e., the code indentation is wrong, and the number of braces dont match. Manually editing the generated parser file to fix these errors makes the code run properly. How do I report this bug and get it fixed?

Extraneous input error when using "lexer rule actions" and "lexer commands"

I'm seeing an "extraneous input" error with input "\aa a" and the following grammar:
Cool.g4
grammar Cool;
import Lex;
expr
: STR_CONST # str_const
;
Lex.g4
lexer grammar Lex;
#lexer::members {
public static boolean initial = true;
public static boolean inString = false;
public static boolean inStringEscape = false;
}
BEGINSTRING: '"' {initial}? {
inString = true;
initial = false;
System.out.println("Entering string");
} -> more;
INSTRINGSTARTESCAPE: '\\' {inString && !inStringEscape}? {
inStringEscape = true;
System.out.println("The next character will be escaped!");
} -> more;
INSTRINGAFTERESCAPE: ~[\n] {inString && inStringEscape}? {
inStringEscape = false;
System.out.println("Escaped a character.");
} -> more;
INSTRINGOTHER: (~[\n\\"])+ {inString && !inStringEscape}? {
System.out.println("Consumed some other characters in the string!");
} -> more;
STR_CONST: '"' {inString && !inStringEscape}? {
inString = false;
initial = true;
System.out.println("Exiting string");
};
WS : [ \t\r\n]+ -> skip ; // skip spaces, tabs, newlines
ID: [a-z][_A-Za-z0-9]*;
Here's the output:
$ grun Cool expr -tree
"\aa a"
Entering string
The next character will be escaped!
Escaped a character.
Consumed some other characters in the string!
Exiting string
line 1:0 extraneous input '"\aa' expecting STR_CONST
(expr "\aa a")
Interestingly, if I remove the ID rule, antlr parses the input fine. Here's the output when I remove the ID rule:
$ grun Cool expr -tree
"\aa a"
Entering string
The next character will be escaped!
Escaped a character.
Consumed some other characters in the string!
Exiting string
(expr "\aa a")
Any idea what might be going on? Why does antlr throw an error when ID is one of the Lexer rules?
That's a surprisingly complex way to parse strings with escape sequences. Did you print the resulting tokens to see what your lexer produced?
I recommond a different (and much simpler) approach:
STR_CONST: '"' ('\\"' | .)*? '"';
Then in your semantic phase, when you post process your parse tree, examine the matched text to find escape sequences. Convert them to the real chars and print a good error message, when an invalid escape sequence was found (something you cannot do when trying to match escape sequences in the lexer).
Copying the answer I received from #sharwell on GitHub.
"Your ID rule is unpredicated, so it matches aa following the \ (aa is longer than the a matched by INSTRINGAFTERESCAPE, so it's preferred even though it's later in the grammar). If you add a println to WS and ID you'll see the strange behavior in the output."

non-fragment lexer rule x can match the empty string

What's wrong with the following antlr lexer?
I got an error
warning(146): MySQL.g4:5685:0: non-fragment lexer rule VERSION_COMMENT_TAIL can match the empty string
Attached source code
VERSION_COMMENT_TAIL:
{ VERSION_MATCHED == False }? // One level of block comment nesting is allowed for version comments.
((ML_COMMENT_HEAD MULTILINE_COMMENT) | . )*? ML_COMMENT_END { self.setType(MULTILINE_COMMENT); }
| { self.setType(VERSION_COMMENT); IN_VERSION_COMMENT = True; }
;
You are trying to convert my ANTLR3 grammar for MySQL to ANTLR4? Remove all the comment rules in the lexer and insert this instead:
// There are 3 types of block comments:
// /* ... */ - The standard multi line comment.
// /*! ... */ - A comment used to mask code for other clients. In MySQL the content is handled as normal code.
// /*!12345 ... */ - Same as the previous one except code is only used when the given number is a lower value
// than the current server version (specifying so the minimum server version the code can run with).
VERSION_COMMENT_START: ('/*!' DIGITS) (
{checkVersion(getText())}? // Will set inVersionComment if the number matches.
| .*? '*/'
) -> channel(HIDDEN)
;
// inVersionComment is a variable in the base lexer.
MYSQL_COMMENT_START: '/*!' { inVersionComment = true; setChannel(HIDDEN); };
VERSION_COMMENT_END: '*/' {inVersionComment}? { inVersionComment = false; setChannel(HIDDEN); };
BLOCK_COMMENT: '/*' ~[!] .*? '*/' -> channel(HIDDEN);
POUND_COMMENT: '#' ~([\n\r])* -> channel(HIDDEN);
DASHDASH_COMMENT: DOUBLE_DASH ([ \t] (~[\n\r])* | LINEBREAK | EOF) -> channel(HIDDEN);
You need a local inVersionComment member and a function checkVersion() in your lexer (I have it in the base lexer from which the generated lexer derives) which returns true or false, depending on whether the current server version is equal to or higher than the given version.
And for your question: you cannot have actions in alternatives. Actions can only appear at the end of an entire rule. This differs from ANTLR3.

ANTLR4 lexer rule with #init block

I have this lexer rule defined in my ANTLR v3 grammar file - it maths text in double quotes.
I need to convert it to ANTLR v4. ANTLR compiler throws an error 'syntax error: mismatched input '#' expecting COLON while matching a lexer rule' (in #init line). Can lexer rule contain a #init block ? How this should be rewritten ?
DOUBLE_QUOTED_CHARACTERS
#init
{
int doubleQuoteMark = input.mark();
int semiColonPos = -1;
}
: ('"' WS* '"') => '"' WS* '"' { $channel = HIDDEN; }
{
RecognitionException re = new RecognitionException("Illegal empty quotes\"\"!", input);
reportError(re);
}
| '"' (options {greedy=false;}: ~('"'))+
('"'|';' { semiColonPos = input.index(); } ('\u0020'|'\t')* ('\n'|'\r'))
{
if (semiColonPos >= 0)
{
input.rewind(doubleQuoteMark);
RecognitionException re = new RecognitionException("Missing closing double quote!", input);
reportError(re);
input.consume();
}
else
{
setText(getText().substring(1, getText().length()-1));
}
}
;
Sample data:
" " -> throws error "Illegal empty quotes!";
"asd -> throws error "Missing closing double quote!"
"text" -> returns text (valid input, content of "...")
I think this is the right way to do this.
DOUBLE_QUOTED_CHARACTERS
:
{
int doubleQuoteMark = input.mark();
int semiColonPos = -1;
}
(
('"' WS* '"') => '"' WS* '"' { $channel = HIDDEN; }
{
RecognitionException re = new RecognitionException("Illegal empty quotes\"\"!", input);
reportError(re);
}
| '"' (options {greedy=false;}: ~('"'))+
('"'|';' { semiColonPos = input.index(); } ('\u0020'|'\t')* ('\n'|'\r'))
{
if (semiColonPos >= 0)
{
input.rewind(doubleQuoteMark);
RecognitionException re = new RecognitionException("Missing closing double quote!", input);
reportError(re);
input.consume();
}
else
{
setText(getText().substring(1, getText().length()-1));
}
}
)
;
There are some other errors as well in above like WS .. => ... but I am not correcting them as part of this answer. Just to keep things simple. I took hint from here
Just to hedge against that link moving or becoming invalid after sometime, quoting the text as is:
Lexer actions can appear anywhere as of 4.2, not just at the end of the outermost alternative. The lexer executes the actions at the appropriate input position, according to the placement of the action within the rule. To execute a single action for a role that has multiple alternatives, you can enclose the alts in parentheses and put the action afterwards:
END : ('endif'|'end') {System.out.println("found an end");} ;
The action conforms to the syntax of the target language. ANTLR copies the action’s contents into the generated code verbatim; there is no translation of expressions like $x.y as there is in parser actions.
Only actions within the outermost token rule are executed. In other words, if STRING calls ESC_CHAR and ESC_CHAR has an action, that action is not executed when the lexer starts matching in STRING.
I in countered this problem when my .g4 grammar imported a lexer file. Importing grammar files seems to trigger lots of undocumented shortcomings in ANTLR4. So ultimately I had to stop using import.
In my case, once I merged the LEXER grammar into the parser grammar (one single .g4 file) my #input and #after parsing errors vanished. I should submit a test case + bug, at least to get this documented. I will update here once I do that.
I vaguely recall 2-3 issues with respect to importing lexer grammar into my parser that triggered undocumented behavior. Much is covered here on stackoverflow.

How to check if the given string is palindrome? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Definition:
A palindrome is a word, phrase, number or other sequence of units that has the property of reading the same in either direction
How to check if the given string is a palindrome?
This was one of the FAIQ [Frequently Asked Interview Question] a while ago but that mostly using C.
Looking for solutions in any and all languages possible.
PHP sample:
$string = "A man, a plan, a canal, Panama";
function is_palindrome($string)
{
$a = strtolower(preg_replace("/[^A-Za-z0-9]/","",$string));
return $a==strrev($a);
}
Removes any non-alphanumeric characters (spaces, commas, exclamation points, etc.) to allow for full sentences as above, as well as simple words.
Windows XP (might also work on 2000) or later BATCH script:
#echo off
call :is_palindrome %1
if %ERRORLEVEL% == 0 (
echo %1 is a palindrome
) else (
echo %1 is NOT a palindrome
)
exit /B 0
:is_palindrome
set word=%~1
set reverse=
call :reverse_chars "%word%"
set return=1
if "$%word%" == "$%reverse%" (
set return=0
)
exit /B %return%
:reverse_chars
set chars=%~1
set reverse=%chars:~0,1%%reverse%
set chars=%chars:~1%
if "$%chars%" == "$" (
exit /B 0
) else (
call :reverse_chars "%chars%"
)
exit /B 0
Language agnostic meta-code then...
rev = StringReverse(originalString)
return ( rev == originalString );
C# in-place algorithm. Any preprocessing, like case insensitivity or stripping of whitespace and punctuation should be done before passing to this function.
boolean IsPalindrome(string s) {
for (int i = 0; i < s.Length / 2; i++)
{
if (s[i] != s[s.Length - 1 - i]) return false;
}
return true;
}
Edit: removed unnecessary "+1" in loop condition and spent the saved comparison on removing the redundant Length comparison. Thanks to the commenters!
C#: LINQ
var str = "a b a";
var test = Enumerable.SequenceEqual(str.ToCharArray(),
str.ToCharArray().Reverse());
A more Ruby-style rewrite of Hal's Ruby version:
class String
def palindrome?
(test = gsub(/[^A-Za-z]/, '').downcase) == test.reverse
end
end
Now you can call palindrome? on any string.
Unoptimized Python:
>>> def is_palindrome(s):
... return s == s[::-1]
Java solution:
public class QuickTest {
public static void main(String[] args) {
check("AmanaplanacanalPanama".toLowerCase());
check("Hello World".toLowerCase());
}
public static void check(String aString) {
System.out.print(aString + ": ");
char[] chars = aString.toCharArray();
for (int i = 0, j = (chars.length - 1); i < (chars.length / 2); i++, j--) {
if (chars[i] != chars[j]) {
System.out.println("Not a palindrome!");
return;
}
}
System.out.println("Found a palindrome!");
}
}
Using a good data structure usually helps impress the professor:
Push half the chars onto a stack (Length / 2).
Pop and compare each char until the first unmatch.
If the stack has zero elements: palindrome.
*in the case of a string with an odd Length, throw out the middle char.
C in the house. (not sure if you didn't want a C example here)
bool IsPalindrome(char *s)
{
int i,d;
int length = strlen(s);
char cf, cb;
for(i=0, d=length-1 ; i < length && d >= 0 ; i++ , d--)
{
while(cf= toupper(s[i]), (cf < 'A' || cf >'Z') && i < length-1)i++;
while(cb= toupper(s[d]), (cb < 'A' || cb >'Z') && d > 0 )d--;
if(cf != cb && cf >= 'A' && cf <= 'Z' && cb >= 'A' && cb <='Z')
return false;
}
return true;
}
That will return true for "racecar", "Racecar", "race car", "racecar ", and "RaCe cAr". It would be easy to modify to include symbols or spaces as well, but I figure it's more useful to only count letters(and ignore case). This works for all palindromes I've found in the answers here, and I've been unable to trick it into false negatives/positives.
Also, if you don't like bool in a "C" program, it could obviously return int, with return 1 and return 0 for true and false respectively.
Here's a python way. Note: this isn't really that "pythonic" but it demonstrates the algorithm.
def IsPalindromeString(n):
myLen = len(n)
i = 0
while i <= myLen/2:
if n[i] != n[myLen-1-i]:
return False
i += 1
return True
Delphi
function IsPalindrome(const s: string): boolean;
var
i, j: integer;
begin
Result := false;
j := Length(s);
for i := 1 to Length(s) div 2 do begin
if s[i] <> s[j] then
Exit;
Dec(j);
end;
Result := true;
end;
I'm seeing a lot of incorrect answers here. Any correct solution needs to ignore whitespace and punctuation (and any non-alphabetic characters actually) and needs to be case insensitive.
A few good example test cases are:
"A man, a plan, a canal, Panama."
"A Toyota's a Toyota."
"A"
""
As well as some non-palindromes.
Example solution in C# (note: empty and null strings are considered palindromes in this design, if this is not desired it's easy to change):
public static bool IsPalindrome(string palindromeCandidate)
{
if (string.IsNullOrEmpty(palindromeCandidate))
{
return true;
}
Regex nonAlphaChars = new Regex("[^a-z0-9]");
string alphaOnlyCandidate = nonAlphaChars.Replace(palindromeCandidate.ToLower(), "");
if (string.IsNullOrEmpty(alphaOnlyCandidate))
{
return true;
}
int leftIndex = 0;
int rightIndex = alphaOnlyCandidate.Length - 1;
while (rightIndex > leftIndex)
{
if (alphaOnlyCandidate[leftIndex] != alphaOnlyCandidate[rightIndex])
{
return false;
}
leftIndex++;
rightIndex--;
}
return true;
}
EDIT: from the comments:
bool palindrome(std::string const& s)
{
return std::equal(s.begin(), s.end(), s.rbegin());
}
The c++ way.
My naive implementation using the elegant iterators. In reality, you would probably check
and stop once your forward iterator has past the halfway mark to your string.
#include <string>
#include <iostream>
using namespace std;
bool palindrome(string foo)
{
string::iterator front;
string::reverse_iterator back;
bool is_palindrome = true;
for(front = foo.begin(), back = foo.rbegin();
is_palindrome && front!= foo.end() && back != foo.rend();
++front, ++back
)
{
if(*front != *back)
is_palindrome = false;
}
return is_palindrome;
}
int main()
{
string a = "hi there", b = "laval";
cout << "String a: \"" << a << "\" is " << ((palindrome(a))? "" : "not ") << "a palindrome." <<endl;
cout << "String b: \"" << b << "\" is " << ((palindrome(b))? "" : "not ") << "a palindrome." <<endl;
}
boolean isPalindrome(String str1) {
//first strip out punctuation and spaces
String stripped = str1.replaceAll("[^a-zA-Z0-9]", "");
return stripped.equalsIgnoreCase((new StringBuilder(stripped)).reverse().toString());
}
Java version
Here's my solution, without using a strrev. Written in C#, but it will work in any language that has a string length function.
private static bool Pal(string s) {
for (int i = 0; i < s.Length; i++) {
if (s[i] != s[s.Length - 1 - i]) {
return false;
}
}
return true;
}
Here's my solution in c#
static bool isPalindrome(string s)
{
string allowedChars = "abcdefghijklmnopqrstuvwxyz"+
"1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ";
string compareString = String.Empty;
string rev = string.Empty;
for (int i = 0; i <= s.Length - 1; i++)
{
char c = s[i];
if (allowedChars.IndexOf(c) > -1)
{
compareString += c;
}
}
for (int i = compareString.Length - 1; i >= 0; i--)
{
char c = compareString[i];
rev += c;
}
return rev.Equals(compareString,
StringComparison.CurrentCultureIgnoreCase);
}
Here's a Python version that deals with different cases, punctuation and whitespace.
import string
def is_palindrome(palindrome):
letters = palindrome.translate(string.maketrans("",""),
string.whitespace + string.punctuation).lower()
return letters == letters[::-1]
Edit: Shamelessly stole from Blair Conrad's neater answer to remove the slightly clumsy list processing from my previous version.
C++
std::string a = "god";
std::string b = "lol";
std::cout << (std::string(a.rbegin(), a.rend()) == a) << " "
<< (std::string(b.rbegin(), b.rend()) == b);
Bash
function ispalin { [ "$( echo -n $1 | tac -rs . )" = "$1" ]; }
echo "$(ispalin god && echo yes || echo no), $(ispalin lol && echo yes || echo no)"
Gnu Awk
/* obvious solution */
function ispalin(cand, i) {
for(i=0; i<length(cand)/2; i++)
if(substr(cand, length(cand)-i, 1) != substr(cand, i+1, 1))
return 0;
return 1;
}
/* not so obvious solution. cough cough */
{
orig = $0;
while($0) {
stuff = stuff gensub(/^.*(.)$/, "\\1", 1);
$0 = gensub(/^(.*).$/, "\\1", 1);
}
print (stuff == orig);
}
Haskell
Some brain dead way doing it in Haskell
ispalin :: [Char] -> Bool
ispalin a = a == (let xi (y:my) = (xi my) ++ [y]; xi [] = [] in \x -> xi x) a
Plain English
"Just reverse the string and if it is the same as before, it's a palindrome"
Ruby:
class String
def is_palindrome?
letters_only = gsub(/\W/,'').downcase
letters_only == letters_only.reverse
end
end
puts 'abc'.is_palindrome? # => false
puts 'aba'.is_palindrome? # => true
puts "Madam, I'm Adam.".is_palindrome? # => true
An obfuscated C version:
int IsPalindrome (char *s)
{
char*a,*b,c=0;
for(a=b=s;a<=b;c=(c?c==1?c=(*a&~32)-65>25u?*++a,1:2:c==2?(*--b&~32)-65<26u?3:2:c==3?(*b-65&~32)-(*a-65&~32)?*(b=s=0,a),4:*++a,1:0:*++b?0:1));
return s!=0;
}
This Java code should work inside a boolean method:
Note: You only need to check the first half of the characters with the back half, otherwise you are overlapping and doubling the amount of checks that need to be made.
private static boolean doPal(String test) {
for(int i = 0; i < test.length() / 2; i++) {
if(test.charAt(i) != test.charAt(test.length() - 1 - i)) {
return false;
}
}
return true;
}
Another C++ one. Optimized for speed and size.
bool is_palindrome(const std::string& candidate) {
for(std::string::const_iterator left = candidate.begin(), right = candidate.end(); left < --right ; ++left)
if (*left != *right)
return false;
return true;
}
Lisp:
(defun palindrome(x) (string= x (reverse x)))
Three versions in Smalltalk, from dumbest to correct.
In Smalltalk, = is the comparison operator:
isPalindrome: aString
"Dumbest."
^ aString reverse = aString
The message #translateToLowercase returns the string as lowercase:
isPalindrome: aString
"Case insensitive"
|lowercase|
lowercase := aString translateToLowercase.
^ lowercase reverse = lowercase
And in Smalltalk, strings are part of the Collection framework, you can use the message #select:thenCollect:, so here's the last version:
isPalindrome: aString
"Case insensitive and keeping only alphabetic chars
(blanks & punctuation insensitive)."
|lowercaseLetters|
lowercaseLetters := aString
select: [:char | char isAlphabetic]
thenCollect: [:char | char asLowercase].
^ lowercaseLetters reverse = lowercaseLetters
Note that in the above C++ solutions, there was some problems.
One solution was inefficient because it passed an std::string by copy, and because it iterated over all the chars, instead of comparing only half the chars. Then, even when discovering the string was not a palindrome, it continued the loop, waiting its end before reporting "false".
The other was better, with a very small function, whose problem was that it was not able to test anything else than std::string. In C++, it is easy to extend an algorithm to a whole bunch of similar objects. By templating its std::string into "T", it would have worked on both std::string, std::wstring, std::vector and std::deque. But without major modification because of the use of the operator <, the std::list was out of its scope.
My own solutions try to show that a C++ solution won't stop at working on the exact current type, but will strive to work an anything that behaves the same way, no matter the type. For example, I could apply my palindrome tests on std::string, on vector of int or on list of "Anything" as long as Anything was comparable through its operator = (build in types, as well as classes).
Note that the template can even be extended with an optional type that can be used to compare the data. For example, if you want to compare in a case insensitive way, or even compare similar characters (like è, é, ë, ê and e).
Like king Leonidas would have said: "Templates ? This is C++ !!!"
So, in C++, there are at least 3 major ways to do it, each one leading to the other:
Solution A: In a c-like way
The problem is that until C++0X, we can't consider the std::string array of chars as contiguous, so we must "cheat" and retrieve the c_str() property. As we are using it in a read-only fashion, it should be ok...
bool isPalindromeA(const std::string & p_strText)
{
if(p_strText.length() < 2) return true ;
const char * pStart = p_strText.c_str() ;
const char * pEnd = pStart + p_strText.length() - 1 ;
for(; pStart < pEnd; ++pStart, --pEnd)
{
if(*pStart != *pEnd)
{
return false ;
}
}
return true ;
}
Solution B: A more "C++" version
Now, we'll try to apply the same solution, but to any C++ container with random access to its items through operator []. For example, any std::basic_string, std::vector, std::deque, etc. Operator [] is constant access for those containers, so we won't lose undue speed.
template <typename T>
bool isPalindromeB(const T & p_aText)
{
if(p_aText.empty()) return true ;
typename T::size_type iStart = 0 ;
typename T::size_type iEnd = p_aText.size() - 1 ;
for(; iStart < iEnd; ++iStart, --iEnd)
{
if(p_aText[iStart] != p_aText[iEnd])
{
return false ;
}
}
return true ;
}
Solution C: Template powah !
It will work with almost any unordered STL-like container with bidirectional iterators
For example, any std::basic_string, std::vector, std::deque, std::list, etc.
So, this function can be applied on all STL-like containers with the following conditions:
1 - T is a container with bidirectional iterator
2 - T's iterator points to a comparable type (through operator =)
template <typename T>
bool isPalindromeC(const T & p_aText)
{
if(p_aText.empty()) return true ;
typename T::const_iterator pStart = p_aText.begin() ;
typename T::const_iterator pEnd = p_aText.end() ;
--pEnd ;
while(true)
{
if(*pStart != *pEnd)
{
return false ;
}
if((pStart == pEnd) || (++pStart == pEnd))
{
return true ;
}
--pEnd ;
}
}
A simple Java solution:
public boolean isPalindrome(String testString) {
StringBuffer sb = new StringBuffer(testString);
String reverseString = sb.reverse().toString();
if(testString.equalsIgnoreCase(reverseString)) {
return true;
else {
return false;
}
}
Many ways to do it. I guess the key is to do it in the most efficient way possible (without looping the string). I would do it as a char array which can be reversed easily (using C#).
string mystring = "abracadabra";
char[] str = mystring.ToCharArray();
Array.Reverse(str);
string revstring = new string(str);
if (mystring.equals(revstring))
{
Console.WriteLine("String is a Palindrome");
}
In Ruby, converting to lowercase and stripping everything not alphabetic:
def isPalindrome( string )
( test = string.downcase.gsub( /[^a-z]/, '' ) ) == test.reverse
end
But that feels like cheating, right? No pointers or anything! So here's a C version too, but without the lowercase and character stripping goodness:
#include <stdio.h>
int isPalindrome( char * string )
{
char * i = string;
char * p = string;
while ( *++i ); while ( i > p && *p++ == *--i );
return i <= p && *i++ == *--p;
}
int main( int argc, char **argv )
{
if ( argc != 2 )
{
fprintf( stderr, "Usage: %s <word>\n", argv[0] );
return -1;
}
fprintf( stdout, "%s\n", isPalindrome( argv[1] ) ? "yes" : "no" );
return 0;
}
Well, that was fun - do I get the job ;^)
Using Java, using Apache Commons String Utils:
public boolean isPalindrome(String phrase) {
phrase = phrase.toLowerCase().replaceAll("[^a-z]", "");
return StringUtils.reverse(phrase).equals(phrase);
}

Resources