StrSubstitutor replacement with JRE libraries - string

At the moment I am using org.apache.commons.lang.text.StrSubstitutor for doing:
Map m = ...
substitutor = new StrSubstitutor(m);
result = substitutor.replace(input);
Given the fact I want to remove commons-lang dependency from my project what would be a working and minimalistic implementation of StrSubstitutor using standard JRE libraries?
Note:
StrSubstitutor works like this:
Map map = new HashMap();
map.put("animal", "quick brown fox");
map.put("target", "lazy dog");
StrSubstitutor sub = new StrSubstitutor(map);
String resolvedString = sub.replace("The ${animal} jumped over the ${target}.");
yielding resolvedString = "The quick brown fox jumped over the lazy dog."

If performance is not a priority, you can use the appendReplacement method of the Matcher class:
public class StrSubstitutor {
private Map<String, String> map;
private static final Pattern p = Pattern.compile("\\$\\{(.+?)\\}");
public StrSubstitutor(Map<String, String> map) {
this.map = map;
}
public String replace(String str) {
Matcher m = p.matcher(str);
StringBuilder sb = new StringBuilder();
while (m.find()) {
String var = m.group(1);
String replacement = map.get(var);
m.appendReplacement(sb, replacement);
}
m.appendTail(sb);
return sb.toString();
}
}
A more performant but uglier version, just for fun :)
public String replace(String str) {
StringBuilder sb = new StringBuilder();
char[] strArray = str.toCharArray();
int i = 0;
while (i < strArray.length - 1) {
if (strArray[i] == '$' && strArray[i + 1] == '{') {
i = i + 2;
int begin = i;
while (strArray[i] != '}') ++i;
sb.append(map.get(str.substring(begin, i++)));
} else {
sb.append(strArray[i]);
++i;
}
}
if (i < strArray.length) sb.append(strArray[i]);
return sb.toString();
}
It's about 2x as fast as the regex version and 3x faster than the apache commons version as per my tests. So the normal regex stuff is actually more optimized than the apache version. Usually not worth it of course. Just for fun though, let me know if you can make it more optimized.
Edit: As #kmek points out, there is a caveat. Apache version will resolve transitively. e.g, If ${animal} maps to ${dog} and dog maps to Golden Retriever, apache version will map ${animal} to Golden Retriever. As I said, you should use libraries as far as possible. The above solution is only to be used if you have a special constraint which does not allow you to use a library.

there's nothing like this that i know of in the JRE, but writing one is simple enough.
Pattern p = Pattern.compile("${([a-zA-Z]+)}";
Matcher m = p.matcher(inputString);
int lastEnd = -1;
while (m.find(lastEnd+1)) {
int startIndex = m.start();
String varName = m.group(1);
//lookup value in map and substitute
inputString = inputString.substring(0,m.start())+replacement+inputString.substring(m.end());
lastEnt = m.start() + replacement.size();
}
this is of course horribly inefficient and you should probably write the result into a StringBuilder instead of replacing inputString all the time

Related

Eclipse JDT resolve unknown kind from annotation IMemberValuePair

I need to retrieve the value from an annotation such as this one that uses a string constant:
#Component(property = Constants.SERVICE_RANKING + ":Integer=10")
public class NyServiceImpl implements MyService {
But I am getting a kind of K_UNKNOWN and the doc says "the value is an expression that would need to be further analyzed to determine its kind". My question then is how do I perform this analysis? I could even manage to accept getting the plain source text value in this case.
The other answer looks basically OK, but let me suggest a way to avoid using the internal class org.eclipse.jdt.internal.core.Annotation and its method findNode():
ISourceRange range = annotation.getSourceRange();
ASTNode annNode = org.eclipse.jdt.core.dom.NodeFinder.perform(cu, range);
From here on you should be safe, using DOM API throughout.
Googling differently I found a way to resolve the expression. Still open to other suggestions if any. For those who might be interested, here is a snippet of code:
if (valueKind == IMemberValuePair.K_UNKNOWN) {
Annotation ann = (Annotation)annotation;
CompilationUnit cu = getAST(ann.getCompilationUnit());
ASTNode annNode = ann.findNode(cu);
NormalAnnotation na = (NormalAnnotation)annNode;
List<?> naValues = na.values();
Optional<?> optMvp = naValues.stream()
.filter(val-> ((MemberValuePair)val).getName().getIdentifier().equals(PROPERTY))
.findAny();
if (optMvp.isPresent()) {
MemberValuePair pair = (MemberValuePair)optMvp.get();
if (pair.getValue() instanceof ArrayInitializer) {
ArrayInitializer ai = (ArrayInitializer)pair.getValue();
for (Object exprObj : ai.expressions()) {
Expression expr = (Expression)exprObj;
String propValue = (String)expr.resolveConstantExpressionValue();
if (propValue.startsWith(Constants.SERVICE_RANKING)) {
return true;
}
}
}
else {
Expression expr = pair.getValue();
String propValue = (String)expr.resolveConstantExpressionValue();
if (propValue.startsWith(Constants.SERVICE_RANKING)) {
return true;
}
}
}
//report error
}
private CompilationUnit getAST(ICompilationUnit compUnit) {
final ASTParser parser = ASTParser.newParser(AST.JLS8);
parser.setKind(ASTParser.K_COMPILATION_UNIT);
parser.setSource(compUnit);
parser.setResolveBindings(true); // we need bindings later on
CompilationUnit unit = (CompilationUnit)parser.createAST(null);
return unit;
}

How do I reverse a String in Dart?

I have a String, and I would like to reverse it. For example, I am writing an AngularDart filter that reverses a string. It's just for demonstration purposes, but it made me wonder how I would reverse a string.
Example:
Hello, world
should turn into:
dlrow ,olleH
I should also consider strings with Unicode characters. For example: 'Ame\u{301}lie'
What's an easy way to reverse a string, even if it has?
The question is not well defined. Reversing arbitrary strings does not make sense and will lead to broken output. The first (surmountable) obstacle is Utf-16. Dart strings are encoded as Utf-16 and reversing just the code-units leads to invalid strings:
var input = "Music \u{1d11e} for the win"; // Music 𝄞 for the win
print(input.split('').reversed.join()); // niw eht rof
The split function explicitly warns against this problem (with an example):
Splitting with an empty string pattern ('') splits at UTF-16 code unit boundaries and not at rune boundaries[.]
There is an easy fix for this: instead of reversing the individual code-units one can reverse the runes:
var input = "Music \u{1d11e} for the win"; // Music 𝄞 for the win
print(new String.fromCharCodes(input.runes.toList().reversed)); // niw eht rof 𝄞 cisuM
But that's not all. Runes, too, can have a specific order. This second obstacle is much harder to solve. A simple example:
var input = 'Ame\u{301}lie'; // Amélie
print(new String.fromCharCodes(input.runes.toList().reversed)); // eiĺemA
Note that the accent is on the wrong character.
There are probably other languages that are even more sensitive to the order of individual runes.
If the input has severe restrictions (for example being Ascii, or Iso Latin 1) then reversing strings is technically possible. However, I haven't yet seen a single use-case where this operation made sense.
Using this question as example for showing that strings have List-like operations is not a good idea, either. Except for few use-cases, strings have to be treated with respect to a specific language, and with highly complex methods that have language-specific knowledge.
In particular native English speakers have to pay attention: strings can rarely be handled as if they were lists of single characters. In almost every other language this will lead to buggy programs. (And don't get me started on toLowerCase and toUpperCase ...).
Here's one way to reverse an ASCII String in Dart:
input.split('').reversed.join('');
split the string on every character, creating an List
generate an iterator that reverses a list
join the list (creating a new string)
Note: this is not necessarily the fastest way to reverse a string. See other answers for alternatives.
Note: this does not properly handle all unicode strings.
I've made a small benchmark for a few different alternatives:
String reverse0(String s) {
return s.split('').reversed.join('');
}
String reverse1(String s) {
var sb = new StringBuffer();
for(var i = s.length - 1; i >= 0; --i) {
sb.write(s[i]);
}
return sb.toString();
}
String reverse2(String s) {
return new String.fromCharCodes(s.codeUnits.reversed);
}
String reverse3(String s) {
var sb = new StringBuffer();
for(var i = s.length - 1; i >= 0; --i) {
sb.writeCharCode(s.codeUnitAt(i));
}
return sb.toString();
}
String reverse4(String s) {
var sb = new StringBuffer();
var i = s.length - 1;
while (i >= 3) {
sb.writeCharCode(s.codeUnitAt(i-0));
sb.writeCharCode(s.codeUnitAt(i-1));
sb.writeCharCode(s.codeUnitAt(i-2));
sb.writeCharCode(s.codeUnitAt(i-3));
i -= 4;
}
while (i >= 0) {
sb.writeCharCode(s.codeUnitAt(i));
i -= 1;
}
return sb.toString();
}
String reverse5(String s) {
var length = s.length;
var charCodes = new List(length);
for(var index = 0; index < length; index++) {
charCodes[index] = s.codeUnitAt(length - index - 1);
}
return new String.fromCharCodes(charCodes);
}
main() {
var s = "Lorem Ipsum is simply dummy text of the printing and typesetting industry.";
time('reverse0', () => reverse0(s));
time('reverse1', () => reverse1(s));
time('reverse2', () => reverse2(s));
time('reverse3', () => reverse3(s));
time('reverse4', () => reverse4(s));
time('reverse5', () => reverse5(s));
}
Here is the result:
reverse0: => 331,394 ops/sec (3 us) stdev(0.01363)
reverse1: => 346,822 ops/sec (3 us) stdev(0.00885)
reverse2: => 490,821 ops/sec (2 us) stdev(0.0338)
reverse3: => 873,636 ops/sec (1 us) stdev(0.03972)
reverse4: => 893,953 ops/sec (1 us) stdev(0.04089)
reverse5: => 2,624,282 ops/sec (0 us) stdev(0.11828)
Try this function
String reverse(String s) {
var chars = s.splitChars();
var len = s.length - 1;
var i = 0;
while (i < len) {
var tmp = chars[i];
chars[i] = chars[len];
chars[len] = tmp;
i++;
len--;
}
return Strings.concatAll(chars);
}
void main() {
var s = "Hello , world";
print(s);
print(reverse(s));
}
(or)
String reverse(String s) {
StringBuffer sb=new StringBuffer();
for(int i=s.length-1;i>=0;i--) {
sb.add(s[i]);
}
return sb.toString();
}
main() {
print(reverse('Hello , world'));
}
The library More Dart contains a light-weight wrapper around strings that makes them behave like an immutable list of characters:
import 'package:more/iterable.dart';
void main() {
print(string('Hello World').reversed.join());
}
There is a utils package that covers this function. It has some more nice methods for operation on strings.
Install it with :
dependencies:
basic_utils: ^1.2.0
Usage :
String reversed = StringUtils.reverse("helloworld");
Github:
https://github.com/Ephenodrom/Dart-Basic-Utils
Here is a function you can use to reverse strings. It takes an string as input and will use a dart package called Characters to extract characters from the given string. Then we can reverse them and join again to make the reversed string.
String reverse(String string) {
if (string.length < 2) {
return string;
}
final characters = Characters(string);
return characters.toList().reversed.join();
}
Create this extension:
extension Ex on String {
String get reverse => split('').reversed.join();
}
Usage:
void main() {
String string = 'Hello World';
print(string.reverse); // dlroW olleH
}
Reversing "Hello World"

Convert pinyin to Chinese Character

I want to take pinyin (english) as an input and return Chinese characters that user can choose from. I saw that this has been implemented in many place (support by OS keyboards and various websites), but can't find a library to do it.
Or possibly even doing it myself if it's not that complex or require large amount of data.
The simplest way to do this is use javachinesepinyin, a lightweight Chinese Pinyin Input Method.
You can find related code here.
private String[] pinyinToWord(String[] o) {
Result ret = null;
try {
ret = ptw.labelStateOfNodes(Arrays.asList(o));
} catch (Exception ex) {
System.out.println(ex.getMessage());
}
Map<Double, String> results = new HashMap<Double, String>();
if (null != ret && ret.states() != null) {
for (int pos = 0; pos < ret.states()[o.length - 1].length; pos++) {
StringBuilder sb = new StringBuilder();
int[] statePath = Viterbi.getStatePath(ret.states(), ret.psai(), o.length - 1, o.length, pos);
for (int state : statePath) {
Character name = ptw.getStateBy(state);
sb.append(name).append(" ");
}
results.put(ret.delta()[o.length - 1][pos], sb.toString());
}
List<Double> list = new ArrayList<Double>(results.keySet());
Collections.sort(list);
Collections.reverse(list);
return results.get(list.get(0)).trim().split(" ");
}
return null;
}
Intro Slides in English: http://docs.google.com/present/edit?id=0AbbbdNFzwcADZGR3Z3N0NG1fMTk4M2hraGZjNmRw&hl=en
Live Demo: http://951438.appspot.com/pinyin.jsp?txt=zhongwenpinyinshurufa
If advanced features are needed, maybe you should consider use Rime Input Method Engine or sunpinyin.
FYI, Python Binding for sunpinyin.

How to convert from ArrayList to String?

After compiling an ArrayList in java, how do I print it as a string?
Using ArrayList.toString() gives the values with brackets around them and commas between them.
I want to print them without brackets and only spaces between them.
(Assuming Java)
You can write your own method to do that:
public static <T> String listToString(List<T> list) {
StringBuilder sb = new StringBuilder();
boolean b = false;
for (T o : list) {
if (b)
sb.append(' ');
sb.append(o);
b = true;
}
return sb.toString();
}
Or, if you're using Guava, you can use Joiner:
Joiner.on(' ').join(list)
Similarly, if you just are interested in printing, you can avoid creating a new string all together:
public static <T> void printList(List<T> list) {
for (T o : list) {
System.out.print(o);
System.out.print(' ');
}
System.out.println();
}
If you're using Eclipse Collections, you can use the makeString() method.
ArrayList<String> list = new ArrayList<String>();
list.add("one");
list.add("two");
list.add("three");
Assert.assertEquals(
"one two three",
ArrayListAdapter.adapt(list).makeString(" "));
If you can convert your ArrayList to a FastList, you can get rid of the adapter.
Assert.assertEquals(
"one two three",
FastList.newListWith("one", "two", "three").makeString(" "));
Note: I am a committer for Eclipse collections.
for c#
string.Join(" ", _list);
Not sure what language you're using, but try either:
ArrayList.join()
or
ArrayList.toArray().join()
for(int i = 0; i < arraylist.size(); i++){
System.out.print(arraylist.get(i).toString + " ");
}
???

System.getProperty("line.separator") equivalent in j2me

I need to have a cross-platform newline reference to parse files, and I'm trying to find a way to do the equivalent of the usual
System.getProperty("line.separator");
but trying that in J2ME, I get a null String returned, so I'm guessing line.separator isn't included here. Are there any other direct ways to get a universal newline sequence in J2ME as string?
edit: clarified question a bit
Seems like I forgot to answer my question. I used a piece of code that allowed me to use "\r\n" as delimiter and actually considered \r and \n as well seperately:
public class Tokenizer {
public static String[] tokenize(String str, String delimiter) {
StringBuffer strtok = new StringBuffer();
Vector buftok = new Vector();
char[] ch = str.toCharArray(); //convert to char array
for (int i = 0; i < ch.length; i++) {
if (delimiter.indexOf(ch[i]) != -1) { //if i-th character is a delimiter
if (strtok.length() > 0) {
buftok.addElement(strtok.toString());
strtok.setLength(0);
}
}
else {
strtok.append(ch[i]);
}
}
if (strtok.length() > 0) {
buftok.addElement(strtok.toString());
}
String[] splitArray = new String[buftok.size()];
for (int i=0; i < splitArray.length; i++) {
splitArray[i] = (String)buftok.elementAt(i);
}
buftok = null;
return splitArray;
}
}
I don't think "line.separator" is a system property of JME. Take a look at this documentation at SDN FAQ for MIDP developers: What are the defined J2ME system property names?
Why do you need to get the line separator anyway? What I know is that you can use "\n" in JME.

Resources