How to find position of nth token

How to find position of nth token - string

We have a string that has a maximum limit of 20 words. If the user enters something that is more than 20 words, we then need to truncate the string at its 20th word. How can we automate this? We are able to find the 20th token with #GetToken(myString, 20, ' ')#, but are unsure on how to find it's position in order to left-trim. Any ideas?
Thanks in advance.

The UDF ListLeft() should do what you want. It takes a list and returns the list with the number of elements you define. "Space" is fine as a delimiter.
/**
* A Left() function for lists. Returns the n leftmost elements from the specified list.
*
* #param list List you want to return the n leftmost elements from.
* #param numElements Number of leftmost elements you want returned.
* #param delimiter Delimiter for the list. Default is the comma.
* #return Returns a string,
* #author Rob Brooks-Bilson (rbils#amkor.com)
* #version 1, April 24, 2002
*/
function ListLeft(list, numElements){
var tempList="";
var i=0;
var delimiter=",";
if (ArrayLen(arguments) gt 2){
delimiter = arguments[3];
}
if (numElements gte ListLen(list, delimiter)){
return list;
}
for (i=1; i LTE numElements; i=i+1){
tempList=ListAppend(tempList, ListGetAt(list, i, delimiter), delimiter);
}
return tempList;
}
p.s. CFLIB.org is an outstanding resource, and is usually my first stop when I'm looking for something like this. I recommend it highly.

Can also use a regular expression (group #1 contains match): ^(?:\w+\s+){19}(\w+)

Maybe you could avoid trimming and instead rebuild the result from scratch, something like (pseudo-code, I don't know ColdFusion):
result = ''
for (i = 0; i < 20; ++i)
{
result = result + GetToken(myString, i, ' ');
}
Would that work?

Not sure if CF provides this, but generally there is a LastIndexOf(string token) method. Use that combined with a substring function. For isntance (psuedocode):
string lastWord = GetToken(myString, 20, ' ');
string output = Substring(mystring, 0, LastIndexOf(mystring, lastWord)+StrLength(lastWord));

Related

Longest common prefix - comparing time complexity of two algorithms

If you comparing these two solutions the time complexity of the first solution is O(array-len*sortest-string-len) that you may shorten it to O(n*m) or even O(n^2). And the second one seems O(n * log n) as it has a sort method and then comparing the first and the last item so it would be O(n) and don't have any effect on the O.
But, what happens to the comparing the strings item in the list. Sorting a list of integer values is O(n * log n) but don't we need to compare the characters in the strings to be able to sort them? So, am I wrong if I say the time complexity of the second solution is O(n * log n * longest-string-len)?
Also, as it does not check the prefixes while it is sorting it would do the sorting (the majority of the times) anyway so its best case is far worse than the other option? Also, for the worst-case scenario if you consider the point I mentioned it would still be worse than the first solution?
public string longestCommonPrefix(List<string> input) {
if(input.Count == 0) return "";
if(input.Count == 1) return input[0];
var sb = new System.Text.StringBuilder();
for(var charIndex = 0; charIndex < input[0].Length; charIndex++)
{
for(var itemIndex = 1; itemIndex < input.Count; itemIndex++)
{
if(input[itemIndex].Length > charIndex)
return sb.ToString();
if(input[0][charIndex] != input[itemIndex][charIndex])
return sb.ToString();
}
sb.Append(input[0][charIndex]);
}
return sb.ToString();
}
static string longestCommonPrefix(String[] a)
{
int size = a.Length;
/* if size is 0, return empty string */
if (size == 0)
return "";
if (size == 1)
return a[0];
/* sort the array of strings */
Array.Sort(a);
/* find the minimum length from first
and last string */
int end = Math.Min(a[0].Length,
a[size-1].Length);
/* find the common prefix between the
first and last string */
int i = 0;
while (i < end && a[0][i] == a[size-1][i] )
i++;
string pre = a[0].Substring(0, i);
return pre;
}

First of all, unless I am missing something obvious, the first method runs in O(N * shortest-string-length); shortest, not longest.
Second, you may not reduce O(n*m) to O(n^2): the number of strings and their length are unrelated.
Finally, you are absolutely right. Sorting indeed takes O(n*log(n)*m), so in no case it would improve the performance.
As a side note, it may be beneficial to find the shortest string beforehand. This would make a input[itemIndex].Length > charIndex unnecessary.

How to generate number, format "AG-00001" to "AG-99999"?

I want to generate number by this format : "AG-00001" - "AG-99999"(8 characters) Can you help me ?

Since your first three characters are "AG-" you can keep them constant and just create random numbers and add them to "AG-".
function generate(){
let str = "AG-";
for(let x = 0; x < 5; x++){
str += Math.floor(Math.random() * 10);
}
return str;
}
console.log(generate());
If you want the generated strings unique, you can just add them to a list or database and check if the string already exists.

Add comma sequentially to string in C#

I have a string.
string str = "TTFTTFFTTTTF";
How can I break this string and add character ","?
result should be- TTF,TTF,FTT,TTF

You could use String.Join after you've grouped by 3-chars:
var groups = str.Select((c, ix) => new { Char = c, Index = ix })
.GroupBy(x => x.Index / 3)
.Select(g => String.Concat(g.Select(x => x.Char)));
string result = string.Join(",", groups);
Since you're new to programming. That's a LINQ query so you need to add using System.Linq to the top of your code file.
The Select extension method creates an anonymous type containing the char and the index of each char.
GroupBy groups them by the result of index / 3 which is an integer division that truncates decimal places. That's why you create groups of three.
String.Concat creates a string from the 3 characters.
String.Join concatenates them and inserts a comma delimiter between each.

Here is a really simple solution using StringBuilder
var stringBuilder = new StringBuilder();
for (int i = 0; i < str.Length; i += 3)
{
stringBuilder.AppendFormat("{0},", str.Substring(i, 3));
}
stringBuilder.Length -= 1;
str = stringBuilder.ToString();
I'm not sure if the following is better.
stringBuilder.Append(str.Substring(i, 3)).Append(',');
I would suggest to avoid LINQ in this case as it will perform a lot more operations and this is a fairly simple task.

You can use insert
Insert places one string into another. This forms a new string in your C# program. We use the string Insert method to place one string in the middle of another one—or at any other position.
Tip 1:
We can insert one string at any index into another. IndexOf can return a suitable index.
Tip 2:
Insert can be used to concatenate strings. But this is less efficient—concat, as with + is faster.
for(int i=3;i<=str.Length - 1;i+=4)
{
str=str.Insert(i,",");
}

Google Sheets multiple search and replace from a list

I am looking for a solution to search for certain strings in a Google Sheet and, when found, replace them with another string from a list in another sheet.
For better understanding, I prepared a Sheet for you:
https://docs.google.com/a/vicampo.de/spreadsheets/d/1mETtAY72K6ST-hg1qOU9651265nGq0qvcgvzMRqHDO8/edit?usp=sharing
So here's the exact task I want to achieve:
In every single cell in column A of sheet "Text", look for the strings given in column A in sheet "List" and, when found, replace it with the corresponding string in column B of the sheet "List".
See my Example: Look in cell A1 for the string "Lorem" and replace it with "Xlorem", then look for the string "Ipsum" and replace it with "Xipsum", then look for the string "amet" and replace it with "Xamet" then move on to cell B1 and start again looking for the strings...
I have tried different functions and managed to do this with a function for one cell. But how to do it in a loop?
Thanks everyone who is interested in helping out with this problem!

Although there must be 'nicer' solutions, a quick solution (as long is the number of cells with the words you want replaced is not too long), would be:
=ArrayFormula(regexreplace(regexreplace(regexreplace(A1:A; List!A1; List!B1); List!A2; List!B2); List!A3; List!B3))

Probably the best for you, in this case, should be creating a new function to your Google Spreadsheet. It tends to be, in the general case, more simple, clear and powerfull than that kind of complex formulas that should do the same.
In this particular case, I have the same problem, so you can use the same function:
Click on "Tools" menu, then click on the "Script Editor" option. Into the script editor, erase the draft and paste this function:
function preg_quote( str ) {
// http://kevin.vanzonneveld.net
// + original by: booeyOH
// + improved by: Ates Goral (http://magnetiq.com)
// + improved by: Kevin van Zonneveld (http://kevin.vanzonneveld.net)
// + bugfixed by: Onno Marsman
// * example 1: preg_quote("$40");
// * returns 1: '\$40'
// * example 2: preg_quote("*RRRING* Hello?");
// * returns 2: '\*RRRING\* Hello\?'
// * example 3: preg_quote("\\.+*?[^]$(){}=!<>|:");
// * returns 3: '\\\.\+\*\?\[\^\]\$\(\)\{\}\=\!\<\>\|\:'
return (str+'').replace(/([\\\.\+\*\?\[\^\]\$\(\)\{\}\=\!\<\>\|\:])/g, "\\$1");
}
function ARRAYREPLACE(input,fromList,toList,caseSensitive){
/* default behavior it is not case sensitive */
if( caseSensitive === undefined ){
caseSensitive = false;
}
/* if the from list it is not a list, become a list */
if( typeof fromList != "object" ) {
fromList = [ fromList ];
}
/* if the to list it is not a list, become a list */
if( typeof toList != "object" ) {
toList = [ toList ];
}
/* force the input be a string */
var result = input.toString();
/* iterates using the max size */
var bigger = Math.max( fromList.length, toList.length) ;
/* defines the words separators */
var arrWordSeparator = [ ".", ",", ";", " " ];
/* interate into the lists */
for(var i = 0; i < bigger; i++ ) {
/* get the word that should be replaced */
var fromValue = fromList[ ( i % ( fromList.length ) ) ]
/* get the new word that should replace */
var toValue = toList[ ( i % ( toList.length ) ) ]
/* do not replace undefined */
if ( fromValue === undefined ) {
continue;
}
if ( toValue == undefined ) {
toValue = "";
}
/* apply case sensitive rule */
var caseRule = "g";
if( !caseSensitive ) {
/* make the regex case insensitive */
caseRule = "gi";
}
/* for each end word char, make the replacement and update the result */
for ( var j = 0; j < arrWordSeparator.length; j++ ) {
/* from value being the first word of the string */
result = result.replace( new RegExp( "^(" + preg_quote( fromValue + arrWordSeparator[ j ] ) + ")" , caseRule ), toValue + arrWordSeparator[ j ] );
/* from value being the last word of the string */
result = result.replace( new RegExp( "(" + preg_quote( arrWordSeparator[ j ] + fromValue ) + ")$" , caseRule ), arrWordSeparator[ j ] + toValue );
/* from value in the middle of the string between two word separators */
for ( var k = 0; k < arrWordSeparator.length; k++ ) {
result = result.replace(
new RegExp(
"(" + preg_quote( arrWordSeparator[ j ] + fromValue + arrWordSeparator[ k ] ) + ")" ,
caseRule
),
/* need to keep the same word separators */
arrWordSeparator[ j ] + toValue + arrWordSeparator[ k ]
);
}
}
/* from value it is the only thing in the string */
result = result.replace( new RegExp( "^(" + preg_quote( fromValue ) + ")$" , caseRule ), toValue );
}
/* return the new result */
return result;
}
Just save your script and the new function it will be available to you. Now, you have the function that replaces all the first values list by the second value list.
=ARRAYREPLACE(C2;A1:A4;B1:B4)
for example, takes the C2 text and replaces all the elements found in the A1:A4 list by the equivalent into the B1:B4 list.

Copy Sample File With Explanation
Problem
The challenge is:
Find & Replace multiple values in the input of multiple cells.
ArrayFormula's
Solutions which I account as Array-Solution must be:
based on open ranges
no need to drag the formula down
no need to modify the formula when new items in lists appear
These tests must be passed:
Is ArrayFormula
User can set Case Sensitivity
Replaces Emojis
Replaces Special Chars $\[]. etc.
CrashTest. Works for 10K rows of data
CrashTest. Works for 2K replacements
Script
I recommend using the not-regex-based script in this case. This algorithm finds and replaces text by chars:
Usage
Use as a regular formula from sheet:
=substitutes(A12:A;List!A1:B)
Code
Save this code to use the formula above:
/**
* Substitutes in every entry in array
* Text from prefilled array
*
* #param {array} input The array of strings.
* #param {array} subTable The array of string pairs: search texts / replace texts.
* #param {boolean} caseSensitive [optional=false]
* TRUE to match Apple and apple as different words
* #return The input with all replacement made
* #customfunction
*/
function substitutes(input, subTable,caseSensitive) {
// default behavior it is not case sensitive
caseSensitive = caseSensitive || false;
// if the input is not a list, become a list */
if( typeof input != "object" ) {
input = [ input ];
}
var res = [], text;
for (var i = 0; i < input.length; i++) {
// force each array element in the input be a string
text = input[i].toString();
for (var ii = 0; ii < subTable.length; ii++) {
text = replaceAll_(
text,
subTable[ii][0],
subTable[ii][1],
caseSensitive);
}
res.push(text);
}
return res;
}
/***
* JavaScript Non-regex Replace
*
* Original code sourse:
* https://stackoverflow.com/a/56989647/5372400
*/
function replaceAll_(str, find, newToken, caseSensitive) {
var i = -1;
// sanity check & defaults
if (!str) {
// Instead of throwing, act as
// COALESCE if find == null/empty and str == null
if ((str == null) && (find == null))
return newToken;
return str;
}
if (!find || find === ''){ return str; }
if (find === newToken) { return str; }
caseSensitive = caseSensitive || false;
find = !caseSensitive ? find.toLowerCase() : find;
// search process, search by char
while ((
i = (!caseSensitive ? str.toLowerCase() : str).indexOf(
find, i >= 0 ? i + newToken.length : 0
)) !== -1
) {
str = str.substring(0, i) +
newToken +
str.substring(i + find.length);
}
return str;
}
Monster Formula
I've used the RegEx algorithm to solve it with native functions. This method is not recommended as it slows down your Worksheet.
The formula is:
=INDEX(SUBSTITUTE(REGEXREPLACE(TRANSPOSE(QUERY(TRANSPOSE(IFERROR(SPLIT(SUBSTITUTE(TRANSPOSE(QUERY(TRANSPOSE(IFERROR(VLOOKUP(SPLIT(REGEXREPLACE(A12:A;SUBSTITUTE(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇡");"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1");"𑇡";"(.*)");INDEX(REGEXREPLACE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(IF(SEQUENCE(COUNTA(INDEX(LEN(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇡");"[^𑇡]";""))/2));MAX(INDEX(LEN(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇡");"[^𑇡]";""))/2)))-(SEQUENCE(COUNTA(INDEX(LEN(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇡");"[^𑇡]";""))/2)))-1)*MAX(INDEX(LEN(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇡");"[^𑇡]";""))/2))<=INDEX(LEN(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇡");"[^𑇡]";""))/2);"𑇣"&SEQUENCE(COUNTA(INDEX(LEN(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇡");"[^𑇡]";""))/2));MAX(INDEX(LEN(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇡");"[^𑇡]";""))/2)))-(SEQUENCE(COUNTA(INDEX(LEN(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇡");"[^𑇡]";""))/2)))-1)*MAX(INDEX(LEN(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇡");"[^𑇡]";""))/2))&"𑇤";));;2^99)));" ?𑇣";"$")));"𑇤");{List!A1:A\List!B1:B};2;)&"𑇩"));;2^99));"𑇩 ";"𑇩")&"𝅘";"𑇩")&SPLIT(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇡")&"𝅘";"𑇡")))&"𝅗";;2^99));"𝅗 *";"");"𝅘";""))
Other Solutions
Nested formulas
Nested SUBSTITUTE or REGEXREPLACE formulas as was noted in other answers.
Formulas you need to drag down for the result
Here's a sample formula. Basic logic - split the text into parts → modify parts individually → to join the new result.
This formula must be copied down:
=JOIN(" ";
ArrayFormula(
IFERROR(VLOOKUP(TRANSPOSE(SPLIT(A1;" "));List!A:B;2;0);TRANSPOSE(SPLIT(A1;" ")))))

An improvement on JPV's answer, which is orders of magnitude faster and works with arbitrary query and replacement strings:
=ArrayFormula(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1:A, List!A1, List!B1), List!A2, List!B2), List!A3, List!B3))
Using this format, a 15,000 cell spreadsheet with an 85-length replacement list will update in just a few seconds. Simply assemble the formula string using your scripting language of choice and you're good to go!

With new Labmda and Friends:
=LAMBDA(data,re,with,BYROW(data,LAMBDA(r,if(r="","",REDUCE(r,SEQUENCE(counta(re)),LAMBDA(ini,v,REGEXREPLACE(ini,INDEX(re,v),INDEX(with,v))))))))(C5:C6,E5:E7,F5:F7)
=> Named function
=SUBSTITUTES_RE(list0,list_re,list_with)
↑ This will substitute using regular expressions
substututes
Definition is the same, but REGEXREPLACE is replaced with SUBSTITUTE
Other examples here:
https://docs.google.com/spreadsheets/d/1IMymPZlibT6DX4yzDX4OXj2XBZ48zEl6vBUzIHJIzVE/edit#gid=0

Here is a bit simpler of a script than Thiago Mata's. I modified the script from https://webapps.stackexchange.com/a/46895 to support either single cell or range input
function MSUBSTITUTE(input, subTable)
{
var searchArray = [], subArray = [], outputArray = [];
for (var i = 0, length = subTable.length; i < length; i++)
{
if (subTable[i][0])
{
searchArray.push(subTable[i][0]);
subArray.push(subTable[i][1]);
}
}
var re = new RegExp(searchArray.join('|'), 'g');
/* Check if we got just a single string */
if (typeof( input ) == "string")
{
outputArray.push(input.replace(re, function (match) {return subArray[searchArray.indexOf(match)];}));
}
else /* we got an array of strings */
{
for (var i = 0; i < input.length; i++)
{
/* force each array element in the input be a string */
var text = input[i].toString();
outputArray.push(text.replace(re, function (match) {return subArray[searchArray.indexOf(match)];}))
}
}
return outputArray;
}

I've found a simple way to do this with "ARRAYFORMULA"
You must have one list with the text to find and in a contiguos column, the list you want to replace de data, for example:
#
D
E
1
ToFind
ToReplace
2
Avoc4do
Avocado
3
Tomat3
Tomate
4
On1on
Onion
5
Sug4r
Sugar
then use this formula
=ARRAYFORMULA(FIND(A1:A1000,D1:D5,E1:E5))
A1:A1000 is the original column where you have multiple rows with the word "Avoc4do, Tomat3, On1on, Sugar", ArrayFormula works with a matrix where others formulas can't (formula FIND can't work finding in a matrix, so we use ArrayFormula)
Then you will have a colum with the 1000 rows but now with the "ToReplace" text in order, so now cut and copy in the column A, that's it.

Got it
Lorem ipsum dolor sit xamet Lorem ipsum
= textjoin("";true;ARRAYFORMULA(ifna(vlookup(REGEXEXTRACT(A1;"("&REGEXREPLACE(A1;"("&(textJOIN("|";true;lookuprange))&")";")($1)(")&")");lookuprange;2;false);REGEXEXTRACT(A1;"("&REGEXREPLACE(A1;"("&(textJOIN("|";true;lookuprange))&")";")($1)(")&")"))))
Xlorem ipsum dolor sit Xamet Xlorem ipsum

dart efficient string processing techniques?

I strings in the format of name:key:dataLength:data and these strings can often be chained together. for example "aNum:n:4:9879aBool:b:1:taString:s:2:Hi" this would map to an object something like:
{
aNum: 9879,
aBool: true,
aString: "Hi"
}
I have a method for parsing a string in this format but I'm not sure whether it's use of substring is the most efficient way of pprocessing the string, is there a more efficient way of processing strings in this fashion (repeatedly chopping off the front section):
Map<string, dynamic> fromString(String s){
Map<String, dynamic> _internal = new Map();
int start = 0;
while(start < s.length){
int end;
List<String> parts = new List<String>(); //0 is name, 1 is key, 2 is data length, 3 is data
for(var i = 0; i < 4; i++){
end = i < 3 ? s.indexOf(':') : num.parse(parts[2]);
parts[i] = s.substring(start, end);
start = i < 3 ? end + 1 : end;
}
var tranType = _tranTypesByKey[parts[1]]; //this is just a map to an object which has a function that can convert the data section of the string into an object
_internal[parts[0]] = tranType._fromStr(parts[3]);
}
return _internal;
}

I would try s.split(':') and process the resulting list.
If you do a lot of such operations you should consider creating benchmarks tests, try different techniques and compare them.
If you would still need this line
s = i < 3 ? s.substring(idx + 1) : s.substring(idx);
I would avoid creating a new substring in each iteration but instead just keep track of the next position.

You have to decide how important performance is relative to readability and maintainability of the code.
That said, you should not be cutting off the head of the string repeatedly. That is guaranteed to be inefficient - it'll take time that is quadratic in the number of records in your string, just creating those tail strings.
For parsing each field, you can avoid doing substrings on the length and type fields. For the length field, you can build the number yourself:
int index = ...;
// index points to first digit of length.
int length = 0;
int charCode = source.codeUnitAt(index++);
while (charCode != CHAR_COLON) {
length = 10 * length + charCode - 0x30;
charCode = source.codeUnitAt(index++);
}
// index points to the first character of content.
Since lengths are usually small integers (less than 2<<31), this is likely to be more efficient than creating a substring and calling int.parse.
The type field is a single ASCII character, so you could use codeUnitAt to get its ASCII value instead of creating a single-character string (and then your content interpretation lookup will need to switch on character code instead of character string).
For parsing content, you could pass the source string, start index and length instead of creating a substring. Then the boolean parser can also just read the code unit instead of the singleton character string, the string parser can just make the substring, and the number parser will likely have to make a substring too and call double.parse.
It would be convenient if Dart had a double.parseSubstring(source, [int from = 0, int to]) that could parse a substring as a double without creating the substring.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to find position of nth token - string

Can also use a regular expression (group #1 contains match): ^(?:\w+\s+){19}(\w+)

Maybe you could avoid trimming and instead rebuild the result from scratch, something like (pseudo-code, I don't know ColdFusion): result = '' for (i = 0; i < 20; ++i) { result = result + GetToken(myString, i, ' '); } Would that work?

Not sure if CF provides this, but generally there is a LastIndexOf(string token) method. Use that combined with a substring function. For isntance (psuedocode): string lastWord = GetToken(myString, 20, ' '); string output = Substring(mystring, 0, LastIndexOf(mystring, lastWord)+StrLength(lastWord));

Related

Longest common prefix - comparing time complexity of two algorithms

How to generate number, format "AG-00001" to "AG-99999"?

Add comma sequentially to string in C#

Google Sheets multiple search and replace from a list

dart efficient string processing techniques?

Categories

Resources