Algorithms for "shortening" strings? - string

I am looking for elegant ways to "shorten" the (user provided) names of object. More precisely:
my users can enter free text (used as "name" of some object), they can use up to 64 chars (including whitespaces, punctuation marks, ...)
in addition to that "long" name; we also have a "reduced" name (exactly 8 characters); required for some legacy interface
Now I am looking for thoughts on how to generate these "reduced" names, based on the 64-char name.
With "elegant" I am wondering about any useful ideas that "might" allow the user to recognize something with value within the shortened string.
Like, if the name is "Production Test Item A5"; then maybe "PTIA5" might (or might not) tell the user something useful.

Apply a substring method to the long version, trim it, in case there are any whitespace characters at the end, optionally remove any special characters from the very end (such as dashes) and finally add a dot, in case you want to indicate your abbreviation that way.
Just a quick hack to get you started:
String longVersion = "Aswaghtde-5d";
// Get substring 0..8 characters
String shortVersion = longVersion.substring(0, (longVersion.length() < 8 ? longVersion.length() : 8));
// Remove whitespace characters from end of String
shortVersion = shortVersion.trim();
// Remove any non-characters from end of String
shortVersion = shortVersion.replaceAll("[^a-zA-Z0-9\\s]+$", "");
// Add dot to end
shortVersion = shortVersion.substring(0, (shortVersion.length() < 8 ? shortVersion.length() : shortVersion.length() - 1)) + ".";
System.out.println(shortVersion);

I needed to shorten names to function as column names in a database. Ideally, the names should be recognizable for users. I set up a dictionary of patterns for commonly occuring words with corresponding "abbreviations". This was applied ONLY to those names which were over the limit of 30 characters.

Related

How to remove Letters AFTER-BEFORE specific Letters in dart

i found here many question about How to remove Letters AFTER specific Letters but did not find AFTER-BEFORE specific Letters
i don't know if it possible in dart ..
sample
String test = 'HelloAflutterBHello'
so how to outputs the following result
print(test) => 'flutter'
that's mean i want to delete everything before ('A') and everything after ('B')
i tried this
print(test.substring(0, test.indexOf('B')));
but this will delete only anything after ('B') but couldn't find a way to delete the Letters before ('A') too ..
i hope any good answer . thanks
You can use regular expressions to do the job. This way you can check for more than one character.
Consider this code:
void main() {
String test = 'HelloABCflutterDEFHello';
//regex match all characters between two (or more) specified characters
RegExp exp = RegExp(r"(?<=ABC).*(?=DEF)");
//store all results from searching within a string.
Iterable<RegExpMatch> matches = exp.allMatches(test);
// access the captured value
print(matches.first.group(0));
}

Remove first space if string contains exactly 2 spaces

I'm having issues when trying to remove the first space of a string if that string has 2 spaces in it. For example it should be turning "Fully Functional Method" into "FullyFunctional Method", but "Functional Method" should not be changed because it only has 1 space. I can't really think of a way to remove first space if the string contains 2 spaces.
I don't know exactly what you want to do, but you may search into RegExp and String.replace() to replace some stuff in a String.
Here is another link to understand the Characters, metacharacters, and metasequences.
var myPattern1:RegExp = / /g;
var str1:String = "This is a string that contains double spaces.";
trace(str1.replace(myPattern1, " "));
//this replaces all " " by " "...
//outputs : This is a string that contains double spaces.
Or in your case (I suppose) something like this
var myPattern2:RegExp = / /;
var str2:String = "Fully Functional Method";
trace(str2.replace(myPattern2, ""));
//If you omit the g, only the first space will be replaced by ""
//outputs : FullyFunctional Method
There is so much things you can do by using RegExp, that I will not explain this here...
Just check on the Adobe website...
This is a quick and efficient way to work on Strings.
I hope this will help.
Since you check at those links, you will understand that my example is pure rough and should be modified to have a FullyFunctional Method. :D
Do a linear scan through the string. Count the number of spaces and record the index of the first space, if any. If there are two spaces, return a string that is the concatenation of the characters up to but not including the first space, and the characters after the first space.
Keep it simple. It is possible to solve your problem with regex, but keep in mind that the worst case time complexity of finding a particular character in an unsorted set is always going to be O(N), so it won't be faster.

How to match a part of string before a character into one variable and all after it into another

I have a problem with splitting string into two parts on special character.
For example:
12345#data
or
1234567#data
I have 5-7 characters in first part separated with "#" from second part, where are another data (characters,numbers, doesn't matter what)
I need to store two parts on each side of # in two variables:
x = 12345
y = data
without "#" character.
I was looking for some Lua string function like splitOn("#") or substring until character, but I haven't found that.
Use string.match and captures.
Try this:
s = "12345#data"
a,b = s:match("(.+)#(.+)")
print(a,b)
See this documentation:
First of all, although Lua does not have a split function is its standard library, it does have string.gmatch, which can be used instead of a split function in many cases. Unlike a split function, string.gmatch takes a pattern to match the non-delimiter text, instead of the delimiters themselves
It is easily achievable with the help of a negated character class with string.gmatch:
local example = "12345#data"
for i in string.gmatch(example, "[^#]+") do
print(i)
end
See IDEONE demo
The [^#]+ pattern matches one or more characters other than # (so, it "splits" a string with 1 character).

Matlab: How to delete prefix from strings

Problem: From TrajCompact, i find all the prefix and the value after prefix, using regexp, with this code:
[digits{1:2}] = ndgrid(0:4);
for k=1:25
matches(:,k)=regexp(TrajCompact(:,1),sprintf('%d%d.*',digits{1}(k),digits{2}(k)),'match','once');
end
I want only the postfix of matches, how can delete the prefix from matches?
Method using regular expressions
You can put the .* section in a group by enclosing it in parenthesis (i.e. (.*)). Matlab has some peculiar 'token' nomenclature for this. In any case, an example of how it works:
[match, group] = regexp('25blah',sprintf('%d%d(.*)',2,5),'match','once','tokens');
Then:
match would be a char array containing '25blah'
group would be a 1x1 cell array containing the string 'blah'.
That is, the variable group would hold what you're looking for.
Hack method
Since your prefix is always two digits, you could also just take everything from the 3rd character of the match onwards:
my_string = match(3:end);
other comments
You may want to require the prefix to occur at the beginning of the string by adding ^ to the beginning of your regular expression. Eg., make the line:
[match, group] = regexp('25blah',sprintf('^%d%d(.*)',2,5),'match','once','tokens');
As it is, your current regular expression would match strings like zzzzzzzzz25stuff. I'm not sure if you want that (assuming it can occur in your data).

function to confirm the presence of both letters and numbers/ Ignoring excedents

So, I'm trying to build up a program with MATLAB according to some indications from my teacher and I came up with some obstacles which would give me a better grade if I could get them right. Here they are:
The user is asked to insert a string but it can't have more than 20 characters. If it does, the excedents will be ignored and the string is saved with the first 20 characters the user inserted. How do I ignore the excedents in a string and save it anyway?
isletter is a function that tells us if the elements are all letters. In this program, the user is asked to insert a string that needs to include both numbers and letters, so that strings with just letters or just numbers are excluded, and then I'll use a while to keep asking for a string with these characteristics.
Could you please help me? This is my first semester with MATLAB. Thank you!
If you want to disallow characters other than letters and numbers (i.e. '/#!' or whitespace) and require that the string they enter has to have at least 1 letter and 1 number, then you can use the ISSTRPROP function (which is more general than ISLETTER) to check for other types of characters. The idea to use INPUTDLG to prompt for the string (as suggested in Aabaz's answer) is a good one, so here's a nice condensed solution using INPUTDLG that achieves what you want:
answer = ''; %# Initialize answer to be an empty string
while any(~isstrprop(answer, 'alphanum')) || ... %# Check for alphanumeric chars
~any(isletter(answer)) || ... %# Check for at least 1 letter
~any(isstrprop(answer, 'digit')) %# Check for at least 1 number
answer = inputdlg('Enter string:'); %# Prompt for input
answer = answer{1}(1:min(20, end)); %# Trim answer to max of 20 chars
end
Note how the functions MIN and END are used to trim the string to 20 characters.
For the first part of your problem you can use the Matlab function inputdlg which prompts a dialog box asking for user input. Then you can trim the input as you like.
For the second part of your problem the function isletter that you mentioned will tell you for each character individually if they are alphabetic letters, so you could sum that result and check if it is between 1 and 19 for example. That will tell you that your string contains both letters and numbers.
Finally, you can put your code inside a while loop and change a variable when your conditions are met so that you can break outside of the loop.
This example code demonstrates this:
tryagain=1;
while(tryagain)
answer=inputdlg('Insert a 20 character string that contains both letters and numbers','User input');
answer=answer{1};
if(numel(answer)>20)
answer=answer(1:20);
end
letters=sum(isletter(answer));
numbers=sum(~arrayfun(#(x)isempty(str2num(x)),answer));
if(letters>0 && numbers>0)
tryagain=0;
end
end

Resources