Find ksh functions using nodejs regex

Find ksh functions using nodejs regex - node.js

I want to write some regex which matches the following: any char regargless how often and then and opening and closing bracket. Also I don't want it to match if there is a # infront of it.
Here is what I tried so far:
\s*^(?!#)*[A-Za-z0-9_]*\(\)
The problem is that this regex seems to match every line in the test file, not just the functions that I want.
Have a great day
EDIT:
function match(array, regex) {//Rematcher
back = [];
for (let i = 0; i < array.length; i++) {
let line = array[i];
if (regex.test(line)) {
back.push(line);
}
}
return back
}
function find(text) {//function finder
let reg = new RegExp("^(?!#)*\w*\(\)");
return reMatcher.match(text, reg);
}
let content = fs.readFileSync(file, "UTF-8");//starting point
let functions = find_functions.find(content);
content = content.split("\n");
//Testfile
meldet()
{
if true
then
if true
then
echo Pseudocode
fi
echo Pseudocode
fi
}
The regex should only match the first line, but instead it matches every line and also it shouldn't match on lines where a # is before the function header

In the pattern \s*^(?!#)*[A-Za-z0-9_]*\(\) you can omit the \s* before the start of the line anchors.
As you don't match a # in the character class, you can also omit the negative lookahead (?!#)
Instead of using reMatcher.match(text, reg);, you could use match on the text variable.
Note to double escape the backslashes in the RegExp constructor.
You could update the function find to:
function find(text) {
let reg = new RegExp("\\s*^(?!#)*[A-Za-z0-9_]*\\(\\)");
return text.match(reg)[0]; // Or first check the value before indexing
}

Related

Replacing the number in a string

if my string is lets say "Alfa1234Beta"
how can I convert all the number in to "_"
for example "Alfa1234Beta"
will be "Alfa____Beta"

Going with the Regex approach pointed out by others is possibly OK for your scenario. Mind you however, that Regex sometimes tend to be overused. A hand rolled approach could be like this:
static string ReplaceDigits(string str)
{
StringBuilder sb = null;
for (int i = 0; i < str.Length; i++)
{
if (Char.IsDigit(str[i]))
{
if (sb == null)
{
// Seen a digit, allocate StringBuilder, copy non-digits we might have skipped over so far.
sb = new StringBuilder();
if (i > 0)
{
sb.Append(str, 0, i);
}
}
// Replace current character (a digit)
sb.Append('_');
}
else
{
if (sb != null)
{
// Seen some digits (being replaced) already. Collect non-digits as well.
sb.Append(str[i]);
}
}
}
if (sb != null)
{
return sb.ToString();
}
return str;
}
It is more light weight than Regex and only allocates when there is actually something to do (replace). So, go ahead use the Regex version if you like. If you figure out during profiling that is too heavy weight, you can use something like the above. YMMV

You can run for loop on the string and then use the following method to replace numbers with _
if (!System.Text.RegularExpressions.Regex.IsMatch(i, "^[0-9]*$"))
Here variable i is the character in the for loop .

You can use this:
var s = "Alfa1234Beta";
var s2 = System.Text.RegularExpressions.Regex.Replace(s, "[0-9]", "_");
s2 now contains "Alfa____Beta".
Explanation: the regex [0-9] matches any digit from 0 to 9 (inclusive). The Regex.Replace then replaces all matched characters with an "_".
EDIT
And if you want it a bit shorter AND also match non-latin digits, use \d as a regex:
var s = "Alfa1234Beta๓"; // ๓ is "Thai digit three"
var s2 = System.Text.RegularExpressions.Regex.Replace(s, #"\d", "_");
s2 now contains "Alfa____Beta_".

Node - Test if string contain element of array replace him by random element of same array (synonym)

I want to change each word that matches the synonym list randomly by another synonym or itself (to randomly keep this keyword).
I test if a string (input) contains one element of an array (words). If it's true, I want to randomly replace this with the element of this same list.
var input = "This is an amazing text blob where this word amazing is replaced by a random word from list_of_words. Isn't this amazing!";
words_synonym = ["amazing", "formidable", "great", "smart"];
// first condition --> true if "input" contain one element of "words_synonym"
input = input.toLowerCase();
console.log(words_synonym.some(word => input.includes(word)));
after, I want to replace the "element" that validated the condition with a random element of the same array (words_synonym).
But I can't select this element. I have just true or false
var random_word = words_synonym[Math.floor(Math.random() * (words_synonym.length))]
input = input.replace(element, random_word, 0)
thanks

The way you have it right now, you're checking if any of the synonyms match any of the words (via words_synonym.some(word => input.includes(word))). In order to do what you want, you'll need both the position of the target word and the new word, neither of which you have now. To do this, you'll want to break apart your nested loops.
The code words_synonym.some(word => input.includes(word)) is equivalent to:
let has_synonym = false;
for (word of words_synonym) { // this is a loop
if (input.includes(word)) { // this is also a loop
has_synonym = true;
break;
}
}
console.log(has_synonym);
So to fix your main issue, just replace includes with indexOf.
To handle the case of replacing all of the tokens, I would suggest keeping track of the token you have replaced outside of the loop, otherwise you end up replacing each token many times which may become very expensive. To do this, just keep track of your starting position outside of the loop and increment it with the end index of the replacement word. indexOf already takes a start argument for exactly this use case!
const input = "This is an amazing text blob where this word amazing is replaced by a random word from list_of_words. Isn't this amazing!";
const words_synonym = ["amazing", "formidable", "great", "smart"];
let output = input;
let start = 0; // index of the end of the last replaced token
for (word of words_synonym) {
let index = output.indexOf(word, start);
while (index >= 0) {
const new_word = words_synonym[Math.floor(Math.random() * (words_synonym.length))];
output = output.substr(0, index) + new_word + output.substr(index + word.length, output.length);
start = index + new_word.length + 1; // increment the start
index = output.indexOf(word, start);
}
}
console.log("input: ", input);
console.log("output: ", output);

You can use method find:
words_synonym.find(word => input.includes(word))
Which returns
The value of the first element in the array that satisfies the
provided testing function. Otherwise, undefined is returned.
from docs:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/find

i have modify answer of dantiston and i have include a loop in order to change all the word match "words_synonym".
But there is a problem. The program don't check all the word of "words_synonym" but only the first with indexof.
var input = "This is an amazing text blob where this word amazing is replaced by a random word from list_of_words. Isn't this amazing!";
words_synonym = ["amazing", "formidable", "great", "smart"];
let output = input;
for (word of words_synonym) {
let index = output.indexOf(word);
if (index >= 0) {
console.log(word);
var indexes = [], i = -1;
while ((i = output.indexOf(word, i+1)) != -1){
index=output.indexOf(word, i);
var new_word = words_synonym[Math.floor(Math.random() * (words_synonym.length))];
output = output.substr(0, index) + new_word + output.substr(index + word.length, output.length);
}
}
}
console.log("input: ", input);
console.log("output: ", output);

Get the first line of a string in haxe

Let's assume we have a multiline string, like
var s:String = "my first line\nmy second line\nmy third line\nand so on!";
What is the best way to get (only) the first line of this string in Haxe? I know I can do something like:
static function getFirstLine(s:String):String {
var t:String = s.split("\n")[0];
if(t.charAt(t.length - 1) == "\r") {
t = t.substring(0, t.length - 1);
}
return t;
}
However I'm wondering if there is any easier (predefined) method for this ...

Caveat that #Gama11's answer works well and is more elegant than this.
If your string is long, split will iterate over the whole thing and allocate an array containing every line in your string, both of which are unnecessary here. Another option would be indexOf:
static function getFirstLine(s:String):String {
var i = s.indexOf("\n");
if (i == -1) return s;
if (i > 0 && s.charAt(i - 1) == "\r") --i;
return s.substr(0, i);
}

There's no built-in utility in the standard library for this that I know of, but you make it a bit more elegant and avoid the substring() handling for \r by splitting on a regex:
static function getFirstLine(s:String):String {
return ~/\r?\n/.split(s)[0];
}
The regex \r?\n optionally matches a carriage return followed by a line feed character.

Remove function from file using sed or awk

I want to remove the function engine "map" { ... "foobar" ... }.
I tried in so many ways, it's so hard because it has empty lines and '}' at the end, delimiters doesn't work
mainfunc {
var = "baz"
engine "map" {
func {
var0 = "foo"
border = { 1, 1, 1, 1 }
var1 = "bar"
}
}
}
mainfunc {
var = "baz"
engine "map" {
func {
var0 = "foo"
border = { 1, 1, 1, 1 }
var1 = "foobar"
}
}
}
... # more functions like 'mainfunc'
I tried
sed '/engine/,/^\s\s}$/d' file
but removes every engine function, I just need the one containing "foobar", maybe a pattern match everything even newlines until foobar something like this:
sed '/engine(.*)foobar/,/^\s\s}$/d' file
Is it possible?

Try:
sed '/engine/{:a;N;/foobar/{N;N;d};/ }/b;ba}' filename
or:
awk '/engine/{c=1}c{b=b?b"\n"$0:$0;if(/{/)a++;if(/}/)a--;if(!a){if(b!~/foobar/)print b;c=0;b="";next}}!c' filename

I would simple count the numbers of open / close brackets when you match engine "map", cannot say if this only works in gawk
awk '
/^[ \t]*engine "map"/ {
ship=1; # ship is used as a boolean
b=0 # The factor between open / close brackets
}
ship {
b += split($0, tmp, "{"); # Count numbers of { in line
b -= split($0, tmp, "}"); # Count numbers of } in line
# If open / close brackets are equal the function ends
if(b==0) {
ship = 0;
}
# Ship the rest (printing)
next;
}
1 # Print line
' file
Split returns the number of matches: split(string, array [, fieldsep [, seps ] ]):
Divide
string into pieces defined by fieldpat
and store the pieces in array and the separator strings in the
seps array. The first piece is stored in
array[1], the second piece in array[2], and so
forth. The third argument, fieldpat, is
a regexp describing the fields in string (just as FPAT is
a regexp describing the fields in input records).
It may be either a regexp constant or a string.
If fieldpat is omitted, the value of FPAT is used.
patsplit() returns the number of elements created.

C++/CLI - Split a string with a unknown number of spaces as separator?

I'm wondering how (and in which way it's best to do it) to split a string with a unknown number of spaces as separator in C++/CLI?
Edit: The problem is that the space number is unknown, so when I try to use the split method like this:
String^ line;
StreamReader^ SCR = gcnew StreamReader("input.txt");
while ((line = SCR->ReadLine()) != nullptr && line != nullptr)
{
if (line->IndexOf(' ') != -1)
for each (String^ SCS in line->Split(nullptr, 2))
{
//Load the lines...
}
}
And this is a example how Input.txt look:
ThisISSomeTxt<space><space><space><tab>PartNumberTwo<space>PartNumber3
When I then try to run the program the first line that is loaded is "ThisISSomeTxt" the second line that is loaded is "" (nothing), the third line that is loaded is also "" (nothing), the fourth line is also "" nothing, the fifth line that is loaded is " PartNumberTwo" and the sixth line is PartNumber3.
I only want ThisISSomeTxt and PartNumberTwo to be loaded :? How can I do this?

Why not just using System::String::Split(..)?

The following code example taken from http://msdn.microsoft.com/en-us/library/b873y76a(v=vs.80).aspx#Y0 , demonstrates how you can tokenize a string with the Split method.
using namespace System;
using namespace System::Collections;
int main()
{
String^ words = "this is a list of words, with: a bit of punctuation.";
array<Char>^chars = {' ',',','->',':'};
array<String^>^split = words->Split( chars );
IEnumerator^ myEnum = split->GetEnumerator();
while ( myEnum->MoveNext() )
{
String^ s = safe_cast<String^>(myEnum->Current);
if ( !s->Trim()->Equals( "" ) )
Console::WriteLine( s );
}
}

I think you can do what you need to do with the String.Split method.
First, I think you're expecting the 'count' parameter to work differently: You're passing in 2, and expecting the first and second results to be returned, and the third result to be thrown out. What it actually return is the first result, and the second & third results concatenated into one string. If all you want is ThisISSomeTxt and PartNumberTwo, you'll want to manually throw away results after the first 2.
As far as I can tell, you don't want any whitespace included in your return strings. If that's the case, I think this is what you want:
String^ line = "ThisISSomeTxt \tPartNumberTwo PartNumber3";
array<String^>^ split = line->Split((array<String^>^)nullptr, StringSplitOptions::RemoveEmptyEntries);
for(int i = 0; i < split->Length && i < 2; i++)
{
Debug::WriteLine("{0}: '{1}'", i, split[i]);
}
Results:
0: 'ThisISSomeTxt'
1: 'PartNumberTwo'

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Find ksh functions using nodejs regex - node.js

Related

Replacing the number in a string

Node - Test if string contain element of array replace him by random element of same array (synonym)

Get the first line of a string in haxe

Remove function from file using sed or awk

C++/CLI - Split a string with a unknown number of spaces as separator?

Categories

Resources