Manipulate or split string - string

I've a quick question, that would solve me some problems if possible:
Is it possible to split / manipulate the request-url with nginx?
What I mean is: an url like this: sub.somewhere.com/something/somethingelse
Is turned into:
subsomethingsomethingelse
And then further into:
sub/som/eth/ing/som/eth/ing/els/e
And then the given path is used to retrieve a File (so probably, it has to be stored in a variable that can be re-used, or used directly)
Is this possible somehow? Or if not, what exactly would be possible, and where are the limitations?
(edit) Are there native possibilities to do this, whitout including the PERL Module? Or is that the only way? (maybe a smaller module that only does string handling? )

it is possible and relatively easy, all you need to do is match your location on a regexp with the approriate back references
location ~ (sub).(somewhere).(com)/(some)(thing)/(something)(else) {
set $var1 = $1; # =sub in above example
set $var2 = $2; # =somewhere in above example
set $var3 = $3; # =com in above example
set $var4 = $4; # =some in above example
set $var5 = $5; # =thing in above example
set $var6 = $6; # = something in above example
set $var7 = $7; # = elsein above example
rewrite ^ $1/$2 last; # would be sub/somewhere
}
you need to save the backreferences before the rewrite because the rewrite directive resets the references to those in the regexp first argument (so if you use some other directive like try_files that doesn't do that you coul just use the backreferences directly without saving them)

This can be easily done by using Nginx perl module
I don't know another way or native modules to do this.
You can write your own module for nginx to do such thing, but in fact it is not necessary.
Perl module if quite simple and fast for string manipulation. I have successful experience of using perl module for similar task in production.

Related

How to extract substring (path to folder) from string using batch script or PowerShell

I have been trying to remove in batch script "=" character from string by using this:
set Path_var=%Path_var:^==%
Unfortunately this does not work... I tried also some other common solutions like:
set Path_var=%Path_var:"="=%
set Path_var=%Path_var:'='=%
But without success. Maybe it would be worthy also to explain for what I need it as I am aware you may be able to provide better solution. I extract one line from xml configuration file. The line is following:
<burning addDicomViewer="true" finalizeMedium="true" dicomViewer="C:\user\App_folder\App-name_subfolder_1.1.1_Setup" burnVerification="true" numberOfCopies="0" cleanupProjectData="false" volumeName="Patient Medium"/>
I need to extract from this line this path: "C:\user\App_folder\App-name_subfolder_1.1.1_Setup" (The path will not always be the same)
My strategy was to simply remove definite number of characters before the path as I know this setting will always be the same and therefore the length of the string won`t change.
set /p Path_var= < temp_file01.txt
set Path_var=%Path_var:~81,100%
Then I wanted to use simply substitution to remove the rest. For example:
set Path_var=%Path_var:burnVerification=%
But I ran into problem that my string contains characters like "=" which I can not remove by this method. (Because obviously there are handled as operators) What I was also wondering is what I should do if there will be a space character in my path. Then when I attempt to remove the empty characters at the end I also invalidate my path.
I know batch scripts are not the best for manipulation with strings, but I do not have other choice as my boss want me to use scripting language which does not need compiling.
I asked my work colleague for help and he came with following PowerShell solution:
$path_temp_file01 = "C:\user\temp\tmpFile_backup_script01.txt"
$path_temp_file02 = "C:\user\temp\tmpFile_backup_script02.txt"
$string = [IO.File]::ReadAllText($path_temp_file01)
$Start = $string.IndexOf("C:")
$string = $string.substring($Start)
$End =$string.IndexOf("""")
$string = $string.substring(0,$End)
$string > $path_temp_file02
It works for me. I post it here in the case someone needs similar solution or has better idea how to do that.

How to copy part of a URL to a redirect path

I am trying to redirect a path e.g. www.something.com/apple/pie to www.something.com/tickets/pie-details
This is what I have tried but doesn't work:
if (req.url ~ "^/apple/.*") {
set req.url = "^/tickets/.*-details";
error 701 req.url;
}
Am I missing something?
You either need to capture the matches from the regex, or if it is simple just replace using regsub()
However I have read that these are no longer bundled in core varnish so you might need a vmod. This one appears to be the one you need: https://gitlab.com/uplex/varnish/libvmod-re
Here are some docs on how this can be used: https://docs.fastly.com/en/guides/vcl-regular-expression-cheat-sheet#capturing-matches
Basically the re object lets you use the matched portion to then assemble the new url using string operations.
All of the above is speculative using my knowledge of vcl and regex, but I personally have not tried it.

Variables assignment better approach

I wrote a shell script(beginner), which works fine but it includes a number of parameters.
I assign the value to them as show below.
url=$2
name=$3
ipadd=$5
netmask=$6
vlanid=$4
vlname=$7
Is there is any better approach, I can use ?
Thanks.
You can use read instead of multiple assignments:
f=$'\6' # or any other control character
IFS=$f read -d'' -r _ url name vlanid ipadd netmask vlname _ < <(printf "%s$f" "$#")
_ will ignore $1 and anything after $8.
The only way I would see really doing a better job would be to change to a --flag=value setup, if only to not make the order of arguments as important.
./myscript.sh --url=http://www.example.com --ip=10.42.56.23 --netmask=24
This would then require parsing each argument for the --flag part, then if it is found splitting the variable at the = and setting the value of your real internal value. Worth it for something you are shipping out to users, but maybe not so much for something you are using for yourself.

Removal of trailing dot in RewriteRule of .htaccess

The .htaccess rewrite rule applied in a restful database application:
RewriteRule ^author/([A-z.]+)/([A-z]+)$ get_author.php?first_name=$1&last_name=$2
applied to
http://localhost:8080/API/author/J./Doe
removes the period from "J." and the resulting name "J Doe" is obviously not in the database (while "J. Doe" is). This rewrite rule only removes a trailing period, e.g. "J.O" translates correctly to "J.O". I use XAMPP 7.0.6 plus Apache under Windows 10. What to do in order to NOT remove the trailing dot on the initial?
Update:
Apparently my question wasn't clear, I give it another try.
The regexp (RewriteRule) above is supposed to assign "J." to the variable $1. Instead it assigns "J" to $1, in other words, the regex drops the trailing dot. Secondly, the regex assigns "Doe" to the variable $2, this assignment is as expected and correct. The variables $1 (with incorrect value "J") and $2 (with correct value "Doe") are used in a database search. This search fails because of the missing dot. The database contains "J. Doe", but not "J Doe".
When a dot is not trailing, as in "J.O", the variable $1 gets the correct value "J.O". In other words, the regex does not remove all dots, only the trailing ones.
My question is: how can I tell (the rewrite engine of) .htaccess to apply the regexp correctly?
For comparison, the following piece of JS code does what I want:
var regexp = "^author/([A-z.]+)/([A-z]+)$";
var result = "author/J./Doe".match(regexp);
alert(result[1] + " " + result[2]);
This is apparently (still) a "feature": https://bz.apache.org/bugzilla/show_bug.cgi?id=20036
Problem: Apache strips all trailing dots and spaces unless the path segments is exactly "." or "..".
I ran into the problem because I tried to map an URL from get/a/b/c to get.php?param1=a&param2=b&param3=c, but c can legitimately have trailing dots. The issue is not actually mod_rewrite related but happens with regular URLs too, example URL of a file that's definitely not named this way: Example favicon file. Other servers don't do this. Example: Stackoverflow favicon file, which turns this into a way to detect an Apache server when the HTTP server header is stripped.
To work around this problem, I still map the URL using mod_rewrite, but then in the PHP script, I use the exact same regex to manually map the parameters:
if(preg_match('#/get/([^/]+)/([^/]+)/(.+)$#',$_SERVER['REQUEST_URI'],$matches)){
$param1=$matches[1];
$param2=$matches[2];
$param3=$matches[3];
}
Instead of using the PATH_INFO, I use the REQUEST_URI because it's untouched.
This means if you absolutely need to pass trailing dots in a path string to a backend using apache, your best bet right now is to write an intermediate script that extracts the proper parameters and then does the proxy request for you.

perl byte code generation with too many file.pl

How to make perl bytecode if sub is there in another file.pl so that I can get all perl script in to binary to give for usage but I am getting codedump warning.
Here is the example how I have done!
File: add.pl
require "util.pl";
$a = 1;
$b = 2;
$res = add($a,$b);
print $res;
File: util.pl
sub add()
{
my ($a,$b) = #_;
my $c = $a + $b;
return $c;
}
1; #to return true
Then when I run:
perlcc add.pl
./a.out
I get
Segmentation fault (core dumped)
I also tried
perlcc add.pl util.pl
but it says
/usr/bin/perlcc: using add.pl as input file, ignoring util.pl
Note:
If both are in single file
perlcc file.pl
and
./a.out
will work
I cannot answer for the actual compiler problem, but let me make a few notes.
<Edit> the more I look at this, the more I think that the problem is the namespacing of the add function. When they are in the same file, the function is declared in the main namespace. I think that would be true of the require-d file too, since there was not package declaration. Either way, these are still some good notes that I hope help. </Edit>
You really should use strict and warnings pragmas
You shouldn't use $a and $b, because they are semi-magical in Perl and should be avoided (yeah, thats a weird one)
Perl prototypes are not the same as most languages, and even then the empty prototype () on your add function is incorrect, best to leave it off
Those things said here is how I would format my files.
File: add.pl
use strict;
use warnings;
use MyUtils;
my $x = 1;
my $y = 2;
my $res = add($a,$b);
print $res;
File: MyUtils.pm
package MyUtils;
use strict;
use warnings
use parent 'Exporter';
our #EXPORT = ('add');
sub add
{
my ($x,$y) = #_;
my $c = $x + $y;
return $c;
}
1;
This uses the more modern module/package formalism for reusable libraries. The use directive contains a require directive, but does it at compile-time rather than run-time.
The Exporter module (and the #EXPORT variable) correctly import the function into the script's namespace (typically main).
Perhaps perlcc will like these changes better; but even if not, these are good practices to get used to.
perlcc was removed from Perl in version 5.10.0 (almost five years ago). The perldelta manual page has this to say:
perlcc, the byteloader and the supporting modules (B::C, B::CC,
B::Bytecode, etc.) are no longer distributed with the perl sources.
Those experimental tools have never worked reliably, and, due to the
lack of volunteers to keep them in line with the perl interpreter
developments, it was decided to remove them instead of shipping a
broken version of those.
Seeing that, I have to suggest that using perlcc with any version of Perl is probably a rather bad idea. It was an experimental feature that never really worked. You probably want to move away from using it.

Resources