Regular Expression validating hyperlink for export to Excel - excel

I have a web application that takes input from a user, usually in the form of a filepath, hyperlink, or fileshare, but not always. A user may enter "\my.fileshare.com", "http://www.msdn.com", or "In my file cabinent". These inputs are exported to a Excel file. However, if the input is in the form of "\look on my desk" or "http://here it is" (notice the spaces), after the file is exported, and opened, Excel raises the ever so descriptive error message of, and I quote, "Error".
I'm adding to the existing code a regular expression validator to the textbox the user enters and edits these locations in. Because there are a large number of existing entries, the validator needs to be specific as possible, and only toss out the inputs that cause the Excel export to break. For example "\Will Work" will work, as will "Will Work", and "\This\will also work". I need a regular expression that if the string starts with \, http://, https://, ftp://, ftps://, the server or fileshare name does not have a space in it, and if it does not start with the \, http://, https://, ftp://, ftps://, its fine regardless.
I've been able to write the first part
^(\\)[^ \]+(\.)$|^(((ht|f)tp(s)?)://)[^ /]+(/.)$
but I can't figure out how to say ignore everything if it does not start with \, http://, https://, ftp://, ftps://.

^(?:(?:\\|(?:ht|f)tps?://)\S+|(?!\\|(?:ht|f)tps?://).*)$
Explained:
^ # start-of string
(?: # begin non-capturing group
(?:\\|(?:ht|f)tps?://)\S+ # "\, http, ftp" followed by non-spaces
| # or
(?!\\|(?:ht|f)tps?://).* # NOT "\, http, ftp" followed by anything
) # end non-capturing group
$ # end-of-string
This is pure, unescaped regex. Add character escaping according to the rules of your environment.

EDIT: Ooops premature.
This expression still doesn't allow "http://www.google.com/hello world" :/
EDIT FOR A THIRD TIME
Here we go!
^(?:(?:\\|(?:ht|f)tps?://)[^ /\]+([/\].)?|(?!\\|(?:ht|f)tps?://).)$

Related

How to split Rsyslog MSG by REGEX expression

I have syslog message from my device. I am using Rsyslog and want to collect specific message from a specific folder using REGEX expression.
The configuration with the old syntax works as intended:
:msg, regex, "hostname0. %%01SHELL" /var/log/tel/hostname.log
... which produces the following logs (example):
Feb 1 17:41:18 hostname01 %%01SHELL/6/DISPLAY_CMDRECORD(s)[5461]: Recorded display command information. (Task=FW, Ip=**, VpnName=, User=_system_, AuthenticationMethod="Null", Command="display engine statistics system")
Feb 1 17:42:18 hostname02 %%01SHELL/6/DISPLAY_CMDRECORD(s)[5461]: Recorded display command information. (Task=FW, Ip=**, VpnName=, User=_system_, AuthenticationMethod="Null", Command="display engine statistics system")
My template in the new RainerScript syntax, is not working:
template (name="HOST_SHELL" type="string" string="/var/log/tel/%$YEAR%-%$MONTH%-%$DAY%-HOST-SHELL.log")
if re_match($msg, 'hostname0. %%01SHELL')
then {action(type="omfile" dynaFile="HOST_SHELL")
stop
}
But, nothing happens. Maybe there is another way to solve the problem, or correct my template.
In the future planned to filter:
hostname0. %%01ERRORS in folder /var/log/tel/%$YEAR%-%$MONTH%-%$DAY%-HOST-ERRORS.log
In your RainerScript syntax configuration, you are missing the if statement's closing brace } before the stop statement. You should include it after the action statement like this:
template (name="HOST_SHELL" type="string" string="/var/log/tel/%$YEAR%-%$MONTH%-%$DAY%-HOST-SHELL.log")
if re_match($msg, 'hostname0. %%01SHELL') then {
action(type="omfile" dynaFile="HOST_SHELL")
stop
}
Also, you may want to modify the regex pattern to include the dot character (.) in the hostname match pattern. The dot character has a special meaning in regex, and it matches any character. To match the literal dot character, you can escape it with a backslash (.). Here is an updated regex pattern:
if re_match($msg, 'hostname0\..* %%01SHELL') then {
...
}
This should match any message that starts with "hostname0." followed by any characters, then " %%01SHELL"

How to make apache treat query string as file name?

I mirrored a site to local server with wget and the file names locally look like this:
comments
comments?id=123
Locally these are static files that show unique content.
But when I access second file in browser it keeps showing content from file comments and appends the query string to it ?id=123 so it is not showing content from file comments?id=123
It loads the correct file if I manually encode the ? TO %3F in browser window and I type:
comments%3Fid=123
Is there a way to fix this ? Maybe make apache stop treating ? as query separator and treat it as file name character ? Or make an URL rewrite and change ? into %3F ?
Edit: Indeed too many problems caused by ? in file name and requests. I ended up using the wget option --restrict-file-names=windows that would convert ? into an # when saving file name.
The short answer is "don't do that."
The longer answer is that ? is a reserved character in URLs, using it as a part of a filename is going to cause problems forever, and the recommended solution is to pick a different character to use in those filenames. There are many to choose from - just avoid ? & # and # and you'll probably be fine.
If you insist on keeping the file name (or if you don't have an option) try:
RewriteCond %{QUERY_STRING} (.*)
RewriteRule (.*) $1%%3F%1 [NE]
However, this is going to fire any time you have a query string, which is likely not what you want.

file transfer Extra attachmate appends username to host file name

Hi when I try to download a file from mainframe, using attachmate extra it appends the username also along with it. I dont know where to turn it off.
like for example - file name is yyyy.file.name, then when i try to transfer of file it transfers username.yyyy.file.name.
in 3.4 the option to append user name is turned off. Still its happening
Enclose the entire dataset name (including the high-level qualifier) in single quotes. This is a TSO (not JCL) convention - if you refer to a dataset without single quotes, it pre-pends your user ID as the high-level qualifier; however if you place single quotes around the dataset name it will take it 'as is' (well, it will uppercase it, since all z/OS dataset names are uppercase, but otherwise it will be 'as is').

How to replace every occurrences except the first one on every line - VIM

I have data that looks like this:
母音,vowel
備考,note, remarks, NB
基本形,,fundamental form, basic form, basic pattern, basic model, basic type, prototype
受身,,the defensive, passive attitude, passivity, passiveness, the passive, passive voice, ukemi (the art of falling safely)
受身形,passive voice, passive form
否定,negation, denial, repudiation, NOT operation
不規則,irregularity, unsteadiness, disorderly
How to replace every occurrences except the first one on every line?
I want to replace every , on every line except the first occurrence of , on every line.
Result:
母音,vowel
備考,noteREPLACED remarksREPLACED NB
基本形,REPLACEDfundamental formREPLACED basic formREPLACED basic patternREPLACED basic modelREPLACED basic typeREPLACED prototype
受身,REPLACEDthe defensiveREPLACED passive attitudeREPLACED passivityREPLACED passivenessREPLACED the passiveREPLACED passive voiceREPLACED ukemi (the art of falling safely)
受身形,passive voiceREPLACED passive form
否定,negationREPLACED denialREPLACED repudiationREPLACED NOT operation
不規則,irregularityREPLACED unsteadinessREPLACED disorderly
You can use positive lookbehind for this:
:%s/\m\%(,.*\)\#<=,/REPLACED/g
Since there is a space character after each comma except the first on each line you could do the following.
%s/, /REPLACED /g
%s/,\+/,/g
The second command takes care of multiple consecutive commas, like on the third line.

Issue with filepath name, possible corrupt characters

Perl and html, CGI on Linux.
Issue with file path name, being passed in a form field, to a CGI on server.
The issue is with the Linux file path, not the PC side.
I am using 2 programs,
1) program written years ago, dynamic html generated in a perl program, and presented to the user as a form. I modified by inserting the needed code to allow a the user to select a file from their PC, to be placed on the Linux machine.
Because this program already knew the filepath, needed on the linux side, I pass this filepath in a hidden form field, to program 2.
2) CGI program on Linux side, to run when form on (1) is posted.
Strange issue.
The filepath that I pass, has a very strange issue.
I can extract it using
my $filepath = $query->param("serverfpath");
The above does populate $filepath with what looks like exactly the correct path.
But it fails, and not in a way that takes me to the file open error block, but such that the call to the CGI script gives an error.
However, if I populate $filepath with EXACTLY the same string, via hard coding it, it works, and my file successfully uploads.
For example:
$fpath1 = $query->param("serverfpath");
$fpath2 = "/opt/webhost/ims/DOCURVC/data"
A comparison of $fpath1 and $fpath2 reveals that they are exactly equal.
A length check of $fpath1 and $fpath2 reveals that they are exactly the same length.
I have tried many methods of cleaning the data in $fpath1.
I chomp it.
I remove any non standard characters.
$fpath1 =~ s/[^A-Za-z0-9\-\.\/]//g;
and this:
my $safe_filepath_characters = "a-zA-Z0-9_.-/";
$fpath1 =~ s/[^$safe_filepath_characters]//g;
But no matter what I do, using $fpath1 causes an error, using $fpath2 works.
What could be wrong with the data in the $fpath1, that would cause it to successfully compare to $fpath2, yet not be equal, visually look exactly equal, show as having the exact same length, but not work the same?
For the below file open block.
$upload_dir = $fpath1
causes complete failure of CGI to load, as if it can not find the CGI (which I know is sometimes caused by syntax error in the CGI script).
$uplaod_dir = $fpath2
I get a successful file upload
$uplaod_dir = ""
The call to the cgi does not fail, it executes the else block of the below if, as expected.
here is the file open block:
if (open ( UPLOADFILE, ">$upload_dir/$filename" ))
{
binmode UPLOADFILE;
while ( <$upload_filehandle> )
{
print UPLOADFILE;
}
close UPLOADFILE;
$msgstr="Done with Upload: upload_dir=$upload_dir filename=$filename";
}
else
{
$msgstr="ERROR opening for upload: upload_dir=$upload_dir filename=$filename";
}
What other tests should I be performing on $fpath1, to find out why it does not work the same as its hard-coded equivalent $fpath2
I did try character replacement, a single character at a time, from $fpath2 to $fpath1.
Even doing this with a single character, caused $fpath1 to have the same error as $fpath2, although the character looked exactly the same.
Is your CGI perhaps running perl with the -T (taint mode) switch (e.g., #!/usr/bin/perl -T)? If so, any value coming from untrusted sources (such as user input, URIs, and form fields) is not allowed to be used in system operations, such as open, until it has been untainted by using a regex capture. Note that using s/// to modify it in-place will not untaint the value.
$fpath1 =~ /^([A-Za-z0-9\-\.\/]*)$/;
$fpath1 = $1;
die "Illegal character in fpath1" unless defined $fpath1;
should work if taint mode is your issue.
But it fails, and not in a way that takes me to the file open error block, but such that the call to the CGI script gives an error.
Premature end of script headers? Try running the CGI from the command line:
perl your_upload_script.cgi serverfpath=/opt/webhost/ims/DOCURVC/data

Resources