Having command line arguments to flex as search strings - linux

I use flex, the linux/unix not the Adobe type, to generate small scanners. In the past I have always used static search strings. I now want to provide a command line provided search string by providing a string via getopt and then being able to use it for searching with.
The old way of searching was:
.*"_"\n ECHO;
To find lines that ended with an underscore.
Now I want to search this way:
.*<arbitrary string>.*\n ECHO;
I don't know how to get flex to accept the <arbitrary string>. I can get it via getopt, but I haven't been able to get flex to accept my syntax.
What I am doing is a special purpose very limited grep for a special problem I am having.
Any help would be appreciated.

.*\n { if(strstr(yytext, "arbitrary string")) ECHO; else REJECT; }
The REJECT statement will skip to next rule if yytext doesn't contain "arbitrary string". This will of course not provide the same performance as if the search string was known at compile time. regcomp()/regexec() in glibc might be faster than flex if you are implementing your own grep program.

Related

Can ArchUnit check for certain string patterns in method calls?

In our code we again and again have the issue that somebody forgot to adapt the usage of placeholders when switching between the use of the logger and String.format(...) methods.
For log statements one has to use '{}' as placeholders, like so:
logger.info("File {} successfully opened: {} bytes read, {} objects created", file, nrBytes, nrObjects);
But when using String.format(...) to compose a message one has to use '%s' as placeholders for strings and the statement has to read:
logger.info(String.format("File %s successfully opened: %s bytes read, %s objects created", file, nrBytes, nrObjects));
The second form is often used, when logging an error where the second argument is the Throwable that one wants to log.
Too often people forget about this details and then we end up with wrong log statements that output nothing reasonable.
I know and agree that this is absolutely not an architecture issue but rather a simple programming error, but it would be great if one could (ab-)use ArchUnit to check for the use of '%s' (or the absence of '{}') in the first String argument of the String.format()-method. Is something like that possible?
The ArchUnit, currently in version 0.16.0, does not analyze parameter values for method calls.
The sonar rule "Printf-style format strings should be used correctly" might however catch these bugs.
As already noted ArchUnit can't do this - PMD's [invalidlogmessageformat][1] rule is useful though (and I find PMD easier to deal with than sonar).

What is the Lua "replacement" for the pre_exec command in Conky files?

I'm not great at programming, but I was trying to fiddle around with a conky_rc file I liked that I found that seemed pretty straight-forward.
As the title states, I have now learned that the previous command of pre_exec has been long removed and superseded by Lua.
Unfortunately, I cannot seem to find anything directly related to this other than https://github.com/brndnmtthws/conky/issues/62. The thread https://github.com/brndnmtthws/conky/issues/146 references it, and its "solution" states: Basically, there is no replacement and you should use Lua or use a very large interval and execi.
I have found a few more threads that all include the question as to why this function was discontinued, but with no actual answers. So, to reiterate mine, I have absolutely no knowledge of Lua (I've heard of it before, and I've now added a few websites to look at tomorrow since I have spent most of the evening trying to figure out this Conky thing), and I'll probably just give up and do the execi option (my computer can handle it but, I just think it's so horribly inefficient).
Is there an appropriate Lua option? If so, would someone please direct me to either the manual or wiki for it, or explain it? Or is the "proper" Lua solution this?
#Vincent-C It's not working for your script is because the function
ain't getting call. from the quick few tests I did, it seem
lua_startup_hook need the function to be in another file that is
loaded using lua_load, not really sure how the hook function thingy
all works cause I rather just directly use the config as lua since it
is lua.
Basically just call the io.popen stuff and concat it into conky.text
conky.text = [[ a lot of stuff... ${color green} ]];
o = io.popen('fortune -s | cowsay', 'r') conky.text = conky.text ..
o:read('*a')
The comment by asl97 on the first page you cited appears to provide an answer, but a bit of explanation would probably help.
asl97 provides the following general purpose Lua function to use as a substitute for $pre_exec, preceded by a require statement to make io available for use by the function:
require 'io'
function pre_exec(cmd)
local handle = io.popen(cmd)
local output = handle:read("*a")
handle:close()
return output
end
Adding this block of code to your conky configuration file will make the function available for use therein. For testing, I added it above the conky.config = { ... } section.
Calling the Lua pre_exec function will return a string containing the output of the command passed to it. The conky.text section from [[ to ]] is also a string, so it can then be conactenated to the string returned by pre_exec using the .. operator, as shown in the usage section provided by asl97.
In my test, I did the following silly bit, which worked as expected, to display "Hello World!" and the output of the date function with spacing above and below each at the top of my conky display:
conky.text = pre_exec("echo; echo Hello World!; echo; date; echo")..[[
-- lots of boring conky stuff --
]]
More serious commands can, of course, be used with pre_exec, as shown by asl97.
One thing that asl97 didn't explain was how to provide how to concatenate so that the pre_exec output is in the middle of the conky display rather than just the beginning. I tested and found that you can do it like the following:
conky.text = [[
-- some conky stuff --
]]..pre_exec("your_important_command")..[[
-- more conky stuff --
]]

Perl critic policy violation in checking index of substring in a string

for my $item (#array) {
if (index($item, '$n') != -1) {
print "HELLO\n";
}
}
Problem is: Perl critic gives below policy violation.
String may require interpolation at line 168, near '$item, '$n''. (Severity: 1)
Please advise how do I fix this?
In this case the analyzer either found a bug or is plain wrong in flagging your code.
Are you looking for a literal "$n" in $item, or for what $n variable evaluates to?
If you want to find the literal $n characters then there is nothing wrong with your code
If you expect $item to contain the value stored in $n variable then allow it to be evaluated,
if (index($item, $n) != -1)
If this is indeed the case but $n may also contain yet other escaped sequences or encodings which you need as literal characters (so to suppress their evaluation) then you may need to do a bit more, depending of what exactly may be in that variable.
In case you do need to find characters $ followed by n (what would explain a deliberate act of putting single quotes around a variable) you need to handle the warning.
For the particular policy that is violated see Perl::Critic::Policy::ValuesAndExpressions
This policy warns you if you use single-quotes or q// with a string that has unescaped metacharacters that may need interpolation.
To satisfy the policy you'd need to use double quotes and escape the $, for example qq(\$n). In my opinion this would change the fine original code segment into something strange to look at.
If you end up wanting to simply silence the warning see documentation, in Bending The Rules
A comment. The tool perlcritic is useful but you have to use it right. It's a static code analyzer and it doesn't know what your program is doing, so to say; it can catch bad practices but can't tell you how to write programs. Many of its "policies" are unsuitable for particular code.
The book that it is based on says all this very nicely in its introduction. Use sensibly.
When I look at the question where this comes from it appears that you are looking for index at which substrings were matched, so you need the content of $n variable, not literal "$n". Then perlcritic identified a bug in the code, good return for using it!

Looking for the best way in bash shell to extract a string

I have the following string being exported from a program that is analyzing the certificate on a website which will be part of a bugfix analysis
CERT_SUMMARY:127.0.0.1:127.0.0.1:631:sha256WithRSAEncryption:
/O=bfcentos7-test/CN=bfcentos7-test/emailAddress=root$bfcentos7-
test:/O=bfcentos7-test/CN=bfcentos7-test/emailAddress=root$bfcentos7-
test:170902005715Z:270831005715Z:self signed certificate
(consider output above to be a single line)
What I need is the best way in a bash shell to extract the sha256WithRSAEncryption. This could be anything like sha384withRSAEncryption or something else.
After the CERTSUMMARY it will always be 127.0.0.1:127.0.0.1:portnum above its port 631, but it could be anything.
This runs internally on a system and returns this string along with SSL or TLS (not pictured)
Here is another example of a return
CERT_SUMMARY:127.0.0.1:127.0.0.1:52311:sha256WithRSAEncryption:
/CN=ServerSigningCertificate_0/name=Type`Administrator
/name=DBName`ServerSigningCertificate_0:/C=US/CN=BLAHBLAH/
ST=California/L=Address, Emeryville CA 94608/O=IBM BigFix Evaluation
License/OU=Customer/emailAddress=blahblay#gmail.com/name=
Hash`sha1/name=Server`bigfix01/name=CustomActions`Enable
/name=LicenseAllocation`999999/name=CustomRetrievedProperties`Enable:
170702212459Z:270630212459Z:unable to get local issuer certificate
Thanks in advance.
Novice at shell programming, but learning!!
you need the best way and yet do not seem to provide the best description - "This could be anything like sha384withRSAEncryption or something else."
Given the examples, the string you are looking for is the 4th, when : is a separator, so the command should be OK:
cut -f4 -d":"
If the output string has a strict length format, one easy option is the 'cut' command with -c. This is not the case though since there is a port number.
CERT_SUMMARY:127.0.0.1:127.0.0.1:631:sha256WithRSAEncryption:
as #cyrus pointed out, this was as simple as picking the right column with awk... I am learning.
This worked
awk -F ":" '/CERT_SUMMARY/ {print $5}'
Thanks for the help!!
| sed -E 's/^([^:]*:){4}([^:]*):.*/\2/'
Regular expressions are you friend. If there is one thing one really should be familiar with if one needs to do a lot of string parsing or string processing, it's definitely regular expressions.
echo 'CERT_SUMMARY:127.0.0.1:127.0.0.1:52311:sha256WithRSAEncryption:
/CN=ServerSigningCertificate_0/name=Type`Administrator
/name=DBName`ServerSigningCertificate_0:/C=US/CN=BLAHBLAH/ST=California
/L=Address, Emeryville CA 94608/O=IBM BigFix Evaluation
License/OU=Customer/emailAddress=blahblay#gmail.com/name=Hash`sha1
/name=Server`bigfix01/name=CustomActions`Enable
/name=LicenseAllocation`999999
/name=CustomRetrievedProperties
`Enable:170702212459Z:270630212459Z:unable to get local issuer
certificate'
| sed -E 's/^([^:]*:){4}([^:]*):.*/\2/'
prints
sha256WithRSAEncryption
It's probably a bit overkill here, but there is almost nothing that cannot be done with regular expressions and as you have also built-in regex support in many languages today, knowing regex is never going to be a waste of time.
See also here to get a nice explanation of what each regex expression actually means, including an interactive editing view. Basically I'm telling the regex parser to skip the first 4 groups consisting of any number of characters that are not :, followed by a single : and then capture the 5th group that consists of any number of characters that are not : and finally match anything else (no matter what) to the end of the string. The whole regex is part of a sed "replace" operation, where I replace the whole string by just the content that has been captured by the second capture group (everything in round parenthesis is a capture group).
Could you please use following also, not printing it by field's number so if your Input_file's sha256 location is a bit here and there too than shown one then this could be more helpful too.
awk '{match($0,/sha.*Encryption:/);if(substr($0,RSTART,RLENGTH)){print substr($0,RSTART,RLENGTH-1)}}' Input_file
Pipe the output to:
awk ‘BEGIN{FS=“:”} {print $5}’
You could also take a step back to the openssl x509 command 'name options'. Using sep_comma_plus avoids the slashes in the output and therefore your regex will be simpler.

How to get (translatable) strings from specific domain with POEdit

I have been trying for hours finding a way to setup POEdit so that it can grab the text from specific domain only
My gettext function looks like this:
function ri($id, $parameters = array(), $domain = 'default', $locale = null)
A sample call:
echo ri('Text %xyz%', array('%xyz%'=>100), 'myDomain');
I will need to grab only the text with the domain myDomain to translate, or at least I want POEdit to put these texts into domain specific files. Is there a way to do it?
I found several questions that are similar but the answers don't really tell me what to do (I think I'm such a noob it must be explained in plain English for me to understand):
How to set gettext text domain in Poedit?
How to get list of translatable messages
So I finally figured it out after days of searching, I finally found the answer here:
http://sourceforge.net/mailarchive/message.php?msg_id=27691818
xgettext recognizes context in strings, and gives a msgctxt field in the *.pot file, which is recognized by translation software as a
context and is shown as such (check image of Pootle showing context
below)
This can be done in 3 ways:
String in code should be in the format _t('context','string'); and xgettext invocation should be in the form --keyword=_t:1c,2
(this basically explains to xgettext that there are 2 arguments in
the keyword function, 1st one is context, 2nd one is string)
String in code in the format _t('string','context'); and xgettext invocation should be in the form --keyword=_t:1,2c
String in the code should be as _t('context|string') and xgettext invocation should be in the form --keyword=_t:1g
So to answer my own question, I added this to the "sources keywords" tab of Poedit:
ri:1,3c
ri is the function name, 1 is the location of the stringid, 3 is the location of the context/domain
Hope this helps someone else, I hate all these cryptic documents
(This is a repost of my answer to the same thing here.)
Neither GNU gettext tools nor Poedit (which uses them) support this particular misuse of gettext.
In gettext, domain is roughly “a piece of software” — a program, a library, a plugin, a theme. As such, it typically resides in a single directory tree and is alone there — or at the very least, if you have multiple pieces=domains, you have them organized sanely into some subdirectories that you can limit the extraction to.
Mixing and matching domains within a single file as you do is not how gettext was intended to be used, and there’s no reasonable solution to handle it other than using your own helper function, e.g. by wrapping all myDomain texts into __mydomain (which you must define, obviously) and adding that to the list of keywords in Poedit when extracting for myDomain and not adding that to the list of keywords for other domains' files.

Resources