LL_EXACT_AMBIG_DETECTION - Interpetation - antlr4

When using PredictionMode::LL_EXACT_AMBIG_DETECTION I get the following error messages:
line 186:7 reportAttemptingFullContext d=30, input='ON REPORT HEAD
How am I to interpret the d attribute. Does it reference a rule in my grammar and how can I find out which?
According to the code:
#Override
public void reportAttemptingFullContext(#NotNull Parser recognizer,
#NotNull DFA dfa,
int startIndex, int stopIndex,
#NotNull ATNConfigSet configs)
{
recognizer.notifyErrorListeners("reportAttemptingFullContext d=" +
dfa.decision + ", input='" +
recognizer.getTokenStream().getText(Interval.of(startIndex, stopIndex)) + "'");
}
the attribute d is a decision in DFA. But I have not found out how the use the information back to the rule in the grammar.
Thank for your help.
Kind regards,
Wolfgang Hämmer

The following helper methods can convert a decision number to a rule name. You can create your own error listener implementation based on DiagnosticErrorListener and use these methods to include the name of the rule in each message.
If a rule has more than one decision, then you can pass the -atn flag to ANTLR when you generate code for your grammar. Once you have the name of the rule, look at the graph produced by ruleName.dot (where ruleName is the rule), and you'll see a node in the graph labeled d=decisionNumber (where decisionNumber is the number you're currently seeing). That will point you to the exact location where the problem is occurring.
Keep in mind that rule and decision numbers change when you change your grammar, so when you open ruleName.dot you'll want to verify the actual decision number each you regenerate the code for your grammar.
public static int getDecisionRule(Recognizer<?, ?> recognizer, int decision) {
if (recognizer == null || decision < 0) {
return -1;
}
if (decision >= recognizer.getATN().decisionToState.size()) {
return -1;
}
return recognizer.getATN().decisionToState.get(decision).ruleIndex;
}
public static String getRuleDisplayName(Recognizer<?, ?> recognizer, int ruleIndex) {
if (recognizer == null || ruleIndex < 0) {
return Integer.toString(ruleIndex);
}
String[] ruleNames = recognizer.getRuleNames();
if (ruleIndex < 0 || ruleIndex >= ruleNames.length) {
return Integer.toString(ruleIndex);
}
return ruleNames[ruleIndex];
}

Related

how to check which constraints would be violated by a presumed solution?

In some cases the solver fails to find a solution for my model, which I think is there.
So I would like to populate a solution, and then check which constraint is violated.
How to do that with choco-solver?
Using choco-solver 4.10.6.
Forcing a solution
I ended up adding constraints to force variables to values of my presumed solution:
e.g.
// constraints to force given solution
vehicle2FirstStop[0].eq(model.intVar(4)).post();
vehicle2FirstStop[1].eq(model.intVar(3)).post();
nextStop[1].eq(model.intVar(0)).post();
nextStop[2].eq(model.intVar(1)).post();
...
and then
model.getSolver().showContradiction();
if (model.getSolver().solve()) { ....
Shows the first contradiction of the presumed solution, e.g.
/!\ CONTRADICTION (PropXplusYeqZ(sum_exp_49, mul_exp_51, ...
So the next step is to find out where terms such as sum_exp_49 come from.
Matching the contradiction terms with the code
Here is a simple fix for constraints which will hopefully provide enough information. We can override the post() and associates() methods of model, so that it dumps the java source filename and line number when a constraint is posted/variable is created.
Model model = new Model("Vrp1RpV") {
/**
* retrieve the filename and line number of first caller outside of choco-solver from stacktrace
*/
String getSource() {
String source = null;
StackTraceElement[] stackTraceElements = Thread.currentThread().getStackTrace();
// starts from 3: thread.getStackTrace() + this.getSource() + caller (post() or associates())
for (int i = 3; i < stackTraceElements.length; i++) {
// keep rewinding until we get out of choco-solver packages
if (!stackTraceElements[i].getClassName().toString().startsWith("org.chocosolver")) {
source = stackTraceElements[i].getFileName() + ":" + stackTraceElements[i].getLineNumber();
break;
}
}
return source;
}
#Override
public void post(Constraint... cs) throws SolverException {
String source=getSource();
// dump each constraint along source location
for (Constraint c : cs) {
System.err.println(source + " post: " + c);
}
super.post(cs);
}
#Override
public void associates(Variable variable) {
System.err.println(getSource() + " associates: " + variable.getName());
super.associates(variable);
}
};
This will dump things like:
Vrp1RpV2.java:182 post: ARITHM ([prop(EQ_exp_47.EQ.mul_exp_48)])
Vrp1RpV2.java:182 associates: sum_exp_49
Vrp1RpV2.java:182 post: ARITHM ([prop(mul_exp_48.EQ.sum_exp_49)])
Vrp1RpV2.java:182 associates: EQ_exp_50
Vrp1RpV2.java:182 post: BASIC_REIF ([(stop2vehicle[2] = 1) <=> EQ_exp_50])
...
From there it is possible to see where sum_exp_49 comes from.
EDIT: added associates() thanks to #cprudhom suggestion on https://gitter.im/chocoteam/choco-solver

Get the most possible token types according to line and column number in ANTLR4

I would like to get a list of most possible list of tokens for a given location in the text (line and column number) to determine what has to be populated for auto code completion. Can this be easily achieved using ANTLR 4 API.
I want to get the possible list of tokens for a given location because the user might be writing/editing somewhere in the middle of the text which still guarantees the possible list of tokens.
Please give me some guidelines because I was unable to find an online resource on this topic.
One way to get tokens by line number is to create a ParseTreeListener for your grammar, use it to walk a given ParseTree and index TerminalNodes by line number. I don't know C#, but here is how I've done it in Java. Logic should be similar.
public class MyLineIndexer extends MyGrammarParserBaseListener {
protected MultiMap<Integer, TerminalNode> filelineTokenIndex;
#Override
public void visitTerminal(#NotNull TerminalNode node) {
// map every token to its file line for searching later...
if ( node.getSymbol() != null ) {
List<TerminalNode> tokens;
Integer line = node.getSymbol().getLine();
if (!filelineTokenIndex.containsKey(line)) {
tokens = new ArrayList<>();
filelineTokenIndex.put(line, tokens);
} else {
tokens = filelineTokenIndex.get(line);
}
tokens.add(node);
}
super.visitTerminal(node);
}
}
then walk the parse tree the usual way...
ParseTree parseTree = ... ; // parse it how you want to
MyLineIndexer indexer = new MyLineIndexer();
ParseTreeWalker walker = new ParseTreeWalker();
walker.walk(indexer, parseTree);
Getting the token at a line and range is now reasonably straight forward and efficient assuming you have a reasonable number of tokens on a line. For example you can add another method to the Listener like this:
public TerminalNode findTerminalNodeAtCaret(int caretPos, int caretLine) {
if (caretPos <= 0) return null;
if (this.filelineTokenIndex.containsKey(caretLine)) {
List<TerminalNode> nodes = filelineTokenIndex.get(caretLine);
if (nodes.size() == 0) return null;
int tokenEndPos, tokenStartPos;
for (TerminalNode n : nodes) {
if (n.getSymbol() != null) {
tokenEndPos = n.getSymbol().getCharPositionInLine() + n.getText().length();
tokenStartPos = n.getSymbol().getCharPositionInLine();
// If the caret is within this token, return this token
if (caretPos >= tokenStartPos && caretPos <= tokenEndPos) {
return n;
}
}
}
}
return null;
}
You will also need to ensure your parser allows for 'loose' parsing. While a language construct is being typed, it is likely not to be valid. Your Parser rules should allow for this.

Sentiment Analysis(SentiWordNet) - Judging the context of a sentence

I am trying to find whether a sentence is Positive or Negative in the following steps:
1.) Retrieving the Parts of speech(verbs, nouns, adjectives etc) from the sentence using the Stanford NLP parser.
2.) Using the SentiWordNet to find the Positive and Negative values related to each Part of Speech.
3.) Summing up the Positive and Negative values obtained to calculate a Net Positive and Net Negative value related to a sentence.
But the problem is that, the SentiWordNet return a list of Positive/Negative values based on different senses/contexts. Is it possible to pass a particular sentence along with the part of speech to the SentiWordNet parser, so that it can judge the sense/context automatically and returns only one pair of Positive and Negative value?
Or is there any other alternate solution to this problem?
Thanks.
SentoWordNet Demo Code
This may help you.
// Copyright 2013 Petter Törnberg
//
// This demo code has been kindly provided by Petter Törnberg <pettert#chalmers.se>
// for the SentiWordNet website.
//
// This program is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with this program. If not, see <http://www.gnu.org/licenses/>.
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
public class SentiWordNetDemoCode {
private Map<String, Double> dictionary;
public SentiWordNetDemoCode(String pathToSWN) throws IOException {
// This is our main dictionary representation
dictionary = new HashMap<String, Double>();
// From String to list of doubles.
HashMap<String, HashMap<Integer, Double>> tempDictionary = new HashMap<String, HashMap<Integer, Double>>();
BufferedReader csv = null;
try {
csv = new BufferedReader(new FileReader(pathToSWN));
int lineNumber = 0;
String line;
while ((line = csv.readLine()) != null) {
lineNumber++;
// If it's a comment, skip this line.
if (!line.trim().startsWith("#")) {
// We use tab separation
String[] data = line.split("\t");
String wordTypeMarker = data[0];
// Example line:
// POS ID PosS NegS SynsetTerm#sensenumber Desc
// a 00009618 0.5 0.25 spartan#4 austere#3 ascetical#2
// ascetic#2 practicing great self-denial;...etc
// Is it a valid line? Otherwise, through exception.
if (data.length != 6) {
throw new IllegalArgumentException(
"Incorrect tabulation format in file, line: "
+ lineNumber);
}
// Calculate synset score as score = PosS - NegS
Double synsetScore = Double.parseDouble(data[2])
- Double.parseDouble(data[3]);
// Get all Synset terms
String[] synTermsSplit = data[4].split(" ");
// Go through all terms of current synset.
for (String synTermSplit : synTermsSplit) {
// Get synterm and synterm rank
String[] synTermAndRank = synTermSplit.split("#");
String synTerm = synTermAndRank[0] + "#"
+ wordTypeMarker;
int synTermRank = Integer.parseInt(synTermAndRank[1]);
// What we get here is a map of the type:
// term -> {score of synset#1, score of synset#2...}
// Add map to term if it doesn't have one
if (!tempDictionary.containsKey(synTerm)) {
tempDictionary.put(synTerm,
new HashMap<Integer, Double>());
}
// Add synset link to synterm
tempDictionary.get(synTerm).put(synTermRank,
synsetScore);
}
}
}
// Go through all the terms.
for (Map.Entry<String, HashMap<Integer, Double>> entry : tempDictionary
.entrySet()) {
String word = entry.getKey();
Map<Integer, Double> synSetScoreMap = entry.getValue();
// Calculate weighted average. Weigh the synsets according to
// their rank.
// Score= 1/2*first + 1/3*second + 1/4*third ..... etc.
// Sum = 1/1 + 1/2 + 1/3 ...
double score = 0.0;
double sum = 0.0;
for (Map.Entry<Integer, Double> setScore : synSetScoreMap
.entrySet()) {
score += setScore.getValue() / (double) setScore.getKey();
sum += 1.0 / (double) setScore.getKey();
}
score /= sum;
dictionary.put(word, score);
}
} catch (Exception e) {
e.printStackTrace();
} finally {
if (csv != null) {
csv.close();
}
}
}
public double extract(String word, String pos) {
return dictionary.get(word + "#" + pos);
}
public static void main(String [] args) throws IOException {
if(args.length<1) {
System.err.println("Usage: java SentiWordNetDemoCode <pathToSentiWordNetFile>");
return;
}
String pathToSWN = args[0];
SentiWordNetDemoCode sentiwordnet = new SentiWordNetDemoCode(pathToSWN);
System.out.println("good#a "+sentiwordnet.extract("good", "a"));
System.out.println("bad#a "+sentiwordnet.extract("bad", "a"));
System.out.println("blue#a "+sentiwordnet.extract("blue", "a"));
System.out.println("blue#n "+sentiwordnet.extract("blue", "n"));
}
}
We can pass the pos to sentiwordnet parser.
Download pattern python module
from pattern.en import wordnet
print wordnet.synsets("kill",pos="VB")[0].weight
wordnet.synsets returns list of synsets
and from that we are selecting 1st item
Output will be a tuple of (polarity,subjectivity)
Hope this helps...

How to fix "Path Manipulation Vulnerability" in some Java Code?

The below simple java code getting Fortify Path Manipulation error. Please help me to resolve this. I am struggling from long time.
public class Test {
public static void main(String[] args) {
File file=new File(args[0]);
}
}
Try to normalize the URL before using it
https://docs.oracle.com/javase/7/docs/api/java/net/URI.html#normalize()
Path path = Paths.get("/foo/../bar/../baz").normalize();
or use normalize from org.apache.commons.io.FilenameUtils
https://commons.apache.org/proper/commons-io/javadocs/api-1.4/org/apache/commons/io/FilenameUtils.html#normalize(java.lang.String)
Stirng path = FilenameUtils.normalize("/foo/../bar/../baz");
For both the result will be \baz
Looking at the OWASP page for Path Manipulation, it says
An attacker can specify a path used in an operation on the filesystem
You are opening a file as defined by a user-given input. Your code is almost a perfect example of the vulnerability! Either
Don't use the above code (don't let the user specify the input file as an argument)
Let the user choose from a list of files that you supply (an array of files with an integer choice)
Don't let the user supply the filename at all, remove the configurability
Accept the vulnerability but protect against it by checking the filename (although this is the worst thing to do - someone may get round it anyway).
Or re-think your application's design.
Fortify will flag the code even if the path/file doesn't come from user input like a property file. The best way to handle these is to canonicalize the path first, then validate it against a white list of allowed paths.
Bad:
public class Test {
public static void main(String[] args) {
File file=new File(args[0]);
}
}
Good:
public class Test {
public static void main(String[] args) {
File file=new File(args[0]);
if (!isInSecureDir(file)) {
throw new IllegalArgumentException();
}
String canonicalPath = file.getCanonicalPath();
if (!canonicalPath.equals("/img/java/file1.txt") &&
!canonicalPath.equals("/img/java/file2.txt")) {
// Invalid file; handle error
}
FileInputStream fis = new FileInputStream(f);
}
Source: https://www.securecoding.cert.org/confluence/display/java/FIO16-J.+Canonicalize+path+names+before+validating+them
Only allow alnum and a period in input. That means you filter out the control chars, "..", "/", "\" which would make your files vulnerable. For example, one should not be able to enter /path/password.txt.
Once done, rescan and then run Fortify AWB.
I have a solution to the Fortify Path Manipulation issues.
What it is complaining about is that if you take data from an external source, then an attacker can use that source to manipulate your path. Thus, enabling the attacker do delete files or otherwise compromise your system.
The suggested remedy to this problem is to use a whitelist of trusted directories as valid inputs; and, reject everything else.
This solution is not always viable in a production environment. So, I suggest an alternative solution. Parse the input for a whitelist of acceptable characters. Reject from the input, any character you don't want in the path. It could be either removed or replaced.
Below is an example. This does pass the Fortify review. It is important to remember here to return the literal and not the char being checked. Fortify keeps track of the parts that came from the original input. If you use any of the original input, you may still get the error.
public class CleanPath {
public static String cleanString(String aString) {
if (aString == null) return null;
String cleanString = "";
for (int i = 0; i < aString.length(); ++i) {
cleanString += cleanChar(aString.charAt(i));
}
return cleanString;
}
private static char cleanChar(char aChar) {
// 0 - 9
for (int i = 48; i < 58; ++i) {
if (aChar == i) return (char) i;
}
// 'A' - 'Z'
for (int i = 65; i < 91; ++i) {
if (aChar == i) return (char) i;
}
// 'a' - 'z'
for (int i = 97; i < 123; ++i) {
if (aChar == i) return (char) i;
}
// other valid characters
switch (aChar) {
case '/':
return '/';
case '.':
return '.';
case '-':
return '-';
case '_':
return '_';
case ' ':
return ' ';
}
return '%';
}
}
Assuming you're running Fortify against a web application, during your triage of Fortify vulnerabilities that would likely get marked as "Not an issue". Reasoning being A) obviously this is test code and B) unless you have multiple personality disorder you're not going to be doing a path manipulation exploit against your self when you run that test app.
If very common to see little test utilities committed to a repository which produces this style of false positive.
As for your compilation errors, that generally comes down to classpath issues.
We have code like below which was raising Path Manipulation high category issue in fortify .
String.join(delimeter,string1,string2,string2,string4);
Our program is to deal with AWS S3 bucket so, we changed as below and it worked .
com.amazonaws.util.StringUtils.join(delimeter,string1,string2,string2,string4);
Using the Tika library FilenameUtils.normalize solves the fortify issue.
import org.apache.tika.io.FilenameUtils;
public class Test {
public static void main(String[] args) {
String filePath = FilenameUtils.normalize(args[0]); //This line solve issue.
File file=new File(filePath);
}
}
Try this for replacing FileInputStream. You will need to close your project and open again to accurately see whether changes worked.
File to byte[] in Java
Use Normalize() function in C# and it resolved the fortify vulnerability in next scan.
string s = #:c:\temp\scan.log".Normalize();
Use regex to validate the file path and file name
fileName = args[0];
final String regularExpression = "([\\w\\:\\\\w ./-]+\\w+(\\.)?\\w+)";
Pattern pattern = Pattern.compile(regularExpression);
boolean isMatched = pattern.matcher(fileName).matches();

How do Struts tag attributes work?

In this code:
<html:text property="txtItem5" disabled="disTxtItem5">
I can see that "txtItem5" comes from a getTxtItem5() in the ActionForm, but searching the project for other substrings of "disTxtItem5" seemingly reveals nothing remotely related, though clearly somehow the framework is pulling a boolean from this string, which clearly means that it's more complicated than my current understanding.
Could someone give a good explanation of how these expressions are evaluated, or point me in the direction of one?
EDIT: In my original response (see below) I said that Struts manages the conversion, but I was wrong. I didn’t remember exactly what was going so I pulled out Struts’s sources and took a look. It turns out that the conversion is made by the server. The JSP is converted to a servlet before execution and it is here that false is used for non boolean values.
For example, I used the following tag:
<html:text property="nr" disabled="BlaBla" />
Which was translated to the following html (no disabled):
<input type="text" name="nr" value="123">
This happened in the servlet. Here is what my servlet contains for the above tag:
// html:text
org.apache.struts.taglib.html.TextTag _jspx_th_html_005ftext_005f0 = (org.apache.struts.taglib.html.TextTag) _005fjspx_005ftagPool_005fhtml_005ftext_005fproperty_005fdisabled_005fnobody.get(org.apache.struts.taglib.html.TextTag.class);
_jspx_th_html_005ftext_005f0.setPageContext(_jspx_page_context);
_jspx_th_html_005ftext_005f0.setParent((javax.servlet.jsp.tagext.Tag) _jspx_th_html_005fform_005f0);
_jspx_th_html_005ftext_005f0.setProperty("nr");
_jspx_th_html_005ftext_005f0.setDisabled(false);
int _jspx_eval_html_005ftext_005f0 = _jspx_th_html_005ftext_005f0.doStartTag();
As it can be seen, the disabled value is generated with false, directly. I did some more digging into the Jasper compiler (I used Tomcat) and I think that it is the org.apache.jasper.compiler.JspUtil class that is responsible for the conversion, with the following code:
public static boolean booleanValue(String s) {
boolean b = false;
if (s != null) {
if (s.equalsIgnoreCase("yes")) {
b = true;
} else {
b = Boolean.valueOf(s).booleanValue();
}
}
return b;
}
Since I inserted BlaBla in the disabled field this should fallback to Boolean.valueOf(s).booleanValue(); which does the following:
public static Boolean valueOf(String s) {
return toBoolean(s) ? TRUE : FALSE;
}
private static boolean toBoolean(String name) {
return ((name != null) && name.equalsIgnoreCase("true"));
}
This way, BlaBla results in false.
ORIG: The following was my original response, but was incorrect. What I was describing was in fact what happens when request parameters are binded to the action form.
The disabled attribute is of type boolean, so it should receive only values that map to a boolean. disabled="disTxtItem5" would throw a ConversionException because the text disTxtItem5 does not map to a boolean.
Struts uses CommonBeanUtils to make conversions, so a BooleanConverter will be used, with code like the following:
String stringValue = value.toString();
if (stringValue.equalsIgnoreCase("yes") ||
stringValue.equalsIgnoreCase("y") ||
stringValue.equalsIgnoreCase("true") ||
stringValue.equalsIgnoreCase("on") ||
stringValue.equalsIgnoreCase("1")) {
return (Boolean.TRUE);
} else if (stringValue.equalsIgnoreCase("no") ||
stringValue.equalsIgnoreCase("n") ||
stringValue.equalsIgnoreCase("false") ||
stringValue.equalsIgnoreCase("off") ||
stringValue.equalsIgnoreCase("0")) {
return (Boolean.FALSE);
} else if (useDefault) {
return (defaultValue);
} else {
throw new ConversionException(stringValue);
}
At this point I don’t remember if Struts just logs the exception and fails silently setting false as the value of the parameter or, the exception is propagated (it’s been a while since I used Struts :D but I’m more inclined to think that it just sets false and continues).
The logs should point out the exception even if it is ignored. Setting a logger for org.apache.commons.beanutils or org.apache.struts should indicate any conversion errors.

Resources