Number name entity recognition in Stanford

Number name entity recognition in Stanford - nlp

I have a problem in which I'm trying to recognize the number name entity from a text using Stanford , in case I have for example 20 million It's retrieving like this "Number":["20-5","million-6"], How can I optimize the answer so 20 millions comes together? and How can I ignore the index number like (5,6) in the above example? I'm using java language.
public void extractNumbers(String text) throws IOException {
number = new HashMap<String, ArrayList<String>>();
n= new ArrayList<String>();
edu.stanford.nlp.pipeline.Annotation document = new edu.stanford.nlp.pipeline.Annotation(text);
pipeline.annotate(document);
List<CoreMap> sentences = document.get(CoreAnnotations.SentencesAnnotation.class);
for (CoreMap sentence : sentences) {
for (CoreLabel token : sentence.get(CoreAnnotations.TokensAnnotation.class)) {
if (!token.get(CoreAnnotations.NamedEntityTagAnnotation.class).equals("O")) {
if (token.get(CoreAnnotations.NamedEntityTagAnnotation.class).equals("NUMBER")) {
n.add(token.toString());
number.put("Number",n);
}
}
}
}

To get the exact text from any object of CoreLabel class simply use token.originalText() instead of token.toString()
If you need anything else from these tokens, take a look at CoreLabel's javadoc.

Related

java 8 - iterate 2 hash maps and create new hash map with records only matching keys

we have a requirement to achieve in java8, please anyone help us.
we have method taking 2 parameters as input, both parameters are Hashmap<String,Dog>.
we want to iterate both hash maps and return one hashmap .
result hash Map contains only matched keys and corresponding values from 2 hashmaps, and (value for matched key) i.e Dog atributes we want to set some attribute from Hashmap1 and some attributes from hashmap2.
please suggest how we can achieve this in java 8.

You can implement the iterator simple like this:
public Map<String, Dog> combineMap(Map<String, Dog> first, Map<String, Dog> second) {
Map<String, Dog> result = new HashMap<>();
for (Map.Entry<String, Dog> entry : first.entrySet()) {
if(second.containsKey(entry.getKey())){
Dog dogFirst = entry.getValue();
Dog dogSecond = second.get(entry.getKey());
Dog combineDog = new Dog();
// Do what ever you want with combineDog
result.put(entry.getKey(), combineDog);
}
}
return result;
}

Cucumber V5-V6 - passing complex object in feature file step

So I have recently migrated to v6 and I will try to simplify my question
I have the following class
#AllArgsConstructor
public class Songs {
String title;
List<String> genres;
}
In my scenario I want to have something like:
Then The results are as follows:
|title |genre |
|happy song |romance, happy|
And the implementation should be something like:
#Then("Then The results are as follows:")
public void theResultsAreAsFollows(Songs song) {
//Some code here
}
I have the default transformer
#DefaultParameterTransformer
#DefaultDataTableEntryTransformer(replaceWithEmptyString = "[blank]")
#DefaultDataTableCellTransformer
public Object transformer(Object fromValue, Type toValueType) {
ObjectMapper objectMapper = new ObjectMapper();
return objectMapper.convertValue(fromValue, objectMapper.constructType(toValueType));
}
My current issue is that I get the following error: Cannot construct instance of java.util.ArrayList (although at least one Creator exists)
How can I tell cucumber to interpret specific cells as lists? but keeping all in the same step not splitting apart? Or better how can I send an object in a steps containing different variable types such as List, HashSet, etc.
If I do a change and replace the list with a String everything is working as expected

#M.P.Korstanje thank you for your idea. If anyone is trying to find a solution for this here is the way I did it as per suggestions received. Inspected to see the type fromValue has and and updated the transform method into something like:
if (fromValue instanceof LinkedHashMap) {
Map<String, Object> map = (LinkedHashMap<String, Object>) fromValue;
Set<String> keys = map.keySet();
for (String key : keys) {
if (key.equals("genres")) {
List<String> genres = Arrays.asList(map.get(key).toString().split(",", -1));
map.put("genres", genres);
}
return objectMapper.convertValue(map, objectMapper.constructType(toValueType));
}
}
It is somehow quite specific but could not find a better solution :)

eliminating embedded actions from antlr4 grammar

I have an antlr grammar in which embedded actions are used to collect data bottom up and build aggregated data structures. A short version is given below, where the aggregated data structures are only printed (ie no classes are created for them in this short sample code).
grammar Sample;
top returns [ArrayList l]
#init { $l = new ArrayList<String>(); }
: (mid { $l.add($mid.s); } )* ;
mid returns [String s]
: i1=identifier 'hello' i2=identifier
{ $s = $i1.s + " bye " + $i2.s; }
;
identifier returns [String s]
: ID { $s = $ID.getText(); } ;
ID : [a-z]+ ;
WS : [ \t\r\n]+ -> skip ;
Its corresponding Main program is:
public class Main {
public static void main( String[] args) throws Exception
{
SampleLexer lexer = new SampleLexer( new ANTLRFileStream(args[0]));
CommonTokenStream tokens = new CommonTokenStream( lexer );
SampleParser parser = new SampleParser( tokens );
ArrayList<String> top = parser.top().l;
System.out.println(top);
}
}
And a sample test is:
aaa hello bbb
xyz hello pqr
Since one of the objectives of antlr is to keep the grammar file reusable and action-independent, I am trying to delete the actions from this file and move it to a tree walker. I took a first stab at it with the following code:
public class Main {
public static void main( String[] args) throws Exception
{
SampleLexer lexer = new SampleLexer( new ANTLRFileStream(args[0]));
CommonTokenStream tokens = new CommonTokenStream( lexer );
SampleParser parser = new SampleParser( tokens );
ParseTree tree = parser.top();
ParseTreeWalker walker = new ParseTreeWalker();
walker.walk( new Walker(), tree );
}
}
public class Walker extends SampleBaseListener {
public void exitTop(SampleParser.TopContext ctx ) {
System.out.println( "Exit Top : " + ctx.mid() );
}
public String exitMid(SampleParser.MidContext ctx ) {
return ctx.identifier() + " bye "; // ignoring the 2nd instance here
}
public String exitIdentifier(SampleParser.IdentifierContext ctx ) {
return ctx.ID().getText() ;
}
}
But obviously this is wrong, because at the least, the return types of the Walker methods should be void, so they dont have a way to return aggregated values upstream. Secondly, I dont see a way how to access the "i1" and "i2" from the walker code, so I am not able to differentiate between the two instances of "identifier" in that rule.
Any suggestions on how to separate the actions from the grammar for this purpose?
Should I use a visitor instead of a listener here, since the visitor has the capability of returning values? If I use a visitor, how do I solve the problem of differentiating between "i1" and "i2" (as mentioned above)?
Does a visitor perform its action only at the exit of a rule (unlike the listeners, which exist for both entry and exit)? For example, if I have to initialize the list at the entry of rule "top", how can I do it with a visitor, which executes only at the conclusion of a rule? Do I need a enterTop listener for that purpose?
EDIT: After the initial post, I have modified the rule "top" to create and return a list, and pass this list back to the main program for printing. This is to illustrate why I need an initialization mechanism for the code.

Based on what you are trying to do I think you may benefit from using ANTLR's BaseVisitor Class rather than the BaseListener Class.
Assuming your grammar is this (I generalized it and I'll explain the changes below):
grammar Sample;
top : mid* ;
mid : i1=identifier 'hello' i2=identifier ;
identifier : ID ;
ID : [a-z]+ ;
WS : [ \t\r\n]+ -> skip ;
Then your Walker would look like this:
public class Walker extends SampleBaseVisitor<Object> {
public ArrayList<String> visitTop(SampleParser.TopContext ctx) {
ArrayList<String> arrayList = new ArrayList<>();
for (SampleParser.MidContext midCtx : ctx.mid()) {
arrayList.add(visitMid(midCtx));
}
return arrayList;
}
public String visitMid(SampleParser.MidContext ctx) {
return visitIdentifier(ctx.i1) + " bye " + visitIdentifier(ctx.i2);
}
public String visitIdentifier(SampleParser.IdentifierContext ctx) {
return ctx.getText();
}
}
This allows you to visit and get the result of any rule you want.
You are able to access i1 and i2, as you labeled them through the visitor methods. Note that you don't really need the identifier rule since it contains only one token and you can access a token's text directly in the visitMid, but really it's personal preference.
You should also note that SampleBaseVisitor is a generic class, where the generic parameter determines the return type of the visit methods. For your example I set the generic parameter Object, but you could even make your own class which contains the information you want to preserve and use that for your generic parameter.
Here are some more useful methods which BaseVisitor inherits which may help you out.
Lastly, your main method would end up looking something like this:
public static void main( String[] args) throws IOException {
FileInputStream fileInputStream = new FileInputStream(args[0]);
SampleLexer lexer = new SampleLexer(CharStreams.fromStream(fileInputStream));
CommonTokenStream tokens = new CommonTokenStream(lexer);
SampleParser parser = new SampleParser(tokens);
for (String string : new Walker().visitTop(parser.top())) {
System.out.println(string);
}
}
As a side note, the ANTLRFileStream class is deprecated in ANTLR4.
It is recommend to use CharStreams instead.

As Terence Parr points out in the Definitive Reference, one main difference between Visitor and Listener is that the Visitor can return values. And that can be convenient. But Listener has a place too! What I do for listener is exemplified in this answer. Granted, there are simpler ways of parsing a list of numbers, but I made that answer to show a complete and working example of how to aggregate return values from a listener into a public data structure that can be consumed later.
public class ValuesListener : ValuesBaseListener
{
public List<double> doubles = new List<double>(); // <<=== SEE HERE
public override void ExitNumber(ValuesParser.NumberContext context)
{
doubles.Add(Convert.ToDouble(context.GetChild(0).GetText()));
}
}
Looking closely at the Listener class, I include a public data collection -- a List<double> in this case -- to collect values parsed or calculated in the listener events. You can use any data structure you like: another custom class, a list, a queue, a stack (great for calculations and expression evaluation), whatever you like.
So while the Visitor is arguably more flexible, the Listener is a strong contender too, depending on how you want to aggregate your results.

How do I get a song name and a rating inside an arraylist?

How do I make an arraylist with a song name, and a rating? Right now i just put a number infront of the song name, and then use substring to extract the rating, but I need to sort the ratings and find the songs with the highest rating - which is very complicated (in my world) using this method. Any way to link a song and a rating (1-5)? I was thinking about linkedlists, but that will be too complicated for my skills, if its the only way. All advice are very welcome. Thanks in advance.
Here is my main method:
// Create band
Band Beatles = new Band("Beatles");
// Add musicians
Beatles.musician.add("John Lennon");
Beatles.musician.add("Paul McCartney");
Beatles.musician.add("George Harrison");
Beatles.musician.add("Ringo Starr");
// Add songs (first number is popularity of the song)
Beatles.songs.add("5Yesterday");
Beatles.songs.add("5Let it be");
Beatles.songs.add("3I Saw Her Standing There");
Beatles.songs.add("2Misery");
Beatles.songs.add("4Love Me Do");
//Prints out all songs by Beatles, and their rating
System.out.println(Beatles.musician);
Beatles.getBandSongs();
Here is my Band class:
import java.util.ArrayList;
public class Band {
public String bandName;
public ArrayList<String> musician = new ArrayList<String>();
public ArrayList<String> songs = new ArrayList<String>();
// Constructor
public Band(String bandName) {
this.bandName = bandName;
}
public void getBandSongs(){
for (String s : songs) {
int rating = Integer.parseInt(s.substring(0,1));
s = s.substring(1);
System.out.println("Rating: " + rating + " - Song name: " + s);
}
}
}

I would define a new class named Song, with two fields: a String specifying the name of the song and an integer for the song's rating. Overwrite the default compareTo() method so that songs can be compared by their ratings. You can then put the songs into an ArrayList and use a method such as Collections.max() to get the song with the highest rating. Best of luck.

Declarative hyperlinking with Jersey and JAXB

I am trying to use Jersey declarative hyperlinking and JAXB to achieve something that seems fairly simple to me, but despite reading all the docs and examples I could find I can't get things to work.
I have a bookstore with books, each of which has just a title.
I would like GET /bookstore to return just an array of hyperlinks to books,
while GET /bookstore/some-title would return the actual serialized book attributes.
I am getting completely confused with resources and "representations", and with the way #Ref is supposed to work. What would be the cleanest way to design this?
The icing on the cake would be the ability to get either versions of the bookstore collection (shallow with just book URIs or deep with actual book attributes) based on a query parameter...
I have tried to add this method to bookstore:
#XmlElement
public BookRef[] getBookReferences()
{
BookRef[] refs = new BookRef[_books.size()];
for (int i = 0; i < refs.length; i++) {
refs[i] = new BookRef(_books.get(i).getTitle());
}
return refs;
}
with this BookRef class:
#XmlRootElement(name="book")
public class BookRef
{
private String _title;
public BookRef()
{
}
public BookRef(#PathParam("title") String title)
{
_title= title;
}
#Ref(resource=Book.class,
style = Ref.Style.ABSOLUTE,
bindings=#Binding(title="title", value="${instance.title}")
)
private URI _self;
#XmlElement
public URI getURI()
{
return _self;
}
}
... but that just yields (in JSON): {"bookReferences":null}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Number name entity recognition in Stanford - nlp

To get the exact text from any object of CoreLabel class simply use token.originalText() instead of token.toString() If you need anything else from these tokens, take a look at CoreLabel's javadoc.

Related

java 8 - iterate 2 hash maps and create new hash map with records only matching keys

Cucumber V5-V6 - passing complex object in feature file step

eliminating embedded actions from antlr4 grammar

How do I get a song name and a rating inside an arraylist?

Declarative hyperlinking with Jersey and JAXB

Categories

Resources