Match each keyword at least once - search

I'm trying to configure the scoring logic in Apache Solr 4.10.
I want to maximize the number of DISTINCT keywords matched.
In other words, each keyword should be matched at least once.
For example, this is what I'm currently seeing:
q=foo+bar+baz
Result:
doc1: foo foo foo foo foo foo foo
doc2: foo bar bar bar
doc3: foo bar baz
This is not what I want. I want doc3 to appear at the top (since all keywords are matched), then doc2, then doc1. I tried setting mm=100% but then ONLY doc3 is returned and doc1 and doc2 are not displayed at all.
Any ideas?

If you use omitTermFreqAndPositions="true" in your field definition, you will get the result you want; the number of times a search term is matched in the document will not change the score; the score will then only be impacted by the number of different search terms that match the document.

Related

How can I prepend a common sequence of characters to a consequtive sequence of lines in Vim?

Let's say I have the following:
foo bar bub
baz qux doo
And I want to end up with:
* foo bar bub
* baz qux doo
Is there any way short of qqI* <Esc>jq3#q, or using a macro? I can't imagine that there isn't. Thanks
qqI* <Esc>jq3#q is a macro.
Another way:
:,+norm I*<space><CR>
(as benji commented, ,+ is a short notation for the .,+1 range, meaning "from this line to the next one")
Another way:
:,+s/^/*<space><CR>
Another way:
<C-v>jI*<space><Esc>
Another way:
I*<space><Esc>j^*

How to add only non-null item to a list in Groovy

I want to add non null items to a List. So I do this:
List<Foo> foos = []
Foo foo = makeFoo()
if (foo)
foos << foo
But is there a way to do it in a single operation (without using findAll after the creation of the list). Like:
foos.addNonNull(makeFoo())
Another alternative is to use a short circuit expression:
foo && foos << foo
The foo variable must evaluate to true for the second part to be evaluated. This is a common practice in some other languages but I'd hesitate to use it widely in groovy due to readability issues and conventions.
No, you'd need to use an if, or write your own addNonNull method (which just uses an if)
Also:
if( foo ) {
probably isn't enough, as this will skip empty strings, or 0 if it returns integers
You'd need
if( foo != null ) {
The answer is YES! we can get rid of assigning a variable
Foo foo = makeFoo()//we can ditch this
The answer is NO we can't get rid of the condition. BUT we can make it more compact.
Here's how
List<Foo> foos = []
foos += (makeFoo()?:[]);
The trick is groovy's "+" operator which works differently based on what is to the left and what is to the right of the "+". It just so happens that if what is on the left is a list and what is on the right is an empty list, nothing gets added to the list on the left.
Pros are it is quick to type and compact.
Cons are it is not instantly obvious what is happening to most people
AND we replaced the variable assignment with an extra operation. Groovy is
going to try to do something to List foos no matter what, it just so happens that in the second case the result of that operation gives us a desired result.

How do I restrict the types in a heterogenous list?

I am currently trying to create a (sort of) typesafe xml like syntax embedded into Haskell. In the end I am hoping to achieve something like this:
tree = group [arg1 "str", arg2 42]
[item [foo, bar] []
,item [foo, bar] []
]
where group and item are of kind Node :: [Arg t] -> [Node c] -> Node t. If this doesn't make any sense it is most probably because I have no idea what I am doing :)
My question now is how to make the type system prevent me from giving 'wrong' arguments to a Node. Eg Nodes of type Group only may have arguments of type Arg1 and Arg2 but Items may have arguments of type Foo and Bar.
I guess the bottom line question is: how do i restrict the types in a heterogenous list?
Example of the (user) syntax i am trying to achieve:
group .: arg1 "str" .: arg2 42
item .: foo .: bar
item .: foo .: bar
where (.:) is a function that sets the parameter in the node. This would represent a group with some parameters containing two items.
Additionally there would be some (pseudo) definition like:
data Node = Node PossibleArguments PossibleChildNodes
type Group = Node [Arg1, Arg2] [Item]
type Item = Node [Foo, Bar] []
I am searching for a way to catch usage errors by the typechecker.
What you have doesn't sound to me like you need a heterogeneous list. Maybe you're looking for something like this?
data Foo = Foo Int
data Bar = Bar Int
data Arg = StringArg String | IntArg Int | DoubleArg Double
data Tree = Group Arg Arg [Item]
data Item = Item Foo Bar
example :: Tree
example = Group (StringArg "str") (IntArg 42)
[Item (Foo 1) (Bar 2), Item (Foo 12) (Bar 36)]
Note that we could even create a list of Args of different "sub-types". For example, [StringArg "hello", IntArg 3, DoubleArg 12.0]. It would still be a homogeneous list, though.
===== EDIT =====
There are a few ways you could handle the "default argument" situation. Suppose the Bar argument in an item is optional. My first thought is that while it may be optional for the user to specify it, when I store the data I want to include the default argument. That way,
determining a default is separated from the code that actually does something with it. So,
if the user specifies a Foo of 3, but doesn't supply a Bar, and the default is Bar 77, then I create my item as:
Item (Foo 3) (Bar 77)
This has the advantage that functions that operate on this object don't need to worry about defaults; both parameters will always be present as far as they are concerned.
However, if you really want to omit the default arguments in your data structure, you could do somthing like this:
data Bar = Bar Int | DefaultBar
example = Group (StringArg "str") (IntArg 42)
[Item (Foo 1) (Bar 2), Item (Foo 12) DefaultBar]
Or even:
data Item = Item Foo Bar | ItemWithDefaultBar Foo
===== Edit #2 =====
So perhaps you could use something like this:
data ComplicatedItem = ComplicatedItem
{
location :: (Double, Double),
size :: Int,
rotation :: Double,
. . . and so on . . .
}
defaultComplicatedItem = ComplicatedItem { location = (0.0,0.0), size = 1, rotation = 0.0), ... }
To create a ComplicatedItem, the user only has to specify the non-default parameters:
myComplicatedItem = defaultComplicatedItem { size=3 }
If you add new paramters to the ComplicatedItem type, you need to update defaultComplicatedItem, but the definition for myComplicatedItem doesn't change.
You could also override the show function so that it omits the default parameters when printing.
Based on the ensuing discussion, it sounds like what you want is to create a DSL (Domain-Specific Language) to represent XML.
One option is to embed your DSL in Haskell so it can appear in Haskell source code. In general, you can do this by defining the types you need, and providing a set of functions to work with those types. It sounds like this is what you're hoping to do. However, as an embedded DSL, it will be subject to some constraints, and this is the problem you're encountering. Perhaps there is a clever trick to do what you want, maybe something involving type functions, but I can't think of anything at present. If you want to keep trying, maybe add the tags dsl and gadt to your question, catch the attention of people who know more about this stuff than I do. Alternatively, you might be able to use something like Template Haskell or Scrap Your Boilerplate to allow your users to omit some information, which would them be "filled in" before Haskell "sees" it.
Another option is to have an external DSL, which you parse using Haskell. You could define a DSL, but maybe it would be easier to just use XML directly with a suitable DTD. There are Haskell libraries for parsing XML, of course.

How to use the term position parameter in Xapian query constructors

Xapian docs talk about a query constructor that takes a term position parameter, to be used in phrase searches:
Quote:
This constructor actually takes a couple of extra parameters, which
may be used to specify positional and frequency information for terms
in the query:
Xapian::Query(const string & tname_,
Xapian::termcount wqf_ = 1,
Xapian::termpos term_pos_ = 0)
The term_pos represents the position of the term in the query. Again,
this isn't useful for a single term query by itself, but is used for
phrase searching, passage retrieval, and other operations which
require knowledge of the order of terms in the query (such as
returning the set of matching terms in a given document in the same
order as they occur in the query). If such operations are not
required, the default value of 0 may be used.
And in the reference, we have:
Xapian::Query::Query ( const std::string & tname_,
Xapian::termcount wqf_ = 1,
Xapian::termpos pos_ = 0
)
A query consisting of a single term.
And:
typedef unsigned termpos
A term position within a document or query.
So, say I want to build a query for the phrase: "foo bar baz", how do I go about it?!
Does term_pos_ provide relative position values, ie define the order of terms within the document:
(I'm using here the python bindings API, as I'm more familiar with it)
q = xapian.Query(xapian.Query.OP_AND, [xapian.Query("foo", wqf, 1),xapian.Query("bar", wqf,2),xapian.Query("baz", wqf,3)] )
And just for the sake of testing, suppose we did:
q = xapian.Query(xapian.Query.OP_AND, [xapian.Query("foo", wqf, 3),xapian.Query("bar", wqf, 4),xapian.Query("baz", wqf, 5)] )
So this would give the same results as the previous example?!
And suppose we have:
q = xapian.Query(xapian.Query.OP_AND, [xapian.Query("foo", wqf, 2),xapian.Query("bar", wqf, 4),xapian.Query("baz", wqf, 5)] )
So now this would match where documents have "foo" "bar" separated with one term, followed by "baz" ??
Is it as such, or is it that this parameter is referring to absolute positions of the indexed terms?!
Edit:
And how is OP_PHRASE related to this? I find some online samples using OP_PHRASE as such:
q = xapian.Query(xapian.Query.OP_PHRASE, term_list)
This makes obvious sense, but then what is the role of the said term_pos_ constructor in phrase searches - is it a more surgical way of doing things!?
int pos = 1;
std::list<Xapian::Query> subs;
subs.push_back(Xapian::Query("foo", 1, pos++));
subs.push_back(Xapian::Query("bar", 1, pos++));
querylist.push_back(Xapian::Query(Xapian::Query::OP_PHRASE, subs.begin(), subs.end()));

Manipulate strings in prolog

I have a list of stirings I need to manipulate and write out.
I get the strings the usual way with H|Tail recursion.
H will look something like "statement(foo, foo2, foo3, foo4, foo5)"
I want to be able to write out only foo, foo2, foo3 on separate lines
out: foo
bar: foo2
...
...
div: foo5
Convert string to codes, codes to a term, then destructure the term:
/* SWI Prolog
*/
read_from_string(String, Term) :-
string_to_list(String, List),
read_from_chars(List, Term).
demo:-
String="statement(foo, foo2, foo3,foo4,foo5)",
read_from_string(String, Term),
Term =.. [Fst,Snd,Thr|Rest],
write(functor:Fst),nl,
write(arg1:Snd),nl,
write(arg2:Thr),nl,
write(rest:Rest),nl.
Demo session:
?- demo.
functor:statement
arg1:foo
arg2:foo2
rest:[foo3,foo4,foo5]
true.
Choose the items to print according to their respective positions in the list that resulted from univ(=..). Here they were all printed.

Resources