Solr Autosuggest & Spell check - search

I am using solr 3.6. (sorry to say!) and having a hard time implementing autosuggest and spellcheck simultaneously. I am using Suggester for autosuggest and do not want to use IndexBasedSpellChecker for spell checking. Is it possible to configure autosuggest and spellcheck in a single request handler ??
For example: if I search for 'blan', solr suggests 'blanket' and retrieve search results. However if I type 'blantet' or 'blanpet', I get 0 results and no suggestions or spelling corrections. I just need spell correction from 'blantet' to 'blanket' so that I can show 'Did you mean ?' on my page.
Using standard parser.
Thanks in advance.

Not sure about version 3.6, but following configurations working for me on higher version 6.
Solr-config.xml :
<searchComponent name="spellchecktest" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">text</str>
<lst name="spellchecker">
<str name="name">default</str>
<str name="field">name</str>
<str name="classname">solr.DirectSolrSpellChecker</str>
<str name="distanceMeasure">internal</str>
<float name="accuracy">0.5</float>
<str name="payloadField">address</str>
</lst>
<lst name="spellchecker">
<str name="name">wordbreak</str>
<str name="classname">solr.WordBreakSolrSpellChecker</str>
<str name="field">name</str>
<str name="combineWords">true</str>
<str name="breakWords">true</str>
<int name="maxChanges">10</int>
<int name="minBreakLength">2</int>
</lst>
</searchComponent>
<requestHandler name="/selectCheck" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<str name="df">name</str>
<str name="spellcheck">on</str>
<str name="spellcheck.extendedResults">false</str>
<str name="spellcheck.count">5</str>
<str name="spellcheck.alternativeTermCount">2</str>
<str name="spellcheck.maxResultsForSuggest">5</str>
<str name="spellcheck.collate">true</str>
<str name="spellcheck.collateExtendedResults">true</str>
<str name="spellcheck.maxCollationTries">5</str>
<str name="spellcheck.maxCollations">3</str>
</lst>
<arr name="last-components">
<str>spellchecktest</str>
</arr>
</requestHandler>
Schema.xml
<field name="name" type="text" indexed="true" stored="true" multiValued="false"/>
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
Local solr instance example : Letter n is missing in davider
Query :
http://localhost:8983/solr/basic/selectCheck?q=davider
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">40</int>
<lst name="params">
<str name="q">davider</str>
</lst>
</lst>
<result name="response" numFound="0" start="0" />
<lst name="spellcheck">
<lst name="suggestions">
<lst name="davider">
<int name="numFound">1</int>
<int name="startOffset">0</int>
<int name="endOffset">7</int>
<arr name="suggestion">
<str>davinder</str>
</arr>
</lst>
</lst>
<lst name="collations">
<lst name="collation">
<str name="collationQuery">davinder</str>
<int name="hits">1</int>
<lst name="misspellingsAndCorrections">
<str name="davider">davinder</str>
</lst>
</lst>
</lst>
</lst>
</response>

Related

Apache Nutch 1.17 indexer rabbit not working

I am trying to push crawled documents to the rabbit. Have followed all the docs available.
IndexWriters Mapping
RabbitMQ README
However, I can't manage to run indexer-rabbit. Looking at the logs, there's no even mentioning above indexer-rabbit. I am just trying to make it work before further configuration. I tried connecting to RabbitMQ with a small custom program. Everythings working.
I have included indexer in nutch-site.xml as well.
<property>
<name>plugin.includes</name>
<value>protocol-http|urlfilter-(regex|validator)|parse-(html|tika)|index-(basic|anchor)|indexer-rabbit|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
</property>
<property>
<name>rabbitmq.publisher.server.uri</name>
<value>amqp://guest:guest#172.17.0.2:5672/</value>
</property>
<property>
<name>publisher.queue.type</name>
<value>RabbitMQ</value>
</property>
Also, the mappings are default and seem quite right for testing.
<writer id="indexer_solr_1" class="org.apache.nutch.indexwriter.solr.SolrIndexWriter">
<parameters>
<param name="type" value="http"/>
<param name="url" value="http://localhost:8983/solr/nutch"/>
<param name="collection" value=""/>
<param name="weight.field" value=""/>
<param name="commitSize" value="1000"/>
<param name="auth" value="false"/>
<param name="username" value="username"/>
<param name="password" value="password"/>
</parameters>
<mapping>
<copy>
<!-- <field source="content" dest="search"/> -->
<!-- <field source="title" dest="title,search"/> -->
</copy>
<rename>
<field source="metatag.description" dest="description"/>
<field source="metatag.keywords" dest="keywords"/>
</rename>
<remove>
<field source="segment"/>
</remove>
</mapping>
</writer>
<writer id="indexer_rabbit_1" class="org.apache.nutch.indexwriter.rabbit.RabbitIndexWriter">
<parameters>
<param name="server.uri" value="amqp://guest:guest#172.17.0.2:5672/"/>
<param name="binding" value="false"/>
<param name="binding.arguments" value=""/>
<param name="exchange.name" value=""/>
<param name="exchange.options" value="type=direct,durable=true"/>
<param name="queue.name" value="nutch.queue"/>
<param name="queue.options" value="durable=true,exclusive=false,auto-delete=false"/>
<param name="routingkey" value=""/>
<param name="commit.mode" value="multiple"/>
<param name="commit.size" value="250"/>
<param name="headers.static" value=""/>
<param name="headers.dynamic" value=""/>
</parameters>
<mapping>
<copy>
<field source="title" dest="title,search"/>
</copy>
<rename>
<field source="metatag.description" dest="description"/>
<field source="metatag.keywords" dest="keywords"/>
</rename>
<remove>
<field source="content"/>
<field source="segment"/>
<field source="boost"/>
</remove>
</mapping>
</writer>
Does anybody have any idea what am I missing here?
Turned out it was my stupid mistake. Was just a slight problem. I didn't added index parameter in crawl command. Previous command looked like this.
./bin/crawl -s ./urls --hostdbupdate --hostdbgenerate --size-fetchlist 20 ./crawl 3
In this command, there no index parameter. So indexing was getting skipped. New command should be:
./bin/crawl -i -s ./urls --hostdbupdate --hostdbgenerate --size-fetchlist 20 ./crawl 3

How can the schema of solr be integrated with node?

I have implemented the SOLR schema.xml and it is working good.
But I don't understand why it is not getting integrated with my express app?
Means, the suggesters and filters all which I have defined is not getting applied to the express node app?
I am using solr-client package for this purpose.
`
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">mySuggester</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">fname</str>
<str name="weightField">price</str>
<str name="suggestAnalyzerFieldType">text</str>
<str name="buildOnStartup">false</str>
</lst>
<lst name="suggester">
<str name="name">altSuggester</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">lname</str>
<str name="weightField">price</str>
<str name="suggestAnalyzerFieldType">text</str>
<str name="buildOnStartup">false</str>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="solr.SearchHandler"
startup="lazy" >
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.dictionary">mySuggester</str>
<str name="suggest.dictionary">altSuggester</str>
<str name="suggest.count">10</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>`
This is my suggester which i have applied on table 'fname' and 'lname'.
My doubt is, this suggester works pretty well when I query from solr admin. But when I run client.search.q(query), this suggesters seems not implemented in my express(node) app.

How can I get CruiseControl.NET to attach error log in email?

This is a sample of my publisher setting:
<publishers>
<statistics />
<xmllogger logDir="F:\ccnet\Project\xxxxxx\Artifacts\buildlogs" />
<buildpublisher>
<sourceDir>$(buildDir)\_PublishedWebsites\$(projectName)</sourceDir>
<publishDir>$(webDir)</publishDir>
<useLabelSubDirectory>false</useLabelSubDirectory>
<alwaysPublish>false</alwaysPublish>
</buildpublisher>
<email mailport="25"
mailhostUsername="xxx#xx.xx"
mailhostPassword="xxxxxxxxx"
includeDetails="TRUE"
useSSL="FALSE">
<includeDetails>TRUE</includeDetails>
<from>xxxx#xx.xx</from>
<mailhost>xxxx.xxxx.xxx</mailhost>
<users>
<user name="Flemming" group="buildmaster" address="xx#xx.xxu" />
</users>
<groups>
<group name="buildmaster">
<notifications>
<notificationType>Always</notificationType>
</notifications>
</group>
</groups>
</email>
</publishers>
In webdashboard everything is fine. it shows all information from the standard xslt list.
After each build I get an email, but it only shows me information from header.xsl, unittest.xsl (shows no unit tests) and modification.xsl. But it doesn't show anything from compile.xsl.
xslfiles from ccnet.exe.config:
<xslFiles>
<file name="xsl\header.xsl"/>
<file name="xsl\compile.xsl"/>
<file name="xsl\msbuild.xsl"/>
<file name="xsl\modifications.xsl"/>
<!-- <file name="xsl\unittests.xsl"/>
<file name="xsl\fit.xsl"/>
<file name="xsl\fxcop-summary_1_36.xsl"/> -->
</xslFiles>
What am I missing here?
I tried to insert TRUE as well as having it as an attribute, but no difference.
I found the solution.
Instead of using compile.xsl in the xsllists I now use compile_msbuild.xsl
Now I get all errors and warnings in the mails !!!
From CCNET's EmailPublisher documentation:
Make sure that all of the Merge Publishers, along with the Xml Log Publisher task are done before the publisher, or else you won't be able to include output from the build in the email.

JAXB External Custom Binding XJC Issue - Parsing results in empty node

Forgive me if this is a duplicate. Here is my binding.xjb file. But now i am getting the regular error that the complex type target "AddBankVaultRplyType" is not found. I don't see any issue. Can somebody help me with this? I am listing the xsd that i am trying to customize
<jxb:bindings
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xjc="http://java.sun.com/xml/ns/jaxb/xjc"
xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
xmlns:pd="http://chubb.com/cpi/polsvc/xmlobj"
xmlns:inheritance="http://jaxb2-commons.dev.java.net/basic/inheritance"
jxb:extensionBindingPrefixes="inheritance"
jxb:version="2.1"
>
<jxb:bindings node="/xs:schema/xs:ServiceReply/xs:complexType[#name='AddBankVaultRplyType']">
<inheritance:extends>com.print.poc.AddressTypeHelper</inheritance:extends>
</jxb:bindings>
Here is the piece of XSD that i am trying to customize
<xs:schema xmlns:pd="http://com/polsvc/xmlobj" xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://com/polsvc/xmlobj" elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:complexType name="AddBankVaultRplyType">
</xs:complexType>
<xs:element name="ServiceReply">
<xs:complexType>
<xs:sequence>
<xs:element name="ReplyHeader" type="pd:MsgHeaderType"/>
<xs:element name="RequestHeader" type="pd:MsgHeaderType"/>
<xs:choice>
<xs:element name="AddBankVaultReply" type="pd:AddBankVaultRplyType"/>
</xs:choice>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Now if i run XJC it is saying me that the target "/xs:schema/xs:ServiceReply/xs:complexType[#name='AddBankVaultRplyType']" results in empty node. What is the mistake i am doing here
You will need to wrap in a bindings that has the schema location set. It should be something like:
<jxb:bindings
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xjc="http://java.sun.com/xml/ns/jaxb/xjc"
xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
xmlns:pd="http://chubb.com/cpi/polsvc/xmlobj"
xmlns:inheritance="http://jaxb2-commons.dev.java.net/basic/inheritance"
jxb:extensionBindingPrefixes="inheritance"
version="2.1">
<jxb:bindings schemaLocation="your-schema.xsd">
<jxb:bindings node="//xs:complexType[#name='AddBankVaultRplyType']">
<inheritance:extends>com.print.poc.AddressTypeHelper</inheritance:extends>
</jxb:bindings>
</jxb:bindings>
</jxb:bindings>
For more information:
http://jaxb.java.net/guide/Dealing_with_errors.html
I finally got mine workign with subclassing as well as adding #XmlRootElement to those dang complexTypes that are used by a root element(I don't get why JAXB doesn't add it for me, but this does the trick of doing that since JAXB doesn't)
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<jaxb:bindings
xmlns:jaxb="http://java.sun.com/xml/ns/jaxb" xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xjc="http://java.sun.com/xml/ns/jaxb/xjc"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:annox="http://annox.dev.java.net"
xsi:schemaLocation="http://java.sun.com/xml/ns/jaxb http://java.sun.com/xml/ns/jaxb/bindingschema_2_0.xsd
http://annox.dev.java.net "
jaxb:extensionBindingPrefixes="xjc annox"
version="2.1">
<jaxb:globalBindings>
<jaxb:serializable uid="1"/>
<!-- All generated classes must have MySignature interface (supplied in dependencies) -->
<xjc:superClass name="com.cigna.framework.DataObject"/>
<xjc:superInterface name="com.cigna.framework.InterfaceTest"/>
<!-- All temporal fields are implemented as Joda DateTime and use DateUtils as an adapter -->
<jaxb:javaType
name="org.joda.time.DateTime"
xmlType="xs:time"
parseMethod="com.cigna.framework.util.DateUtil.stringToDateTime"
printMethod="com.cigna.framework.util.DateUtil.dateTimeToString"
/>
</jaxb:globalBindings>
<!-- Application of annotations to selected classes within schemas -->
<!-- org.example.SomeRootType #XmlRootElement -->
<jaxb:bindings schemaLocation="../schemas/externalaction_2012_03.xsd" node="/xs:schema">
<jaxb:schemaBindings >
<jaxb:package name="com.framework.action"></jaxb:package>
</jaxb:schemaBindings>
</jaxb:bindings>
<jaxb:bindings schemaLocation="../schemas/common_2012_04.xsd" node="/xs:schema">
<jaxb:schemaBindings >
<jaxb:package name="com.framework.common"></jaxb:package>
</jaxb:schemaBindings>
<jaxb:bindings node="xs:complexType[#name='PersonNameType']">
<annox:annotate>
<annox:annotate annox:class="javax.xml.bind.annotation.XmlRootElement" name="SomeRootType"/>
</annox:annotate>
</jaxb:bindings>
</jaxb:bindings>
<jaxb:bindings schemaLocation="../schemas/utilities_2012_03.xsd" node="/xs:schema">
<jaxb:schemaBindings >
<jaxb:package name="com.framework.util"></jaxb:package>
</jaxb:schemaBindings>
</jaxb:bindings>
</jaxb:bindings>
Of course I struggled with the pom.xml alot but finally came to this solution which worked for me.
<plugin>
<groupId>org.jvnet.jaxb2.maven2</groupId>
<artifactId>maven-jaxb2-plugin</artifactId>
<version>0.8.1</version>
<executions>
<execution>
<id>process-xsd</id>
<goals>
<goal>generate</goal>
</goals>
<phase>generate-sources</phase>
<configuration>
<schemaIncludes>
<include>schemas/*.xsd</include>
</schemaIncludes>
<bindingIncludes>
<include>schemas/*.xjb.xml</include>
</bindingIncludes>
<generateDirectory>${project.build.directory}/generated-sources</generateDirectory>
<extension>true</extension>
<args>
<arg>-Xannotate</arg>
</args>
<plugins>
<plugin>
<groupId>org.jvnet.jaxb2_commons</groupId>
<artifactId>jaxb2-basics-annotate</artifactId>
<version>0.6.3</version>
</plugin>
<plugin>
<groupId>org.jvnet.jaxb2_commons</groupId>
<artifactId>jaxb2-basics</artifactId>
<version>0.6.3</version>
</plugin>
</plugins>
</configuration>
</execution>
</executions>
</plugin>
later,
Dean

Compiling multiple schemas into different packages using JAXB 2.1

I have a CommonTypes.xsd which I'm including in my all other XSDs using xs:include. I get
Multiple <schemaBindings> are defined for the target namespace ""
when I try to compile it into different packages using binding files. Please tell me whether there is a way to compile them into different packages. I'm using jaxb 2.1
Yeah, there is a way.
Assuming:
xsd/common/common.xsd
xsd/foo/foo.xsd
In the common directory place common.xjb:
<jxb:schemaBindings>
<jxb:package name="mypkg.common">
</jxb:package>
</jxb:schemaBindings>
In the foo directory place foo.xjb:
<jxb:schemaBindings>
<jxb:package name="mypkg.foo">
</jxb:package>
</jxb:schemaBindings>
In the build.xml file, create one xjc task for each:
<xjc destdir="${app.src}" readOnly="true" package="mypkg.common">
<schema dir="${app.schema}/common" includes="*.xsd" />
<binding dir="${app.schema}/common" includes="*.xjb" />
</xjc>
<xjc destdir="${app.src}" readOnly="true" package="mypkg.foo">
<schema dir="${app.schema}/foo" includes="*.xsd" />
<binding dir="${app.schema}/foo" includes="foo.xjb" />
</xjc>
You need to make sure that common.xsd has a targetNameSpace that is different from foo.xsd's targetNameSpace.
As stated already by Ben there is no way to do that if they have the same namespace.
But how to do it if you do have different namespaces?
<jxb:bindings xmlns:jxb="http://java.sun.com/xml/ns/jaxb" version="2.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema" >
<jxb:bindings namespace="http://www.openapplications.org/oagis/9/unqualifieddatatypes/1.1" schemaLocation="oagi/Common/UNCEFACT/ATG/CoreComponents/UnqualifiedDataTypes.xsd" >
<jxb:schemaBindings>
<jxb:package name="com.test.oagi.udt"/>
</jxb:schemaBindings>
</jxb:bindings>
<jxb:bindings namespace="http://www.openapplications.org/oagis/9/codelists" schemaLocation="oagi/Common/OAGi/Components/CodeLists.xsd" >
<jxb:schemaBindings>
<jxb:package name="com.test.oagi.cl"/>
</jxb:schemaBindings>
</jxb:bindings>
</jxb:bindings>
but be sure you do not use the command line parameter -p, since that will override your config.
I've meet the same problem and haven't solve it yet, but I'm afraid that it can't be possible to generate XSD into differents packages :
It is not legal to have more than one <jaxb:schemaBindings> per namespace, so it is impossible to have two schemas in the same target namespace compiled into different Java packages
from Compiler Restrictions at the end of this page
but if some one find some work around, just inform us please
I know it is an old post, but, as there is no answer for the exact question, here is my proposal:
As mmoossen explained, the trick is to specify different namespaces for the XSDs.
But, adding a namespace attribute in the jxb:bindings tag doesn't work:
<jxb:bindings namespace="http://www.openapplications.org/oagis/9/unqualifieddatatypes/1.1" schemaLocation="oagi/Common/UNCEFACT/ATG/CoreComponents/UnqualifiedDataTypes.xsd" >
Instead of that, you need to add a targetNamespace attribute to the xs:schema tags of your XSDs:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified" attributeFormDefault="unqualified"
targetNamespace="some.namespace"
version="1.0">
Once done, you will be able to have 1 external customization file (.xjb) declaring different schemaBindings, each of them possibly using a different package:
<?xml version="1.0" encoding="UTF-8"?>
<jaxb:bindings version="2.1"
xmlns:jaxb="http://java.sun.com/xml/ns/jaxb"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://java.sun.com/xml/ns/jaxb http://java.sun.com/xml/ns/jaxb/bindingschema_2_0.xsd"
jaxb:extensionBindingPrefixes="xjc annox inherit">
<jaxb:bindings schemaLocation="MyFirstXSD.xsd">
<jaxb:schemaBindings>
<jaxb:package name="com.test.a" />
</jaxb:schemaBindings>
</jaxb:bindings>
<jaxb:bindings schemaLocation="MySecondXSD.xsd">
<jaxb:schemaBindings>
<jaxb:package name="com.test.b" />
</jaxb:schemaBindings>
</jaxb:bindings>
<jaxb:bindings schemaLocation="MyThirdXSD.xsd">
<jaxb:schemaBindings>
<jaxb:package name="com.test.c" />
</jaxb:schemaBindings>
</jaxb:bindings>
</jaxb:bindings>
Can be done as mentioned in jaxb maven plugin usage page in case of having Multiple schemas with different configuration.
Separate packages can be configured for each schema.
<packageName>se.west</packageName>
complete example configuration below:
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>jaxb2-maven-plugin</artifactId>
<version>${project.version}</version>
<executions>
<execution>
<id>xjc-schema1</id>
<goals>
<goal>xjc</goal>
</goals>
<configuration>
<!-- Use all XSDs under the west directory for sources here. -->
<sources>
<source>src/main/xsds/west</source>
</sources>
<!-- Package name of the generated sources. -->
<packageName>se.west</packageName>
</configuration>
</execution>
<execution>
<id>xjc-schema2</id>
<goals>
<goal>xjc</goal>
</goals>
<configuration>
<!-- Use all XSDs under the east directory for sources here. -->
<sources>
<source>src/main/xsds/east</source>
</sources>
<!-- Package name of the generated sources. -->
<packageName>se.east</packageName>
<!--
Don't clear the output directory before generating the sources.
Clearing the output directory removes the se.west schema from above.
-->
<clearOutputDir>false</clearOutputDir>
</configuration>
</execution>
</executions>

Resources