Nutch job failing when sending data to Solr

Nutch job failing when sending data to Solr - search

I've been trying various things with no avail. My configuration of Nutch/Solr is based on this:
http://ubuntuforums.org/showthread.php?t=1532230
Now that I have Nutch and Solr up and running, I would like to use Solr to index the crawl data. Nutch successfully crawls the domain I specified but fails when I run the command to communicate that data to Solr. Here's the command:
bin/nutch solrindex http://solr:8181/solr/ crawl/crawldb crawl/linkdb crawl/segments/*
Here's the output:
Indexer: starting at 2013-09-12 10:34:43
Indexer: deleting gone documents: false
Indexer: URL filtering: false
Indexer: URL normalizing: false
Active IndexWriters :
SOLRIndexWriter
solr.server.url : URL of the SOLR instance (mandatory)
solr.commit.size : buffer size when sending to SOLR (default 1000)
solr.mapping.file : name of the mapping file for fields (default solrindex-mapping.xml)
solr.auth : use authentication (default false)
solr.auth.username : use authentication (default false)
solr.auth : username for authentication
solr.auth.password : password for authentication
Indexer: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/usr/share/apache-nutch-1.7/crawl/linkdb/crawl_fetch
Input path does not exist: file:/usr/share/apache-nutch-1.7/crawl/linkdb/crawl_parse
Input path does not exist: file:/usr/share/apache-nutch-1.7/crawl/linkdb/parse_data
Input path does not exist: file:/usr/share/apache-nutch-1.7/crawl/linkdb/parse_text
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:197)
at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:40)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1081)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1073)
at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1353)
at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:123)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:185)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:195)
I've also tried another command after much Googling:
bin/nutch solrindex http://solr:8181/solr/ crawl/crawldb -linkdb crawl/linkdb crawl/segments/*
With this output:
Indexer: starting at 2013-09-12 10:45:51
Indexer: deleting gone documents: false
Indexer: URL filtering: false
Indexer: URL normalizing: false
Active IndexWriters :
SOLRIndexWriter
solr.server.url : URL of the SOLR instance (mandatory)
solr.commit.size : buffer size when sending to SOLR (default 1000)
solr.mapping.file : name of the mapping file for fields (default solrindex-mapping.xml)
solr.auth : use authentication (default false)
solr.auth.username : use authentication (default false)
solr.auth : username for authentication
solr.auth.password : password for authentication
Indexer: java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:123)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:185)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:195)
Does anyone have any ideas of how to overcome these errors?

Was expecting the same error on fresh Solr 5.2.1 and Nutch 1.10:
2015-07-30 20:56:23,015 WARN mapred.LocalJobRunner - job_local_0001
org.apache.solr.common.SolrException: Not Found
Not Found
request: http://127.0.0.1:8983/solr/update?wt=javabin&version=2
So i have created a collection (or core, i am not an expert in SOLR):
bin/solr create -c demo
And changed URL in Nutch indexing script:
bin/nutch solrindex http://127.0.0.1:8983/solr/demo crawl/crawldb -linkdb crawl/linkdb crawl/segments/*
I know that the question is rather old, but maybe i will help somebody with it...

Did you see the log in solr that revealed the error reason. I had ever same problem in nutch, and the solr's log showed a message "unknown field 'host'". After I modified the schema.xml for solr, the problem vanished.

Related

Spark ElasticSearch EsHadoopIllegalArgumentException unable to find keystore with valid URI

I'm trying to connect spark to my elasticsearch with SSL.
Setup
Spark 2.4.0 from CDH 6.3.2 (Cloudera)
ElasticSearch 7.6.1 (Open Distro)
elasticsearch-hadoop-7.6.1.jar
Considering
1) I already managed to authenticate logstash with SSL and pkcs12 keystore manually created
2) Connexion Spark to ES works without security
Here spark conf provided :
spark.es.nodes=node1
spark.es.port=9200
spark.es.net.ssl=true
spark.es.net.ssl.keystore.location= ===> See below what i tried
spark.es.net.ssl.keystore.type=PKCS12
spark.es.net.ssl.cert.allow.self.signed=true
spark.es.net.http.auth.user=admin
spark.es.net.http.auth.pass=admin
spark.es.nodes.wan.only=false //tried true
Doing
spark.read.format("org.elasticsearch.spark.sql")
.option("es.query", "?q=*:*")
.load("spark/docs")
.show
====================================================
FileSystem Values tried with spark.es.net.ssl.keystore.location (after copying admin.pkcs12 on all nodes)
file:///PATH/certs/admin.pkcs12
Error :
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
... elided
Caused by: org.elasticsearch.hadoop.EsHadoopIllegalStateException: Cannot initialize SSL - Get Key failed: null
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.createSSLContext(SSLSocketFactory.java:175)
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.getSSLContext(SSLSocketFactory.java:160)
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.createSocket(SSLSocketFactory.java:129)
at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport.doExecute(CommonsHttpTransport.java:685)
at org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport.execute(CommonsHttpTransport.java:664)
at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:116)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:432)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:428)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:388)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:392)
at org.elasticsearch.hadoop.rest.RestClient.get(RestClient.java:168)
at org.elasticsearch.hadoop.rest.RestClient.mainInfo(RestClient.java:745)
at org.elasticsearch.hadoop.rest.InitializationUtils.discoverClusterInfo(InitializationUtils.java:330)
... 61 more
Caused by: java.security.UnrecoverableKeyException: Get Key failed: null
at sun.security.pkcs12.PKCS12KeyStore.engineGetKey(PKCS12KeyStore.java:435)
at java.security.KeyStore.getKey(KeyStore.java:1023)
at sun.security.ssl.SunX509KeyManagerImpl.<init>(SunX509KeyManagerImpl.java:133)
at sun.security.ssl.KeyManagerFactoryImpl$SunX509.engineInit(KeyManagerFactoryImpl.java:70)
at javax.net.ssl.KeyManagerFactory.init(KeyManagerFactory.java:256)
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.loadKeyManagers(SSLSocketFactory.java:217)
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.createSSLContext(SSLSocketFactory.java:173)
... 78 more
Caused by: java.lang.NullPointerException
at sun.security.pkcs12.PKCS12KeyStore.engineGetKey(PKCS12KeyStore.java:374)
... 84 more
====================================================
I copied a keystore a valid admin.pkcs12 to hdfs => /user/company/ with 777 rights, (as i'm writing, is it too permissive, like ssh ?)
//returns true
FileSystem.get(spark.sparkContext.hadoopConfiguration).exists(new Path("hdfs://namenode:8020/user/company/admin.pkcs12"))
HDFS Values tried with spark.es.net.ssl.keystore.location
hdfs:///namenode:8020/user/company/admin.pkcs12
hdfs://namenode:8020/user/company/admin.pkcs12
/user/company/admin.pkcs12
Error :
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
... elided
Caused by: org.elasticsearch.hadoop.EsHadoopIllegalStateException: Cannot initialize SSL - Expected to find keystore file at [...] but was unable to. Make sure that it is available on the classpath, or if not, that you have specified a valid URI.
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.createSSLContext(SSLSocketFactory.java:175)
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.getSSLContext(SSLSocketFactory.java:160)
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.createSocket(SSLSocketFactory.java:129)
at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport.doExecute(CommonsHttpTransport.java:685)
at org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport.execute(CommonsHttpTransport.java:664)
at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:116)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:432)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:428)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:388)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:392)
at org.elasticsearch.hadoop.rest.RestClient.get(RestClient.java:168)
at org.elasticsearch.hadoop.rest.RestClient.mainInfo(RestClient.java:745)
at org.elasticsearch.hadoop.rest.InitializationUtils.discoverClusterInfo(InitializationUtils.java:330)
... 61 more
Caused by: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Expected to find keystore file at [...] but was unable to. Make sure that it is available on the classpath, or if not, that you have specified a valid URI.
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.loadKeyStore(SSLSocketFactory.java:195)
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.loadKeyManagers(SSLSocketFactory.java:215)
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.createSSLContext(SSLSocketFactory.java:173)
I tried JKS too.
What am I missing ?

//Works
file:///PATH/certs/admin.pkcs12
I was getting this error because of the missing password.
spark.es.net.ssl.keystore.pass=PASSWORD

Nutch-cygwin: ERROR output.FileOutputCommitter - Mkdirs failed to create file

I just started using Nutch on windows 10 with cygwin. When I run this command from cygwin "bin/nutch inject crawl/crawldb urls", I get below error:
$ bin/nutch inject crawl/crawldb urls
Injector: starting at 2020-04-22 01:05:45
Injector: crawlDb: crawl/crawldb
Injector: urlDir: urls
Injector: Converting injected urls to crawl db entries.
Injecting seed URL file file:/F:/Academic/Spring20/IR/apache-nutch-1.16-bin/apache-nutch-1.16/urls/seed.txt
Injector job did not succeed, job status: FAILED, reason: NA
Injector: java.lang.RuntimeException: Injector job did not succeed, job status: FAILED, reason: NA
at org.apache.nutch.crawl.Injector.inject(Injector.java:443)
at org.apache.nutch.crawl.Injector.run(Injector.java:569)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.nutch.crawl.Injector.main(Injector.java:534)
When I check in logs/hadoop.log, I found below error messages:
2020-04-22 01:05:47,806 ERROR output.FileOutputCommitter - Mkdirs failed to create file:/F:/Academic/Spring20/IR/apache-nutch-1.16-bin/apache-nutch-1.16/crawl/crawldb/crawldb-1609992588/_temporary/0^M
2020-04-22 01:05:48,378 INFO regex.RegexURLNormalizer - can't find rules for scope 'inject', using default^M
2020-04-22 01:05:48,589 WARN mapred.LocalJobRunner - job_local456428503_0001^M
java.lang.Exception: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in localfetcher#1^M
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:491)^M
Can someone please help how to resolve this?

Nutch 2 with Cassandra as a storage is not crawling data properly

I am using Nutch 2.x using Cassandra as storage. Currently I am just crawling only one website, and data is getting loaded to Cassandra in byte code format.
When I use readdb command in Nutch, I did get any useful crawling data.
Below are the details of different files and output I am getting:
========== command to run crawler =====================
bin/crawl urls/ crawlDir/ http://localhost:8983/solr/ 3
======================== seed.txt data ==========================
http://www.ft.com
=== Output of readdb command to read data from cassandra webpage.f table======
~/Documents/Softwares/apache-nutch-2.3/runtime/local$ bin/nutch readdb -dump data -content
~/Documents/Softwares/apache-nutch-2.3/runtime/local/data$ cat part-r-00000
http://www.ft.com/ key: com.ft.www:http/
baseUrl: null
status: 4 (status_redir_temp)
fetchTime: 1426888912463
prevFetchTime: 1424296904936
fetchInterval: 2592000
retriesSinceFetch: 0
modifiedTime: 0
prevModifiedTime: 0
protocolStatus: (null)
parseStatus: (null)
title: null
score: 1.0
marker _injmrk_ : y
marker dist : 0
reprUrl: null
batchId: 1424296906-20007
metadata _csh_ :
===============content of regex-urlfilter.txt ======================
# skip file: ftp: and mailto: urls
-^(file|ftp|mailto):
# skip image and other suffixes we can't yet parse
# for a more extensive coverage use the urlfilter-suffix plugin
-\.(gif|GIF|jpg|JPG|png|PNG|ico|ICO|css|CSS|sit|SIT|eps|EPS|wmf|WMF|zip|ZIP|ppt|PPT|mpg|MPG|xls|XLS|gz|GZ|rpm|RPM|tgz|TGZ|mov|MOV|exe|EXE|jpeg|JPEG|bmp|BMP|js|JS)$
# skip URLs containing certain characters as probable queries, etc.
-[?*!#=]
# skip URLs with slash-delimited segment that repeats 3+ times, to break loops
-.*(/[^/]+)/[^/]+\1/[^/]+\1/
# accept anything else
+.
===========content of log file which is bothering me ======================
2015-02-18 13:57:51,253 ERROR store.CassandraStore -
2015-02-18 13:57:51,253 ERROR store.CassandraStore - [Ljava.lang.StackTraceElement;#653e3e90
2015-02-18 14:01:45,537 INFO connection.CassandraHostRetryService - Downed Host Retry service started with queue size -1 and retry delay 10s
Please let me know if you need more information.
Can someone please help me ?
Thanks in advance.
-Sumant

I just started using Nutch and Cassandra today. I am not receiving the same errors in my log file during a crawl.
Did you double check your nutch-site.xml and gora.properties settings? This is how I currently have my files configured.
nutch-site.xml
<configuration>
<property>
<name>http.agent.name</name>
<value>My Spider</value>
</property>
<property>
<name>storage.data.store.class</name>
<value>org.apache.gora.cassandra.store.CassandraStore</value>
<description>Default class for storing data</description>
</property>
</configuration>
gora.properties
#############################
# CassandraStore properties #
#############################
gora.datastore.default=org.apache.gora.cassandra.store.CassandraStore
gora.cassandrastore.servers=localhost:9160

nutch crawling with solr, job failed for depth >= 2

I am trying to run Nutch crawler on my local machine and want to index the retrieved data using solr.
Using apache-nutch-1.9 and solr-4.10.1
As of now they are installed and seem to run for depth = 1.
I get the following error when depth = 2
bin/crawl urls/ crawl http://localhost:8983/solr 2
.....
Indexer: java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:114)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:176)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:186)

Importing Users From LDAP in Liferay

I have exactly the same issue since 5 days, Impossible to import Users and group from OpenLDAP to Liferay CE 6.1 ...
Here the structure of my OpenLDAP :
World
--tn
----com
------domain
--------admin
--------Computers
--------domain
--------Groups
----------IT
------------cn
------------gidNumber
------------objectClass
------------memberUid (n member)
--------Users
----------uid0001
------------cn
------------gidNumber
------------homeDirectory
------------objectClass
------------sn
------------uid
------------uidNumber
------------givenName
------------loginShell
------------userPassword
And here is my portal-ext.porperties file :
## LDAP Server Settings
ldap.base.provider.url=ldap://controller.domain.com.tn:389
ldap.base.dn=DC=domain,DC=com,DC=tn
# authentication
ldap.security.principal=cn=admin,dc=domain,dc=com,dc=tn
ldap.security.credentials=My Password !
# search from this point in the tree
ldap.users.dn=DC=domain,DC=com,DC=tn
# You can write your own class that implements
# com.liferay.portal.security.ldap.AttributesTransformer to transform the
# LDAP attributes before a user or group is imported to the LDAP store.
ldap.attrs.transformer.impl=com.liferay.portal.security.ldap.DefaultAttributesTransformer
# standard mappings, must be present in LDAP or we get an exception
#ldap.user.mappings=screenName=cn\npassword=userPassword\nemailAddress=\nfirstName=givenName\nlastName=sn\njobTitle=\ngroup=
ldap.user.mappings=screenName=cn\npassword=userPassword\nfirstName=givenName\nlastName=sn\njobTitle=title\ngroup=groupMembership\nemailAddress=uid
ldap.auth.search.filter=(mail=#user_id#)
ldap.import.user.search.filter=(objectClass=inetOrgPerson)
## Import, users can be imported on demand at login or at startup and at regular intervals.
ldap.import.enabled=true
ldap.import.interval=360
ldap.import.on.startup=true
ldap.export.enabled=false
ldap.user.default.object.classes=inetOrgPerson,organizationalPerson
## Custom Mappings, same format as ldap.user.mappings
##Commented by ME : ldap.user.custom.mappings=nickname=mailNickname\ndisplay=cn
## Added from this link : http://www.liferay.com/community/forums/-/message_boards/message/5681334
users.screen.name.validator=com.liferay.portal.security.auth.LiberalScreenNameValidator
users.screen.name.allow.numeric=true
##added from this link : http://issues.liferay.com/browse/LPS-14519
users.screen.name.always.autogenerate=true
##added from this link : http://vkbardia.blogspot.com/2012/05/liferay-authentication-fails-for-screen.html?showComment=1345199625453#c3592789922325172023
users.email.address.required= false
#Groups
ldap.group.mappings=groupName=cn\ndescription=description\nuser=memberUid
ldap.import.create.role.per.group=false
PS : I don't have an email field for my users that's why I want them to login with their UID.
PS : When running in eclipse I got this :
13:35:47,450 ERROR [PortalLDAPImporterImpl:196] Error importing LDAP users and groups
java.lang.NullPointerException
at com.liferay.portal.kernel.io.unsync.UnsyncStringReader.<init>(UnsyncStringReader.java:33)
at com.liferay.portal.kernel.util.PropertiesUtil.load(PropertiesUtil.java:199)
at com.liferay.portal.kernel.util.PropertiesUtil.load(PropertiesUtil.java:192)
at com.liferay.portal.security.ldap.LDAPSettingsUtil.getUserExpandoMappings(LDAPSettingsUtil.java:124)
at com.liferay.portal.security.ldap.PortalLDAPImporterImpl.importFromLDAP(PortalLDAPImporterImpl.java:169)
at com.liferay.portal.security.ldap.PortalLDAPImporterImpl.importFromLDAP(PortalLDAPImporterImpl.java:128)
at com.liferay.portal.security.ldap.PortalLDAPImporterUtil.importFromLDAP(PortalLDAPImporterUtil.java:34)
at com.liferay.portal.util.PortalInstances._initCompany(PortalInstances.java:448)
at com.liferay.portal.util.PortalInstances.initCompany(PortalInstances.java:92)
at com.liferay.portal.servlet.MainServlet.initCompanies(MainServlet.java:766)
at com.liferay.portal.servlet.MainServlet.init(MainServlet.java:336)
at javax.servlet.GenericServlet.init(GenericServlet.java:160)
at org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1266)
at org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1185)
at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:1080)
at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:5001)
at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5289)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:866)
at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:842)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:615)
at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:649)
at org.apache.catalina.startup.HostConfig$DeployDescriptor.run(HostConfig.java:1581)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
PS : I'm writing all this in detail cause this is so important for me.
Waiting for your Help.
Best & Regards

hope this might help you,
http://www.liferay.com/community/forums/-/message_boards/message/8246721

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Nutch job failing when sending data to Solr - search

Did you see the log in solr that revealed the error reason. I had ever same problem in nutch, and the solr's log showed a message "unknown field 'host'". After I modified the schema.xml for solr, the problem vanished.

Related

Spark ElasticSearch EsHadoopIllegalArgumentException unable to find keystore with valid URI

Nutch-cygwin: ERROR output.FileOutputCommitter - Mkdirs failed to create file

Nutch 2 with Cassandra as a storage is not crawling data properly

nutch crawling with solr, job failed for depth >= 2

Importing Users From LDAP in Liferay

Categories

Resources