I tried to create the folder in file system of HDFS by means of a command
./hadoop fs -mkdir /user/hadoop
and as a result I received the following messages
13/02/17 09:45:50 INFO ipc.Client: Retrying connect to server: one/192.168.1.8:9000. Already tried 0 time(s).
13/02/17 09:45:51 INFO ipc.Client: Retrying connect to server: one/192.168.1.8:9000. Already tried 1 time(s).
13/02/17 09:45:52 INFO ipc.Client: Retrying connect to server: one/192.168.1.8:9000. Already tried 2 time(s).
13/02/17 09:45:53 INFO ipc.Client: Retrying connect to server: one/192.168.1.8:9000. Already tried 3 time(s).
13/02/17 09:45:54 INFO ipc.Client: Retrying connect to server: one/192.168.1.8:9000. Already tried 4 time(s).
13/02/17 09:45:55 INFO ipc.Client: Retrying connect to server: one/192.168.1.8:9000. Already tried 5 time(s).
13/02/17 09:45:56 INFO ipc.Client: Retrying connect to server: one/192.168.1.8:9000. Already tried 6 time(s).
13/02/17 09:45:57 INFO ipc.Client: Retrying connect to server: one/192.168.1.8:9000. Already tried 7 time(s).
13/02/17 09:45:58 INFO ipc.Client: Retrying connect to server: one/192.168.1.8:9000. Already tried 8 time(s).
13/02/17 09:45:59 INFO ipc.Client: Retrying connect to server: one/192.168.1.8:9000. Already tried 9 time(s).
Bad connection to FS. command aborted. exception: Call to one/192.168.1.8:9000 failed on connection exception: java.net.ConnectException: Connection refused
In the /export/hadoop-1.0.1/conf/core-site.xml file the following is specified:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.1.8:9000</value>
</property>
</configuration>
Wanted to specify - whether correct I specified port? If isn't present, tell what it is necessary?
seems like hadoop daemons are not running. check with jps
Related
Am trying to bring Jfrog up, in local tomcat is running and artifactory service also looking fine. But in UI jfrog is not coming up.
Getting 502 Bad Gateway error. I have shared the console log details below.
Below is the console log
[TRACE] [Service registry ping] operation attempt #94 failed. retrying in 1s. current error: error while trying to connect to local router at address 'http://localhost:8046/access/api/v1/system/ping': Get "http://localhost:8046/access/api/v1/system/ping": dial tcp 127.0.0.1:8046: connect: connection refused
[TRACE] [Service registry ping] running retry attempt #95
[INFO ] Cluster join: Retry 95: Service registry ping failed, will retry. Error: error while trying to connect to local router at address 'http://localhost:8046/access/api/v1/system/ping': Get "http://localhost:8046/access/api/v1/system/ping": dial tcp 127.0.0.1:8046: connect: connection refused
[TRACE] [Service registry ping] operation attempt #95 failed. retrying in 1s. current error: error while trying to connect to local router at address 'http://localhost:8046/access/api/v1/system/ping': Get "http://localhost:8046/access/api/v1/system/ping": dial tcp 127.0.0.1:8046: connect: connection refused
2022-09-10T06:14:20.271Z [jffe ] [INFO ] [ ] [ ] [main ] - pinging artifactory, attempt number 90
2022-09-10T06:14:20.274Z [jffe ] [INFO ] [ ] [ ] [main ] - pinging artifactory attempt number 90 failed with code : ECONNREFUSED
[TRACE] [Service registry ping] running retry attempt #96
[DEBUG] Cluster join: Retry 96: Service registry ping failed, will retry. Error: error while trying to connect to local router at address 'http://localhost:8046/access/api/v1/system/ping': Get "http://localhost:8046/access/api/v1/system/ping": dial tcp 127.0.0.1:8046: connect: connection refused
[TRACE] [Service registry ping] operation attempt #96 failed. retrying in 1s. current error: error while trying to connect to local router at address 'http://localhost:8046/access/api/v1/system/ping': Get "http://localhost:8046/access/api/v1/system/ping": dial tcp 127.0.0.1:8046: connect: connection refused
2022-09-10T06:14:21.188Z [jfrou] [INFO ] [2b4bfed554e45cf6] [join_executor.go:169 ] [main ] [] - Cluster join: Retry 100: Service registry ping failed, will retry. Error: could not parse error from service registry, status code: 404, raw body: <!doctype html><html lang="en"><head><title>HTTP Status 404 – Not Found</title><style type="text/css">body {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b {color:white;background-color:#525D76;} h1 {font-size:22px;} h2 {font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP Status 404 – Not Found</h1></body></html>
[TRACE] [Service registry ping] running retry attempt #97
[DEBUG] Cluster join: Retry 97: Service registry ping failed, will retry. Error: error while trying to connect to local router at address 'http://localhost:8046/access/api/v1/system/ping': Get "http://localhost:8046/access/api/v1/system/ping": dial tcp 127.0.0.1:8046: connect: connection refused
[TRACE] [Service registry ping] operation attempt #97 failed. retrying in 1s. current error: error while trying to connect to local router at address 'http://localhost:8046/access/api/v1/system/ping': Get "http://localhost:8046/access/api/v1/system/ping": dial tcp 127.0.0.1:8046: connect: connection refused
2022-09-10T06:14:22.016Z [jfmd ] [INFO ] [ ] [accessclient.go:60 ] [main ] - Cluster join: Retry 100: Service registry ping failed, will retry. Error: Error while trying to connect to local router at address 'http://localhost:8046/access': Get "http://localhost:8046/access/api/v1/system/ping": dial tcp 127.0.0.1:8046: connect: connection refused [access_client]
[TRACE] [Service registry ping] running retry attempt #98
[DEBUG] Cluster join: Retry 98: Service registry ping failed, will retry. Error: error while trying to connect to local router at address 'http://localhost:8046/access/api/v1/system/ping': Get "http://localhost:8046/access/api/v1/system/ping": dial tcp 127.0.0.1:8046: connect: connection refused
and this is the error am getting in UI.
502 Bad Gateway error
Is it a newly installed Artifactory instance? If yes, we need to verify whether the required ports are in place (open at the firewall level). If the ports are already available, disable the IPv6 address from the VM where Artifactory is installed and restart the Artifactory. There are chances of this error occurrence if the application is trying to pick up the IPv6 address for initialisation instead of Ipv4.
Im trying to run a simple query on a column with only 10 rows:
select MAX(Column3) from table;
However the spark application runs infinitely with the following message:
> 2017-05-10T16:23:40,397 DEBUG [IPC Parameter Sending Thread #0]
> ipc.Client: IPC Client (1360312263) connection to /0.0.0.0:8032 from
> ubuntu sending #1841 2017-05-10T16:23:40,397 DEBUG [IPC Client
> (1360312263) connection to /0.0.0.0:8032 from ubuntu] ipc.Client: IPC
> Client (1360312263) connection to /0.0.0.0:8032 from ubuntu got value
> #1841 2017-05-10T16:23:40,397 DEBUG [main] ipc.ProtobufRpcEngine: Call: getApplicationReport took 0ms 2017-05-10T16:23:41,397 DEBUG
> [main] security.UserGroupInformation: PrivilegedAction as:ubuntu
> (auth:SIMPLE)
> from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:323)
> 2017-05-10T16:23:41,398 DEBUG [IPC Parameter Sending Thread #0]
> ipc.Client: IPC Client (1360312263) connection to /0.0.0.0:8032 from
> ubuntu sending #1842 2017-05-10T16:23:41,398 DEBUG [IPC Client
> (1360312263) connection to /0.0.0.0:8032 from ubuntu] ipc.Client: IPC
> Client (1360312263) connection to /0.0.0.0:8032 from ubuntu got value
> #1842 2017-05-10T16:23:41,398 DEBUG [main] ipc.ProtobufRpcEngine: Call: getApplicationReport took 1ms 2017-05-10T16:23:41,399 DEBUG
> [main] security.UserGroupInformation: PrivilegedAction as:ubuntu
> (auth:SIMPLE)
> from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:323)
> 2017-05-10T16:23:41,399 DEBUG [IPC Parameter Sending Thread #0]
> ipc.Client: IPC Client (1360312263) connection to /0.0.0.0:8032 from
> ubuntu sending #1843 2017-05-10T16:23:41,399 DEBUG [IPC Client
> (1360312263) connection to /0.0.0.0:8032 from ubuntu] ipc.Client: IPC
> Client (1360312263) connection to /0.0.0.0:8032 from ubuntu got value
> #1843 2017-05-10T16:23:41,399 DEBUG [main] ipc.ProtobufRpcEngine: Call: getApplicationReport took 0ms
The issue was related to an unhealthy node, therefore it was not able to assign the task. The solution was to increase the yarn maximum disk utilization percentage in yarn-site.xml because my disk was at 97% used :
<property>
<name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name>
<value>99</value>
</property>
we are testing hadoop HA cluster on Azure v2 cloud running on linux. We are trying to switch to Azure BLOB storage. We are not sure how we should configure name nodes using Blob storage. We are getting following error:
2015-12-22 13:05:50,193 INFO ha.StandbyCheckpointer (StandbyCheckpointer.java:start(129)) - Starting standby checkpoint thread...
Checkpointing active NN at http://bd-azure-qa-nn2:50070
Serving checkpoints at http://bd-azure-qa-nn1:50070
2015-12-22 13:07:50,240 INFO ha.EditLogTailer (EditLogTailer.java:triggerActiveLogRoll(269)) - Triggering log roll on remote NameNode bd-azure-qa-nn2/10.0.0.7:8020
2015-12-22 13:07:51,387 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:52,391 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:53,400 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:54,416 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:55,425 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:56,450 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:57,456 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:58,462 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:59,473 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:00,478 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:01,482 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 10 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:02,490 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 11 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:03,501 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 12 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:04,515 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 13 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:05,520 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 14 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:05,966 WARN ha.EditLogTailer (EditLogTailer.java:triggerActiveLogRoll(274)) - Unable to trigger a roll of the active NN
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category JOURNAL is not supported in state standby
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1719)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1352)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:6339)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:933)
at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:139)
at org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:11214)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
at org.apache.hadoop.ipc.Client.call(Client.java:1468)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy14.rollEditLog(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:145)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:271)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$600(EditLogTailer.java:61)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:313)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:282)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:299)
at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:412)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:295)
We are not pretty sure how to configure name nodes with Azure Blob since Blob effectively handover the HDFS functionality.
We have standard HA name node configuration in hdfs-site.xml and modified core-site.xml with the following properties to enable BLOB storage.
<property>
<name>fs.AbstractFileSystem.wasb.impl</name>
<value>org.apache.hadoop.fs.azure.Wasb</value>
</property>
<property>
<name>fs.azure.account.key.OUR_STORAGE_ACCOUNT.blob.core.windows.net</name>
<value>"OUR_KEY"</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>wasb://blob-hdfs#OUR_STORAGE_ACCOUNT.blob.core.windows.net</value>
<final>true</final>
</property>
<property>
<name>fs.azure.page.blob.dir</name>
<value>/datadir</value>
</property>
<property>
<name>fs.azure.selfthrottling.read.factor</name>
<value>1.000000</value>
</property>
<property>
<name>fs.azure.selfthrottling.write.factor</name>
<value>1.000000</value>
<property>
Our HA name in the original cluster were:
<!-- <property>
<name>fs.defaultFS</name>
<value>hdfs://namenodeha</value>
</property> -->
hdfs-site.xml we didn't touch at all.
We are not sure about name node settings. Two name node from original settings are probably overkill since underlying BLOB should handle all replication etc.
Could someone please clarify?
Following properties we were missing in hdfs-site.xml in oreder to make it work.
<property>
<name>dfs.datanode.https.address</name>
<value>0.0.0.0:50475</value>
</property>
<property>
<name>dfs.namenode.https-address.namenodeha.nn1</name>
<value>bd-azure-qa-nn1:50470</value>
</property>
<property>
<name>dfs.namenode.https-address.namenodeha.nn2</name>
<value>bd-azure-qa-nn2:50470</value>
</property>
<property>
<name>dfs.journalnode.https-address</name>
<value>0.0.0.0:8481</value> :
</property>
For sure with appropriate host names.
I am trying to configure a two node cluster with cassandra in windows r2 2008 So i installed cassandra community version in one server (10.xxx.0.1,10.xxx.0.2) And then I stopped the service and then edited the configuraton.yaml file in the conf folder.
The changes are:
cluster_name
commented the num_tokens
gave the tokens in initial_token,
seeds as 10.xxx.0.1,10.xxx.0.2,
listen_addresses are their respective ip addresses which are 10.xxx.0.1,10.xxx.0.2,
rpc_addresses are same as listen_address,
endpointsnitch as gossip
I also changed the cassandra rackdc.properties file to dc=DC1 rack=RAC1.
and changed the cassandra-env file - uncommented: JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=10.xxx.0.1
I then saved and started back the service and ran nodetool status and got
Datacenter: DC1
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Owns (effective) Host ID
Token Rack
UN 10.104.0.15 46.65 KB 100.0% bc41a884-baaf-4a52-85f3-f3270c2ec9
57 -9223372036854775808 RAC1
and opened the cqlsh, but it is not connecting. Below is the error:
ERROR [Initialization] 2015-10-12 17:10:21,353 Error connecting via JMX: java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused: connect]
INFO [main] 2015-10-12 17:10:24,612 Reconnecting to a backup OpsCenter instance
INFO [main] 2015-10-12 17:10:24,615 SSL communication is disabled
INFO [main] 2015-10-12 17:10:24,615 Creating stomp connection to 127.0.0.1:61620
INFO [main] 2015-10-12 17:10:24,628 Starting Jetty server: {:join? false, :ssl? false, :host nil, :port 61621}
INFO [Initialization] 2015-10-12 17:10:24,632 Sleeping for 2s before trying to determine IP over JMX again
INFO [StompConnection receiver] 2015-10-12 17:10:24,640 Reconnecting in 0s.
INFO [main] 2015-10-12 18:25:56,347 Waiting for the config from OpsCenter
INFO [main] 2015-10-12 18:25:56,349 Attempting to determine Cassandra's broadcast address through JMX
INFO [main] 2015-10-12 18:25:56,350 Starting Stomp
INFO [main] 2015-10-12 18:25:56,350 Starting up agent communcation with OpsCenter.
INFO [Initialization] 2015-10-12 18:25:56,356 New JMX connection (127.0.0.1:7199)
ERROR [Initialization] 2015-10-12 18:25:57,652 Error connecting via JMX: java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused: connect]
INFO [main] 2015-10-12 18:26:00,768 Reconnecting to a backup OpsCenter instance
INFO [main] 2015-10-12 18:26:00,770 SSL communication is disabled
INFO [main] 2015-10-12 18:26:00,770 Creating stomp connection to 127.0.0.1:61620
INFO [main] 2015-10-12 18:26:00,779 Starting Jetty server: {:join? false, :ssl? false, :host nil, :port 61621}
INFO [Initialization] 2015-10-12 18:26:00,782 Sleeping for 2s before trying to determine IP over JMX again
INFO [StompConnection receiver] 2015-10-12 18:26:00,945 Reconnecting in 0s.
INFO [StompConnection receiver] 2015-10-12 18:26:01,136 Connected to 127.0.0.1:61620
I have installed hadoop 2.6.0 multi-node cluster in EC2 instance(ubuntu 14.04 64 bit). All demons(NameNode,SecondaryNameNode,ResourceManager) in master is up,but in slave machine only DataNode is up NodeManager is shutting down due to connection refuse.
Kindly help me in this regard. Thanks in advance
The log file of my NodeManager is below:
2015-09-08 07:59:36,606 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: NodeManager configured with 8 G physical memory allocated to containers, which is more than 80% of the total physical memory available (992.5 M). Thrashing might happen.
2015-09-08 07:59:36,613 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Initialized nodemanager for null: physical-memory=8192 virtual-memory=17204 virtual-cores=8
2015-09-08 07:59:36,646 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2015-09-08 07:59:36,666 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 53949
2015-09-08 07:59:36,688 INFO org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.yarn.api.ContainerManagementProtocolPB to the server
2015-09-08 07:59:36,688 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Blocking new container-requests as container manager rpc server is still starting.
2015-09-08 07:59:36,691 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2015-09-08 07:59:36,692 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 53949: starting
2015-09-08 07:59:36,707 INFO org.apache.hadoop.yarn.server.nodemanager.security.NMContainerTokenSecretManager: Updating node address : ec2-52-88-167-9.us-west-2.compute.amazonaws.com:53949
2015-09-08 07:59:36,713 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2015-09-08 07:59:36,713 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8040
2015-09-08 07:59:36,716 INFO org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.yarn.server.nodemanager.api.LocalizationProtocolPB to the server
2015-09-08 07:59:36,717 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2015-09-08 07:59:36,717 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8040: starting
2015-09-08 07:59:36,717 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Localizer started on port 8040
2015-09-08 07:59:36,719 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: ContainerManager started at ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154:53949
2015-09-08 07:59:36,719 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: ContainerManager bound to 0.0.0.0/0.0.0.0:0
2015-09-08 07:59:36,719 INFO org.apache.hadoop.yarn.server.nodemanager.webapp.WebServer: Instantiating NMWebApp at 0.0.0.0:8042
2015-09-08 07:59:36,790 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2015-09-08 07:59:36,793 INFO org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.nodemanager is not defined
2015-09-08 07:59:36,805 INFO org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2015-09-08 07:59:36,806 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context node
2015-09-08 07:59:36,806 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2015-09-08 07:59:36,807 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2015-09-08 07:59:36,812 INFO org.apache.hadoop.http.HttpServer2: adding path spec: /node/*
2015-09-08 07:59:36,812 INFO org.apache.hadoop.http.HttpServer2: adding path spec: /ws/*
2015-09-08 07:59:36,820 INFO org.apache.hadoop.http.HttpServer2: Jetty bound to port 8042
2015-09-08 07:59:36,820 INFO org.mortbay.log: jetty-6.1.26
2015-09-08 07:59:36,863 INFO org.mortbay.log: Extract jar:file:/home/ubuntu/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-common-2.6.0.jar!/webapps/node to /tmp/Jetty_0_0_0_0_8042_node____19tj0x/webapp
2015-09-08 07:59:37,358 INFO org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup#0.0.0.0:8042
2015-09-08 07:59:37,359 INFO org.apache.hadoop.yarn.webapp.WebApps: Web app /node started at 8042
2015-09-08 07:59:37,879 INFO org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules
2015-09-08 07:59:37,885 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8031
2015-09-08 07:59:37,913 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out 0 NM container statuses: []
2015-09-08 07:59:37,917 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Registering with RM using containers :[]
**2015-09-08 07:59:38,951 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-09-08 07:59:39,956 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-09-08 07:59:40,957 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-09-08 07:59:41,957 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-09-08 07:59:42,958 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)**
2015-09-08 08:19:48,256 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl failed in state STARTED; cause: **org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused**
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:197)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:264)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:463)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:509)
Caused by: java.net.ConnectException: Call From ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.GeneratedConstructorAccessor8.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy27.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:68)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy28.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:257)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:191)
... 6 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
... 18 more
2015-09-08 08:19:48,257 INFO org.apache.hadoop.service.AbstractService: Service NodeManager failed in state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:197)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:264)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:463)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:509)
Caused by: java.net.ConnectException: Call From ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.GeneratedConstructorAccessor8.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy27.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:68)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy28.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:257)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:191)
... 6 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
... 18 more
2015-09-08 08:19:48,263 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup#0.0.0.0:8042
2015-09-08 08:19:48,264 INFO org.apache.hadoop.ipc.Server: Stopping server on 53949
2015-09-08 08:19:48,266 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 53949
2015-09-08 08:19:48,267 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2015-09-08 08:19:48,267 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl is interrupted. Exiting.
2015-09-08 08:19:48,267 INFO org.apache.hadoop.ipc.Server: Stopping server on 8040
2015-09-08 08:19:48,268 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8040
2015-09-08 08:19:48,268 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2015-09-08 08:19:48,269 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Public cache exiting
2015-09-08 08:19:48,269 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NodeManager metrics system...
2015-09-08 08:19:48,270 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system stopped.
2015-09-08 08:19:48,270 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system shutdown complete.
2015-09-08 08:19:48,270 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
core-site.xml:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://ec2-52-26-161-203.us-west-2.compute.amazonaws.com:8020</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/ubuntu/hdfstmp</value>
</property>
</configuration>
mapred-site.xml:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hdfs://ec2-52-26-161-203.us-west-2.compute.amazonaws.com:8021</value>
</property>
</configuration>
hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
Master machine:
ubuntu#ec2-52-26-161-203:~$ vim /etc/hosts
172.31.23.167 ec2-52-26-161-203.us-west-2.compute.amazonaws.com
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
ubuntu#ec2-52-26-161-203:~$ vim /etc/hadoop/masters
ec2-52-26-161-203.us-west-2.compute.amazonaws.com
ubuntu#ec2-52-26-161-203:~$ vim /etc/hadoop/slaves
ec2-52-88-167-9.us-west-2.compute.amazonaws.com
Slave Machine :
ubuntu#ec2-52-88-167-9:~ vim /etc/hosts
172.31.29.154 ec2-52-88-167-9.us-west-2.compute.amazonaws.com
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
ubuntu#ec2-52-88-167-9:~ vim /etc/hadoop/slaves
ec2-52-88-167-9.us-west-2.compute.amazonaws.com
ubuntu#ec2-52-26-161-203:~$ sudo netstat -lpten | grep java
tcp 0 0 0.0.0.0:50070 0.0.0.0:* LISTEN 1000 569904 19910/java
tcp 0 0 0.0.0.0:50090 0.0.0.0:* LISTEN 1000 570916 20136/java
tcp 0 0 172.31.23.167:8020 0.0.0.0:* LISTEN 1000 569911 19910/java
tcp6 0 0 :::8088 :::* LISTEN 1000 571699 20278/java
tcp6 0 0 :::8030 :::* LISTEN 1000 571690 20278/java
tcp6 0 0 :::8031 :::* LISTEN 1000 571683 20278/java
tcp6 0 0 :::8032 :::* LISTEN 1000 571695 20278/java
tcp6 0 0 :::8033 :::* LISTEN 1000 571702 20278/java
Telnet command:
ubuntu#ec2-52-26-161-203:~$ telnet localhost 8031
Trying ::1...
Connected to localhost.
Escape character is '^]'.
How is it taking 8031 port for resource manager ? I haven't giving in my hadoop configuration files(coresite.xml,mapred-site.xml,hdfs-site.xml) which is above.
I have made modifications in mapred-site.xml and yarn-site.xml which solved my issue.Since I haven't mentioned the host name property value for resource manager in yarn-site.xml it was trying to connect with the address 0.0.0.0 which was the cause for connection refused exception.
mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>ec2-52-26-161-203.us-west-2.compute.amazonaws.com</value>
</property>
The hadoop documentation http://wiki.apache.org/hadoop/ConnectionRefused
clearly says:
Make sure the destination address in the exception isn't 0.0.0.0 -this
means that you haven't actually configured the client with the real
address for that
Could you please try adding master ip entry in slave machine's host and slave's ip entry to master as well. Also comment out all the entries in the hosts file, if not needed.