Got TimeoutException when try to download file from Azure Blob Storage - azure

Im trying to download file from Azure blob storage with flowing code:
blobServiceClient = new BlobServiceClientBuilder().connectionString(connectionString)
.buildClient();
BlobClient b = blobContainerClient.getBlobClient(remotePath);
b.downloadToFile(localPath, true);
But sometimes i got this exception:
Caused by: java.util.concurrent.TimeoutException: Did not observe any item or terminal signal within 60000ms in 'map' (and no fallback has been configured)
at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.handleTimeout(FluxTimeout.java:288)
at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.doTimeout(FluxTimeout.java:273)
at reactor.core.publisher.FluxTimeout$TimeoutTimeoutSubscriber.onNext(FluxTimeout.java:390)
at reactor.core.publisher.StrictSubscriber.onNext(StrictSubscriber.java:89)
at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onNext(FluxOnErrorResume.java:73)
at reactor.core.publisher.MonoDelay$MonoDelayRunnable.run(MonoDelay.java:117)
at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:50)
at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:27)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Do we have any solution to make it's stable?
Version:
<azure-storage-blob.version>12.6.0</azure-storage-blob.version>
<azure-core.version>1.3.0</azure-core.version>

Related

Using pubsub lite library in spark getting error

I am getting error while publishing message to gcp pubsub lite using spark structured streaming.
I cannot use writestream as I want to use it in forEachBatch sink in spark so I am using foreachpartition and foreach and publishing message inside foreach for each dataframe row.
Below is error I get , some messages get published but in some I can see below exception:
2022-06-07 10:08:17 WARN PartitionCountWatcherImpl:101 - Failed to refresh partition count
com.google.api.gax.rpc.ApiException:
at com.google.cloud.pubsublite.internal.CheckedApiException.<init>(CheckedApiException.java:51)
at com.google.cloud.pubsublite.internal.CheckedApiException.<init>(CheckedApiException.java:55)
at com.google.cloud.pubsublite.internal.ExtractStatus.toCanonical(ExtractStatus.java:49)
at com.google.cloud.pubsublite.internal.wire.PartitionCountWatcherImpl.pollTopicConfig(PartitionCountWatcherImpl.java:92)
at com.google.cloud.pubsublite.internal.wire.PartitionCountWatcherImpl.onAlarm(PartitionCountWatcherImpl.java:71)
at com.google.cloud.pubsublite.internal.AlarmFactory.lambda$null$0(AlarmFactory.java:41)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.InterruptedException
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:456)
at com.google.common.util.concurrent.FluentFuture$TrustedFuture.get(FluentFuture.java:100)
at com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:73)
at com.google.cloud.pubsublite.internal.wire.PartitionCountWatcherImpl.pollTopicConfig(PartitionCountWatcherImpl.java:81)
... 9 more

Logstash + Azure Events Hubs

Trying to follow the link to add azure event into logstash, I have the below issue:
[2020-02-13T14:06:28,886][ERROR][com.microsoft.azure.eventprocessorhost.PartitionManager] host logstash-5fdbcee8-e368-44de-bc13-c640a36f646f: Exception while initializing stores, not starting partition manager com.microsoft.azure.eventhubs.IllegalEntityException: Failure getting partition ids for event hub
at com.microsoft.azure.eventprocessorhost.PartitionManager.lambda$cachePartitionIds$4(PartitionManager.java:80) ~[azure-eventhubs-eph-2.1.0.jar:?]
at java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:836) ~[?:1.8.0_242]
at java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:811) ~[?:1.8.0_242]
at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456) [?:1.8.0_242]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_242]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_242]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_242]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_242]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242]
Can someone help ?
I got the hint from this question. It appears the consumer SAS policy still needs manage privileges.

Nodetool repair erroring on cassandra 3.9 cluster complaining of dead nodes

I have a cassandra 3.9 cluster. I initiated a repair from one of the nodes in the cluster. The repair went nowhere. I see the logs on that initiated node filled with errors like this.
ERROR [GossipTasks:1] 2018-02-16 23:27:36,949 RepairSession.java:347 - [repair #cadf6f11-1342-11e8-8d73-6767c6890f70] session completed with the following error
java.io.IOException: Endpoint /**.**.**.52 died
at org.apache.cassandra.repair.RepairSession.convict(RepairSession.java:346) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.gms.FailureDetector.interpret(FailureDetector.java:306) [apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:782) [apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.gms.Gossiper.access$800(Gossiper.java:66) [apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.gms.Gossiper$GossipTask.run(Gossiper.java:181) [apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118) [apache-cassandra-3.9.jar:3.9]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_91]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_91]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_91]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
One the other hand if I look at the logs for the nodes claimed to be dead, I see one of 3 symptoms.
Either the node claims to have successfully sent the requested
merkle tree over.
The node does not have any trace of the repair session, and thus doesn't appear to have received any repair request.
The node shows exception like this.
ERROR [ValidationExecutor:3] 2018-02-16 23:29:06,548 Validator.java:261 - Failed creating a merkle tree for [repair #cac2bf50-1342-11e8-8d73-6767c6890f70 on somekeyspace/sometable, [(-3531087107126953137,-3495591103116433105], (1424707151780052485,1425479237398192865], (-3533012126945497873,-3531087107126953137], (1425479237398192865,1429220273719165251], (-4991682772598302168,-4984938905452900436], (-7686750611814623539,-7685228552629222537], (7554301216433235881,7559623046999138658], (334796420453180909,342318143371667659], (-3538876023288368831,-3533012126945497873], (1409514567521922418,1424707151780052485], (5391546013321073004,5393284101537339558], (590921410556013711,593440512568877190]]], /..**.43 (see log for details)
ERROR [ValidationExecutor:3] 2018-02-16 23:29:06,549 CassandraDaemon.java:226 - Exception in thread Thread[ValidationExecutor:3,1,main]
java.lang.RuntimeException: Parent repair session with id = c8bf7540-1342-11e8-8d73-6767c6890f70 has failed.
at org.apache.cassandra.service.ActiveRepairService.getParentRepairSession(ActiveRepairService.java:377) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.db.compaction.CompactionManager.getSSTablesToValidate(CompactionManager.java:1313) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1222) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.db.compaction.CompactionManager.access$700(CompactionManager.java:81) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.db.compaction.CompactionManager$11.call(CompactionManager.java:844) ~[apache-cassandra-3.9.jar:3.9]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
Is this a known issue?

spark-hbase connector expired ticket kerberos

I have a cluster with CDH 5.8.4. I'm runnin a spark streaming application which reads and writes data from/to HBase by using the cloudera spark-hbase connector namely the HBaseContext.
When I start the application I give the principal and the kinit to the spark-submit script.
I'm seeing that after 7 days the application crashed with an error about the expiration of the ticket kerberos related to the HBase context. This is the error from the executors log:
ERROR executor.Executor: Exception in task 0.0 in stage 544265.0 (TID 1149098)
org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java
:326)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:157)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:61)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295)
at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160)
at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:155)
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:867)
at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.restart(TableRecordReaderImpl.java:91)
at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.initialize(TableRecordReaderImpl.java:169)
at org.apache.hadoop.hbase.mapreduce.TableRecordReader.initialize(TableRecordReader.java:134)
at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$1.initialize(TableInputFormatBase.java:211)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:164)
at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:129)
at org.apache.hadoop.hbase.spark.NewHBaseRDD.compute(NewHBaseRDD.scala:34)
at org.apache.hadoop.hbase.spark.NewHBaseRDD.compute(NewHBaseRDD.scala:25)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.security.token.SecretManager$InvalidToken: Token has expired
at sun.reflect.GeneratedConstructorAccessor58.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:327)
at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRowOrBefore(ProtobufUtil.java:1593)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1398)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1199)
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:315)
... 30 more
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): Token has expired
at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.readStatus(HBaseSaslRpcClient.java:155)
at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:222)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:617)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$700(RpcClientImpl.java:162)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:743)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:740)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1783)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:740)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:906)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:873)
at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1242)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:34070)
at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRowOrBefore(ProtobufUtil.java:1589)
Does anyone knows how to solve this issue?
Thanks in advance,
Beniamino
We (Splice Machine) had the same issue with a customer. Our issue was caused by https://issues.apache.org/jira/browse/SPARK-12646. We wrote some code to fix the _HOST issue and we also upgraded to Spark 2.2 to get around this issue.
You should not rely on an external ticket cache for distributed jobs. The best solution is to ship a keytab with your application or rely on a keytab being deployed on all nodes where your Spark task may be executed.
UserGroupInformation.loginUserFromKeytab("name#xyz.com", keyTab);
connection=ConnectionFactory.createConnection(conf);
With your approach above, you would need to do something like the following after obtaining the UserGroupInformation instance:
ugi.doAs(new PrivilegedAction<Void>() {
public Void run() {
connection = ConnectionFactory.createConnection(conf);
...
return null;
}
});

After creating a new jhipster project, unable to launch the application

After creating a jhipster project, tried with the following command.
**mvnw
I am getting the following error. For existing project also, i am facing the same issue.
Error :
com.netflix.discovery.shared.transport.TransportException: Cannot execute request on any known server
at com.netflix.discovery.shared.transport.decorator.RetryableEurekaHttpClient.execute(RetryableEurekaHttpClient.java:111)
at com.netflix.discovery.shared.transport.decorator.EurekaHttpClientDecorator.register(EurekaHttpClientDecorator.java:56)
at com.netflix.discovery.shared.transport.decorator.EurekaHttpClientDecorator$1.execute(EurekaHttpClientDecorator.java:59)
at com.netflix.discovery.shared.transport.decorator.SessionedEurekaHttpClient.execute(SessionedEurekaHttpClient.java:77)
at com.netflix.discovery.shared.transport.decorator.EurekaHttpClientDecorator.register(EurekaHttpClientDecorator.java:56)
at com.netflix.discovery.DiscoveryClient.register(DiscoveryClient.java:815)
at com.netflix.discovery.InstanceInfoReplicator.run(InstanceInfoReplicator.java:104)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Your question is not detailed enough but it is obvious that you are using a microservice achitecture and did not start the registry, check documentation.

Resources