Grid start error after upgrading from GridGain 6.1.6 to 6.2.0 - gridgain

I'm attempting to upgrade our gridgain install from 6.1.6 to 6.2.0 (tried 6.2.1 too but the same issue persists). All I did was change the version number in our pom file and add the attribute and implement a few new methods in our MockGrid.
<dependency>
<groupId>org.gridgain</groupId>
<artifactId>gridgain-platform</artifactId>
<version>6.2.0</version>
<type>pom</type>
<exclusions>
<exclusion>
<groupId>javax.servlet</groupId>
<artifactId>servlet-api</artifactId>
</exclusion>
<exclusion>
<groupId>org.springframework</groupId>
<artifactId>spring-asm</artifactId>
</exclusion>
</exclusions>
</dependency>
As I run through the relevant testcases, there appears to be an issue starting the grid.
Here's the relevant error (Stack trace available at https://gist.github.com/anonymous/a34f9a37a67ea98623b3) . What more do I need to do ? Did any constructors , properties change that I need to affect in our spring config?
Thanks,
Vinay
[2014-09-09 17:08:16,481] INFO GridNodeMonitorImpl - Monitoring Grid: GridKernal [cfg=GridConfiguration [gridName=master-1, execSvc=org.gridgain.grid.thread.GridThreadPoolExecutor#fbf7b8f[Running, pool size = 16, active threads = 0, queued tasks = 0, completed tasks = 8], sysSvc=org.gridgain.grid.thread.GridThreadPoolExecutor#6448f15c[Running, pool size = 16, active threads = 0, queued tasks = 0, completed tasks = 7], mgmtSvc=org.gridgain.grid.thread.GridThreadPoolExecutor#70de0273[Running, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0], ggfsSvc=org.gridgain.grid.thread.GridThreadPoolExecutor#7f7305e8[Running, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0], restExecSvc=null, p2pSvcShutdown=true, execSvcShutdown=true, sysSvcShutdown=true, mgmtSvcShutdown=true, ggfsSvcShutdown=true, restSvcShutdown=true, lifeCycleEmailNtf=true, p2pSvc=org.gridgain.grid.thread.GridThreadPoolExecutor#588903b6[Running, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0], ggHome=null, ggWork=null, mbeanSrv=com.sun.jmx.mbeanserver.JmxMBeanServer#21ab7757, nodeId=e99a635d-a5ff-48ca-8486-3a9184d072b1, marsh=org.gridgain.grid.marshaller.optimized.GridOptimizedMarshaller#b52cf23, marshLocJobs=false, daemon=false, jettyPath=null, restEnabled=true, p2pEnabled=false, netTimeout=5000, sndRetryDelay=1000, sndRetryCnt=3, clockSyncSamples=8, clockSyncFreq=120000, metricsHistSize=10000, metricsUpdateFreq=2000, metricsExpTime=9223372036854775807, discoSpi=GridTcpDiscoverySpi [addrRslvr=null, locPort=47500, locPortRange=100, statsPrintFreq=0, netTimeout=5000, sockTimeout=60000, ackTimeout=5000, maxAckTimeout=600000, joinTimeout=0, hbFreq=5000, maxMissedHbs=10, threadPri=10, storesCleanFreq=60000, reconCnt=10, topHistSize=1000, gridName=master-1, locNodeId=e99a635d-a5ff-48ca-8486-3a9184d072b1, marsh=GridJdkMarshaller [], gridMarsh=org.gridgain.grid.marshaller.optimized.GridOptimizedMarshaller#b52cf23, locNode=GridTcpDiscoveryNode [id=e99a635d-a5ff-48ca-8486-3a9184d072b1, addrs=[192.168.1.117, 0:0:0:0:0:0:0:1%1, 127.0.0.1], sockAddrs=[/192.168.1.117:47502, /0:0:0:0:0:0:0:1%1:47502, /127.0.0.1:47502], discPort=47502, order=3, loc=true, ver=GridProductVersion [major=6, minor=2, maintenance=0, stage=, revTs=1408990498]], locAddr=null, locHost=0.0.0.0/0.0.0.0, ipFinder=GridTcpDiscoveryIpFinderAdapter [shared=false], metricsStore=null, spiState=CONNECTED, ipFinderHasLocAddr=false, recon=true, joinRes=GridTuple [val=null], nodeAuth=org.gridgain.grid.kernal.managers.discovery.GridDiscoveryManager$3#76f073b3, gridStartTime=1410300490132], segPlc=STOP, segResolveAttempts=2, waitForSegOnStart=true, allResolversPassReq=true, segChkFreq=10000, commSpi=GridTcpCommunicationSpi [srvLsnr=org.gridgain.grid.spi.communication.tcp.GridTcpCommunicationSpi$2#243c29c6, locNodeId=e99a635d-a5ff-48ca-8486-3a9184d072b1, marsh=org.gridgain.grid.marshaller.optimized.GridOptimizedMarshaller#b52cf23, locAddr=null, locHost=0.0.0.0/0.0.0.0, locPort=47100, locPortRange=100, shmemPort=48100, gridName=master-1, directBuf=true, directSndBuf=false, idleConnTimeout=30000, connBufFlushFreq=100, connBufSize=0, connTimeout=1000, maxConnTimeout=600000, reconCnt=10, sockSndBuf=32768, sockRcvBuf=32768, msgQueueLimit=1024, minBufferedMsgCnt=512, bufSizeRatio=0.8, dualSockConn=false, nioSrvr=GridNioServer [filterChain=FilterChain[filters=[GridNioCodecFilter [parser=org.gridgain.grid.util.nio.GridDirectParser#3970a5ee, directMode=true], GridConnectionBytesVerifyFilter], closed=false, directBuf=true, tcpNoDelay=true, sockSndBuf=32768, sockRcvBuf=32768, writeTimeout=5000, idleTimeout=30000, skipWrite=false, locAddr=0.0.0.0/0.0.0.0:47102, order=LITTLE_ENDIAN, sndQueueLimit=1024, directMode=true, metricsLsnr=org.gridgain.grid.spi.communication.tcp.GridTcpCommunicationSpi$3#129dc9b8, msgWriter=org.gridgain.grid.spi.communication.tcp.GridTcpCommunicationSpi$6#35265894, sslFilter=null], shmemSrv=null, tcpNoDelay=true, asyncSnd=true, lsnr=org.gridgain.grid.kernal.managers.communication.GridIoManager$3#688177ce, boundTcpPort=47102, boundTcpShmemPort=-1, selectorsCnt=2, addrRslvr=null, nodeIdMsg=org.gridgain.grid.spi.communication.tcp.GridTcpCommunicationSpi$NodeIdMessage#be0aafc, rcvdMsgsCnt=7, sentMsgsCnt=4, rcvdBytesCnt=128650, sentBytesCnt=23197, ctxInitLatch=java.util.concurrent.CountDownLatch#4af98c7b[Count = 0], metricsLsnr=org.gridgain.grid.spi.communication.tcp.GridTcpCommunicationSpi$3#129dc9b8, locks=GridKeyLock [locksSize=0], msgReader=org.gridgain.grid.spi.communication.tcp.GridTcpCommunicationSpi$5#476fcb17, msgWriter=org.gridgain.grid.spi.communication.tcp.GridTcpCommunicationSpi$6#35265894], evtSpi=GridMemoryEventStorageSpi [expireAgeMs=9223372036854775807, expireCnt=10000, filter=null], colSpi=GridPriorityQueueCollisionSpi [parallelJobsNum=95, waitJobsNum=2147483647, runningCnt=0, waitingCnt=0, heldCnt=0, taskPriAttrKey=grid.task.priority, jobPriAttrKey=grid.job.priority, dfltPri=0, starvationInc=1, preventStarvation=true, priComp=null], authSpi=GridNoopAuthenticationSpi [], sesSpi=GridNoopSecureSessionSpi [], deploySpi=GridLocalDeploymentSpi [lsnr=org.gridgain.grid.kernal.managers.deployment.GridDeploymentLocalStore$LocalDeploymentListener#2da1f51c], swapSpaceSpi=GridNoopSwapSpaceSpi [], addrRslvr=null, cacheSanityCheckEnabled=true, discoStartupDelay=60000, deployMode=SHARED, p2pMissedCacheSize=100, smtpHost=null, smtpPort=25, smtpUsername=null, smtpPwd=null, smtpFromEmail=info#gridgain.com, smtpSsl=false, smtpStartTls=false, locHost=null, timeSrvPortBase=31100, timeSrvPortRange=100, restSecretKey=null, licUrl=null, metricsLogFreq=60000, restTcpHost=null, restTcpPort=11211, restTcpNoDelay=true, restTcpDirectBuf=false, restTcpSndBufSize=0, restTcpRcvBufSize=0, restTcpSndQueueLimit=0, restTcpSelectorCnt=2, restIdleTimeout=7000, restTcpSslEnabled=false, restTcpSslClientAuth=false, restTcpSslCtxFactory=null, restPortRange=100, clientMsgInterceptor=null, drRcvHubCfg=null, drSndHubCfg=null, dataCenterId=0, securityCred=null, hadoopCfg=null, clientCfg=null, portableCfg=null], log=GridLoggerProxy [gridName=master-1, id8=e99a635d], gridName=master-1, kernalMBean=org.gridgain:grid=master-1,group=Kernal,name=GridKernal, locNodeMBean=org.gridgain:grid=master-1,group=Kernal,name=GridLocalNodeMetrics, pubExecSvcMBean=org.gridgain:grid=master-1,group=Thread Pools,name=GridExecutionExecutor, sysExecSvcMBean=org.gridgain:grid=master-1,group=Thread Pools,name=GridSystemExecutor, mgmtExecSvcMBean=org.gridgain:grid=master-1,group=Thread Pools,name=GridManagementExecutor, p2PExecSvcMBean=org.gridgain:grid=master-1,group=Thread Pools,name=GridClassLoadingExecutor, restExecSvcMBean=null, startTime=1410300494574, rsrcCtx=org.gridgain.grid.kernal.processors.resource.GridSpringResourceContextImpl#5268b5c8, updateNtfTimer=null, starveTimer=null, licTimer=null, metricsLogTimer=null, errOnStop=false, scheduler=org.gridgain.grid.kernal.GridSchedulerImpl#587687bc, security=null, portables=null, drPool=null, gw=GridKernalGatewayImpl [state=STARTED, gridName=master-1, stackTrace=java.lang.Throwable
at org.gridgain.grid.kernal.GridKernalGatewayImpl.stackTrace(GridKernalGatewayImpl.java:137)
at org.gridgain.grid.kernal.GridKernalGatewayImpl.writeLock(GridKernalGatewayImpl.java:105)
at org.gridgain.grid.kernal.GridKernal.start(GridKernal.java:515)
at org.gridgain.grid.kernal.GridGainEx$GridNamedInstance.start0(GridGainEx.java:1898)
at org.gridgain.grid.kernal.GridGainEx$GridNamedInstance.start(GridGainEx.java:1232)
at org.gridgain.grid.kernal.GridGainEx.start0(GridGainEx.java:775)
at org.gridgain.grid.kernal.GridGainEx.start(GridGainEx.java:472)
at org.gridgain.grid.GridGainSpring.start(GridGainSpring.java:73)
at org.gridgain.grid.GridSpringBean.afterPropertiesSet(GridSpringBean.java:145)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1612)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1549)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:539)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:475)
at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:304)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:228)
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:300)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:195)
at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:703)
at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:760)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:482)
at com.mycompany.enterprise.gridgain.license.GridInstance.initialize(GridInstance.java:88)
at com.mycompany.enterprise.gridgain.license.GridNodeMonitorIT.testMasterNodeDiscoversWorkerNodesWhenStartedAfterWorkers(GridNodeMonitorIT.java:173)

You don't have an exception during the start. Stacktrace you see is just a field which is printed out when Grid.toString is called. Thanks for pointing this out, we will fix it in the next version.
Thanks!

Related

How to add a delay in a Netty request flow without blocking on inbound main thread?

I have a high qps Netty app (2500-3000qps per vm).
The EventLoop threads look like below.
bossGroup = new NioEventLoopGroup(2, new DefaultThreadFactory("inbound-netty-boss"));
workerGroup = new NioEventLoopGroup(32 * 5, new DefaultThreadFactory("inbound-netty-worker"));
The request lifecycle looks like this:
Incoming request--> Do X --> Do Y
What I want to do is :
Incoming request --> Do X(params1) --> Do X(params2) after 100ms delay -->Do Y
So far I have tried:
CompletableFuture.runAsync(RunnableX, CompletableFuture.delayedExecutor(
100,
TimeUnit.MILLISECONDS, new ForkJoinPool(32 * 5));
Executors.newScheduledThreadPool(32*5,
new ThreadFactoryBuilder().schedule(RunnableX, 100, TimeUnit.MILLISECONDS)
And finally
channelHandlerContext.executor().schedule(RunnableX, 100, TimeUnit.MILLISECONDS);
All of these have cause a tremendous bottleneck with the qps almost(1/10th) of original.
I keep seeing the threads in RUNNABLE state from the eventloopgroup and all threads in dedicated executor in TIMED WAITING state.
What am I missing? I need to find a way to not block the inbound thread, so that it is free to serve other requests.

What does -1000 mean in spark exit status

I'm doing something with Spark-SQL and got error below:
YarnSchedulerBackend$YarnSchedulerEndpoint: Requesting driver to
remove executor 1 for reason Container marked as failed:
container_1568946404896_0002_02_000002 on host: worker1. Exit status:
-1000. Diagnostics: [2019-09-20 10:43:11.474]Task java.util.concurrent.ExecutorCompletionService$QueueingFuture#76430b7c
rejected from
org.apache.hadoop.util.concurrent.HadoopThreadPoolExecutor#16970b[Terminated,
pool size = 0, active threads = 0, queued tasks = 0, completed tasks =
1]
I'm trying to figure it out by checking the meaning of Exit status: 1000, however, no valuable info returned by googling.
According to this thread, the -1000 is not even mentioned.
Any comment is welcomed, thanks.

Error:java: Annotation processor 'org.mapstruct.ap.MappingProcessor' not found

When I run application.java:
Information:Using javac 1.8.0_45 to compile java sources
Information:java: Errors occurred while compiling module 'bookstore'
Information:2015/5/29 17:35 - Compilation completed with 1 error and 0 warnings in 5 sec
Error:java: Annotation processor 'org.mapstruct.ap.MappingProcessor' not found
You can add this to your pom:
<dependency>
<groupId>org.mapstruct</groupId>
<artifactId>mapstruct-processor</artifactId>
<version>${org.mapstruct.version}</version>
<scope>provided</scope>
</dependency>
That fixes the error for me.
#Mappings({
#Mapping(target = "", source = ""),
#Mapping(target = "", source = "")
})
This fixed my error

spring integration task executor queue filled with more records

I started to build a Spring Integration app, in which the input gateway generates a fixed number (50) of records and then stops generating new records. There are basic filters/routers/transformers in the middle, and the ending service activator and task executor config are as following:
<int:service-activator input-channel="inChannel" output-channel="outChannel" ref="svcProcessor">
<int:poller fixed-rate="100" task-executor="myTaskExecutor"/>
</int:service-activator>
<task:executor id = "myTaskExecutor" pool-size="5" queue-capacity="100"/>
I tried to put some debug info at the begging of the svcProcessor method:
#Qualifier(value="myTaskExecutor")
#Autowired
ThreadPoolTaskExecutor executor;
#ServiceActivator
public Order processOrder(Order order) {
log.debug("---- " + "executor size: " + executor.getActiveCount() +
" q: " + executor.getThreadPoolExecutor().getQueue().size() +
" r: " + executor.getThreadPoolExecutor().getQueue().remainingCapacity()+
" done: " + executor.getThreadPoolExecutor().getCompletedTaskCount() +
" task: " + executor.getThreadPoolExecutor().getTaskCount()
);
//
//process order takes up to 5 seconds.
//
return order;
}
After sometimes the program runs, the log shows the queue has reached over 50, then eventually gets reject exception:
23:38:31.096 DEBUG [myTaskExecutor-2] ---- executor size: 5 q: 44 r: 56 done: 11 task: 60
23:38:31.870 DEBUG [myTaskExecutor-5] ---- executor size: 5 q: 51 r: 49 done: 11 task: 67
23:38:33.600 DEBUG [myTaskExecutor-4] ---- executor size: 5 q: 69 r: 31 done: 11 task: 85
23:32:46.792 DEBUG [myTaskExecutor-1] ---- executor size: 5 q: 72 r: 28 done: 11 task: 88
It looks like the active count and sum of queue size/remaining looks right with the config of 5 and 100, but I am not clear why there are more than 50 records in the queue, and the taskCount is also larger than the limit 50.
Am I looking at the wrong info from the executor and the queue?
Thanks
UPDATE:
(not sure if I should open another question)
I tried the xml version of the cafeDemo from spring-integration (branch SI3.0.x), and used pool provided in the document, but used 100 milliseconds rate and added capacity:
<int:service-activator input-channel="hotDrinks" ref="barista" method="prepareHotDrink" output-channel="preparedDrinks">
<int:poller task-executor="pool" fixed-rate="100"/>
</int:service-activator>
<task:executor id="pool" pool-size="5" queue-capacity="200"/>
After I ran it, it also got rejection exception after around the 20th delivery:
org.springframework.core.task.TaskRejectedException: Executor [java.util.concurrent.ThreadPoolExecutor#6c31732b[Running, pool size = 5, active threads = 5, queued tasks = 200, completed tasks = 0]]
There are only about 32 orders placed until the exception, so I am not sure why queued tasks = 200 and completed task = 0?
THANKS
getTaskCount() This method gives the number of total task assigned to executor since the start. So, it will increase with time.
And other variables are approximate number not exact as per documentation of java.
getCompletedTaskCount()
Returns the approximate total number of tasks that have completed execution.
public int getActiveCount()
Returns the approximate number of threads that are actively executing tasks.
Ideally getTaskCount() and getCompletedTaskCount() will increase linearly with time, as it includes all the previous tasks assigned since start of execution of your code. However, activeCount should be less than 50, but being approximate number it will go beyond 50 sometimes with little margin.
Refer :-
https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html

How to increase resolution of gif image?

How to increase resolution of gif image generated by rgl package of R (plot3d and movie3d functions) - either externally or through R ?
R Code :
MyX<-rnorm(10,5,1)
MyY<-rnorm(10,5,1)
MyZ<-rnorm(10,5,1)
plot3d(MyX, MyY, MyZ, xlab="X", ylab="Y", zlab="Z", type="s", box=T, axes=F)
text3d(MyX, MyY, MyZ, text=c(1:10), cex=5, adj=1)
movie3d(spin3d(axis = c(0, 0, 1), rpm = 4), duration=15, movie="TestMovie",
type="gif", dir=("~/Desktop"))
Output :
Update
Adding this line at the beginning of code solved the problem
r3dDefaults$windowRect <- c(0, 100, 1400, 1400)
I don't think you can do much about the resolution of the gif itself. I think you have to make the image much larger as an alternative, and then when you display it smaller it looks better. This is untested as a recent upgrade broke a thing or two for me, but this did work under 2.15:
par3d(windowRect = c(0, 0, 500, 500)) # make the window large
par3d(zoom = 1.1) # larger values make the image smaller
# you can test your settings interactively at this point
M <- par3d("userMatrix") # save your settings to pass to the movie
movie3d(par3dinterp(userMatrix=list(M,
rotate3d(M, pi, 1, 0, 0),
rotate3d(M, pi, 0, 1, 0) ) ),
duration = 5, fps = 50,
movie = "MyMovie")
HTH. If it doesn't quite work for you, check out the functions used and tune up the settings.

Resources