how to deal with large linnet object - spatstat

I am trying to use a whole city network for a particular analysis which I know is very huge. I have also set it as sparse network.
library(maptools)
library(rgdal)
StreetsUTM=readShapeSpatial("cityIN_UTM")
#plot(StreetsUTM)
library(spatstat)
SS_StreetsUTM =as.psp(StreetsUTM)
SS_linnetUTM = as.linnet(SS_StreetsUTM, sparse=TRUE)
> SS_linnetUTM
Linear network with 321631 vertices and 341610 lines
Enclosing window: rectangle = [422130.9, 456359.7] x [4610458,
4652536] units
> SS_linnetUTM$sparse
[1] TRUE
I have the following problems:
It took 15-20 minutes to build psp object
It took almost 5 hours to build the linnet object
every time I want to analyse it for a point pattern or envelope, R crashes
I understand I should try to reduce the network size, but:
I was wondering if there is a smart way to overcome this problem. Would rescaling help?
How can I put it on more processing power?
I am also curios to know if spatstat can be used with parallel package
In the end, what are the limitations on network size for spatstat.
R crashes
R crashes when I use the instructions from Spatstat book:
KN <- linearK(spiders, correction="none") ; on my network (linnet) of course
envelope(spiders, linearK, correction="none", nsim=39); on my network
I do not think RAM is the problem, I have 16GB RAM and 2.5GhZ Dual core i5 processor on an SSD machine.
Could someone guide me please.

Please be more specific about the commands you used.
Did you build the linnet object from a psp object using as.linnet.psp (in which case the connectivity of the network must be guessed, and this can take a long time), or did you have information about the connectivity of the network that you passed to the linnet() command?
Exactly what commands to "analyse it for a point pattern or envelope" cause a crash, and what kind of crash?
The code for linear networks in spatstat is research code which is still under development. Faster algorithms for the K-function will be released soon.

I could only resolve this with simplifying my network in QGIS with Douglas-Peucker algorithm in Simplify Geometries tool. So it is a slight compromise on the geometry of the linear network in the shapefile.

Related

Synchronization problem while executing Simulink FMU in ROS 2-Gazebo (TF_OLD_DATA warning)

I'm working in a co-simulation project between Simulink and Gazebo. The aim is to move a robot model in Gazebo with the trajectory coordinates computed from Simulink. I'm using MATLAB R2022a, ROS 2 Dashing and Gazebo 9.9.0 in a computer running Ubuntu 18.04.
The problem is that when launching the FMU with the fmi_adapter, I'm obtaining the following. It is tagged as [INFO], but actually messing up all my project.
[fmi_adapter_node-1] [INFO] [fmi_adapter_node]: Simulation time 1652274762.959713 is greater than timer's time 1652274762.901340. Is your step size to large?
Note the timer's time is higher than the simulation time. Even if I try to change the step size with the optional argument of the fmi_adapter_node, the same log appears with small differences in the times. I'm using the next commands:
ros2 launch fmi_adapter fmi_adapter_node.launch.py fmu_path:=FMI/Trajectory/RobotMARA_SimulinkFMU_v2.fmu # default step size: 0.2
ros2 launch fmi_adapter fmi_adapter_node.launch.py fmu_path:=FMI/Trajectory/RobotMARA_SimulinkFMU_v2.fmu _step_size:=0.001
As you would expect, the outputs of the FMU are the xyz coordinates of the robot trajectory in each time step. Since the fmi_adapter_node creates topics for both inputs and outputs, I'm reading the output xyz values by means of 3 subscribers with the next code. Then, those coordinates are being used to program the robot trajectories with the MoveIt-Python API.
When I run the previous Python code, I'm obtaining the following warning once and again and the robot manipulator actually doesn't move.
[ WARN] [1652274804.119514250]: TF_OLD_DATA ignoring data from the past for frame motor6_link at time 870.266 according to authority unknown_publisher
Possible reasons are listed at http://wiki.ros.org/tf/Errors%20explained
The previous warning is explained here, but I'm not able to fix it. I've tried clicking Reset in RViz, but nothing changes. I've also tried the following without success:
ros2 param set /fmi_adapter_node use_sim_time true # it just sets the timer's time to 0
It seems that the clock is taking negative values, so there is a synchronization problem.
Any help is welcome.
The warning message by the FMIAdapterNode is emitted if the timer's period is only slightly greater than the simulation step-size and if the timer is preempted by other processes or threads.
I created an issue at https://github.com/boschresearch/fmi_adapter/issues/9 which explains this in more detail and lists two possible fixes. It would be great if you could contribute to this discussion.
I assume that the TF_OLD_DATA error is not related to the fmi_adapter. Looking at the code snippet at ROS Answers, I wondered whether x,y,z values are re-published at all given that the lines
pose.position.x = listener_x.value
pose.position.y = listener_y.value
pose.position.z = listener_z.value
are not inside a callback and executed even before rospy.spin(), but maybe that's just truncated.

linearK - large time difference between empirical and acceptance envelopes in spatstat

I am interested in knowing correlation in points between 0 to 2km on a linear network. I am using the following statement for empirical data, this is solved in 2 minutes.
obs<-linearK(c, r=seq(0,2,by=0.20))
Now I want to check the acceptance of Randomness, so I used envelopes for the same r range.
acceptance_enve<-envelope(c, linearK, nsim=19, fix.n = TRUE, funargs = list(r=seq(0,2,by=0.20)))
But this show estimated time to be little less than 3 hours. I just want to ask if this large time difference is normal. Am I correct in my syntax to the function call of envelope its extra arguments for r as a sequence?
Is there some efficient way to shorten this 3 hour execution time for envelopes?
I have a road network of whole city, so it is quite large and I have checked that there are no disconnected subgraphs.
c
Point pattern on linear network
96 points
Linear network with 13954 vertices and 19421 lines
Enclosing window: rectangle = [559.653, 575.4999] x
[4174.833, 4189.85] Km
thank you.
EDIT AFTER COMMENT
system.time({s <- runiflpp(npoints(c), as.linnet(c));
+ linearK(s, r=seq(0,2,by=0.20))})
user system elapsed
343.047 104.428 449.650
EDIT 2
I made some really small changes by deleting some peripheral network segments that seem to have little or no effect on the overall network. This also lead to split some long segments into smaller segments. But now on the same network with different point pattern, I have even longer estimated time:
> month1envelope=envelope(months[[1]], linearK ,nsim = 39, r=seq(0,2,0.2))
Generating 39 simulations of CSR ...
1, 2, [etd 12:03:43]
The new network is
> months[[1]]
Point pattern on linear network
310 points
Linear network with 13642 vertices and 18392 lines
Enclosing window: rectangle = [560.0924, 575.4999] x [4175.113,
4189.85] Km
System Config: MacOS 10.9, 2.5Ghz, 16GB, R 3.3.3, RStudio Version 1.0.143
You don't need to use funargs in this context. Arguments can be passed directly through the ... argument. So I suggest
acceptance_enve <- envelope(c, linearK, nsim=19,
fix.n = TRUE, r=seq(0,2,by=0.20))
Please try this to see if it accelerates the execution.

Hazelcast High Response Times

We have a Java 1.6 application that uses Hazelcast 3.7.4 version,
with a topology of two nodes. The application operates mainly with 3 maps.
In normal application working, response times when consulting the maps are
generally in values around some milliseconds tens.
I have observed that in some circumstances such as for example with network
cuts, the response time increases to huge values such as for example, 20 or 30 seconds!!
And this is impacting the application performance.
I would like to know if this kind of situation with network micro-cuts can lead
to increase searches response time in this manner. I do not know if some concrete configuration can be done to minimize this, and also which other elements can provoke so high times.
I provide some examples of some executed consults
Example 1:
String sqlPredicate = "acui='"+acui+"'";
Collection<Agent> agents =
(Collection<Agent>) data.getMapAgents().values(new SqlPredicate(sqlPredicate));
Example 2:
boolean exist = data.getMapAgents().containsKey(agent);
Thanks so much for your help.
Best Regards,
Jorge
The Map operations are all TCP Socket based and thus are subject to your Operating Systems TCP Driver implementation.
See TCP_NODELAY

multithreading or shared memory - Architecture

There are 3 parts to my application:
A numerical simulator solving a 21 variable diff equation by runge-kutta method - direct from numerical recipes in C, step size is 0.0001 s
A C code pinging a PIC based micrprocessor every 1s and receiving data at about 3600 samples per second over the USB-COM port; It sends relevant data to the front end over TCP/IP
A JAVA front end reading the data from the numerical simulator via SWIG (for the C code) and JNI, modifying the parameters with input from the microprocessor and finally plotting it to the GUI.
I want to recode the JAVA front end in C++ now, with the option of using HTML/Javascript for plotting.
Would rewriting the front end in C++ so that the numerical simulator runs on a separate thread be a good approach?
I don't understand threading though I have used it for the listening and plotting functions in the JAVA code. It seems like having it all run on multiple threads instead of separate processes would slow down my simulations.
Can I combine 1 , 2 and 3 into a single program or should they remain separate to retain the 0.0001 ms simulation speed and the ability to handle the large amount to microprocessor data.
Please help me pick a path forward!
Thanks in Advance!
On a multicore platform, multithreading will generally improve performance. However, GPOS such as Linux and Windows are not deterministic, so there are no guarantees.
That said, the computational performance of a modern PC is such that it will hardly be stretched by this task and data rate,so it hardly matters perhaps?

How to search for Possibilities to parallelize?

I have some serial code that I have started to parallelize using Intel's TBB. My first aim was to parallelize almost all the for loops in the code (I have even parallelized for within for loop)and right now having done that I get some speedup.I am looking for more places/ideas/options to parallelize...I know this might sound a bit vague without having much reference to the problem but I am looking for generic ideas here which I can explore in my code.
Overview of algo( the following algo is run over all levels of the image starting with shortest and increasing width and height by 2 each time till you reach actual height and width).
For all image pairs starting with the smallest pair
For height = 2 to image_height - 2
Create a 5 by image_width ROI of both left and right images.
For width = 2 to image_width - 2
Create a 5 by 5 window of the left ROI centered around width and find best match in the right ROI using NCC
Create a 5 by 5 window of the right ROI centered around width and find best match in the left ROI using NCC
Disparity = current_width - best match
The edge pixels that did not receive a disparity gets the disparity of its neighbors
For height = 0 to image_height
For width = 0 to image_width
Check smoothness, uniqueness and order constraints*(parallelized separately)
For height = 0 to image_height
For width = 0 to image_width
For disparity that failed constraints, use the average disparity of
neighbors that passed the constraints
Normalize all disparity and output to screen
Just for some perspective, it may not always be worthwhile to parallelize something.
Just because you have a for loop where each iteration can be done independently of each other, doesn't always mean you should.
TBB has some overhead for starting those parallel_for loops, so unless you're looping a large number of times, you probably shouldn't parallelize it.
But, if each loop is extremely expensive (Like in CirrusFlyer's example) then feel free to parallelize it.
More specifically, look for times where the overhead of the parallel computation is small relative to the cost of having it parallelized.
Also, be careful about doing nested parallel_for loops, as this can get expensive. You may want to just stick with paralellizing the outer for loop.
The silly answer is anything that is time consuming or iterative. I use Microsoft's .NET v4.0 Task Parallel Library and one of the interesting things about their setup is its "expressed parallelism." An interesting term to describe "attempted parallelism." Though, your coding statements may say "use the TPL here" if the host platform doesn't have the necessary cores it will simply invoke the old fashion serial code in its place.
I have begun to use the TPL on all my projects. Any place there are loops especially (this requires that I design my classes and methods such that there are no dependencies between the loop iterations). But any place that might have been just good old fashion multithreaded code I look to see if it's something I can place on different cores now.
My favorite so far has been an application I have that downloads ~7,800 different URL's to analyze the contents of the pages, and if it finds information that it's looking for does some additional processing .... this used to take between 26 - 29 minutes to complete. My Dell T7500 workstation with dual quad core Xeon 3GHz processors, with 24GB of RAM, and Windows 7 Ultimate 64-bit edition now crunches the entire thing in about 5 minutes. A huge difference for me.
I also have a publish / subscribe communication engine that I have been refactoring to take advantage of TPL (especially on "push" data from the Server to Clients ... you may have 10,000 client computers who have stated their interest in specific things, that once that event occurs, I need to push data to all of them). I don't have this done yet but I'm REALLY LOOKING FORWARD to seeing the results on this one.
Food for thought ...

Resources