velox package point extraction failure

velox package point extraction failure - geospatial

I am trying to work with the velox package in R 3.4.1, using the current (0.2.0) velox package version. I want to extract raster pixel values using the VeloxRaster_extract_points functionality and after failures with my own data, I ran the exact code provided on page 19 of the current reference manual. This returned the error shown (below). I have been unable to find any relevant references to this error online. Any suggestions?
Thanks
> ## Make VeloxRaster with two bands
> set.seed(0)
> mat1 <- matrix(rnorm(100), 10, 10)
> mat2 <- matrix(rnorm(100), 10, 10)
> vx <- velox(list(mat1, mat2), extent=c(0,1,0,1), res=c(0.1,0.1),crs="+proj=longlat +datum=WGS84 +no_defs")
> ## Make SpatialPoints
> library(sp)
> library(rgeos)
> coord <- cbind(runif(10), runif(10))
> spoint <- SpatialPoints(coords=coord)
> ## Extract
> vx$extract_points(sp=spoint)
Error in envRefInferField(x, what, getClass(class(x)), selfEnv) :
‘extract_points’ is not a valid field or method name for reference class “VeloxRaster”

When trying, it worked fine for my case:
library('velox')
## Make VeloxRaster with two bands
set.seed(0)
mat1 <- matrix(rnorm(100), 10, 10)
mat2 <- matrix(rnorm(100), 10, 10)
vx <- velox(list(mat1, mat2), extent=c(0,1,0,1), res=c(0.1,0.1),
crs="+proj=longlat +datum=WGS84 +no_defs")
## Make SpatialPoints
library(sp)
library(rgeos)
coord <- cbind(runif(10), runif(10))
spoint <- SpatialPoints(coords=coord)
## Extract
vx$extract_points(sp=spoint)
[,1] [,2]
[1,] 0.76359346 -0.4125199
[2,] 0.35872890 0.3178857
[3,] 0.25222345 -1.1195991
[4,] 0.00837096 2.0247614
[5,] 0.77214219 -0.5922254
[6,] 0.00837096 2.0247614
[7,] 1.10096910 0.5989751
[8,] 1.15191175 -0.9558391
[9,] 0.14377148 -1.5236149
[10,] 1.27242932 0.0465803
I think you may need to reinstall the package.

Related

Tabula-py for borderless table extraction

Can anyone please suggest me how to extract tabular data from a PDF using python/java program for the below borderless table present in a pdf file?

This table might be difficult one for tabla. How about using guess=False, stream=True ?
Update: As of tabula-py 1.0.3, guess and stream should work together. No need to set guess=False to use stream or lattice option.

I solved this problem via tabula-py
conda install tabula-py
and
>>> import tabula
>>> area = [70, 30, 750, 570] # Seems to have to be done manually
>>> page2 = tabula.read_pdf("nar_2021_editorial-2.pdf", guess=False, lattice=False,
stream=True, multiple_tables=False, area=area, pages="all",
) # `tabula` doc explains params very well
>>> page2
and I got this result
> 'pages' argument isn't specified.Will extract only from page 1 by default. [
> ShortTitle Text \ 0
> Arena3Dweb 3D visualisation of multilayered networks 1
> Aviator Monitoring the availability of web services 2
> b2bTools Predictions for protein biophysical features and 3
> NaN their conservation 4
> BENZ WS Four-level Enzyme Commission (EC) number ..
> ... ... 68
> miRTargetLink2 miRNA target gene and target pathway
> 69 NaN networks
> 70 mmCSM-PPI Effects of multiple point mutations on
> 71 NaN protein-protein interactions
> 72 ModFOLD8 Quality estimates for 3D protein models
>
>
> URL 0 http://bib.fleming.gr/Arena3D 1
> https://www.ccb.uni-saarland.de/aviator 2
> https://bio2byte.be/b2btools/ 3
> NaN 4 https://benzdb.biocomp.unibo.it/ ..
> ... 68 https://www.ccb.uni-saarland.de/mirtargetlink2 69
> NaN 70 http://biosig.unimelb.edu.au/mmcsm ppi 71
> NaN 72 https://www.reading.ac.uk/bioinf/ModFOLD/ [73
> rows x 3 columns]]
This is an iterable obj, so you can manipulate it via for row in page2:
Hope it help you

Tabula-py borderless table extraction:
Tabula-py has stream which on True detects table based on gaping.
from tabula convert_into
src_pdf = r"src_path"
des_csv = r"des_path"
convert_into(src_pdf, des_csv, guess=False, lattice=False, stream=True, pages="all")

linearK error in seq. default() cannot be NA, NaN

I am trying to learn linearK estimates on a small linnet object from the CRC spatstat book (chapter 17) and when I use the linearK function, spatstat throws an error. I have documented the process in the comments in the r code below. The error is as below.
Error in seq.default(from = 0, to = right, length.out = npos + 1L) : 'to' cannot be NA, NaN or infinite
I do not understand how to resolve this. I am following this process:
# I have data of points for each data of the week
# d1 is district 1 of the city.
# I did the step below otherwise it was giving me tbl class
d1_data=lapply(split(d1, d1$openDatefactor),as.data.frame)
# I previously create a linnet and divided it into districts of the city
d1_linnet = districts_linnet[["d1"]]
# I create point pattern for each day
d1_ppp = lapply(d1_data, function(x) as.ppp(x, W=Window(d1_linnet)))
plot(d1_ppp[[1]], which.marks="type")
# I am then converting the point pattern to a point pattern on linear network
d1_lpp <- as.lpp(d1_ppp[[1]], L=d1_linnet, W=Window(d1_linnet))
d1_lpp
Point pattern on linear network
3 points
15 columns of marks: ‘status’, ‘number_of_’, ‘zip’, ‘ward’,
‘police_dis’, ‘community_’, ‘type’, ‘days’, ‘NAME’,
‘DISTRICT’, ‘openDatefactor’, ‘OpenDate’, ‘coseDatefactor’,
‘closeDate’ and ‘instance’
Linear network with 4286 vertices and 6183 lines
Enclosing window: polygonal boundary
enclosing rectangle: [441140.9, 448217.7] x [4640080, 4652557] units
# the errors start from plotting this lpp object
plot(d1_lpp)
"show.all" is not a graphical parameter
Show Traceback
Error in plot.window(...) : need finite 'xlim' values
coords(d1_lpp)
x y seg tp
441649.2 4649853 5426 0.5774863
445716.9 4648692 5250 0.5435492
444724.6 4646320 677 0.9189631
3 rows
And then consequently, I also get error on linearK(d1_lpp)
Error in seq.default(from = 0, to = right, length.out = npos + 1L) : 'to' cannot be NA, NaN or infinite
I feel lpp object has the problem, but I find it hard to interpret the errors and how to resolve them. Could someone please guide me?
Thanks

I can confirm there is a bug in plot.lpp when trying to plot the marked point pattern on the linear network. That will hopefully be fixed soon. You can plot the unmarked point pattern using
plot(unmark(d1_lpp))
I cannot reproduce the problem with linearK. Which version of spatstat are you running? In the development version on my laptop spatstat_1.51-0.073 everything works. There has been changes to this code recently, so it is likely that this will be solved by updating to development version (see https://github.com/spatstat/spatstat).

The n-th "Invalid parent value" error

I hope you can help me with a very simple model I'm running right now in Rjags.
The data I have are as follows:
> print(data)
$R
225738 184094 66275 24861 11266
228662 199379 70308 27511 12229
246808 224814 78255 30447 13425
254823 236063 83099 33148 13961
263772 250706 89182 35450 14750
272844 262707 96918 37116 15715
280101 271612 102604 38692 16682
291493 283018 111125 40996 18064
310474 299315 119354 44552 19707
340975 322054 126901 47757 21510
347597 332946 127708 49103 21354
354252 355994 130561 51925 22421
366818 393534 140628 56562 23711
346430 400629 146037 59594 25313
316438 399545 150733 62414 26720
303294 405876 161793 67060 29545
$N
9597000 8843000 9154000 9956000 11329000
9854932 9349814 9532373 10195193 11357751
9908897 9676950 9303113 10263930 11141510
9981879 9916245 9248586 10270193 10903446
10086567 10093723 9307104 10193818 10660101
10242793 10190641 9479080 10041145 10453320
10434789 10222806 9712544 9835154 10411620
10597293 10238784 10014422 9611918 10489448
10731326 10270163 10229259 9559334 10502839
10805148 10339566 10393532 9625879 10437809
10804571 10459413 10466871 9800559 10292169
10696317 10611599 10477448 10030407 10085603
10540942 10860363 10539271 10245334 9850488
10411836 11053751 10569913 10435763 9797028
10336667 11152428 10652017 10613341 9850533
10283624 11172747 10826549 10719741 9981814
$n
[1] 16
$na
[1] 5
$pbeta
[1] 0.70 0.95
and the model is as follows:
cat('model{
## likelihoods ##
for(k in 1:na){ for(w in 1:n){ R[w,k] ~ dbin( theta[w,k], N[w,k] ) }}
for(k in 1:na){ for(w in 1:n){ theta[w,k] <- 0.5*beta[w,k]*0.5 }}
for(k in 1:na){
beta[1,k] ~ dunif(pbeta[1], pbeta[2])
beta.plus[1,k] <- beta[1,k]
for (w in 2:n){
beta.plus[w,k] ~ dunif(beta[(w-1),k], 0.95)
beta[w,k] <- beta.plus[w,k]} } }',
file='model1.bug')
######## initial random values for beta
bbb=bb.plus=matrix(rep(NA, 16*5), byrow=T, ncol=5);
for(k in 1:5){
bbb[1,k]=runif(1, 0.7,0.95);
for (w in 2:16){
bb.plus[w,k] = runif(1, bbb[w-1,k], 0.95);
bbb[w,k]=bb.plus[w,k]} }
## data & initial values
inits1 <- list('beta'= bbb )
jags_mod <- jags.model('model1.bug', data=data, inits=inits1, n.chains=1, n.adapt=1000)
update(jags_mod, n.iter=1000)
posts=coda.samples(model=jags_mod,variable.names=c('beta','theta'), n.iter=niter, thin=1000)
Super easy. This is actually a scaled down model from a more complex one which gives me exactly the same error message I get here.
Whenever I run this model, no problems at all.
You will notice that the priors for parameter beta are written in such a way to be increasing from 0.7 to 0.95.
Now I would like to "shut off" the likelihood for R by commenting out the first line of the model. I'd like to do so, to see how the parameter theta gets estimated in any case (basically I should find theta=beta/4 in this case, but that would be fine with me)
When I do that, I get an "Invalid parent" error for parameter beta, generally in the bottom rows (rows 15 or 16) of the matrix.
Actually it's more sophisticated than this: sometimes I get an error, and sometimes I don't (mostly, I do).
I don' t understand why this happens: shouldn't the values of beta generated independently from the presence/absence of a likelihood?
Sorry if this is a naive question, I really hope you can help me sort it out.
Thanks, best!
Emanuele

After playing around with the model a bit more I think I found the cause of your problem. One necessary aspect of the uniform distribution (i.e., unif(a,b)) is that a<b. When you are making the uniform distribution smaller and smaller within your model you are bringing a closer and closer to b. At times, it does not reach it, but other times a equals b and you get the invalid parent values error. For example, in your model if you include:
example ~ dunif(0.4,0.4)
You will get "Error in node example, Invalid parent values".
So, to solve this I think it will be easier to adjust how you specify your priors instead of assigning them randomly. You could do this with the beta distribution. At the first step, beta(23.48, 4.98) covers most of the range from 0.7 to 0.95, but we could truncate it to make sure it lies between that range. Then, as n increases you can lower 4.98 so that the prior shrinks towards 0.95. The model below will do this. After inspecting the priors, it does turn out that theta does equal beta/4.
data.list <- list( n = 16, na = 5,
B = rev(seq(0.1, 4.98, length.out = 16)))
cat('model{
## likelihoods ##
#for(k in 1:na){ for(w in 1:n){ R[w,k] ~ dbin( theta[w,k], N[w,k] ) }}
for(k in 1:na){ for(w in 1:n){ theta[w,k] <- 0.5*beta[w,k]*0.5 }}
for(k in 1:na){
for(w in 1:n){
beta[w,k] ~ dbeta(23.48, B[w]) T(0.7,0.95)
} } }',
file='model1.bug')
jags_mod <- jags.model('model1.bug', data=data.list,
inits=inits1, n.chains=1, n.adapt=1000)
update(jags_mod, n.iter=1000)
posts=coda.samples(model=jags_mod,
variable.names=c('beta','theta'), n.iter=10000, thin=10)
Looking at some of the output from this model we get
beta[1,1] theta[1,1]
[1,] 0.9448125 0.2362031
[2,] 0.7788794 0.1947198
[3,] 0.9498806 0.2374702
0.9448125/4
[1] 0.2362031
Since I don't really know what you are trying to use the model for I do not know if the beta distribution would suit your needs, but the above method will mimic what you are trying to do.

Error using writeOGR {rgdal} to write gpx file

I am trying to use writeOGR to create a gpx file of points. writeOGR() will create a shp file with no error, but if I try to write a KML or GPX file I get this error. I'm using R 3.1.1 and rgdal 0.8-16 on Windows (I've tried it on 7 and 8, same issue).
writeOGR(points, driver="KML", layer="random_2014",dsn="C:/Users/amvander/Downloads")
Error in writeOGR(points, driver = "KML", layer = "random_2014", dsn = "C:/Users/amvander/Downloads") :
Creation of output file failed
It is in geographic coordinates, I already figured out that that was important
summary(points)
Object of class SpatialPointsDataFrame
Coordinates:
min max
x -95.05012 -95.04392
y 40.08884 40.09588
Is projected: FALSE
proj4string :
[+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0]
Number of points: 20
Data attributes:
x y ID
Min. :-95.05 Min. :40.09 Length:20
1st Qu.:-95.05 1st Qu.:40.09 Class :character
Median :-95.05 Median :40.09 Mode :character
Mean :-95.05 Mean :40.09
3rd Qu.:-95.05 3rd Qu.:40.09
Max. :-95.04 Max. :40.10
str(points)
Formal class 'SpatialPointsDataFrame' [package "sp"] with 5 slots
..# data :'data.frame': 20 obs. of 3 variables:
.. ..$ x : num [1:20] -95 -95 -95 -95 -95 ...
.. ..$ y : num [1:20] 40.1 40.1 40.1 40.1 40.1 ...
.. ..$ ID: chr [1:20] "nvsanc_1" "nvsanc_2" "nvsanc_3" "nvsanc_4" ...
..# coords.nrs : num(0)
..# coords : num [1:20, 1:2] -95 -95 -95 -95 -95 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:2] "x" "y"
..# bbox : num [1:2, 1:2] -95.1 40.1 -95 40.1
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:2] "x" "y"
.. .. ..$ : chr [1:2] "min" "max"
..# proj4string:Formal class 'CRS' [package "sp"] with 1 slots
.. .. ..# projargs: chr "+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"
can anyone provide any guidance on how to get around this error?
Here are the files I used and the code.
https://www.dropbox.com/sh/r7kz3p68j58c189/AACH0U_PLH7Y6cZW1wdFLQTOa/random_2014

you already figured out these formats will only accept geographic coordinates (lat-long, not projected) and at least for GPX files there are very limited fields allowed, like "name" for the name, "ele" for elevation and "time" for date-time information. The #data fields in your file do not match those and thus cause an error.
It is possible to write those extra fields using
dataset_options="GPX_USE_EXTENSIONS=yes"
in that case they will be added as subclasses in an "extensions" field, but many simple gps receivers will not read or use those fields though. To create a very simple waypoint file with names use the following procedure.
#I will use your dataset points
#if not already re-project your points as lat-long
ll_points <- spTransform(points, CRS("+proj=longlat + ellps=WGS84"))
# use the ID field for the names
ll_points#data$name <- ll_points#data$ID
#Now only write the "name" field to the file
writeOGR(ll_points["name"], driver="GPX", layer="waypoints",
dsn="C:/whateverdir/gpxfile.gpx")
for me this executed and created a working gpx file that my gps accepted with displayed names.

I had some problems implementing the code for input from a simple data.frame and wanted to provide the full code for someone working with that kind of data (instead of from a shapefile). This is simply a very slightly modified answer from #Wiebe's answer, without having to go search out what #Auriel Fournier's original data looked like.Thank you #Wiebe - your answer helped me solve my problem too.
The data look like this:
dat
name Latitude Longitude
1 AP1402_C110617 -78.43262 45.45142
2 AP1402_C111121 -78.42433 45.47371
3 AP1402_C111617 -78.41053 45.45600
4 AP1402_C112200 -78.42115 45.53047
5 AP1402_C112219 -78.41262 45.50071
6 AP1402_C112515 -78.42140 45.43471
Code to get it into GPX for mapsource using writeOGR:
setwd("C:\\Users\\YourfileLocation")
dat <- structure(list(name = c("AP1402_C110617", "AP1402_C111121", "AP1402_C111617",
"AP1402_C112200", "AP1402_C112219", "AP1402_C112515"), Latitude = c(-78.4326169598409,
-78.4243276812641, -78.4105301310195, -78.4211498660601, -78.4126208020092,
-78.4214041610924), Longitude = c(45.4514150332163, 45.4737126348589,
45.4560042609868, 45.5304703938887, 45.5007103937952, 45.4347135938299
)), .Names = c("name", "Latitude", "Longitude"), row.names = c(NA,
6L), class = "data.frame")
library(rgdal)
library(sp)
dat <- SpatialPointsDataFrame(data=dat, coords=dat[,2:3], proj4string=CRS("+proj=longlat +datum=WGS84"))
writeOGR(dat,
dsn="TEST3234.gpx", layer="waypoints", driver="GPX",
dataset_options="GPX_USE_EXTENSIONS=yes")

R convert all rows to strings

I need to convert all rows of a dataframe to strings.
Here's a sample data:
1.12331,4.331123,4.12335435,1,"asd"
1.123453345,5.654456,4.889999,1.45456,"qwe"
2.00098,5.5445,4.768799,1.999999,"ttre"
I read this data into R, got a dataframe.
td<-read.table("test.csv", sep=',')
When I run apply(td, 2, as.character) on this data, I got
V1 V2 V3 V4 V5
[1,] "1.1233" "4.3311" "4.1234" "1.0000" "asd"
[2,] "1.1235" "5.6545" "4.8900" "1.4546" "qwe"
[3,] "2.0010" "5.5445" "4.7688" "2.0000" "ttre"
But when I do the same only on numeric columns, I got the different result:
apply(td[,1:4], 2, as.character)
V1 V2 V3 V4
[1,] "1.12331" "4.331123" "4.12335435" "1"
[2,] "1.123453345" "5.654456" "4.889999" "1.45456"
[3,] "2.00098" "5.5445" "4.768799" "1.999999"
As a result I need a dataframe with values exactly the same as in source file. What I'm doing wrong?

You can set colClasses in read.table() to make all columns as character.
td <- read.table("test.csv", sep=',',colClasses="character")
td
V1 V2 V3 V4 V5
1 1.12331 4.331123 4.12335435 1 asd
2 1.123453345 5.654456 4.889999 1.45456 qwe
3 2.00098 5.5445 4.768799 1.999999 ttre
str(td)
'data.frame': 3 obs. of 5 variables:
$ V1: chr "1.12331" "1.123453345" "2.00098"
$ V2: chr "4.331123" "5.654456" "5.5445"
$ V3: chr "4.12335435" "4.889999" "4.768799"
$ V4: chr "1" "1.45456" "1.999999"
$ V5: chr "asd" "qwe" "ttre"

The best way to do this is the read the data in as character in the first place. You can do this with the colClasses argument to read.table:
td <- read.table("test.csv", sep=',', colClasses="character")

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

velox package point extraction failure - geospatial

Related

Tabula-py for borderless table extraction

linearK error in seq. default() cannot be NA, NaN

The n-th "Invalid parent value" error

Error using writeOGR {rgdal} to write gpx file

R convert all rows to strings

Categories

Resources