Error using writeOGR {rgdal} to write gpx file - gpx

I am trying to use writeOGR to create a gpx file of points. writeOGR() will create a shp file with no error, but if I try to write a KML or GPX file I get this error. I'm using R 3.1.1 and rgdal 0.8-16 on Windows (I've tried it on 7 and 8, same issue).
writeOGR(points, driver="KML", layer="random_2014",dsn="C:/Users/amvander/Downloads")
Error in writeOGR(points, driver = "KML", layer = "random_2014", dsn = "C:/Users/amvander/Downloads") :
Creation of output file failed
It is in geographic coordinates, I already figured out that that was important
summary(points)
Object of class SpatialPointsDataFrame
Coordinates:
min max
x -95.05012 -95.04392
y 40.08884 40.09588
Is projected: FALSE
proj4string :
[+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0]
Number of points: 20
Data attributes:
x y ID
Min. :-95.05 Min. :40.09 Length:20
1st Qu.:-95.05 1st Qu.:40.09 Class :character
Median :-95.05 Median :40.09 Mode :character
Mean :-95.05 Mean :40.09
3rd Qu.:-95.05 3rd Qu.:40.09
Max. :-95.04 Max. :40.10
str(points)
Formal class 'SpatialPointsDataFrame' [package "sp"] with 5 slots
..# data :'data.frame': 20 obs. of 3 variables:
.. ..$ x : num [1:20] -95 -95 -95 -95 -95 ...
.. ..$ y : num [1:20] 40.1 40.1 40.1 40.1 40.1 ...
.. ..$ ID: chr [1:20] "nvsanc_1" "nvsanc_2" "nvsanc_3" "nvsanc_4" ...
..# coords.nrs : num(0)
..# coords : num [1:20, 1:2] -95 -95 -95 -95 -95 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:2] "x" "y"
..# bbox : num [1:2, 1:2] -95.1 40.1 -95 40.1
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:2] "x" "y"
.. .. ..$ : chr [1:2] "min" "max"
..# proj4string:Formal class 'CRS' [package "sp"] with 1 slots
.. .. ..# projargs: chr "+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"
can anyone provide any guidance on how to get around this error?
Here are the files I used and the code.
https://www.dropbox.com/sh/r7kz3p68j58c189/AACH0U_PLH7Y6cZW1wdFLQTOa/random_2014

you already figured out these formats will only accept geographic coordinates (lat-long, not projected) and at least for GPX files there are very limited fields allowed, like "name" for the name, "ele" for elevation and "time" for date-time information. The #data fields in your file do not match those and thus cause an error.
It is possible to write those extra fields using
dataset_options="GPX_USE_EXTENSIONS=yes"
in that case they will be added as subclasses in an "extensions" field, but many simple gps receivers will not read or use those fields though. To create a very simple waypoint file with names use the following procedure.
#I will use your dataset points
#if not already re-project your points as lat-long
ll_points <- spTransform(points, CRS("+proj=longlat + ellps=WGS84"))
# use the ID field for the names
ll_points#data$name <- ll_points#data$ID
#Now only write the "name" field to the file
writeOGR(ll_points["name"], driver="GPX", layer="waypoints",
dsn="C:/whateverdir/gpxfile.gpx")
for me this executed and created a working gpx file that my gps accepted with displayed names.

I had some problems implementing the code for input from a simple data.frame and wanted to provide the full code for someone working with that kind of data (instead of from a shapefile). This is simply a very slightly modified answer from #Wiebe's answer, without having to go search out what #Auriel Fournier's original data looked like.Thank you #Wiebe - your answer helped me solve my problem too.
The data look like this:
dat
name Latitude Longitude
1 AP1402_C110617 -78.43262 45.45142
2 AP1402_C111121 -78.42433 45.47371
3 AP1402_C111617 -78.41053 45.45600
4 AP1402_C112200 -78.42115 45.53047
5 AP1402_C112219 -78.41262 45.50071
6 AP1402_C112515 -78.42140 45.43471
Code to get it into GPX for mapsource using writeOGR:
setwd("C:\\Users\\YourfileLocation")
dat <- structure(list(name = c("AP1402_C110617", "AP1402_C111121", "AP1402_C111617",
"AP1402_C112200", "AP1402_C112219", "AP1402_C112515"), Latitude = c(-78.4326169598409,
-78.4243276812641, -78.4105301310195, -78.4211498660601, -78.4126208020092,
-78.4214041610924), Longitude = c(45.4514150332163, 45.4737126348589,
45.4560042609868, 45.5304703938887, 45.5007103937952, 45.4347135938299
)), .Names = c("name", "Latitude", "Longitude"), row.names = c(NA,
6L), class = "data.frame")
library(rgdal)
library(sp)
dat <- SpatialPointsDataFrame(data=dat, coords=dat[,2:3], proj4string=CRS("+proj=longlat +datum=WGS84"))
writeOGR(dat,
dsn="TEST3234.gpx", layer="waypoints", driver="GPX",
dataset_options="GPX_USE_EXTENSIONS=yes")

Related

Need help in aligning the content in python for self automation

I am trying to create an anime series search using the tool anilistpython, but I am not able to ignore the newline character in the plot tag and need help in align the output in a proper view format.
Tried code :
from AnilistPython import Anilist
import pandas as pd
import re
# db access online
anilist = Anilist()
# User input
ani_search = anilist.get_anime(input('Enter the Anime Name\t:\t'), manual_select=True)
df = ani_search
# for Genres split
cate = []
for gen in df['genres']:
cate.append(gen)
cate1 = (' , '.join(cate))
# for Checking Episode
if df['airing_status'] == 'RELEASING':
print('Ongoing')
x ='Ongoing'
y = df['next_airing_ep']
print(y['episode'])
y1 = y['episode']
elif df['airing_status'] == 'FINISHED':
print('Ended')
x = 'Ended'
y = df['airing_episodes']
print(y)
y1 = y
else:
print('None')
# print other details
print(f"\nTitle_Name\t:\t{df['name_english']}\nRomji_Title\t:\t{df['name_romaji']}\nPlot\t:\t{re.split('<br>', df['desc'])}\nAiring_Format\t:\t{df['airing_format']}\nStatus\t:\t{x}\nEpisodes_Count\t:\t{y1}\nGenres\t:\t{cate1}\nRating\t:\t{df['average_score']}/100\n")
The output it generated :
Enter the Anime Name : Bleach
1. BLEACH
2. BEACH
3. Akkanbee da
Please select the anime that you are searching for in number: 1
Title_Name : Bleach
Romji_Title : BLEACH
Plot : ["Ichigo Kurosaki is a rather normal high school student apart from the fact he has the ability to see ghosts. This ability never impacted his life in a major way until the day he encounters the Shinigami Kuchiki Rukia, who saves him and his family's lives from a Hollow, a corrupt spirit that devours human souls. \n", '', '\nWounded during the fight against the Hollow, Rukia chooses the only option available to defeat the monster and passes her Shinigami powers to Ichigo. Now forced to act as a substitute until Rukia recovers, Ichigo hunts down the Hollows that plague his town. \n\n\n']
Airing_Format : TV
Status : Ended
Episodes_Count : 366
Genres : Action , Adventure , Supernatural
Rating : 76/100
I am looking for the format to look like this:
Title_Name : Bleach
Romji_Title : BLEACH
Plot : Ichigo Kurosaki is a rather normal high school student apart from the fact he has the ability to see ghosts. This ability never impacted his life in a major way until the day he encounters the Shinigami Kuchiki Rukia, who saves him and his family's lives from a Hollow, a corrupt spirit that devours human souls. Wounded during the fight against the Hollow, Rukia chooses the only option available to defeat the monster and passes her Shinigami powers to Ichigo. Now forced to act as a substitute until Rukia recovers, Ichigo hunts down the Hollows that plague his town.
Airing_Format : TV
Status : Ended
Episodes_Count : 366
Genres : Action , Adventure , Supernatural
Rating : 76/100

GATK GnarlyGenotyper limit of alleles

I am joint calling 167 samples with GATK GEnomicsDBImpot. But I got this kind of error:
Sample/Callset 45( TileDB row idx 107) at Chromosome Chr1 position
1320197 (TileDB column 247913574) has too many genotypes in the
combined VCF record : 1081 : current limit : 1024 (num_alleles,
ploidy) = (46, 2). Fields, such as PL, with length equal to the
number of genotypes will NOT be added for this sample for this
location.
Following the advises I have found on the link below, I decided to use GnarlyGenotyper to call the variants, as it seems to manage more alleles.
https://gatk.broadinstitute.org/hc/en-us/community/posts/360072168712-GenomicsDBImport-Attempting-to-genotype-more-than-50-alleles?page=1#community_comment_360012343671
The following script has been run, with the correct option to accept more alleles:
~/gatk-4.2.0.0/gatk GnarlyGenotyper \
-R "$reference" \
-V gendb://GenomicsDBImport_GATK \
--max-alternate-alleles 100 \
-O GenotypeGVCFs_gnarly.vcf
Unfortunately I got the following error as well:
Chromosome Chr1 position 198912 (TileDB column 246792289) has too many
alleles in the combined VCF record : 7 : current limit : 6. Fields,
such as PL, with length equal to the number of genotypes will NOT be
added for this location.
Has anyone already used this tool? Is it possible to input more alleles?

Why gnuplot rounds data in column of vertical axis?

My old script worked fine years ago.
set terminal png background "#ffffff" enhanced fontscale 2.0 size 1800, 1400
set output 'delete.png'
w=1
x=1
z = 60
y=2
plot 'plot.in.tmp' using (column(x)/z):(column(y)) axis x1y1 with lines
exit gnuplot
reset
Now result in graph with only rounded integer points in y(vertical) axe. I dont understand why.
Example data in file:
0 -0,00 0,5 570,2 11,98 -0,121 0,000 9,6
5 -0,00 0,7 570,2 11,97 -0,002 0,012 13,2
10 -0,00 0,9 570,3 11,98 -0,004 -0,000 16,1
15 0,24 35,9 570,4 11,96 0,001 0,000 18,4
20 0,56 87,0 570,1 11,99 -0,001 -0,000 20,5
25 1,03 173,5 570,4 11,97 -0,000 0,000 23,2
30 1,61 296,4 570,3 11,96 0,002 0,000 12,4
35 2,17 422,6 570,2 11,68 0,004 0,000 8,8
40 2,81 571,6 570,2 11,37 0,010 0,001 7,5
45 3,52 752,3 570,3 11,26 0,015 0,000 7,1
50 3,97 905,0 570,2 11,69 0,075 0,006 7,4
55 4,36 1048,4 570,1 11,36 0,081 0,001 8,6
60 4,59 1156,8 570,2 11,22 0,087 0,001 10,7
Result graph:
Welcome to StackOverflow! Maybe the local setting of your system (or something in gnuplot) has changed?
The following works for me with your data.
Add a line
set decimalsign locale "german"
or
set decimalsign locale "french"
Check help decimalsign.
Syntax:
set decimalsign {<value> | locale {"<locale>"}}
Correct typesetting in most European countries requires:
set decimalsign ','
Please note: If you set an explicit string, this affects only numbers
that are printed using gnuplot's gprintf() formatting routine,
including axis tics. It does not affect the format expected for input
data, and it does not affect numbers printed with the sprintf()
formatting routine.
The answer given by theozh is correct, but it does not point out the unfortunate lack of standardization about how different operating systems report the current locale setting. For linux machines the locale strings are less human-friendly. For example instead of using something generic like "french", they subdivide into "fr_FR.UTF-8" "fr_BE.UTF-8" "fr_LU.UTF-8" etc to account for slight differences in the conventions used in France, Belgium, Luxembourg, etc.
I cannot tell you the exact set of locale descriptions on your machine, but here is what works for me on a linux machine:
set decimalsign locale "fr_FR.UTF-8"
w=1
x=1
z = 60
y=2
plot 'plot.in.tmp' using (column(x)/z):(column(y)) axis x1y1 with lines

linearK error in seq. default() cannot be NA, NaN

I am trying to learn linearK estimates on a small linnet object from the CRC spatstat book (chapter 17) and when I use the linearK function, spatstat throws an error. I have documented the process in the comments in the r code below. The error is as below.
Error in seq.default(from = 0, to = right, length.out = npos + 1L) : 'to' cannot be NA, NaN or infinite
I do not understand how to resolve this. I am following this process:
# I have data of points for each data of the week
# d1 is district 1 of the city.
# I did the step below otherwise it was giving me tbl class
d1_data=lapply(split(d1, d1$openDatefactor),as.data.frame)
# I previously create a linnet and divided it into districts of the city
d1_linnet = districts_linnet[["d1"]]
# I create point pattern for each day
d1_ppp = lapply(d1_data, function(x) as.ppp(x, W=Window(d1_linnet)))
plot(d1_ppp[[1]], which.marks="type")
# I am then converting the point pattern to a point pattern on linear network
d1_lpp <- as.lpp(d1_ppp[[1]], L=d1_linnet, W=Window(d1_linnet))
d1_lpp
Point pattern on linear network
3 points
15 columns of marks: ‘status’, ‘number_of_’, ‘zip’, ‘ward’,
‘police_dis’, ‘community_’, ‘type’, ‘days’, ‘NAME’,
‘DISTRICT’, ‘openDatefactor’, ‘OpenDate’, ‘coseDatefactor’,
‘closeDate’ and ‘instance’
Linear network with 4286 vertices and 6183 lines
Enclosing window: polygonal boundary
enclosing rectangle: [441140.9, 448217.7] x [4640080, 4652557] units
# the errors start from plotting this lpp object
plot(d1_lpp)
"show.all" is not a graphical parameter
Show Traceback
Error in plot.window(...) : need finite 'xlim' values
coords(d1_lpp)
x y seg tp
441649.2 4649853 5426 0.5774863
445716.9 4648692 5250 0.5435492
444724.6 4646320 677 0.9189631
3 rows
And then consequently, I also get error on linearK(d1_lpp)
Error in seq.default(from = 0, to = right, length.out = npos + 1L) : 'to' cannot be NA, NaN or infinite
I feel lpp object has the problem, but I find it hard to interpret the errors and how to resolve them. Could someone please guide me?
Thanks
I can confirm there is a bug in plot.lpp when trying to plot the marked point pattern on the linear network. That will hopefully be fixed soon. You can plot the unmarked point pattern using
plot(unmark(d1_lpp))
I cannot reproduce the problem with linearK. Which version of spatstat are you running? In the development version on my laptop spatstat_1.51-0.073 everything works. There has been changes to this code recently, so it is likely that this will be solved by updating to development version (see https://github.com/spatstat/spatstat).

svm train output file has less lines than that of the input file

I am currently building a binary classification model and have created an input file for svm-train (svm_input.txt). This input file has 453 lines, 4 No. features and 2 No. classes [0,1].
i.e
0 1:15.0 2:40.0 3:30.0 4:15.0
1 1:22.73 2:40.91 3:36.36 4:0.0
1 1:31.82 2:27.27 3:22.73 4:18.18
0 1:22.73 2:13.64 3:36.36 4:27.27
1 1:30.43 2:39.13 3:13.04 4:17.39 ......................
My problem is that when I count the number of lines in the output model generated by svm-train (svm_train_model.txt), this has 12 fewer lines than that of the input file. The line count here shows 450, although there are obviously also 9 lines at the beginning showing the various parameters generated
i.e.
svm_type c_svc
kernel_type rbf
gamma 1
nr_class 2
total_sv 441
rho -0.156449
label 0 1
nr_sv 228 213
SV
Therefore 12 lines in total from the original input of 453 have gone. I am new to svm and was hoping that someone could shed some light on why this might have happened?
Thanks in advance
Updated.........
I now believe that in generating the model, it has removed lines whereby the labels and all the parameters are exactly the same.
To explain............... My input is a set of miRNAs which have been classified as 1 and 0 depending on their involvement in a particular process or not (i.e 1=Yes & 0=No). The input file looks something like.......
0 1:22 2:30 3:14 4:16
1 1:26 2:15 3:17 4:25
0 1:22 2:30 3:14 4:16
Whereby, lines one and three are exactly the same and as a result will be removed from the output model. My question is then both why the output model would do this and how I can get around this (whilst using the same features)?
Whilst both SOME OF the labels and their corresponding feature values are identical within the input file, these are still different miRNAs.
NOTE: The Input file does not have a feature for miRNA name (and this would clearly show the differences in each line) however, in terms of the features used (i.e Nucleotide Percentage Content), some of the miRNAs do have exactly the same percentage content of A,U,G & C and as a result are viewed as duplicates and then removed from the output model as it obviously views them as duplicates even though they are not (hence there are less lines in the output model).
the format of the input file is:
Where:
Column 0 - label (i.e 1 or 0): 1=Yes & 0=No
Column 1 - Feature 1 = Percentage Content "A"
Column 2 - Feature 2 = Percentage Content "U"
Column 3 - Feature 3 = Percentage Content "G"
Column 4 - Feature 4 = Percentage Content "C"
The input file actually looks something like (See the very first two lines below), as they appear identical, however each line represents a different miRNA):
1 1:23 2:36 3:23 4:18
1 1:23 2:36 3:23 4:18
0 1:36 2:32 3:5 4:27
1 1:14 2:41 3:36 4:9
1 1:18 2:50 3:18 4:14
0 1:36 2:23 3:23 4:18
0 1:15 2:40 3:30 4:15
In terms of software, I am using libsvm-3.22 and python 2.7.5
Align your input file properly, is my first observation. The code for libsvm doesnt look for exactly 4 features. I identifies by the string literals you have provided separating the features from the labels. I suggest manually converting your input file to create the desired input argument.
Try the following code in python to run
Requirements - h5py, if your input is from matlab. (.mat file)
pip install h5py
import h5py
f = h5py.File('traininglabel.mat', 'r')# give label.mat file for training
variables = f.items()
labels = []
c = []
import numpy as np
for var in variables:
data = var[1]
lables = (data.value[0])
trainlabels= []
for i in lables:
trainlabels.append(str(i))
finaltrain = []
trainlabels = np.array(trainlabels)
for i in range(0,len(trainlabels)):
if trainlabels[i] == '0.0':
trainlabels[i] = '0'
if trainlabels[i] == '1.0':
trainlabels[i] = '1'
print trainlabels[i]
f = h5py.File('training_features.mat', 'r') #give features here
variables = f.items()
lables = []
file = open('traindata.txt', 'w+')
for var in variables:
data = var[1]
lables = data.value
for i in range(0,1000): #no of training samples in file features.mat
file.write(str(trainlabels[i]))
file.write(' ')
for j in range(0,49):
file.write(str(lables[j][i]))
file.write(' ')
file.write('\n')

Resources