I have an array of Polygons. I need to convert the array in to Multipolygon.
["POLYGON ((-93.8153401599999 31.6253224010001, -93.8154545089999 31.613245482, -93.8256952309999 31.6133096470001, -93.8239846819999 31.6142335050001, -93.822649241 31.614534889, -93.819589744 31.6141266810001, -93.8187199179999 31.6145615630001, -93.818796329 31.6166099970001, -93.8191396409999 31.616805696, -93.822160944 31.6185287610001, -93.8259606669999 31.6195415540001, -93.827173805 31.6202834370001, -93.826861 31.621054014, -93.826721397 31.6210996090001, -93.825838469 31.621387795, -93.823763302 31.620645804, -93.8224278609999 31.620880388, -93.8207344099999 31.6214468590001, -93.817712918 31.621645233, -93.8171636009999 31.6218779230001, -93.8170138 31.622175612, -93.816896795 31.622408104, -93.816843193 31.622514901, -93.8172703129999 31.623758464, -93.817027909 31.6250143240001, -93.816942408 31.624910524, -93.8153401599999 31.6253224010001))", "POLYGON ((-93.827875499 31.6135011530001, -93.8276549939999 31.6133218590001, -93.830593683 31.613340276, -93.827860513 31.616556659, -93.825911348 31.6159317660001, -93.825861447 31.615915767, -93.826296355 31.6149087000001, -93.8272805829999 31.614407122, -93.827341685 31.6143140250001, -93.827875499 31.6135011530001))"]
I am using the following code to convert the Multipolygons using Apache Sedona
select FID,ST_Multi(ST_GeomFromText(collect_list(polygon))) polygon_list group by 1
I am getting the error like "org.apache.spark.sql.catalyst.util.GenericArrayData cannot be cast to org.apache.spark.unsafe.types.UTF8String" .How can I overcome this issue ? is the same thing can be achieved using Geopandas or shapely?
The answer given by #Antoine B is a very good attempt. But it won't work with the polygons that have hole(s) in them. There is another approach that works with such polygons, and the code is easier to comprehend.
from shapely.geometry import Polygon, MultiPolygon
from shapely import wkt
from shapely.wkt import loads
# List of strings representing polygons
poly_string = ["POLYGON ((-93.8153401599999 31.6253224010001, -93.8154545089999 31.613245482, -93.8256952309999 31.6133096470001, -93.8239846819999 31.6142335050001, -93.822649241 31.614534889, -93.819589744 31.6141266810001, -93.8187199179999 31.6145615630001, -93.818796329 31.6166099970001, -93.8191396409999 31.616805696, -93.822160944 31.6185287610001, -93.8259606669999 31.6195415540001, -93.827173805 31.6202834370001, -93.826861 31.621054014, -93.826721397 31.6210996090001, -93.825838469 31.621387795, -93.823763302 31.620645804, -93.8224278609999 31.620880388, -93.8207344099999 31.6214468590001, -93.817712918 31.621645233, -93.8171636009999 31.6218779230001, -93.8170138 31.622175612, -93.816896795 31.622408104, -93.816843193 31.622514901, -93.8172703129999 31.623758464, -93.817027909 31.6250143240001, -93.816942408 31.624910524, -93.8153401599999 31.6253224010001))", "POLYGON ((-93.827875499 31.6135011530001, -93.8276549939999 31.6133218590001, -93.830593683 31.613340276, -93.827860513 31.616556659, -93.825911348 31.6159317660001, -93.825861447 31.615915767, -93.826296355 31.6149087000001, -93.8272805829999 31.614407122, -93.827341685 31.6143140250001, -93.827875499 31.6135011530001))"]
# Create a list of polygons from the list of strings
all_pgons = [loads(pgon) for pgon in poly_string]
# Create the required multipolygon
multi_pgon = MultiPolygon(all_pgons)
This is a list of strings of polygons with holes.
# List of polygons with hole
poly_string = ['POLYGON ((1 2, 1 5, 4 4, 1 2), (1.2 3, 3 4, 1.3 4, 1.2 3))',
'POLYGON ((11 12, 11 15, 14 14, 11 12), (11.2 13, 13 14, 11.3 14, 11.2 13))']
The code above also works well in this case.
a MultiPolygon is just a list of Polygon, so you need to reconstruct every Polygon in a list and then pass it to MultiPolygon.
With the format of the string you gave, I got it to work like that :
from shapely.geometry import Polygon, MultiPolygon
poly_string = ["POLYGON ((-93.8153401599999 31.6253224010001, -93.8154545089999 31.613245482, -93.8256952309999 31.6133096470001, -93.8239846819999 31.6142335050001, -93.822649241 31.614534889, -93.819589744 31.6141266810001, -93.8187199179999 31.6145615630001, -93.818796329 31.6166099970001, -93.8191396409999 31.616805696, -93.822160944 31.6185287610001, -93.8259606669999 31.6195415540001, -93.827173805 31.6202834370001, -93.826861 31.621054014, -93.826721397 31.6210996090001, -93.825838469 31.621387795, -93.823763302 31.620645804, -93.8224278609999 31.620880388, -93.8207344099999 31.6214468590001, -93.817712918 31.621645233, -93.8171636009999 31.6218779230001, -93.8170138 31.622175612, -93.816896795 31.622408104, -93.816843193 31.622514901, -93.8172703129999 31.623758464, -93.817027909 31.6250143240001, -93.816942408 31.624910524, -93.8153401599999 31.6253224010001))", "POLYGON ((-93.827875499 31.6135011530001, -93.8276549939999 31.6133218590001, -93.830593683 31.613340276, -93.827860513 31.616556659, -93.825911348 31.6159317660001, -93.825861447 31.615915767, -93.826296355 31.6149087000001, -93.8272805829999 31.614407122, -93.827341685 31.6143140250001, -93.827875499 31.6135011530001))"]
polygons = []
for poly in poly_string:
coordinates = []
for s in poly.split('('):
if len(s.split(')')) > 1:
for c in s.split(')')[0].split(','):
coordinates.append((float(c.lstrip().split(' ')[0]),
float(c.lstrip().split(' ')[1])))
polygons.append(Polygon(coordinates))
multipoly = MultiPolygon(polygons)
The resulting MultiPolygon looks like that :
I would try
select
FID,
ST_Multi(ST_Collect(ST_GeomFromText(polygon))) polygon_list
group by 1
I have 3000 lines of data like this:
['OFFD.271818,271818,"LINESTRING (16.303895355263016 48.18772778239529, 16.304571765172827 48.18758202488568, 16.30482300975865 48.18755484403183, 16.305031079294384 48.187546649202545, 16.30536730486924 48.187533407177206, 16.307523452290432 48.18753396398144, 16.309072536732444 48.18748514596115, 16.312777938045286 48.18734458451529, 16.313426882251083 48.18727411748434, 16.315405366265555 48.186920966444205, 16.316609208646593 48.18670268519608, 16.317260447683868 48.18652861710351, 16.31853471535412 48.186166775088815)",U4,4,U-Bahn,']
I want using matplotlib to create a plot, but I need X and Y coordinates.
The Targe is: U4 (from the line)
Coordinates are:
16.303895355263016 48.18772778239529, 16.304571765172827 48.18758202488568, 16.30482300975865 48.18755484403183, 16.305031079294384 48.187546649202545, 16.30536730486924 48.187533407177206, 16.307523452290432 48.18753396398144, 16.309072536732444 48.18748514596115, 16.312777938045286 48.18734458451529, 16.313426882251083 48.18727411748434, 16.315405366265555 48.186920966444205, 16.316609208646593 48.18670268519608, 16.317260447683868 48.18652861710351, 16.31853471535412 48.186166775088815
I do not get how to parse this string with numpy and create the dataset:
U4: coordinates for X: ... and for Y:....
Any hints?
Previously I used the following to calculate the ewma
dataset['26ema'] = pd.ewma(dataset['price'], span=26)
But, in the latest version of pandas pd.ewma has been removed. How to calculate using the new method dataframe.ewma?
dataset['26ema'] = dataset['price'].ewma(span=26)
This is giving an error 'AttributeError: 'Series' object has no attribute 'ewma'
Use Series.ewm:
dataset['price'].ewm(span=26)
See GH11603 for the relevant PR and mapping of the old API to new ones.
Minimal Code Example
s = pd.Series(range(5))
s.ewm(span=3).mean()
0 0.000000
1 0.666667
2 1.428571
3 2.266667
4 3.161290
dtype: float64
I am trying to learn linearK estimates on a small linnet object from the CRC spatstat book (chapter 17) and when I use the linearK function, spatstat throws an error. I have documented the process in the comments in the r code below. The error is as below.
Error in seq.default(from = 0, to = right, length.out = npos + 1L) : 'to' cannot be NA, NaN or infinite
I do not understand how to resolve this. I am following this process:
# I have data of points for each data of the week
# d1 is district 1 of the city.
# I did the step below otherwise it was giving me tbl class
d1_data=lapply(split(d1, d1$openDatefactor),as.data.frame)
# I previously create a linnet and divided it into districts of the city
d1_linnet = districts_linnet[["d1"]]
# I create point pattern for each day
d1_ppp = lapply(d1_data, function(x) as.ppp(x, W=Window(d1_linnet)))
plot(d1_ppp[[1]], which.marks="type")
# I am then converting the point pattern to a point pattern on linear network
d1_lpp <- as.lpp(d1_ppp[[1]], L=d1_linnet, W=Window(d1_linnet))
d1_lpp
Point pattern on linear network
3 points
15 columns of marks: ‘status’, ‘number_of_’, ‘zip’, ‘ward’,
‘police_dis’, ‘community_’, ‘type’, ‘days’, ‘NAME’,
‘DISTRICT’, ‘openDatefactor’, ‘OpenDate’, ‘coseDatefactor’,
‘closeDate’ and ‘instance’
Linear network with 4286 vertices and 6183 lines
Enclosing window: polygonal boundary
enclosing rectangle: [441140.9, 448217.7] x [4640080, 4652557] units
# the errors start from plotting this lpp object
plot(d1_lpp)
"show.all" is not a graphical parameter
Show Traceback
Error in plot.window(...) : need finite 'xlim' values
coords(d1_lpp)
x y seg tp
441649.2 4649853 5426 0.5774863
445716.9 4648692 5250 0.5435492
444724.6 4646320 677 0.9189631
3 rows
And then consequently, I also get error on linearK(d1_lpp)
Error in seq.default(from = 0, to = right, length.out = npos + 1L) : 'to' cannot be NA, NaN or infinite
I feel lpp object has the problem, but I find it hard to interpret the errors and how to resolve them. Could someone please guide me?
Thanks
I can confirm there is a bug in plot.lpp when trying to plot the marked point pattern on the linear network. That will hopefully be fixed soon. You can plot the unmarked point pattern using
plot(unmark(d1_lpp))
I cannot reproduce the problem with linearK. Which version of spatstat are you running? In the development version on my laptop spatstat_1.51-0.073 everything works. There has been changes to this code recently, so it is likely that this will be solved by updating to development version (see https://github.com/spatstat/spatstat).
I'm struggling with getting a simple correlation done. I've tried all that was suggested under similar questions.
Here are the relevant parts of the code, the various attempts I've made and their results.
import numpy as np
import pandas as pd
try01 = data[['ESA Index_close_px', 'CCMP Index_close_px' ]].corr(method='pearson')
print (try01)
Out:
Empty DataFrame
Columns: []
Index: []
try04 = data['ESA Index_close_px'][5:50].corr(data['CCMP Index_close_px'][5:50])
print (try04)
Out:
**AttributeError: 'float' object has no attribute 'sqrt'**
using numpy
try05 = np.corrcoef(data['ESA Index_close_px'],data['CCMP Index_close_px'])
print (try05)
Out:
AttributeError: 'float' object has no attribute 'sqrt'
converting the columns to lists
ESA_Index_close_px_list = list()
start_value = 1
end_value = len (data['ESA Index_close_px']) +1
for items in data['ESA Index_close_px']:
ESA_Index_close_px_list.append(items)
start_value = start_value+1
if start_value == end_value:
break
else:
continue
CCMP_Index_close_px_list = list()
start_value = 1
end_value = len (data['CCMP Index_close_px']) +1
for items in data['CCMP Index_close_px']:
CCMP_Index_close_px_list.append(items)
start_value = start_value+1
if start_value == end_value:
break
else:
continue
try06 = np.corrcoef(['ESA_Index_close_px_list','CCMP_Index_close_px_list'])
print (try06)
Out:
****TypeError: cannot perform reduce with flexible type****
Also tried .astype but not made any difference.
data['ESA Index_close_px'].astype(float)
data['CCMP Index_close_px'].astype(float)
Using Python 3.5, pandas 0.18.1 and numpy 1.11.1
Would really appreciate any suggestion.
**edit1:*
Data is coming from an excel spreadsheet
data = pd.read_excel('C:\\Users\\Ako\\Desktop\\ako_files\\for_corr_tool.xlsx') prior to the correlation attempts, there are only column renames and
data = data.drop(data.index[0])
to get rid of a line
regarding the types:
print (type (data['ESA Index_close_px']))
print (type (data['ESA Index_close_px'][1]))
Out:
**edit2*
parts of the data:
print (data['ESA Index_close_px'][1:10])
print (data['CCMP Index_close_px'][1:10])
Out:
2 2137
3 2138
4 2132
5 2123
6 2127
7 2126.25
8 2131.5
9 2134.5
10 2159
Name: ESA Index_close_px, dtype: object
2 5241.83
3 5246.41
4 5243.84
5 5199.82
6 5214.16
7 5213.33
8 5239.02
9 5246.79
10 5328.67
Name: CCMP Index_close_px, dtype: object
Well, I've encountered the same problem today.
try use .astype('float64') to help make the type correct.
data['ESA Index_close_px'][5:50].astype('float64').corr(data['CCMP Index_close_px'][5:50].astype('float64'))
This works well for me. Hope it can help you as well.
You can try as following:
Top15['Citable docs per capita']=(Top15['Citable docs per capita']*100000)
Top15['Citable docs per capita'].astype('int').corr(Top15['Energy Supply per Capita'].astype('int'))
It worked for me.