How can I optimize dijsktra path finding query for speed - node.js

Through postgresql, postgis, pgrouting and nodejs I am working on a project which basically finds a path between shops.
There are three tables in my database
1.CREATE TABLE public."edges" (id int, name varchar(100), highway varchar(100), oneway varchar(100), surface varchar(100), the_geom geometry, source int, target int);
2.CREATE TABLE public."edges_noded" (id bigint, old_id int, sub_id int, source bigint, target bigint, the_geom geometry, name varchar(100), type varchar(100), distance double precision);
3.CREATE TABLE public."edges_noded_vertices_pgr" (id bigint, cnt int, chk int, ein int, eout int, the_geom geometry); –
And the query by which I am finding path
client.query( "WITH dijkstra AS (SELECT * FROM pgr_dijkstra('SELECT id,source,target,distance AS cost FROM
edges_noded',"+source+","+target+",FALSE)) SELECT seq, CASE WHEN
dijkstra.node = edges_noded.source THEN
ST_AsGeoJSON(edges_noded.the_geom) ELSE
ST_AsGeoJSON(ST_Reverse(edges_noded.the_geom)) END AS
route_geom_x,CASE WHEN dijkstra.node = edges_noded.source THEN
ST_AsGeoJSON(edges_noded.the_geom) ELSE
ST_AsGeoJSON(ST_Reverse(edges_noded.the_geom)) END AS route_geom_y
FROM dijkstra JOIN edges_noded ON(edge = id) ORDER BY
seq",(err,res)=>{ })
This query works for me but taking too much time for example, If I want to find a path between 30 shops then it is taking almost 25 to 30 sec which is too much.
After searching about this problem I found this link
https://gis.stackexchange.com/questions/16886/how-can-i-optimize-pgrouting-for-speed/16888
In this link Délawenis is saying that use a st_buffer so it doesn't get all ways, but just the "nearby" ways:
So I tried to apply st_buffer in above query but not got any success.
If someone has any idea plz help me with this problem.
If this approach is wrong please also tell me the right way.

Related

can we use more than one cassandra CQL collections ( set,list,map )in a single query?

create table seller(
seller_id int primary key,
seller_name text,
seller_email set<text>,
seller_address map<text>,
seller_phone list<text>,
product_id int,
product_title_text,
product_description text,
product_trackno int,
product_bidoption text,
bid_startdate date,
bid_closedate date,
bid_startprice int,
bid_withdrawdate date);
SyntaxException: line 1:110 mismatched input '>' expecting ',' (...<text>,
seller_address map<text[>],...)
What changes should be made in order to execute?
Of course you can, with some adjustments:
1) It helps if the type of a column isn't linked to the column name by an underscore. Instead of:
product_title_text,
This will work:
product_title text,
2) You'll also need to provide both types for the map collection. Instead of:
seller_address map<TEXT>,
This will work:
seller_address map<TEXT,TEXT>,
Full CQL:
create table seller(
seller_id int primary key,
seller_name text,
seller_email set<TEXT>,
seller_address map<TEXT,TEXT>,
seller_phone list<TEXT>,
product_id int,
product_title text,
product_description text,
product_trackno int,
product_bidoption text,
bid_startdate date,
bid_closedate date,
bid_startprice int,
bid_withdrawdate date);
Also, are you really only ever going to query this table by seller_id? If not, you may want to rethink the primary key definition.

SyntaxException: while creating Cassandra Table

I'm trying to create a simple table in cassandra, this is the command I run,
create table app_instance(app_id int, app_name varchar, proc_id varchar, os_priority int, cpu_time int, num_io_ops int, primary_key (host_id, proc_id)) with clustering order by (proc_id DESC) ;
I get the following error,
SyntaxException: line 1:132 no viable alternative at input '(' (...int, num_io_ops int, primary_key [(]...)
What am I doing wrong here?
It should be primary key, with a space, not primary_key, as ernest_k already noted in a comment.
The way you wrote it,
...cpu_time int, num_io_ops int, primary_key (host_id, proc_id)
the CQL parser thinks that "primary_key" is the name of yet another column, just as num_io_ops was, and now expects to see the name of the type - and doesn't expect an open parenthesis after the "primary_key", and this is exactly what the error message told you (albeit vaguely).

PL/Python3 - Concatenate PLyResults

I'm wondering if it's possible to concatenate PLyResults somehow inside a function. For example, let's say that firstly I have a function _get_data that, given a tuple (id, index) returns a table of values:
CREATE OR REPLACE FUNCTION _get_data(id bigint, index bigint):
RETURNS TABLE(oid bigint, id bigint, val double precision) AS
$BODY$
#...process for fetching the data, irrelevant for the question.
return recs
$BODY$
LANGUAGE plpython3u;
Now I would like to be able to create a generic function defined as such, that fetches data between two boundaries for a given ID, and uses the previous function to fetch data individually and then aggregate the results somehow:
CREATE OR REPLACE FUNCTION get_data(id bigint, lbound bigint, ubound bigint)
RETURNS TABLE(oid bigint, id bigint, val double precision) AS
$BODY$
concatenated_recs = [] #<-- For the sake of argument.
plan = plpy.prepare("SELECT oid, id, val FROM _get_data($1, $2);", ['bigint', 'bigint'])
for i in range(lbound, ubound+1):
recs = plpy.execute(plan, [id, i]) # <-- Records fetched individually
concatenated_recs += [recs] #<-- Not sure how to concatenate them...
return concatenated_recs
$BODY$
LANGUAGE plpython3u;
Perhaps I am missing something, but the answer you gave looks like a slower, more complicated version of this query:
SELECT oid, id, val
FROM generate_series(your_lower_bound, your_upper_bound) AS g(i),
_get_data(your_id, i);
You could put that in a simple SQL function with no loops or temporary tables:
CREATE OR REPLACE FUNCTION get_data(id bigint, lbound bigint, ubound bigint)
RETURNS TABLE(oid bigint, id bigint, val double precision) AS
$BODY$
SELECT oid, id, val
FROM generate_series(lbound, ubound) AS g(i),
_get_data(id, i);
$BODY$ LANGUAGE SQL;
Although I wasn't able to find a way to concatenate the results from the PL/Python documentation, and as of 06-2019, I'm not sure if the language supports this resource, I could solve it by creating a temp table, inserting the records in it for each iteration and then returning the full table:
CREATE OR REPLACE FUNCTION get_data(id bigint, lbound bigint, ubound bigint)
RETURNS TABLE(oid bigint, id bigint, val double precision) AS
$BODY$
#Creates the temp table
plpy.execute("""CREATE TEMP TABLE temp_results(oid bigint, id bigint, val double precision)
ON COMMIT DROP""")
plan = plpy.prepare("INSERT INTO temp_results SELECT oid, id, val FROM _get_data($1, $2);",
['bigint', 'bigint'])
#Inserts the results in the temp table
for i in range(lbound, ubound+1):
plpy.execute(plan, [id, i])
#Returns the whole table
recs = plpy.execute("SELECT * FROM temp_results")
return recs
$BODY$
LANGUAGE plpython3u;

Does using all fields as a partitioning keys in a table a drawback in cassandra?

my aim is to get the msgAddDate based on below query :
select max(msgAddDate)
from sampletable
where reportid = 1 and objectType = 'loan' and msgProcessed = 1;
Design 1 :
here the reportid, objectType and msgProcessed may not be unique. To add the uniqueness I have added msgAddDate and msgProcessedDate (an additional unique value).
I use this design because I don't perform range query.
Create table sampletable ( reportid INT,
objectType TEXT,
msgAddDate TIMESTAMP,
msgProcessed INT,
msgProcessedDate TIMESTAMP,
PRIMARY KEY ((reportid ,msgProcessed,objectType,msgAddDate,msgProcessedDate));
Design 2 :
create table sampletable (
reportid INT,
objectType TEXT,
msgAddDate TIMESTAMP,
msgProcessed INT,
msgProcessedDate TIMESTAMP,
PRIMARY KEY ((reportid ,msgProcessed,objectType),msgAddDate, msgProcessedDate))
);
Please advice which one to use and what will be the pros and cons between two based on performance.
Design 2 is the one you want.
In Design 1, the whole primary key is the partition key. Which means you need to provide all the attributes (which are: reportid, msgProcessed, objectType, msgAddDate, msgProcessedDate) to be able to query your data with a SELECT statement (which wouldn't be useful as you would not retrieve any additional attributes than the one you already provided in the WHERE statemenent)
In Design 2, your partition key is reportid ,msgProcessed,objectType which are the three attributes you want to query by. Great. msgAddDate is the first clustering column, which will be automatically sorted for you. So you don't even need to run a max since it is sorted. All you need to do is use LIMIT 1:
SELECT msgAddDate FROM sampletable WHERE reportid = 1 and objectType = 'loan' and msgProcessed = 1 LIMIT 1;
Of course, make sure to define a DESC sorted order on msgAddDate (I think by default it is ascending...)
Hope it helps!

Row level user permissions, help with design

Say I am creating a forums application, I understand how to design a forum level permission system with Groups.
i.e. you create a forum to group mapping, and assign users to a group to give them access to a particular forum.
How can I refine the permissions to allow for row level permissions (or in forum terms, post level).
You would do so in a similar manner as you've already described. It'll require a few more joins. Let's say you have a structure like so (I've intentionally kept off the constraints to make it generic and reduce the amount of code):
CREATE TABLE ForumPost (
PostID int,
ForumID int,
PostText varchar(255)
);
CREATE TABLE ForumUser (
ForumUserID int,
ForumUserName varchar(255),
NumofPosts int
);
CREATE TABLE ForumGroups (
ForumGroupID int,
ForumGroupName varchar(255)
)
CREATE TABLE ForumGroupMembership (
ForumUserID int,
ForumGroupID int
)
CREATE TABLE ForumPermissions (
ForumID int,
ForumGroupID int,
MinPosts int
)
Then you could do several joins to ensure you restrict the content accordingly:
SELECT FPost.PostID, FPost.ForumID, FPost.PostText
FROM ForumPost FPost
JOIN ForumPermissions FPerm
ON FPost.ForumID = FPerm.ForumID
JOIN ForumGroupMembership FGM
ON FPerm.ForumGroupID = FGM.ForumGroupID
JOIN ForumUser FUser
ON FUser.ForumUserID = FGM.ForumUserID
WHERE FUser.NumOfPosts >= FPerm.MinPosts
AND FPost.PostID = <Some Number>

Resources