node.js pg query returns fewer rows than native postgresql client - node.js

I have an SQL query that returns 356 rows when run in a native Postgresql client (psql, Navicat, etc.), but only returns 214 rows when run inside the node.js service I'm developing. Here's the query:
SELECT discs.id AS id,
discs.is_streamable AS is_streamable,
discs.updated_at AS updated_at,
albums.title AS album_title,
'https://www.slurpie.com/albums/' || albums.slug AS album_url,
artists.name AS main_artist,
genres.name AS genre,
albums.cover_remote_url AS album_art
FROM discs
JOIN albums
ON albums.id = discs.album_id
JOIN artists
ON artists.id = albums.main_artist_id
JOIN genres
ON genres.id = albums.genre_id
JOIN users
ON users.id = discs.user_id
WHERE users.authentication_token = 'itsasecret'
ORDER BY main_artist
The node.js service is using restify and pg-query (although I've tested it with the underlying "pg" module as well with the same results).
Looking at the output from the query, I can't find any similarities between the rows that are left out when the query is run inside of node (I thought perhaps a null value in a column, or an extremely large amount of column data, special characters, etc.).

Yieldsfalsehood was on the right track.
Turns out that the node code was pointed at a recent copy of the database, similar enough that it wasn't obvious that it was a separate database, but out-of-date enough that the number of rows was different.
It wasn't until I noticed a small difference in one of the rows that was returned by both psql and the node app that the cause jumped out at me.
Occams Razor FTW! :)

Related

Node-Postgres/Knex returning CITEXT[] as a string in JS, instead of an array of strings

I'm using Knex, which itself uses package "pg" (aka "node-postgres").
If you SELECT some rows from a table with a TEXT[] column, all is well... in JS you get an array of strings.
But if you're using a CITEXT[] column, instead you just get back a string in JS like:
"{First-element,Second-element}"
Normally when you want to instruct the pg package on how to return specific postgres types, you can do something like this:
import {types} from 'pg';
types.setTypeParser(types.builtins.TIMESTAMPTZ, 'text');
types.setTypeParser(types.builtins.TIMESTAMP, 'text');
types.setTypeParser(types.builtins.DATE, 'text');
types.setTypeParser(types.builtins.TIME, 'text');
types.setTypeParser(types.builtins.TIMETZ, 'text');
The types.builtins.* constants have values that are hardcoded OID numbers for the known built-in types in postgres. Those OID numbers are the same across all postgres installations.
However due to the fact that CITEXT[] is an extension, the OID numbers for the CITEXT + CITEXT[] types will be different on every server, e.g. with the following SQL query:
SELECT typname, oid, typarray FROM pg_type WHERE typname like '%citext%';
On my development server I get:
typname|oid |typarray|
-------|-----|--------|
citext |17459|17464 |
_citext|17464|0 |
But on my production server I get:
typname|oid |typarray|
-------|-----|--------|
citext |18618|18623 |
_citext|18623|0 |
How can I solve this?
Some hacky options that I really don't want to do are:
Find out all the different OID values for all my servers and hard code them in - very hacky and really don't want to do this.
Write code specifically for every table/column that manually converts the strings to array - also hacky and repetitive
When the node process initializes, get the server's OID value for the server and then call the types.setTypeParser() function with that dynamic value - also not very good
How can I solve this for all tables/columns without these hacky solutions?
I don't believe there is any way how to do it without querying DB.
I would probably query the correct OID number before starting the node app and store it to an environment variable and then initialize pg types with the value from process.env.
That is also a bit hacky, but at least the hack is mostly encapsulated out of the application code.

Knex + SQL Server whereIn query 8-12s -- raw version returns NO results but if I input the .toQuery() result directly I get results

The database is in Azure cloud and not being used in production currently. There are 80.000 rows and a uprn is a VARCHAR(100);
I'm already using JOI to validate each UPRN as well;
I'm using KNEX with a SQL Server database with the following whereIn query:
knex(LOCATIONS.table).whereIn(LOCATIONS.uprn, req.body.uprns)
but this takes 8-12s to complete and sometimes timesout. if I use .toQuery() on the same thing, SSMS will return the result within 1-2.
If I do a raw query, the resulting .toQuery() or toString() works in SSMS and returns results. But if I try to use the raw directly, it will return 0 results.
I'm looking to either fix what's making whereIn so slow or get the raw query working.
EDIT 1:
After much debugging and trying -- it seems that the bug is due to how knex deals with arrays, so I made a for-of loop to add ? ? ? for each array element and then inputed the array for all params.
This led me to realizing the performance issue is due to SQL server way of parameterising.
I ended up building a raw query string with all of the parameters and validating the input with Joi string/regex config:
Joi.string()
.min(1)
.max(35)
.regex(/^[a-z\d\-_\s]+$/i)
allowing only for alphanumeric, dashes and spaces which should prevent sql injection.
I'm going to look deeper into security issues with this and might make a separate login that can only SELECT data from that table and nothing more to run with these queries.
Needed to just handle it raw and validate separately.

NodeJS - azure-storage-node- , how to retrieve addition of two columns, and apply filtering condition

Sorry for being newbie for NodeJs and table query, my question's,
How I could create a query using Nodejs pakcage "azure-storage-node", which selects the sum/addition of two coloumns 'start' and 'period' , if the addition is greater than a threshold it will take the whole raw, my tries which didn't work is something like this,
var query = new azure.TableQuery();
total = query.select(['start']) + query.select(['period']);
query.where('total > ?' , 50000);
or may be something like this,
var query = new azure.TableQuery()
.where('start + period gt 50000');
but it throws an error of '+'.
Thanks
What you're trying to accomplish is not possible with Azure Tables at least as of today as Azure Tables has limited querying support and support for computed columns (if I may say so) is not there.
There are two possible solutions:
Have an attribute called total in your entities that will contain the value i.e. start + period. You calculate this value when you're inserting or updating the entity and store it at that time.
Do this filtering on the client side. For this you will need to download all related entities and then apply this filtering on the client side on the data that you fetched.

KnexJS giving different response

So I have a NodeJS+KnexJS setup on a PostgreSQL DB, and am using the .whereRaw() method so I can use a CASE statement in my WHERE clause.
The query was tested in my CLI before migrating to code. Here is the code that is being used.
var qry = knex.select(....); // ignore the select, not important.
qry.with('daspecs', function(qy) {
qy.select('spec_id').from('drawings').where('uid', query.d);
}).whereRaw('CASE WHEN (select "spec_id" from "daspecs") IS NULL THEN true ELSE c.spec_id = (select "spec_id" from "daspecs") END');
The SQL that KnexJS is generating (output using qry.toString()) is correct, and I can even copy and paste this to my psql CLI and it returns the results I want (12 records), but for some wierd reason the KnexJS query seems to return a completely different set of results (1106 records).
Not sure where to go next, since KnexJS is giving me the right SQL, but seems like it's executing something else, and not sure how else to diagnose what it is actually doing (I've tried the knex.on('query'...) event).
Any alteration on the final SQL would result in an error (i've tested), so at the point of ruling out missing pieces.
Has anyone had any experience or issues with KnexJS saying one thing, but doing another, in particular, with the whereRaw command?

Why is GetPaged() Executing two database calls?

I'm a bit new to subsonic (i.e. evaluating 3.0.0.3) and have come across a strange behavior in GetPaged(int pageIndex, int pageSize). When I execute the method it does two SQL calls. Any ideas why ?
Details
Lets say I have a "Cultures" table with 200 rows. In my code I do something like ...
var sonicCollection = from c in RTE.Culture.GetPaged(1, 25)
select c;
Now, I would expect this executes a single query returning the first 25 entries in my cultures table. When I watch SQL profiler I see two queries run by.
First this--
SELECT [dbo].[Cultures].[cultureCode], [dbo].[Cultures].[cultureName]
FROM [dbo].[Cultures]
Then This--
SELECT *
FROM (SELECT ROW_NUMBER() OVER (
ORDER BY cultureID ASC) AS Row,
[dbo].[Cultures].[cultureCode], [dbo].[Cultures].[cultureName]
FROM [dbo].[Cultures]
)
AS PagedResults
WHERE Row >= 1 AND Row <= 25
I expect the 2nd query to roll by, as it is the one returning the 25 rows I politely requested of subsonic. The first query, however, appears to return 200 rows (at least according to SQL profiler).
Any ideas what's going on?
It's a bug in the code. The code actually queries every record and then iterates over each one for the count. I've created an issue in the github repo here:
https://github.com/subsonic/SubSonic-3.0/issues/259
You can download the source, fix the issue and recompile pretty easily. I've done this and its fixed my issue.
You just want to use RTE.Culture.GetPaged() - it runs the paged query for you.

Resources