I'm using Hapi to implement a backend for a angularJS-based web application providing access to non-trivial workflows.
The database behind the backend is postgresql. Because the data is also used by other components, I have limited control over the schema (I can add tables, views, columns etc., but I cannot restructure everything to fit an ORM). The workflows must be atomic, so I need to be able to do SELECT... FOR UPDATE to avoid transactionality / locking issues. Optimistic locking would also be an option, but doesn't seem necessary so far.
In my ideal world, there would be a hapi plugin providing
- generic reading of JavaScript objects from query results: I do a SELECT * FROM myview, I get a JavaScript object for every result row, and the columns have miraculously turned into fields
- a way to save me from worrying about parameter escaping - e.g. I do 'WHERE column=%s' and a parameter and it just works
- requests are automagically wrapped in pg transactions - and there is a hook to retry if the commit went wrong (especially for optimistic locking)
I have looked at all the nodejs and hapi modules/plugins I could find and that seemed relevant to the issue, but they all seem to either
- leave me having to worry about a lot of low-level stuff
- require me to deal with any errors at every single db call
- not support ALL pg features I might require (views, stored procedures - I don't really want to be limited to a subset of SQL)
But then - I lack practical experience with this scenario so far (though I've built plenty of backends with postgresql, usually in conjunction with either Java EE or python/django).
I like my business logic to deal with business logic - and not be interspersed with low-level stuff. These things should be separated in a clean architecture.
What is a good way to achieve that in the described scenario?
Related
I have two separate cloud-based APIs that I am working on integrating together. Neither software directly talks to each other so I am creating something in the middle to get them to communicate. I have had trouble finding examples or documentation on how exactly to do this, does anyone know of any resources that could help me out?
My plan going in was to use a MERN Stack, running on a local server to do GET and POST requests to both APIs, use some mapping and logic to transpose the data into the correct format and send it to the other software. I do not have a client per se (other than myself) on my end, so I really will be skipping the React part of MERN, at least that is what I'm thinking. I'll be using Mongo to keep track of both sets of data for redundancy. I also considered using a LAMP Stack but felt that MERN would be faster in handling the data, and Mongo is more flexible in handling different data formats. If there is another process or technology that could help me that I'm not thinking of, I would be grateful to hear about it.
Has anyone encountered something like this before? Thank you.
As with most architecture questions, there's no completely right or wrong answer here. You could certainly design a well-built system to handle for this purpose with either stack; even more-so when you mention that your front-end framework is not an important consideration. Instead, ask yourself questions like this:
Which stack do you have more experience with, and is this an appropriate time to learn a new set of technologies, or is it important to do the best work you're capable of right now (how important is time, cost, or quality in this case)?
Another generalization I'll stick my neck out for is a data-first approach; what sort of data are you dealing with from each cloud integration, and what kind of data do you need to support and/or create in order to make your system work? Mongo, being a NoSQL persistence layer, will allow you to change your data model and handle more varied data in a quicker and easier manner than a SQL solution will. This is a double-edged sword, however, as lack of validation and a strongly-constrained (typed?) data model will make your application harder to work with and debug as it grows. In short - how big might this application grow?
If you have a handy and familiar way to manage the three different data models you're dealing with (cloud service 1, cloud service 2, and your app) via MySQL, then that's a compelling reason to use it. However, if your style is to start dumping data into your database and you're comfortable with a more iterative approach (which may require more, albeit shorter rounds of refactoring), then Mongo with MERN may be the preferable choice.
Finally, will others ever be working on this application? If so, which language would you prefer to be dealing with them upon - PHP or Javascript?
To speed up development for my next Node-API I was looking for a suitable Framework. In the past I was building my APIs with express only.
One Design pattern I always found useful is to completely seperate the business logic from route-handling in services. Those services only accept the required information (like a user id or data) and return a promise resolving the result of the operation.
This way it is easy to reuse these services in other routes, to combine them, test them, or call them based on schedules or other events - totally independent from endpoint-calls. Routing and Middleware take care of access-controll, error-handling and respondig.
Looking at the documentations of those frameworks (sailsjs, keystonejs, ...) I mostly see the business-logic tightly coupled to individual routes, directly accepting request objects and handling the responses. Only as an afterthought it seems there is sometimes offered a way to extract "often used code" into helper functions.
Am I missing something? How come this pattern seems to be the standard of API design? Is this a best practice for a reason?
It might have to do with Node.js services being smaller in size. If you're coming from an enterprise background, you're well aware mixing business-logic with controller code doesn't fly in the long run. Perhaps small projects can get away with defying that, but once the size increases, you can't avoid the laws of physics. It's best to separate concerns and keep the codebase maintainable.
I'd also add that below services, it's good to have a separate layer that handles talking to outside process boundaries. That way, you can test business logic in isolation by providing appropriate test doubles for integrations. Here's a longer explanation of how it would work in a Node project: Organize Node.js API project using 3-layer architecture.
I'm currently involved in a app project, and I'm incharge of setting up the backend.
What i'm use to using is a MYSQL database + php for cleaning and managing the data sent to and fro the front end, which I have much more experience in. However, because of certain preferences of my bosses, on this project I've found myself looking at IBMs Bluemix and Cloudant software. Cloudant is a NoSQL database(like CouchDB) and my experience regarding noSQL is severely lacking. All I've mananged to do so far is to create a few JSON documents, and some basic views
What I need to figure out is how to perform the CRUD(create,read,update,delete) actions on a NoSQL database, or at least what it would look like.
In addition to this, I need to know if there are ways to implement security measures(implement security and anti-hacking functions) on a NoSQL database without an external source, or will I need to learn how to reroute the data through some sort of php function first, if i want it cleaned, before sending it to the Cloudant server where my database sits.
Let me know if my attempt to explain my problem is lacking in clarity. I'll try my best to state a different way, if need be.
Generally speaking, there is nothing equivalent to an ANSI to NoSQL databases. In other words, NoSQL databases are not as standardized as SQL databases. All standards are starting to appear. You can think of it as a technology still in the making.
What you have in general is an API with methods such as put_record or delete_record, or a REST interface that is logically equivalent. Also, in general you CRUD the whole record, not parts of the record.
Take a look at the reference: Cloudant - Reading and Writing
Having that said, in your case I would recommend abstracting away from the specific implementation of the NoSQL you want to use if you care about avoiding vendor lock-in. So I would suggest you to wrap CRUD functions using PHP functions that later can be replaced if you want to change the NoSQL database flavor.
This approach has the additional advantage to provide an abstraction for you to implement your own security. Some important NoSQL databases have no concept of multi-tenancy or just implemented that. Again, it is a technology in the making.
When your mindset is the relational one, you tend to think of the database as something that will help you guarantee data consistency as much as possible. But NoSQL databases are not like that. Think of them as a simple repository of documents (in a JSON or XML structure, for instance), without cross references.
Then the obvious question is perhaps: why would anyone want such a thing? One of the possible answers is because NoSQL databases may hold an aggregate of consolidated data. You can then retrieve aggregates to save time reprocessing or re-retrieving data unnecessarily.
As for security, most (if no all) NoSQL databases have some pretty good authentication mechanisms.
Recently I've been playing around with Node.js a little bit. In my particular case I wound up using MongoDB, partly because it made sense for that project because it was very simple, and partly because Mongoose seemed to be an extremely simple way to get started with it.
I've noticed that there seems to be a degree of antipathy towards relational databases when using Node.js. They seem to be poorly supported compared to non-relational databases within the Node.js ecosystem, but I can't seem to find a concise reason for this.
So, my question is, is there a solid technical reason why relational databases are a poorer fit for working with Node.js than alternatives such as MongoDB?
EDIT: Just want to clarify a few things:
I'm specifically not looking for details relating to a specific application I'm building
Nor am I looking for non-technical reasons (for example, I'm not after answers like "Node and MongoDB are both new so developers use them together")
What I am looking for is entirely technical reasons, ONLY. For instance, if there were a technical reason why relational databases performed unusually poorly when used with Node.js, then that would be the kind of thing I'm looking for (note that from the answers so far it doesn't appear that is the case)
No, there isn't a technical reason. It's mostly just opinion and using NoSQL with Node.js is currently a popular choice.
Granted, Node's ecosystem is largely community-driven. Everything beyond Node's core API requires community involvement. And, certainly, people will be more likely to support what aligns with their personal preferences.
But, many still use and support relational databases with Node.js. Some notable projects include:
mysql
pg
sequelize
I love Node.js, but with Node it actually makes more sense to use a RDBMs, as opposed to a non-relational DB. With a noSQL/non-relational solution you often need to do manual joins in your Node.js code and sometimes work with a lack of transactions, a technical feature of RDBMs that have commit/rollback features. Here are some potential problems with using Non-Relational DBs + Node.js servers:
(a) the joins are slower and responses are slower, because Node is not C/C++
(b) the expensive joins block your
event loop, because the join is happening in your Node.js code not on some database server
(c) manually writing joins is often difficult and error-prone; your
noSQL queries could easily be incorrect or your join code might be
incorrect or suboptimal; optimized joins have been done before by the masters of
RDBMs, and joins in RDBMs are proven to be correct, mathematically in most cases.
(d) Some non-relational databases, like MongoDB, do not support transactions - in my team's case, that means we have to use an external distributed lock so that multiple queries can be grouped together into an atomic transaction. It would be somewhat easier if we could just use transactions and avoid application level locks.
with a more powerful relational database system that can do optimized joins in C/C++ on the database server rather than in your Node.js code, you let your Node.js server do what it's best at.
With that being said, I think it's pretty f*ing stupid that many major noSQL vendors don't support joins (?) Complete de-normalization is only a dream as far as I can see it. And the lack of transactions can be a bit weird. Without transactions, only one query is atomic, you cannot make multiple queries atomic without an application level locking mechanism :/
Take-aways:
If you want non-relational persistence - why not simply de-normalize a relational database? There is nobody forcing you to use a traditional database in a relational manner.
If you use a relational DB with Node.js I recommend this ORM:
https://github.com/typeorm/typeorm
As an aside, I prefer the term "non-relational" as opposed to "noSQL".
In my experience node tends to be popular with databases that have a stateless API, this fits very nicely into nodes async nature. Most relational databases utilize stateful connections for transactions, this minimizes the primary advantages of async non-block i/o.
Can you explain exactly what specific problems you are facing with your chosen database and node.js?
A few reasons why MongoDB could be more popular than relational databases:
MongoDB is essentially a JSON object store, so it translates very well for a javascript application. MongoDB functions are javascript functions.
I am just guessing here, but since NoSQL databases are newer and have more enthusiastic programmers experimenting with it, you probably have more involvement in those NPM modules.
Apart from this, Node.js technically is a perfect choice for any sort of database application. I have personally worked on a small Node.js/MySQL application and I didn't face any hurdles.
But back to my main point, we could talk about this all day, and that is not what this forum is for. If you have any specific issues in any code with Node.js and your database of choice, please ask those questions instead.
Edit: Strictly technical reasons, apart from the JSON compatibility on both sides: There are none.
Anyone wondering about the same question in 2021-
Node has nothing to do with type of databse you choose.
You can choose database of your choice as per your requirement.
If you need to maintain strict data structure then choose relational db, else you can go for NO-SQL.
There are NPM packages for PostgreSQL, MySql and other db which are non-blocking. These db clients will not block the Node process while performing queries.
We have several legacy SQL Server databases that we occasionally make schema changes to. We currently have a utility written in C++ that allows users to update their DB's with these schema changes. The utility currently generates dynamic sql to create all DB objects. I am looking into redoing this and thought EF migrations might be a good way to go. I have read up a bit on the subject and I have a general idea of how it works. But I'm having a bit of a hard time figuring out how I would set it up to replace our current procedure (or if it is even possible). Currently, a client could be on any one of a number of previous versions. I'm assuming I would have to go back to the oldest possible version and create my model/initial migration from that, then generate incremental migrations for each version change in order to support updates from all versions. Is that a correct assumption? Also, currently our clients could be using sql server 2000, 2005, or 2008. Would this have any effect on how I would set things up (or if I even could)? Further, the goal is to create a utility with a (C# - probably WPF) UI that the user can use to manipulate the migrations (up or down, preferably). I've seen a lot of examples of how to manipulate migrations from command-line within package manager but not a lot of stuff on how to create a utility with a friendly UI for upgrading/downgrading DB's in production. Also, I have not seen anything that shows how to create stored procedures in a migration (our DBs rely on some stored procedures). I'm assuming that, if nothing else, I can use the Sql() method to generate a SQL query to create a SP. Is that correct? Is there a better way?
I know my questions are a bit non-specific and I apologize for that. But I'm still in the beginning processes of learning this and I'd like to get an idea of whether or not this is a good way to go. Any guidance would be greatly appreciated.
Thanks,
Dennis
Firstly, on SQL Server support, Entity Framework doesn't really support SQL Server 2000. See this question:
EntityFramework SQL Server 2000?
On the question of supporting all the multiple versions, you have the right idea about needing to generate an initial migration for the oldest version first then incrementally altering the model and generating migrations to support the later versions. This will be a pain as the migrations are opinionated about how they represent the model in the database and you will be doing a lot of messing about to end up with a model and a set of migrations that fully represent that. Specific concerns are indexes, column lengths, data types, stored procedures, triggers, functions, partitioning.
The Sql() function gets you around most issues, though also helpful in the migrations are functions like CreateIndex and AlterColumn.
For automating this, the migrations are definitely available as powershell cmdlets which are themselves just .Net objects so can be called programmatically.
As this question is a year old, I assume you will have made a decision on whether to do this. My opinion is that it is hard to see that it's worth the effort. If you were re-platforming the code base that uses this database to Entity Framework then it would make sense. Otherwise there are bound to be better tools out there for database version management. My first port of call would be Redgate.