How to unit test Gremlin Queries on JanusGraph using FastAPI and GremlinPython

How to unit test Gremlin Queries on JanusGraph using FastAPI and GremlinPython - python-3.x

I have written a Python REST API using FastAPI. It connects to Janus Graph on a remote machine and runs some Gremlin Queries using the GremlinPython API. While writing my unit tests using FastAPI's built in test client, I cannot mock Janus Graph and test my APIs. In the worst case I need to run Janus on docker in my local setup and test there. However, I would like to do a pure unit test. I've not come across any useful documentation so far. Can anyone please help?

I think running Gremlin Server locally is how a lot of people do local testing. If you do not need to test data persistence you could configure JanusGraph to use the "inmemory" backend and avoid the need to provision any storage nodes.

Related

Switching between Databricks Connect and local Spark environment

I am looking to use Databricks Connect for developing a pyspark pipeline. DBConnect is really awesome because I am able to run my code on the cluster where the actual data resides, so it's perfect for integration testing, but I also want to be able to, during development and unit testing (pytest with pytest-spark), simply using a local Spark environment.
Is there any way to configure DBConnect so for one use-case I simply use a local Spark environment, but for another it uses DBConnect?

My 2 cents, since I've been done this type of development for some months now:
Work with two Python environments: one with databricks-connect (and thus, no pyspark installed), and another one with only pyspark installed. When you want to execute the tests, just activate the "local" virtual environment and run pytest as usual. Make sure, as some commenters pointed out, that you are initializing the pyspark session using SparkConf().setMaster("local").
Pycharm helps immensely to switch between environments during development. I am always on the "local" venv by default, but whenever I want to execute something using databricks-connect, I just create a new Run configuration from the menu. Easy peasy.
Also, be aware of some of databricks-connect's limitations:
It is not officially supported anymore, and Databricks recommend moving towards dbx whenever possible.
UDFs just won't work in databricks-connect.
Mlflow integration is not reliable. In my use case, I am able to download and use models, but unable to log a new experiment or track models using databricks tracking uri. This might depend on your Databricks Runtime, mlflow and local Python version.

How to setup local cluster for MBrace

I'm trying to follow tutorials on using MBrace with f# (one is here (youtube video). The problem is that with all the videos I've seen, they are either using Azure or running some form of local cluster on the machine.
Since I'll not be using Azure now, how do I setup a local cluster which I can use to test mbrace locally without having to go online?

If you want to test MBrace with a local cluster on your machine you can
git clone https://github.com/mbraceproject/MBrace.Core and for a sample check this https://github.com/mbraceproject/MBrace.Core/blob/master/samples/wordcount.fsx
One important note is that we are currently working towards MBrace 1.0 and you may find some API differences between MBrace.Core and MBrace.StarterKit (https://github.com/mbraceproject/MBrace.StarterKit)

Different database for production and development in nodejs

I know that Ruby on Rails has this feature, and in the railstutorial it specifically encourages it. However, I have not found such a thing in nodejs. If I want to run Sqlite3 on my machine so I can have easy to use database access, but postgres in production on Heroku, how would I do this in Nodejs? I can't see to find any tutorials on it.
Thank you!
EDIT: I meant to include Node.JS + Express.

It's possible of course, but be aware that this is probably a bad idea: http://12factor.net/dev-prod-parity
If you don't want to go through the hassle of setting up postgres locally, you could instead use a free postgres plan on Heroku and connect to it from your local machine:
DATABASE_URL=url node server.j
A .env file can make this easier:
https://devcenter.heroku.com/articles/heroku-local#copy-heroku-config-vars-to-your-local-env-file

To switch between production and development Db you use different ports for running you application locally and on Heroku.
As Heroku by default runs the application to port 80 you have a some other port while running your app locally.
This will help you to figure out in run time if your application is running locally or in production and you can switch the Databases accordingly.

You could use something like jugglingdb to do this:
JugglingDB(3) is cross-db ORM for nodejs, providing common interface to access most popular database formats. Currently supported are: mysql, sqlite3, postgres, couchdb, mongodb, redis, neo4j and js-memory-storage (yep, self-written engine for test-usage only). You can add your favorite database adapter, checkout one of the existing adapters to learn how, it's super-easy, I guarantee.
Jugglingdb also works on client-side (using WebService and Memory adapters), which allows to write rich client-side apps talking to server using JSON API.
I personally haven't used it, but having a common API to access all your database instances would make it super simple to use one locally and one in production - you could wire up some location detection without too much trouble as well and have it automatically select the target db depending on the environment it's in.

Remote javascript interaction with arangodb

Our production environment doesn't provide a shell but only javascript engine and REST interface. Our arangodb server will be installed at a remote location. Since all of our users are comfortable with javascript implementation we are looking for a solution if we could provide them an interface where they write the queries for arangodb in javascript (the way we do in arangodbsh) and we can execute them remotely and get the result.
Is it somehow possible ?
I am new to arangodb and so far I have found that there is only REST interface available to interact remotely.
arangosh is not available and can not be used.

You can use arangosh to connect to the remote server as it uses the REST interface to work. All information on connecting to your server is available via arangosh --help. The default behaviour of arangosh is to connect to a local ArangoDB instance, but it can connect to remote ones as well.
You probably want to do something like, where 1.2.3.4 is the IP of your remote server:
arangosh --server.endpoint tcp://1.2.3.4:8529
If you want to execute arbitrary JavaScript code in ArangoDB from an application, you can use the endpoint /_admin/execute described here that takes JavaScript code as its body that will be executed in ArangoDB. Be aware that this is a potential security risk

Running IIS server with Coypu and SpecFlow

I have already spending a lot of time googling for some solution but I'm helpless !
I got an MVC application and I'm trying to do "integration testing" for my Views using Coypu and SpecFlow. But I don't know how I should manage IIS server for this. Is there a way to actually run the server (first start of tests) and making the server use a special "test" DB (for example an in-memory RavenDB) emptied after each scenario (and filled during the background).
Is there a better or simpler way to do this?

I'm fairly new to this too, so take the answers with a pinch of salt, but as noone else has answered...
Is there a way to actually run the server (first start of tests) ...
You could use IIS Express, which can be called via the command line. You can spin up your website before any tests run (which I believe you can do with the [BeforeTestRun] attribute in SpecFlow) with a call via System.Diagnostics.Process.
The actual command line would be something like e.g.
iisexpress.exe /path:c:\iisexpress\<your-site-published-to-filepath> /port:<anyport> /clr:v2.0
... and making the server use a special "test" DB (for example an in-memory RavenDB) emptied after each scenario (and filled during the background).
In order to use a special test DB, I guess it depends how your data access is working. If you can swap in an in-memory DB fairly easily then I guess you could do that. Although my understanding is that integration tests should be as close to production env as possible, so if possible use the same DBMS you're using in production.
What I'm doing is just doing a data restore to my test DB from a known backup of the prod DB, each time before the tests run. I can again call this via command-line/Process before my tests run. For my DB it's a fairly small dataset, and I can restore just the tables relevant to my tests, so this overhead isn't too prohibitive for integration tests. (It wouldn't be acceptable for unit tests however, which is where you would probably have mock repositories or in-memory data.)

Since you're already using SpecFlow take a look at SpecRun (http://www.specrun.com/).
It's a test runner which is designed for SpecFlow tests and adds all sorts of capabilities, from small conveniences like better formatting of the Test names in the Test Explorer to support for running the same SpecFlow test against multiple targets and config file transformations.
With SpecRun you define a "Profile" which will be used to run your tests, not dissimilar to the VS .runsettings file. In there you can specify:
<DeploymentTransformation>
<Steps>
<IISExpress webAppFolder="..\..\MyProject.Web" port="5555"/>
</Steps>
</DeploymentTransformation>
SpecRun will then start up an IISExpress instance running that Website before running your tests. In the same place you can also set up custom Deployment Transformations (using the standard App.Config transformations) to override the connection strings in your app's Web.config so that it points to the in-memory DB.
The only problem I've had with SpecRun is that the documentation isn't great, there are lots of video demonstrations but I'd much rather have a few written tutorials. I guess that's what StackOverflow is here for.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string