Pull/Import Data into ArangoDB from an External Database - arangodb

I've produced a Proof of concept to automate the export of data out of an external database and save it as a file to then import into ArangoDB using arangoimp.exe. While this method is certainly functional it is unfortunately not going to work due to my company not being able to utilize the file system in the external database or of the local file system that ArangoDB has access to.
The Question: Is there an alternative method or mechanism to import data from within ArangoDB?
Is there any inherent tool I can make use of such as:
User Defined Functions (UDFs)
Foxx
I've read about both of these features in ArangoDB however I'm curious if either of these two features can do what I need.
I need to be able to automate from within ArangoDB a procedure/action that connects to an External Database and can then run SQL Queries OR run Stored Procedures in that external database and store that data directly into ArangoDB Collections.
Alternative: Do I need to code/develop my own program?
Many years ago I created a Win Forms app that can connect to several databases. Basically it was my first attempt at learning connection strings and sql injection. That project didn't end up going beyond just that but I've had thoughts in the back of my mind that are telling me I might have to develop an intermediary application to facilitate the data transfer I'm attempting to make happen.
My fear in the latter is that this just opens up a brand new project that needs to be maintained and developed internally which means resources will need to be devoted to it.

arangoimp has --server.endpoint parameter, which would allow one to import the data on a remote machine to an arangodb server. The two machines would just have to be on the same network.

Related

Some input on how to proceed on the migration from SQL Server

I'm migrating from SQL Server to Azure SQL and I'd like to ask you who have more experience in Azure(I have basically none) some questions just to understand what I need to do to have the best migration.
Today I do a lot of cross database queries in some of my tasks that runs once a week. I execute SPs, run selects, inserts and updates cross the dbs. I solved the executions of SPs by using external data sources and sp_execute_remote. But as far as I can see it's only possible to select from an external database, meaning I won't be able to do any inserts or updates cross the dbs. Is that correct? If so, what's the best way to solve this problem?
I also read about cross db calls are slow. Does this mean it's slower that in SQL Server? I want to know if I'll face a slower process comparing to what I have today.
What I really need is some good guidelines on how to do the best migration without spending loads of time with trial and error. I appreciate any help in this matter.
Cross database transactions are not supported in Azure SQL DB. You connect to a specific database, and can't use 3 part names or use the USE syntax.
You could open up two different connections from your program, one to each database. It doesn't allow any kind of transactional consistency, but would allow you to retrieve data from one Azure SQL DB and insert it in another.
So, at least now, if you want your database in Azure and you can't avoid cross-database transactions, you'll be using an Azure VM to host SQL Server.

NoSQL database: ArangoDB

I have been looking for a database that can be embedded and also be file-based, like Sqlite.
I wanted a NoSQL type of database with this kind of feature.
The language is Python, and ArangoDB has binding for Python, and many other languages.
I am finding conflicting facts about ArangoDB.
In some cases I have seen articles say it is not an embedded DB, or can't be embedded, then see others that imply it is embedded.
Also on the website it says that it stores its data in a special binary format, and then I see an article saying its mainly an In-Memory database.
So its been very confusing.
1)So the question is, can this database run embedded in a python app?
If not, if it runs as a separate process, runs as a server, can this be generated/managed in Python with "zero configuration" on the part of the user, for the sake of deploying a desktop app based on this.
2) Does the database data etc get stored on disk.
SO that is it!
No, you can't embedd ArangoDB in the way you embedd SQLite.
ArangoDB offers the Foxx framework, which you can use to implement RESTfull microservices in JavaScript close to the database core like you would use python with SQLite. However, with AQL ArangoDB also offers a query language as SQLite does with sql.
There are currently several python drivers available that grant you access to ArangoDB from python in a compfortable manner.
The ArangoDB download page offers several packages, which you could use to deploy ArangoDB alongside your app. We offer a windows zip package that you could install by yourselves without user interaction; For linux distributions you'd probably want to use the respective package for that distribution. Easy deployability is one of our core goals.
Regarding the database and your data itself, this gets persisted to disk. This works via memory mapped files. However, the index and other structures are built up during the startup, which is why we refer to ourselves as mostly in memory.
Regular access to ArangoDB (and foxx) is done via the http interface and you get json documents as response. The drivers abstract that interface for you. If you implement foxx apps, you may need to formulate requests on your own.
ArangoDB Datafiles aren't intended to be moved across machines; though it may work as long as you have the same OS & Architectures on both sides. The proper way of doing this is to use ArangoDump on the first machine and ArangoRestore on the second. These are mostly json inside (one json document per line) so they're portable and even simple to load in python - you could even directly access the dump facility from python, and prepare an email for the user with the content.
The most sustainable way of running ArangoDB would be as a service; please note that you may need elevated privileges to register & re/start new services in Windows. The service then binds a tcp port, which you may access from other nodes in the network.

Single Shared Database, Fluent NHibernate, Many clients

I am working on inventory application (C# .net 4.0) that will simultaneously inventory dozens of workstations and write the results to a central database. To save me having to write a DAL I am thinking of using Fluent NHibernate which I have never used before.
It is safe and good practice to allow the inventory application which runs as a standalone application to talk directly to the database using Nhibernate? Or should I be using a client server model where all access to the database is via a server which then reads/writes to database. In other words if 50 workstations when currently being inventoried there would be 50 active DB sessions. I am thinking of using GUID-Comb for the PK ID's.
Depending on the environment in which your application will be deployed, you should also consider that direct database connections to a central server might not always be allowed for security reasons.
Creating a simple REST Service with WCF (using WebServiceHost) and simply POST'ing or PUT'ing your inventory data (using HttpClient) might provide a good alternative.
As a result, clients can get very simple and can be written for other systems easily (linux? android?) and the server has full control over how and where data is stored.
it depends ;)
NHibernate has optimistic concurrency control ootb which is good enough for many situations. So if you just create data on 50 different stations there should be no problem. If creating data on one station depends on data from all stations it gets tricky and a central server would help.

SubSonic-based app that connects to multiple databases

I currently developed an app that connects to SQL Server 2005 database, so my DAL objects where generated using information from that DB.
It will also be possible to connect to an Oracle and MySQL db, all with the same table structures (aside from the normal differences in fields, such as varbinary(max) in SQL Server and BLOB in Oracle, and so on). For this purpose, I already defined multiple connection strings and multiple SubSonic providers for the different DB's the app will run on.
My question is, if I generated my objects using a SQL Server database, should the generated objects work transparently with the other DB's or do I need to generate a different DAL for each database engine I use? Should I be aware of any possible bugs I may encounter while performing these operations?
Thanks in advance for any advice on this issue.
I'm using SubSonic 2.2 by the way....
From what I've been able to test so far, I can't see an easy way to achieve what I'm trying to do.
The ideal situation for me would have been to generate SubSonic objects using SQL Server for example, and just be able to switch dynamically to MySQL by just creating at runtime the correct Provider for it along with its connection string. I got to a point where my app would correctly connect from SQL Server to a MySQL DB, but there's a point where the app fails since SubSonic internally generates queries of the form
SELECT * FROM dbo.MyTable
which MySQL doesn't support obviously. I also noticed queries that enclosed table names with brackets ([]), so it seems that there are a number of factors that would limit the use of one Provider along multiple DB engines.
I guess my only other option is to sort it out with multiple generated providers, although I must admit it does not make me comfortable knowing that I'll have N copies of basically the same classes along my project.
I would really love to hear from anyone else if they've had similar experiences. I'll be sure to post my results once I get everything sorted out and working for my project.
Has any of this changed in 3.0? This would definitely be a worthy reason for me to upgrade if life is any easier on this matter...

how to use an MS-Access file from Linux?

I'm studying an introductory course in databases and one of the exercises is to work with MS-Access. However I'm using Linux at home and although I can use the computer classes at the university it is far from convenient (limited open time - my studying time is mostly nights).
So how can I use an Access file (*.mdb) in Linux? By use I mean changing tables, writing queries and so on.
Are there tools to convert it to another database format (mysql, postgresql or even gadfly)?
Also what problems may I encounter?
Although a bit dated, I've had good success with mdbtools which is a set of command line tools for accessing and converting Access databases to other formats. I've used it for importing databases into PostgreSQL.
If you're running an Ubuntu variant you can install it with:
sudo apt-get install mdbtools
or you can download it from here.
You're out of luck. Access has no real equivalent on Linux and while Kexi is an interesting alternative that can import Access files and aims to provide similar functionality, it doesn't actually uses Access files once the data is imported.
If your assignment is to develop an Access application with forms etc as opposed to just using and mdb database as a store, then you can try a recent release of Wine with a compatible Access version (see compatibility list) or, even better, find a Windows machine where you're sure it's going to work.
Not to be forgotten, the use of a Virtual Machine loaded with Windows would help you achieve the same thing on your Linux box.
I am currently trying Access with Wine on Ubuntu and I seem to be getting there. I have found that I need to copy various dlls manually, but that could easily be lack of reading up on the subject.
From the documentation: Connecting To Microsoft Access. However, this seems to indicate that you need access running in a windows host and connect via ODBC... See also Known Problems.
I recently discover https://dbeaver.io/ which is a software (in java) to manage different database types (MySQL, PostGreSQL…), a bit like phpmyadmin (but as a host based soft, no server require) and it can manage MS Access excep if version is too old (it is probably my case)
You can work with Access through a connection (ODBC or OLEDB), as long as you only need to manage the "database" dimension of the file (tables and views, which are called "queries" in Access).
Once the connection is open (see here for connection strings), you can send SQL commands to your mdb database, such as (where cn is here a connection object):
cn.execute "CREATE TABLE myTableName (myTable_id autoNumber, myTable_code Text, ...)"
Please note that MsAccess uses a specific DDL that looks like the standard T-SQL but is not really it. Check the syntax in MsAccess help.
Depending on your database (and its constraints, default values, primary keys used, relations, data validation rules, aso), transfering Access can be easy and straight or might not even be possible. You will encounter a problem each time your database implement an access-specific/non-standard SQL rule.
If you really need to convert your access data to something else, I'd adise you to (1) export it under MS-SQL (the free version will be ok, an upsizing wizard is available in Access or on this site), (2)use an additional tool like this one to generate a "CREATE DATABASE" SQL Script, including or not data inserts, (3) use this script to try to create the database and its data on another database server.
If you've got an assignment to work with Access, then frigging find a Windows computer and do your exercise on the native platform for Access. It's completely senseless to do anything else, as you won't be learning anything useful about Access.
If the assignment is to use a Jet data store, then that's something of a different story. And if it is, then you should have worded your question differently. I wouldn't recommend using Jet on anything but a native Windows file system. Certainly if the project is to actually read/write data to a Jet data file then you're not really fulfilling the assignment if you're not using Windows at least as the ODBC host.

Resources