SchemaCrawler user name and roles - schemacrawler

I am using schema crawler in my project. Is there any way to get the Users and their roles ?
There catalogue object doesn't contain any such information .
I am using this configuration :
val schemaCrawlerOptions = SchemaCrawlerOptionsBuilder
.builder
.withSchemaInfoLevel(SchemaInfoLevelBuilder.maximum())
.toOptions

SchemaCrawler does not provide a way to obtain users and roles as part of the database metadata catalog. However, you can execute any arbitrary SQL queries, and show the results in SchemaCrawler output. Depending on your database, you may be able to obtain user and role information by means of a SQL query, so then you can include this as part of the SchemaCrawler output. See the SchemaCrawler examples (included with the download) for an idea how to run SQL queries with SchemaCrawler.
Sualeh Fatehi, SchemaCrawler.

Related

Elasticsearch using my PostgresSQL database

By default Elasticsearch seems to query its own database in the indexes defined during the search.
Is it possible that Elasticsearch is not querying its database but mine in PostgresSql?
No.
Elasticsearch is a database on its own rights, it's not an interface/middleman for other backends.
If you want to conditionally query different databases, you need to implement that logic at application level.

Database within a Database in Databricks

Is it possible to have a folder or database with a database in Azure Databricks? I know you can use the "create database if not exists xxx" to get a database, but I want to have folders within that database where I can put tables.
Thanks.
If I correctly understand your question you want to have a database stored in specific location? Then you can use CREATE DATABASE with explicit LOCATION parameter (see docs), like this:
CREATE DATABASE IF NOT EXISTS <name>
LOCATION 'path to folder';
And then all data for tables in this database will be stored under given location.
P.S. You can also specify path for table explicitly, when writing, using the option path:
df.write.format("delta").option("path", "path_to_table")\
.saveAsTable("database.table")

Spark results accessible through API

We really would want to get an input here about how the results from a Spark Query will be accessible to a web-application. Given Spark is a well used in the industry I would have thought that this part would have lots of answers/tutorials about it, but I didnt find anything.
Here are a few options that come to mind
Spark results are saved in another DB ( perhaps a traditional one) and a request for query returns the new table name for access through a paginated query. That seems doable, although a bit convoluted as we need to handle the completion of the query.
Spark results are pumped into a messaging queue from which a socket server like connection is made.
What confuses me is that other connectors to spark, like those for Tableau, using something like JDBC should have all the data (not the top 500 that we typically can get via Livy or other REST interfaces to Spark). How do those connectors get all the data through a single connection.
Can someone with expertise help in that sense?
The standard way I think would be to use Livy, as you mention. Since it's a REST API you wouldn't expect to get a JSON response containing the full result (could be gigabytes of data, after all).
Rather, you'd use pagination with ?from=500 and issue multiple requests to get the number of rows you need. A web application would only need to display or visualize a small part of the data at a time anyway.
But from what you mentioned in your comment to Raphael Roth, you didn't mean to call this API directly from the web app (with good reason). So you'll have an API layer that is called by the web app and which then invokes Spark. But in this case, you can still use Livy+pagination to achieve what you want, unless you specifically need to have the full result available. If you do need the full results generated on the backend, you could design the Spark queries so they materialize the result (ideally to cloud storage) and then all you need is to have your API layer access the storage where Spark writes the results.

Does Azure SQL API support xml query?

I know it is better to convert xml to json to store it in CosmosDb especially to query documents using SQL API. But will it be OK to flatten xml data to store it inside of a document then query them using SQL API? Not even sure if SQL API support xml query or not.
The simple answer is NO :(
The CosmosDB is stores JSON documents:
https://learn.microsoft.com/en-us/azure/cosmos-db/introduction
Depending the choosen API, you can handle these JSON documents in graph model (Gramlin), or in document collection (MongoDb, DocumentDB). Or SQL etc... But the result is always an JSON document.
But there are a lot of tools to convert XML to JSON and convert back. Here is a discussion about it:
How to convert JSON to XML or XML to JSON?
I hope it helps.
Regards
gy
Per my knowledge, it is not possible. More information for your reference: Query Azure Cosmos DB data with SQL queries

Retrieving data from couchDB

I am new to couchDB but have a good experience working with relational databases. Can anyone tell how to connect to couchDB database and retrieve the data stored in it. I am giving an example in relational database and i need help regarding how to do similar task in couchDB.In mysql we use a connector to get connected to the database and the for example we give "select username from tablename where password="abc" ".
CouchDB talks HTTP and JSON, then you can use any HTTP client and JSON parser/generator. You can find a nice introduction in The Definitive Guide.
Try this URL: http://localhost:5984/_utils/, it will open FUTON editor.
CouchDB is a NOSQL database. So it works using HTTP requests (url based). Data that is stored in couchDB is in the form of JSON documents, so there is no concept of tables. In short, database in SQL represent database in couchDB and the rows in a table of SQL represent Documents in couchDB.
Coming back to your question, to retrieve data from couchDB, there is a concept called views which uses Map and Reduce functions (which are JavaScript functions). Using these views couchDB indexes your search function spanning through the complete database (includes all documents), so you need to write a Map function specifying the condition to be used to search. Here's an example -
function(doc) {
if (doc.password) {
emit(doc.username, doc);
}
}
The above example is a simple Map function. Search for the documents of the database where there is a password and return the usernames from all the documents in the database. Password input value (in this case "abc") should be specified in the query string that you will be sending out to couchDB URL. Now, you might ask where is the database specified to search for? I said that we have to create views in order to search. These views are stored in that particular database where you want to search. So, if you want to search a database with name "User_Credentials", then create a view in the "User_Credentials" with the above Map function. More details on how it can be done can be found here: CouchDB Guide to Views

Resources