Can I get schemacrawler to ignore the schema name? - schemacrawler

I'm attempting to make a comparison of two Oracle DBs - I'm running a report on two different schema names - in my case, a schema prefix. E.g. Using:
-schemas=FOO.*
then
-schemas=BAR.*
Is there a way of hiding this prefix from the report, so that it isn't shown as an obvious difference when comparing the two reports?
I know I can use the 'unimportant' text feature in Beyond Compare, but it would be nice to cover this upfront.
I have a feeling that I'm missing something obvious, or maybe no one ever requires this as the schema name is fairly fundamental. I suppose I am just comparing across schemas.
If it is in the help, I have probably misunderstood what I have read.
Any hints would be welcome.
Many thanks.

Of course, this was answered in an obvious place...
SchemaCrawler HowTo
How to hide catalog and schema names in text output
Change the configuration for the SchemaCrawler
schemacrawler.format.show_unqualified_names=true in the
schemacrawler.config.properties file. This setting will show
unqualified names of database objects such as tables and prcodures.
That is, the catalog and schema names will not be displayed. Use with
care, especially if you have foreign keys that reference tables in
other schemas, or synonyms.
However, in my situation, the output was actually within returned SQL and procedures etc, so is fundamental to what the DB is holding.
As far as I can see, my best way is to use Beyond Compare or something similar to strip these small strings out to aid in the comparison.

Related

DBT Duplication Check Ignores Schemas

During dbt compile, there is a model duplication check to be sure models aren’t stepping on top of each other. This check is causing me problems.
Our Architecture
Our system delineates the stages of processing into different schemas, and we're wanting begin using dbt. So, say we’re importing a raw dataset we’re calling jaffles, we’ll have a raw.jaffles table, a clean.jaffles table, and so on. Note raw and clean in this examples are different schemas.
The Problem
This breaks the duplication check. No matter how I customize the schema names, or how I call ref, the duplication check happens before touching any of that, notices we have two models named “jaffles”, ignores that they wouldn’t actually collide from being in different schemas, and throws an error.
Possible Solutions
Ideally, I'd customize how it solves for the paths it uses to check duplication to include schema. But I can't find how to customize that part.
Possibly I could skip this check altogether and do the integrity check myself. But I couldn't find options to disable this.
The only solution I'm seeing that could work is to rename each of the views to be unique, and this would be a lot of work polluting an otherwise super-clean naming convention we already have established.
As stated in the docs, "model names need to be unique, even if they are in distinct folders".
What you could do, though, is to use custom aliases (see the docs), where you can re-use the same table/view name within 2 or more different schemas. In your example, you could have two different models that have a specific schema assigned each:
-- models/.../raw_jaffles.sql
{{ config(alias='jaffles', schema='raw') }}
-- models/.../clean_jaffles.sql
{{ config(alias='jaffles', schema='clean') }}
Nevertheless, the file names still need to be different one from the other.

PouchDB structure

i am new with nosql concept, so when i start to learn PouchDB, i found this conversion chart. My confusion is, how PouchDB handle if lets say i have multiple table, does it mean that i need to create multiple databases? Because from my understanding in pouchdb a database can store a lot of documents, but a document mean a row in sql or am i misunderstood?
The answer to this question seems to be surprisingly under-documented. While #llabball clearly gave a decent answer, I don't think that views are always the way to go.
As you can read here in the section When not to use map/reduce, Nolan explains that for simpler applications, the key is to abuse _ids, and leverage the power of allDocs().
In other words, if you had two separate types (say artists, and albums), then you could prefix the id of each type to obtain an easily searchable data set. For example _id: 'artist_name' & _id: 'album_title', would allow you to easily retrieve artists in name order.
Laying out the data this way will result in better performance due to not requiring extra indexes, and less code. Clearly however, if your data requirements are more complex, then views are the way to go.
... does it mean that i need to create multiple databases?
No.
... a document mean a row in sql or am i misunderstood?
That's right. The SQL table defines column header (name and type) - that are the JSON property names of the doc.
So, all docs (rows) with the same properties (a so called "schema") are the equivalent of your SQL table. You can have as much different schemata in one database as you want (visit json-schema.org for some inspiration).
How to request them separately? Create CouchDB views! You can get all/some "rows" of your tabular data (docs with the same schema) with one request as you know it from SQL.
To write such views easily the property type is very common for CouchDB docs. Your known name from a SQL table can be your type like doc.type: "animal"
Your view names will be maybe animalByName or animalByWeight. Depends on your needs.
Sometimes multiple-databases plan is a good option, like a database per user or even a database per user-feature. Take a look at this conversation on CouchDB mailing list.

secondary index on column store dbs

Is there any column store database that supports secondary index ?
I know HBase does, but it's not there yet.
Haggai.
By storing overlapping projections in different sort orders, column stores based on the C-Store architecture (so, as far as commericial implementations go, Vertica) natively support secondary indexes.
See http://db.csail.mit.edu/projects/cstore/vldb.pdf
Also check out MonetDb, which treats "create index" statements as hints for its self-organizing engine.
Take a look in this class IndexSpecification which is part of r0.19.3.
Here you can see how to use it (maybe they have a test for that as well)
I've never used that and don't if it performs well. please share with us your results.
good luck
-- Yonatan
Sybase IQ supports as many indexes as you might ever desire on every column and even within a column (e.g. the word index which lets you stay with defaults or specify your own delimiter)

Optional or boolean elements to specify characteristics in XML schema?

I'm trying to create an XML schema to describe some aspects of hospitals. A hospital may have 24 hour coverage on: emergency services, operating room, pharmacist, etc. The entire list is relatively short - around 10. The coverage may be on more than one of these services.
My question is how best to represent this. I'm thinking along the lines of:
<coverage>
<emergencyServices/>
<operatingRoom/>
</coverage>
Basically, the services are optional and, if they exist, the coverage is offered by the hospital.
Alternatively, I could have:
<coverage>
<emergencyServices>true</emergencyServices>
<operatingRoom>true</operatingRoom>
<pharmacist>false</pharmacist>
</coverage>
In this case, I require all the elements, but a value of false means that the coverage isn't offered.
There are probably other approaches.
What's the best practice for something like this? And, if I use the first option, what type should the elements be in the schema?
Best practice here depends really on the consumer.
The short and simple rule is that markup is for structure, and content is for data. So having them contain xs:boolean values is generally the best course.
Now, on to the options:
Having separate untyped elements is simple and clear; sometimes processing systems may have difficulty reading them, because some XML-relational mappers may not see any data in the elements to put in relational tables. But if they had values, like <emergencyServices>true</emergencyServices>, then the relational table would have a value to hold.
Again, if you have fixed element names, it means if your consumer is using a system that maps the XML to a database, every time you add a service, a schema change will have to be made.
There are several other ways; each has trade-offs:
Using a <xs:string> with an enumeration, and allow multiple copies. Then you could have <coverage>emergencyServices</coverage><coverage>operatingRoom</coverage>. It makes adding to the list simpler, but allows duplicates. This scheme does not require schema changes in the database for the consumer.
You could use attributes on the <coverage> element. They would have a xs:boolean type, but still require a schema change. Of course, this evokes the attribute vs. element argument.
One good resource is Chapter 11 of Effective XML. At least this should be read before making a final decision.

Alternative Data Access pattern to Repository

I have certain objects in my domain which are not aggregate roots/entities, yet I still need to retrieve them from a database. I don't want to confuse things by creating repositories for these things. So, what are alternative data access patterns? Would you simply create a DAO for them, while still of course separating the interface?
Edit:
Some more detail on what I'm doing. I need to create a code. This code has certain rules as to its format. One of the rules is that the final character must be a unique number incremented by one from the last code generated. For example:
ABCD1
ABCD2
ABCD3
So, I'm keeping a table with one row, one column to store the number in question. Now, I don't want to consider this number an entity and create a repository for it - that's overkill. I just need a way of retrieving the number, adding 1 to it, and saving it. I know there are myriad ways I could do it, but I'm wondering if there's an customary way.
There are several data access patterns that could apply, in theory. You'd need to provide more detail though if you want us to suggest a specific pattern.
Without more detail, all I can suggest is to consider looking into Martin Fowler's Patterns of Enterprise Application Architecture book.
Edit: Customary way? No, not that I can think of - it really depends on where and how you're using this unique code in your domain. If I were doing this, I'd probably create a small service that speaks directly to the database to perform this function - not as heavy-weight as a repository, and very focused on the problem at hand.
Based on the edit: I would look first at the context in which you need to create that code. Perhaps there are some related entities or something that you are missing.
btw, I find the question really interesting as it comes up from time to time while coding specific features. I usually end up finding I was missing something on the scenario and it ends up fitting well with the normal repository pattern.
After surveying the options I'm going with the Table Gateway pattern.

Resources