How to search column_names in Vertica? - search

Anyone know of a handy function to search through column_names in Vertica? From the documentation, it seems like \d only queries table_names. I'm looking for something like MySQL's information_schema.columns, but can't find any information about a similar table of meta-data.
Thanks!

In 5.1 if you have enough permissions you can do
SELECT * FROM v_catalog.columns;
to access columns's info, for some things you'll need to join with
v_catalog.tables

The answer may differ depending on the version of Vertica you are using.
In the latest version, 5.1, there is a COLUMNS system table. Just from looking at the online documentation here seems to be the most useful columns with their types:
TABLE_SCHEMA VARCHAR
TABLE_NAME VARCHAR
DATA_TYPE VARCHAR
That should give you what you need. If your version doesn't have the system table, let me know what version you're running and I'll see what we can do.

Wrap this python script in a shell function and you'll be able to see all tables that contain any two columns:
import argparse
parser = argparse.ArgumentParser(description='Find Vertica attributes in tables')
parser.add_argument('names', metavar='N', type=str, nargs='+', help='attribute names')
args = parser.parse_args()
def vert_attributes(*names):
first_name = names[0].lower()
first = "select root.table_name, root.column_name from v_catalog.columns root "
last = " where root.column_name like '%s' " % first_name
names = names[1:]
if len(names) >= 1:
return first + " ".join([" inner join (select table_name from v_catalog.columns where column_name like '%s') q%s on root.table_name = q%s.table_name " % (name.lower(), index, index) for index,name in enumerate(names)]) + last
else:
return first + last
print nz_attributes(*tuple(args.names))

Related

How to apply DISTINCT on only date part of datetime field in sqlalchemy python?

I need to query my database and return the result by applying Distinct on only date part of datetime field.
My code is:
#blueprint.route('/<field_id>/timeline', methods=['GET'])
#blueprint.response(field_timeline_paged_schema)
def get_field_timeline(
field_id,
page=1,
size=10,
order_by=['capture_datetime desc'],
**kwargs
):
session = flask.g.session
field = fetch_field(session, parse_uuid(field_id))
if field:
query = session.query(
func.distinct(cast(Capture.capture_datetime, Date)),
Capture.capture_datetime.label('event_date'),
Capture.tags['visibility'].label('visibility')
).filter(Capture.field_id == parse_uuid(field_id))
return paginate(
query=query,
order_by=order_by,
page=page,
size=size
)
However this returns the following error:
(psycopg2.errors.InvalidColumnReference) for SELECT DISTINCT, ORDER BY expressions must appear in select list
The resulting query is:
SELECT distinct(CAST(tenant_resson.capture.capture_datetime AS DATE)) AS distinct_1, CAST(tenant_resson.capture.capture_datetime AS DATE) AS event_date, tenant_resson.capture.tags -> %(tags_1)s AS visibility
FROM tenant_resson.capture
WHERE tenant_resson.capture.field_id = %(field_id_1)s
Error is:
Query error - {'error': ProgrammingError('(psycopg2.errors.InvalidColumnReference) SELECT DISTINCT ON expressions must match initial ORDER BY expressions\nLINE 2: FROM (SELECT DISTINCT ON (CAST(tenant_resson.capture.capture...\n ^\n',)
How to resolve this issue? Cast is not working for order_by.
I am not familiar with sqlalchemy but this resulting query works as you expect. Please note the DISTINCT ON.
Maybe there is a way in sqlalchemy to execute non-trivial parameterized queries? This would give you the extra benefit to be able to test and optimize the query upfront.
SELECT DISTINCT ON (CAST(tenant_resson.capture.capture_datetime AS DATE))
CAST(tenant_resson.capture.capture_datetime AS DATE) AS event_date,
tenant_resson.capture.tags -> %(tags_1)s AS visibility
FROM tenant_resson.capture
WHERE tenant_resson.capture.field_id = %(field_id_1)s;
You can order by event_date if your business logic needs.
The query posted by #Stefanov.sm is correct. In SQLAlchemy terms it would be
query = (
session.query(
Capture.capture_datetime.label('event_date'),
Capture.tags['visibility'].label('visibility')
).distinct(cast(Capture.capture_datetime, Date))\
.filter(Capture.field_id == parse_uuid(field_id))
)
See the docs for more information
I needed to add order_by to my query. Now it works fine.
query = session.query(
cast(Capture.capture_datetime, Date).label('event_date'),
Capture.tags['visibility'].label('visibility')
).filter(Capture.field_id == parse_uuid(field_id)) \
.distinct(cast(Capture.capture_datetime, Date)) \
.order_by(cast(Capture.capture_datetime, Date).desc())

Need help fetching data from a column

Sorry for this but I'm real new to sqlite: i've created a database from an excel sheet I had, and I can't seem to fetch the values of the column I need
query = """ SELECT GNCR from table"""
cur.execute(query)
This actually works, but
query = """ SELECT ? from table"""
cur.execute(query, my_tuple)
doesn't
Here's my code:
def print_col(to_print):
db = sqlite3.connect('my_database.db')
cur = db.cursor()
query = " SELECT ? FROM my_table "
cur.execute(query, to_print)
results = cur.fetchall()
print(results)
print_col(('GNCR',))
The result is:
[('GNCR',), ('GNCR',), ('GNCR',), ('GNCR',), [...]]
instead of the actual values
What's the problem ? I can't figure it out
the "?" character in query is used for parameter substitution. Sqlite will escape the parameter you passed and replace "?" with the send text. So in effect you query after parameter substitution will be SELECT 'GNCR' FROM my_table where GNCR will be treated as text so you will get the text for each row returned by you query instead of the value of that column.
Basically you should use the query parameter where you want to substitute the parameter with escaped string like in where clause. You can't use it for column name.

python oracle where clause containing date greater than comparison

I am trying to use cx_Oracle to query a table in oracle DB (version 11.2) and get rows with values in a column between a datetime range.
I have tried the following approaches:
Tried between clause as described here, but cursor gets 0 rows
parameters = (startDateTime, endDateTime)
query = "select * from employee where joining_date between :1 and :2"
cur = con.cursor()
cur.execute(query, parameters)
Tried the TO_DATE() function and Date'' qualifiers. Still no result for Between or >= operator. Noteworthy is that < operator works. I also got the same query and tried in a sql client, and the query returns results. Code:
#returns no rows:
query = "select * from employee where joining_date >= TO_DATE('" + startDateTime.strftime("%Y-%m-%d") + "','yyyy-mm-dd')"
cur = con.cursor()
cur.execute(query)
#tried following just to ensure that some query runs fine, it returns results:
query = query.replace(">=", "<")
cur.execute(query)
Any pointers about why the between and >= operators are failing for me? (my second approach was in line with the answer in Oracle date comparison in where clause but still doesn't work for me)
I am using python 3.4.3 and used cx_Oracle 5.3 and 5.2 with oracle client 11g on windows 7 machine
Assume that your employee table contains the field emp_id and the row with emp_id=1234567 should be retrieved by your query.
Make two copies of your a program that execute the following queries
query = "select to_char(:1,'YYYY-MM-DD HH24:MI:SS')||' >= '||to_char(joining_date,'YYYY-MM-DD HH24:MI:SS')||' >= '||to_char(:2,'YYYY-MM-DD HH24:MI:SS') resultstring from employee where emp_id=1234567"
and
query="select to_char(joining_date,'YYYY-MM-DD HH24:MI:SS')||' >= '||to_char(TO_DATE('" + startDateTime.strftime("%Y-%m-%d") + "','yyyy-mm-dd'),'YYYY-MM-DD HH24:MI:SS') resultstring from employee where emp_id=1234567"
Show us the code and the value of the column resultstring
You are constructing SQL queries as strings when you should be using parameterized queries. You can't use parameterization to substitute the comparison operators, but you should use it for the dates.
Also, note that the referenced answer uses the PostgreSQL parameterisation format, whereas Oracle requires you to use the ":name" format.

Dynamic variable parameter in static mysql query using groovy soap ui

I would like to generate the query that results for the BeneficiaryID 'ABC123' along with some other inputs if they were also given. Suppose if the currency value is given, I would like to include the Currency condition as well in the JOIN query, so as well the Category. I have the following code snippet in the SOAP UI Groovy script.
query= " CORR.BeneficiaryID LIKE 'ABC123'"
if (currencyValue!=""){
query=query + " and CORR.Currency LIKE '${currencyValue}'"
}
if (CategoryValue!=""){
query=query + " and CORR.Category LIKE '${CategoryValue}'"
}
log.info("Query" + query)
Outputrows = sql.rows("select CORR.Preferred as preferred ,CORR.Category as category,CORR.Currency as currency\
from BENEFICIARY CORR \
JOIN LOCATION LOC on CORR.UID=LOC.UID and ${query}
log.info("Output rows size" + Outputrows.size())
When currency and category are not given, I would like to have the following query run and get me the results.
select CORR.Preferred as preferred ,CORR.Category as category,CORR.Currency as currency\
from BENEFICIARY CORR \
JOIN LOCATION LOC on CORR.UID=LOC.UID and CORR.BeneficiaryID LIKE 'ABC123'
and when the currency and category are given(say USD & Commercial), then the following query.
select CORR.Preferred as preferred ,CORR.Category as category,CORR.Currency as currency\
from BENEFICIARY CORR \
JOIN LOCATION LOC on CORR.UID=LOC.UID and CORR.BeneficiaryID LIKE 'ABC123' and CORR.Currency LIKE 'USD' and CORR.Category LIKE 'Commercial'
All I could see on the result for Outputrows.size() is zero(0).
Can you please correct me where am I doing wrong.
Thanks.
Here is changed script.
Since the issue to just build query, only putting that part remove sql execution part as that is not really the issue.
//Define the values or remove if you get those value from somewhere else
//Just put them here to demonstrate
//You may also try by empty value to make sure you are getting the right query
def currencyValue = 'USD'
def categoryValue = 'Commercial'
def query = 'select CORR.Preferred as preferred, CORR.Category as category,CORR.Currency as currency from BENEFICIARY CORR JOIN LOCATION LOC on CORR.UID = LOC.UID and CORR.BeneficiaryID LIKE \'ABC123\''
currencyValue ? (query += " and CORR.Currency LIKE '${currencyValue}'") : query
categoryValue ? (query += " and CORR.Category LIKE '${categoryValue}'") : query
log.info "Final query is \n ${query}"
You can just pass query to further where you need to run the sql, say sql.rows(query)
You may quickly try Demo

Count number of rows in Pysqlite3

I have to code on python sqlite3 a function to count rows of a table.
The thing is that the user should input the name of that table once the function is executed.
So far I have the following. However, I don't know how to "connect" the variable (table) with the function, once it's executed.
Any help would be great.
Thanks
def RT():
import sqlite3
conn= sqlite3.connect ("MyDB.db")
table=input("enter table name: ")
cur = conn.cursor()
cur.execute("Select count(*) from ?", [table])
for row in cur:
print str(row[0])
conn.close()
Columns and Tables Can't be Parameterized
As explained in this SO answer, Columns and tables can't be parameterized. A fact that might not be documented by any authoritative source (I couldn't find one, so if you you know of one please edit this answer and/or the one linked above), but instead has been learned through people trying exactly what was attempted in the question.
The only way to dynamically insert a column or table name is through standard python string formatting:
cur.execute("Select count(*) from {0}".format(table))
Unfortunately This opens you up to the possibility of SQL injection
Whitelist Acceptable Column/Table Names
This SO answer explains that you should use a whitelist to check against acceptable table names. This is what it would look like for you:
import sqlite3
def RT():
conn = sqlite3.connect ("MyDB.db")
table = input("enter table name: ")
cur = conn.cursor()
if table not in ['user', 'blog', 'comment', ...]:
raise ... #Include your own error here
execute("Select count(*) from {0}".format(table))
for row in cur:
print str(row[0])
conn.close()
The same SO answer cautions accepting submitted names directly "because the validation and the actual table could go out of sync, or you could forget the check." Meaning, you should only derive the name of the table yourself. You could do this by making a clear distinction between accepting user input and the actual query. Here is an example of what you might do.
import sqlite3
acceptable_table_names = ['user', 'blog', 'comment', ...]
def RT():
"""
Client side logic: Prompt the user to enter table name.
You could also give a list of names that you associate with ids
"""
table = input("enter table name: ")
if table in acceptable_table_names:
table_index = table_names.index(table)
RT_index(table_index)
def RT_index(table_index):
"""
Backend logic: Accept table index instead of querying user for
table name.
"""
conn = sqlite3.connect ("MyDB.db")
cur = conn.cursor()
table = acceptable_table_names[table_index]
execute("Select count(*) from {0}".format(table))
for row in cur:
print str(row[0])
conn.close()
This may seem frivolous, but this keeps the original interface while addressing the potential problem of forgetting to check against a whitelist. The validation and the actual table could still go out of sync; you'll need to write tests to fight against that.

Resources