Python reading my textfile with keywords for each line and then I let SQL select it and fetch it in Python, but I don't know the right list comprehension or code to exclude the keywords which couldn't be read by SQL. It only prints the last word from the loop and I want both keywords to be included.
So I have these keywords in a random textfile:
Bohemian Rhapsody
You're
Thriller
Just some random words
The database found tracks for the first two but didn't for the 3rd and 4th line in the file . I want a print statement which says: --- No tracks found for Thriller, Just some random words ---
My code:
import sqlite3, sys
conn = sqlite3.connect(r'C:\Users\Just\Downloads\chinook.db')
cur = conn.cursor()
import_file = input ('Enter file name: ')
with open(import_file, 'r') as f:
unfiltered = f.read().splitlines()
keywords = [filter_empty for filter_empty in unfiltered if filter_empty]
for keyword in keywords:
cur.execute('''SELECT tracks.TrackId, tracks.Name, artists.Name
FROM tracks
INNER JOIN albums ON tracks.AlbumId = albums.AlbumId
INNER JOIN artists ON albums.ArtistId = artists.ArtistId
WHERE tracks.name LIKE (?||'%') ''',(keyword,))
found_tracks = cur.fetchall()
unknown_tracks = []
if len(found_tracks) == 0:
print (keyword)
unknown_tracks += [keyword]
If keyword is not found in the database, the result of this cur.fetchall() will be an empty list. Add a test for that condition and output the desired message.
I have a Db from where I have to display all the columns which match the substring of the column given by user.
The following code works:
c.execute("select *from Transactions where Description like '%sh%' ")
conn.commit()
print(c.fetchall())
conn.close()
But when I try to run this code it returns me an empty list:
def search(col,val):
conn = sqlite3.connect('test.db')
c = conn.cursor()
c.execute("Select *from Transactions where ? Like ? ",(col,'%'+val+'%'))
print(c.fetchall())
search('description',"sh")
Also the result will always be a blank list even if the col name is wrong. as opposed the usual error which says column not found.
Please Help
I'm running the linux terminal command 'strings' on a file and storing the result, which is a series of readable strings, in a variable. The results are also written to a text file. I then need to upload the results to a database. The problem is that the result often contains ' and " characters in an unpredictable order. These cause an SQL error. I've tried to string.replace with an empty string and a \ escape. I've also tried """ or ''' round the string but neither work as I don't know which type of quotation mark will be first. Any suggestions would be appreciated.
fileName = filePath.rsplit('/', 1)[1]
stream = os.popen('strings ' + filePath)
output = stream.readlines()
file = open(filePath + "/" + fileName + "_StringsResults.txt", "w+")
for o in output:
file.write(str(o))
results += str(o)
file.close()
dsn = 'postgresql://############localhost:########/test?sslmode=disable'
conn = psycopg2.connect(dsn)
with conn.cursor() as cur:
cur.execute(
"UPSERT INTO test (testID) VALUES ('%s')" % (results))
conn.commit()
yes that worked, thanks a million. For anyone else who's interested the solution was roughly:
query = """UPSERT INTO test (testID) VALUES (%s)"""
#Connection code etc.
with conn.cursor() as cur:
cur.execute(query, [results])
conn.commit()
The [] round the parameter was necessary to avoid a type error.
I'm trying to extract some legacy data from a Teradata server, but some of the records contain weird characters that don't register in python, such as "U+ffffffc2".
Currently,
I'm using pyodbc to extract the data from Teradata
Placing the results into a numpy array (because when I put it directly into pandas, It interprets all of the columns as a single column of type string)
Then I turn the numpy array into a pandas dataframe to change things like Decimal("09809") and Date("2015,11,14") into [09809,"11,14,2015"]
Then I try to write it to a file, where this error occurs
ValueError: character U+ffffffc2 is not in range [U+0000; U+10ffff]
I don't have access to edit this data, so from a client perspective what can I do to skip or, preferably, remove the character before writing it trying to write it to a file and getting the error?
Currently, I have a "try and except" block to skip queries with erroneous data, but I have to query the data in row chunks of at least 100. So if I just skip it, I lose 100 or more lines at a time. As I mentioned before, however, I would prefer to keep the line, but remove the character.
Here's my code. (Feel free to point out any bad practices as well!)
#Python 3.4
#Python Teradata Extraction
#Created 01/28/16 by Maz Baig
#dependencies
import pyodbc
import numpy as np
import pandas as pd
import sys
import os
import psutil
from datetime import datetime
#create a global variable for start time
start_time=datetime.now()
#create global process variable to keep track of memory usage
process=psutil.Process(os.getpid())
def ResultIter(curs, arraysize):
#Get the specified number of rows at a time
while True:
results = curs.fetchmany(arraysize)
if not results:
break
#for result in results:
yield results
def WriteResult(curs,file_path,full_count):
rate=100
rows_extracted=0
for result in ResultIter(curs,rate):
table_matrix=np.array(result)
#Get shape to make sure its not a 1d matrix
rows, length = table_matrix.shape
#if it is a 1D matrix, add a row of nothing to make sure pandas doesn't throw an error
if rows < 2:
dummyrow=np.zeros((1,length))
dummyrow[:]=None
df = pd.DataFrame(table_matrix)
#give the user a status update
rows_extracted=rows+rows_extracted
StatusUpdate(rows_extracted,full_count)
with open(file_path,'a') as f:
try:
df.to_csv(file_path,sep='\u0001',encoding='latin-1',header=False,index=False)
except ValueError:
#pass afterwards
print("This record was giving you issues")
print(table_matrix)
pass
print('\n')
if (rows_extracted < full_count):
print("All of the records were not extracted")
#print the run durration
print("Duration: "+str(datetime.now() - start_time))
sys.exit(3)
f.close()
def StatusUpdate(rows_ex,full_count):
print(" ::Rows Extracted:"+str(rows_ex)+" of "+str(full_count)+" | Memory Usage: "+str(process.memory_info().rss/78
def main(args):
#get Username and Password
usr = args[1]
pwd = args[2]
#Define Table
view_name=args[3]
table_name=args[4]
run_date=args[5]
#get the select statement as an input
select_statement=args[6]
if select_statement=='':
select_statement='*'
#create the output filename from tablename and run date
file_name=run_date + "_" + table_name +"_hist.dat"
file_path="/prod/data/cohl/rfnry/cohl_mort_loan_perfnc/temp/"+file_name
if ( not os.path.exists(file_path)):
#create connection
print("Logging In")
con_str = 'DRIVER={Teradata};DBCNAME=oneview;UID='+usr+';PWD='+pwd+';QUIETMODE=YES;'
conn = pyodbc.connect(con_str)
print("Logged In")
#Get number of records in the file
count_query = 'select count (*) from '+view_name+'.'+table_name
count_curs = conn.cursor()
count_curs.execute(count_query)
full_count = count_curs.fetchone()[0]
#Generate query to retrieve all of the table data
query = 'select '+select_statement+' from '+view_name+'.'+table_name
#create cursor
curs = conn.cursor()
#execute query
curs.execute(query)
#save contents of the query into a matrix
print("Writting Result Into File Now")
WriteResult(curs,file_path,full_count)
print("Table: "+table_name+" was successfully extracted")
#print the scripts run duration
print("Duration: "+str(datetime.now() - start_time))
sys.exit(0)
else:
print("AlreadyThere Exception\nThe file already exists at "+file_path+". Please remove it before continuing\n")
#print the scripts run duration
print("Duration: "+str(datetime.now() - start_time))
sys.exit(2)
main(sys.argv)
Thanks,
Maz
If you have only 4-byte unicode points giving an error, this probably may help.
One solution is to register a custom error handler using codecs.register_error, which would filter out error points and then just try to decode:
import codecs
def error_handler(error):
return '', error.end+6
codecs.register_error('nonunicode', error_handler)
b'abc\xffffffc2def'.decode(errors='nonunicode')
# gives you 'abcdef' which's exactly what you want
You may futher impove your handler to catch more complicated errors, see https://docs.python.org/3/library/exceptions.html#UnicodeError and https://docs.python.org/3/library/codecs.html#codecs.register_error for details
I have this code that simply creates a list from user input. I want to load this into sqlite Db instead of list shown but am not conversant with Sqlite. please help
HERE IS THE CODE
listQ = []
while True:
read = input("Type in a line. ").lower().split()
for item in read:
listQ.append( input("Type in a line. ") )
for line in listQ:
import sqlite3
conn = sqlite3.connect('/C/project/new/sqlite_file.db')
c = conn.cursor()
for item in listQ:
c.execute('insert into tablename values (?,?,?)', item)
#print(line)