Extract Salesforce Objects and Load them into SQLLite Database Tables- Python3 - python-3.x

I am trying to collect data from salesforce and then load them into sqllite tables.
Here is my code:
from simple_salesforce import Salesforce, SFType, SalesforceLogin
from pandas import DataFrame, read_csv
import json
import pandas as pd
from pprint import pprint as pp
#Connect to salesforce site
session_id, instance = SalesforceLogin(username=username, password=password, security_token=security_token)
#Create Instance
sf = Salesforce(instance=instance, session_id=session_id)
desc = sf.Opportunity.describe()
# Below is what you need
field_names = [field['name'] for field in desc['fields']]
soql = "SELECT {} FROM Opportunity ".format(','.join)
results = sf.query_all(soql)
sf_df = pd.DataFrame(results['records']).drop(columns='attributes')
sf_df.to_csv('/Users/ma/test1.csv')
This collects the opportunity table and writes it to a CSV file. Any suggestions on how to improve this step and also the next step which is to create a sqllite table out of the salesforce generated csv files? I am new to salesforce and sqllite and am stuck on these steps.

Related

Using "UPDATE" and "SET" in Python to Update Snowflake Table

I have been using Python to read and write data to Snowflake for some time now to a table I have full update rights to using a Snowflake helper class my colleague found on the internet. Please see below for the class I have been using with my personal Snowflake connection information abstracted and a simply read query that works given you have a 'TEST' table in your schema.
from snowflake.sqlalchemy import URL
from sqlalchemy import create_engine
import keyring
import pandas as pd
from sqlalchemy import text
# Pull the username and password to be used to connect to snowflake
stored_username = keyring.get_password('my_username', 'username')
stored_password = keyring.get_password('my_password', 'password')
class SNOWDBHelper:
def __init__(self):
self.user = stored_username
self.password = stored_password
self.account = 'account'
self.authenticator = 'authenticator'
self.role = stored_username + '_DEV_ROLE'
self.warehouse = 'warehouse'
self.database = 'database'
self.schema = 'schema'
def __connect__(self):
self.url = URL(
user=stored_username,
password=stored_password,
account='account',
authenticator='authenticator',
role=stored_username + '_DEV_ROLE',
warehouse='warehouse',
database='database',
schema='schema'
)
# =============================================================================
self.url = URL(
user=self.user,
password=self.password,
account=self.account,
authenticator=self.authenticator,
role=self.role,
warehouse=self.warehouse,
database=self.database,
schema=self.schema
)
self.engine = create_engine(self.url)
self.connection = self.engine.connect()
def __disconnect__(self):
self.connection.close()
def read(self, sql):
self.__connect__()
result = pd.read_sql_query(sql, self.engine)
self.__disconnect__()
return result
def write(self, wdf, tablename):
self.__connect__()
wdf.to_sql(tablename.lower(), con=self.engine, if_exists='append', index=False)
self.__disconnect__()
# Initiate the SnowDBHelper()
SNOWDB = SNOWDBHelper()
query = """SELECT * FROM """ + 'TEST'
snow_table = SNOWDB.read(query)
I now have the need to update an existing Snowflake table and my colleague suggested I could use the read function to send the query containing the update SQL to my Snowflake table. So I adapted an update query I use successfully in the Snowflake UI to update tables and used the read function to send it to Snowflake. It actually tells me that the relevant rows in the table have been updated, but they have not. Please see below for update query I use to attempt to change a field "field" in "test" table to "X" and the success message I get back. Not thrilled with this hacky update attempt method overall (where the table update is a side effect of sorts??), but could someone please help with method to update within this framework?
# Query I actually store in file: '0-Query-Update-Effective-Dating.sql'
UPDATE "Database"."Schema"."Test" AS UP
SET UP.FIELD = 'X'
# Read the query in from file and utilize it
update_test = open('0-Query-Update-Effective-Dating.sql')
update_query = text(update_test.read())
SNOWDB.read(update_query)
# Returns message of updated rows, but no rows updated
number of rows updated number of multi-joined rows updated
0 316 0
SQL2Pandas | UPDATE row(s) in pandas

Why are utf-8 emojis not getting rendered in my pandas dataframe when I read from the SQL database?

I have the following line to read from a csv file:
coronavirus_df = pd.read_csv('Path\coronavirus_March-3-2020.csv')
I Have this other lines to read from MSSQL:
import pandas as pd
import pyodbc
conn = pyodbc.connect('Driver={SQL Server};'
'Server=MyServer;'
'Database=Mydb;'
'Trusted_Connection=yes;')
cursor = conn.cursor()
sql_tweets_df = pd.read_sql_query('SELECT * FROM my table',conn)
In both cases, I can get the data from the data sources and create a data frame, but there is an important difference:
coronavirus_df['text'].loc[9] gives the result:
-> 'YEP 👍some more text.'
sql_tweets_df['Text'].loc[9] gives this other result:
-> 'YEP ðŸ‘\x8d some more text'
Why is this happening?, the emoji is not rendered when I'm getting the information from the database.
In both the database and in the excel file, that record seems to be precisely the same.
I'm using python 3 and Jupyter notebooks

how to fetch and save collection data quicker from MongoDB to a csv file in Python?

I have a collection in MongoDB with 4,040,989 records.
I am fetching and saving it in a csv in following way-
import pandas as pd
import pymongo
connection_string=''
myclient = pymongo.MongoClient(connection_string)
mydb = myclient[""]
mycol = mydb[""]
x = mycol.find()
df = pd.DataFrame(list(x))
print(df)
df.to_csv('Db_collection.csv')
The above way is taking so much time (it has been 30 minutes still continues) for this task.
Is there any inbuilt or custom efficient way to complete this process faster with Python?
I am using Python 3.7, PyMongo 3.10.1 and Windows 10 with GPU.

Binding Teradata Query in python not returning anything

I am trying to automate some usual db queries via python and was testing the sql parameterization
import teradata
import pyodbc
import sys
from pandas import DataFrame
import pandas as pd
import warnings
warnings.filterwarnings('ignore')
udaExec = teradata.UdaExec (appName="HelloWorld", version="1.0",
logConsole=False)
session = udaExec.connect(method="odbc", system="db",
username="username", password="password");
t = 'user_id' #dynamic column to be selected
cursor = session.cursor();
"""The below query returned only the user_id column
>>> sw_overall1
0
0 user_id
"""
sw_overall1=cursor.execute("""select distinct ? from
table""" ,(t,)).fetchall()
sw_overall1 = DataFrame(sw_overall1)
cursor = session.cursor();
#The below query returned the correct result
sw_overall2=cursor.execute("""select distinct user_id from
table""" ).fetchall()
Am I doing the binding incorrectly ? since without binding I get the correct output.

Querying with cqlengine

I am trying to hook the cqlengine CQL 3 object mapper with my web application running on CherryPy. Athough the documentation is very clear about querying, I am still not aware how to make queries on an existing table(and an existing keyspace) in my cassandra database. For instance I already have this table Movies containing the fields Title, rating, Year. I want to make the CQL query
SELECT * FROM Movies
How do I go ahead with the query after establishing the connection with
from cqlengine import connection
connection.setup(['127.0.0.1:9160'])
The KEYSPACE is called "TEST1".
Abhiroop Sarkar,
I highly suggest that you read through all of the documentation at:
Current Object Mapper Documentation
Legacy CQLEngine Documentation
Installation: pip install cassandra-driver
And take a look at this example project by the creator of CQLEngine, rustyrazorblade:
Example Project - Meat bot
Keep in mind, CQLEngine has been merged into the DataStax Cassandra-driver:
Official Python Cassandra Driver Documentation
You'll want to do something like this:
CQLEngine <= 0.21.0:
from cqlengine.connection import setup
setup(['127.0.0.1'], 'keyspace_name', retry_connect=True)
If you need to create the keyspace still:
from cqlengine.management import create_keyspace
create_keyspace(
'keyspace_name',
replication_factor=1,
strategy_class='SimpleStrategy'
)
Setup your Cassandra Data Model
You can do this in the same .py or in your models.py:
import datetime
import uuid
from cqlengine import columns, Model
class YourModel(Model):
__key_space__ = 'keyspace_name' # Not Required
__table_name__ = 'columnfamily_name' # Not Required
some_int = columns.Integer(
primary_key=True,
partition_key=True
)
time = columns.TimeUUID(
primary_key=True,
clustering_order='DESC',
default=uuid.uuid1,
)
some_uuid = columns.UUID(primary_key=True, default=uuid.uuid4)
created = columns.DateTime(default=datetime.datetime.utcnow)
some_text = columns.Text(required=True)
def __str__(self):
return self.some_text
def to_dict(self):
data = {
'text': self.some_text,
'created': self.created,
'some_int': self.some_int,
}
return data
Sync your Cassandra ColumnFamilies
from cqlengine.management import sync_table
from .models import YourModel
sync_table(YourModel)
Considering everything above, you can put all of the connection and syncing together, as many examples have outlined, say this is connection.py in our project:
from cqlengine.connection import setup
from cqlengine.management import sync_table
from .models import YourTable
def cass_connect():
setup(['127.0.0.1'], 'keyspace_name', retry_connect=True)
sync_table(YourTable)
Actually Using the Model and Data
from __future__ import print_function
from .connection import cass_connect
from .models import YourTable
def add_data():
cass_connect()
YourTable.create(
some_int=5,
some_text='Test0'
)
YourTable.create(
some_int=6,
some_text='Test1'
)
YourTable.create(
some_int=5,
some_text='Test2'
)
def query_data():
cass_connect()
query = YourTable.objects.filter(some_int=5)
# This will output each YourTable entry where some_int = 5
for item in query:
print(item)
Feel free to let ask for further clarification, if necessary.
The most straightforward way to achieve this is to make model classes which mirror the schema of your existing cql tables, then run queries on them
cqlengine is primarily an Object Mapper for Cassandra. It does not interrogate an existing database in order to create objects for existing tables. Rather it is usually intended to be used in the opposite direction (i.e. create tables from python classes). If you want to query an existing table using cqlengine you will need to create python models that exactly correspond to your existing tables.
For example, if your current Movies table had 3 columns, id, title, and release_date you would need to create a cqlengine model that had those three columns. Additionally, you would need to ensure that the table_name attribute on the class was exactly the same as the table name in the database.
from cqlengine import columns, Model
class Movie(Model):
__table_name__ = "movies"
id = columns.UUID(primary_key=True)
title = columns.Text()
release_date = columns.Date()
The key thing is to make sure that model exactly mirrors the existing table. If there are small differences you may be able to use sync_table(MyModel) to update the table to match your model.

Resources