Python df2gspread.upload() silently failing - python-3.x

I have this code that runs smoothly in PyCharm but doesn't seem to export anything to Google Sheets.
import pandas as pd
import gspread
from df2gspread import df2gspread as d2g
from oauth2client.service_account import ServiceAccountCredentials
d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)
scope = ['https://spreadsheets.google.com/feeds','https://www.googleapis.com/auth/drive']
credentials = ServiceAccountCredentials.from_json_keyfile_name('credentials_file_here.json', scope)
gc = gspread.authorize(credentials)
spreadsheet_key = 'spread_sheet_key_entered_here'
wks_name = 'Sheet1'
d2g.upload(df, spreadsheet_key, wks_name, credentials=credentials, row_names=True)

Related

How to change the datatype of the output

I want the output of this code in int64 format but the output of this code is in float. how can change it? pls suggest
import pandas as pd
import numpy as np
df = pd.read_csv('https://query.data.world/s/HqjNNadqEnwSq1qnoV_JqyRJkc7o6O')
df = df[df.isnull().sum(axis=1) < 5]
print(round(100*(df.isnull().sum()/len(df.index))),2)
Something like this should do the trick...
import pandas as pd
import numpy as np
df = pd.read_csv('https://query.data.world/s/HqjNNadqEnwSq1qnoV_JqyRJkc7o6O')
df = df[df.isnull().sum(axis=1) < 5]
x = round(100*(df.isnull().sum()/len(df.index)))
y = x.astype(np.int64)
print(y)
The key bit being x.astype(np.int64) to convert the format.

Python - configuration file

I have code:
import matplotlib.pyplot as plt
from configparser import ConfigParser
cfg = ConfigParser()
cfg.read('file.cfg')
plt.plot([1, 10],[2, 2], color_4, ls = "dashed")
plt.xlim(1,10)
plt.ylim(1,4)
plt.savefig('image.pdf')
and I would like control it by configuration file:
[a]
color_4 = c = 'silver'
What is wrong please? It gives an error:
NameError: name 'color_4' is not defined
I guess you need to get the value in this way to get the value of color_4:
cfg['a']['color_4']
from configparser import ConfigParser
cfg = ConfigParser()
cfg.read('file.cfg')
plt.plot([1, 10],[2, 2], cfg['a']['color_4'], ls = "dashed")
plt.xlim(1,10)
plt.ylim(1,4)
plt.savefig('image.pdf')
Ref: ConfigParser

Is there any alternative for pd.notna ( pandas 0.24.2). It is not working in pandas 0.20.1?

"Code was developed in pandas=0.24.2, and I need to make the code work in pandas=0.20.1. What is the alternative for pd.notna as it is not working in pandas version 0.20.1.
df.loc[pd.notna(df["column_name"])].query(....).drop(....)
I need an alternative to pd.notna to fit in this line of code to work in pandas=0.20.1
import os
import subprocess
import pandas as pd
import sys
from StringIO import StringIO
from io import StringIO
cmd = 'NSLOOKUP email.fullcontact.com'
df = pd.DataFrame()
a = subprocess.Popen(cmd, stdout=subprocess.PIPE)
b = StringIO(a.communicate()[0].decode('utf-8'))
df = pd.read_csv(b, sep=",")
column = list(df.columns)
name = list(df.iloc[1])[0].strip('Name:').strip()
name

How to upload pandas dataframe to IBM Db2 database

I am trying to upload pandas dataframe to IBM db2 dataframe. However, I could not manage to find the method to load the complete dataset at once.
import ibm_db
dsn_driver = "IBM DB2 ODBC DRIVER"
dsn_database = "BLUDB"
dsn_hostname= "dashdb-txn-xxxxx.eu-gb.bluemix.net"
dsn_port="5xx00"
dsn_protocol="TCPIP"
dsn_uid="xxxxx"
dsn_pwd="xxxx"
dsn = (
"DRIVER={{IBM DB2 ODBC DRIVER}};"
"DATABASE={0};"
"HOSTNAME={1};"
"PORT={2};"
"PROTOCOL=TCPIP;"
"UID={3};"
"PWD={4};").format(dsn_database, dsn_hostname, dsn_port, dsn_uid, dsn_pwd)
try:
conn = ibm_db.connect(dsn, "", "")
print('Connected')
except:
print('Unable to connect to database', dsn)
d = {'col1': [1, 2,3,4,5,6,7,8,9,10], 'col2': [3, 4,3,4,5,6,7,8,2,34], 'col3': [1, 2,3,14,5,36,72,8,9,10],}
import pandas as pd
df = pd.DataFrame(data=d)
df
So far, I manage to conect successfully to the ibmdb2 database, the rest steps of uploading the pandas dataframe is not clear to me, I tried several options from google, none seem to be working.
To make problem easy, I created a sample pandas dataframe (df, above). Any help page or documentation is appreciated.
thank you
pooja
The code below worked for me with both python 3.5.2 and 2.7.12, ibm_db_sa 0.3.4, with Db2 v11.1.4.4
Adjust the parameters for .to_sql to suit your requirements.
Add exception handling as required.
import ibm_db
import pandas as pd
from sqlalchemy import create_engine
dsn_driver = "IBM DB2 ODBC DRIVER"
dsn_database = "..."
dsn_hostname= "..."
dsn_port="60000"
dsn_protocol="TCPIP"
dsn_uid="..."
dsn_pwd="..."
d = {'col1': [1, 2,3,4,5,6,7,8,9,10], 'col2': [3, 4,3,4,5,6,7,8,2,34], 'col3': [1, 2,3,14,5,36,72,8,9,10],}
df = pd.DataFrame(data=d)
engine = create_engine('ibm_db_sa://'+ dsn_uid + ':' + dsn_pwd + '#'+dsn_hostname+':'+dsn_port+'/' + dsn_database )
df.to_sql('pandas1', engine)

Quandl code is not working

I have just started using Quandl and Pandas when I came across this code.
import quandl
import pandas as pd
api_key=open('quandlapi.txt','r').read()
df = quandl.get("FMAC/HPI_TX", authtoken=api_key)
fiddy_states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')
main_df = pd.DataFrame()
for abbv in fiddy_states[0][0][1:]:
query="FMAC/HPI_"+str(abbv)
df = quandl.get(query, authtoken=api_key)
if main_df.empty:
main_df = df
else:
main_df = main_df.join(df)
But when I run it I get the following error :
ValueError: columns overlap but no suffix specified: Index(['Value'], dtype='object')
Can anyone tell me what wrong I am doing here.?

Resources