How can I convert from SQLite3 format to dictionary - python-3.x

How can i convert my SQLITE3 TABLE to a python dictionary where the name and value of the column of the table is converted to key and value of dictionary.

I have made a package to solve this issue if anyone got into this problem..
aiosqlitedict
Here is what it can do
Easy conversion between sqlite table and Python dictionary and vice-versa.
Get values of a certain column in a Python list.
Order your list ascending or descending.
Insert any number of columns to your dict.
Getting Started
We start by connecting our database along with
the reference column
from aiosqlitedict.database import Connect
countriesDB = Connect("database.db", "user_id")
Make a dictionary
The dictionary should be inside an async function.
async def some_func():
countries_data = await countriesDB.to_dict("my_table_name", 123, "col1_name", "col2_name", ...)
You can insert any number of columns, or you can get all by specifying
the column name as '*'
countries_data = await countriesDB.to_dict("my_table_name", 123, "*")
so you now have made some changes to your dictionary and want to
export it to sql format again?
Convert dict to sqlite table
async def some_func():
...
await countriesDB.to_sql("my_table_name", 123, countries_data)
But what if you want a list of values for a specific column?
Select method
you can have a list of all values of a certain column.
country_names = await countriesDB.select("my_table_name", "col1_name")
to limit your selection use limit parameter.
country_names = await countriesDB.select("my_table_name", "col1_name", limit=10)
you can also arrange your list by using ascending parameter
and/or order_by parameter and specifying a certain column to order your list accordingly.
country_names = await countriesDB.select("my_table_name", "col1_name", order_by="col2_name", ascending=False)

Related

creating python pop function for sqlite3

I'm trying to create a pop function getting a row of data from a sqlite database and deleting that same row. I would like to not have to create an ID column so I am using ROWID. I want to always get the first row and return it. This is the code I have:
import sqlite3
db = sqlite3.connect("Test.db")
c=db.cursor()
def sqlpop():
c.execute("SELECT * from DATA WHERE ROWID=1")
data = c.fetchall()
c.execute("DELETE from DATA WHERE ROWID=1")
db.commit()
return(data)
when I call the function it gets the first item correctly, but after the first call the function returns nothing. like this:
>>> sqlpop()
[(1603216325, 'placeholder IP line 124', 'placeholder Device line 124', '1,2,0', 1528, 1564)]
>>> sqlpop()
[]
>>> sqlpop()
[]
>>> sqlpop()
[]
what do I need to change for this function to work correctly?
update:
using what Schwern said I got the funtion to work:
def sqlpop():
c.execute("SELECT * from DATA ORDER BY ROWID LIMIT 1")
data = c.fetchone()
c.execute("DELETE from DATA ORDER BY ROWID LIMIT 1")
db.commit()
return data
rowid is not the row order, it is a unique identifier for the row created by SQLite unless you say otherwise.
SQL rows have no inherent order. You could grab just one row...
select * from table limit 1;
But you'll get them in no guaranteed order. And without a rowid you have no way to identify it again to delete it.
If you want to get the "first" row you must define what "first" means. To do that you need something to order by. For example, a timestamp. Or perhaps an auto-incrementing integer. You cannot use rowid, rowids are not guaranteed to be assigned in any particular order.
select *
from table
where created_at = max(created_at)
limit 1
So long as created_at is indexed, that should work fine. Then delete by its rowid.
You also don't need to use fetchall to fetch one row, use fetchone. In general, fetchall should be avoided as it risks consuming all your memory by slurping all the data in at once. Instead, use iterators.
for row in c.execute(...)

Got multiple Arguements in one function call but not in another function call

I am working on Robot Framework, one of my base method is written in python for building a SQL query with n columns and multiple where conditions. The function looks like,
from pypika import Query, Table, Field
def get_query_with_filter_conditions(table_name, *column, **where):
table_name_with_no_lock = table_name + ' with (nolock)'
table = Table(table_name_with_no_lock)
where_condition = get_where_condition(**where)
sql_query = Query.from_(table).select(
*column
).where(
Field(where_condition)
)
return str(sql_query).replace('"', '')
I am calling this method in my Robot keywords as:
Get Query With Filter Conditions ${tableName} ${column} &{tableFilter}
This function is called in two other keywords. For one it works fine. for another it keeps throwing error as
Keyword 'queryBuilderUtility.Get Query With Filter Conditions' got multiple values for argument 'table_name'.
The Keyword that works fine looks as:
Verify the ${element} in ${grid} is fetched from ${column} column in ${tableName} table from DB
[Documentation] Verifies Monetary values in the View Sale Grid
${feature}= Get Variable Value ${FEATURE_NAME}
${filterValue}= Get Variable value ${FILTER_VALUE}
${queryFilter}= Get the Test Data valid ${filterValue} ${feature}
&{tableFilter}= Create Dictionary
Set To Dictionary ${tableFilter} ${filterValue}=${queryFilter}
Set To Dictionary ${tableFilter} form_of_payment_type=${element}
${tableName}= Catenate SEPARATOR=. SmartPRASales ${tableName}
${query}= Get query with Filter Conditions ${tableName} ${column} &{tableFilter}
Log ${query}
#{queryResult}= CommonPage.Get a Column values from DB ${query}
The functions That always throws error looks like:
Verify ${element} drop down contains all values from ${column} column in ${tableName} table
[Documentation] To verify the drop down has all values from DB
${feature}= Get Variable Value ${FEATURE_NAME}
${filterElement}= Run Keyword If '${element}'=='batch_type' Set Variable transaction_type
... ELSE IF '${element}'=='channel' Set Variable agency_type
... ELSE Set Variable ${element}
&{tableFilter}= Create Dictionary
Set To Dictionary ${tableFilter} table_name=GENERAL
Set To Dictionary ${tableFilter} column_name=${filterElement}
Set To Dictionary ${tableFilter} client_id=QR
Log ${tableFilter}
Log ${tableName}
Log ${column}
${tableName}= Catenate SEPARATOR=. SmartPRAMaster ${tableName}
${query}= Get Query With Filter Conditions ${tableName} ${column} &{tableFilter}
Log ${query}
#{expectedvalues}= CommonPage.Get a Column values from DB ${query}
Can someone help me to get rectified what is the mistake I am doing here?
The issue is due to key value pair in the dictionary. One of the key in the dictionary
&{tableFilter}= Create Dictionary
Set To Dictionary ${tableFilter} table_name=GENERAL
is same as that of one of the argument in the
def get_query_with_filter_conditions(table_name, *column, **where):
Changed the argument in get_query_with_filter_conditions function from table_name to p_table_name and it worked. As the function takes positional arguments which can be specified as named parameter, python got confused with the table_name parameter I have passed with that of the key table_name in the dictionary.

Getting all acceptable string arguments to DataFrameGroupby.aggregate

So, I have a piece of code that takes a groupby object and a dictionary mapping columns in the groupby to strings, indicating aggregation types. I want to validate that all the values in the dictionary are strings that pandas accepts in its aggregation. However, I don't want to use a try/except (which, without a loop, will only catch a single problem value). How do I do this?
I've already tried importing the SelectionMixin from pandas.core.generic and checking against the values in SelectionMixin._cython_table, but this clearly isn't an exhaustive list. My version of pandas is 0.20.3.
Here's an example of how I want to use this
class SomeModule:
ALLOWED_AGGREGATIONS = # this is where I would save the collection of allowed values
#classmethod
def aggregate(cls, df, groupby_cols, aggregation_dict):
disallowed_aggregations = list(
set(aggregation_dict.values) - set(cls.ALLOWED_AGGREGATIONS)
)
if len(disallowed_aggregations):
val_str = ', '.join(disallowed_aggregations)
raise ValueError(
f'Unallowed aggregations found: {val_str}'
)
return df.groupby(groupby_cols).agg(aggregation_dict)

Access third value of first key in dictionary python

I have created a dictionary where one key has multiple values - start_time_C, duration_pre_val, value_T. All are input from an excel sheet.
Then I have sorted the dictionary.
pre_dict = {}
pre_dict.setdefault(rows,[]).append(start_time_C)
pre_dict.setdefault(rows,[]).append(duration_pre_val)
pre_dict.setdefault(rows,[]).append(value_T)
pre_dict_sorted = sorted(pre_dict.items(), key = operator.itemgetter(1))
Now, I want to compare a value (Column T of the excel sheet) with value_T.
How do I access value_T from the dictionary?
Many thanks!
Let's break this into two parts:
Reading in the spreadsheet
I/O stuff like this is best handled with pandas; if you'll be working with spreadsheets and other tabular data in Python, get acquainted with this package. You can do something like
import pandas as pd
#read the excel file into a pandas dataframe
my_data = pd.read_excel('/your/path/filename.xlsx', sheetname='Sheet1')
Accessing elements of the data, creating a dict
Your spreadsheet's content is now in the pandas DataFrame "my_data". From here you can reference DataFrame elements like this
#pandas: entire column
my_data['value_T']
#pandas: 2nd row, 0th column
my_data.iloc[2, 0]
and create Python data structures
#create a dict from the dataframe
my_dict = my_data.set_index(my_data.index).to_dict()
#access the values associated with the 'value_T key of the dict
my_dict['value_T']

Spark DataFrame created from JavaRDD<Row> copies all columns data into first column

I have a DataFrame which I need to convert into JavaRDD<Row> and back to DataFrame I have the following code
DataFrame sourceFrame = hiveContext.read().format("orc").load("/path/to/orc/file");
//I do order by in above sourceFrame and then I convert it into JavaRDD
JavaRDD<Row> modifiedRDD = sourceFrame.toJavaRDD().map(new Function<Row,Row>({
public Row call(Row row) throws Exception {
if(row != null) {
//updated row by creating new Row
return RowFactory.create(updateRow);
}
return null;
});
//now I convert above JavaRDD<Row> into DataFrame using the following
DataFrame modifiedFrame = sqlContext.createDataFrame(modifiedRDD,schema);
sourceFrame and modifiedFrame schema is same when I call sourceFrame.show() output is expected I see every column has corresponding values and no column is empty but when I call modifiedFrame.show() I see all the columns values gets merged into first column value for e.g. assume source DataFrame has 3 column as shown below
_col1 _col2 _col3
ABC 10 DEF
GHI 20 JKL
When I print modifiedFrame which I converted from JavaRDD it shows in the following order
_col1 _col2 _col3
ABC,10,DEF
GHI,20,JKL
As shown above all the _col1 has all the values and _col2 and _col3 is empty. I don't know what is wrong.
As I mentioned in question's comment ;
It might occurs because of giving list as a one parameter.
return RowFactory.create(updateRow);
When investigated Apache Spark docs and source codes ; In that specifying schema example They assign parameters one by one for all columns respectively. Just investigate the some source code roughly RowFactory.java class and GenericRow class doesn't allocate that one parameter. So Try to give parameters respectively for row's column's.
return RowFactory.create(updateRow.get(0),updateRow.get(1),updateRow.get(2)); // List Example
You may try to convert your list to array and then pass as a parameter.
YourObject[] updatedRowArray= new YourObject[updateRow.size()];
updateRow.toArray(updatedRowArray);
return RowFactory.create(updatedRowArray);
By the way RowFactory.create() method is creating Row objects. In Apache Spark documentation about Row object and RowFactory.create() method;
Represents one row of output from a relational operator. Allows both generic access by ordinal, which will incur boxing overhead for
primitives, as well as native primitive access. It is invalid to use
the native primitive interface to retrieve a value that is null,
instead a user must check isNullAt before attempting to retrieve a
value that might be null.
To create a new Row, use RowFactory.create() in Java or Row.apply() in
Scala.
A Row object can be constructed by providing field values. Example:
import org.apache.spark.sql._
// Create a Row from values.
Row(value1, value2, value3, ...)
// Create a Row from a Seq of values.
Row.fromSeq(Seq(value1, value2, ...))
According to documentation; You can also apply your own required algorithm to seperate rows columns while creating Row objects respectively. But i think converting list to array and pass parameter as an array will work for you(I couldn't try please post your feedbacks, thanks).

Resources