How to write batch of data to Django's sqlite db from a custom written file? - python-3.x

For a pet project I am working on I need to import list of people to sqlite db. I have 'Staff' model, as well as a users.csv file with list of users. Here is how I am doing it:
import csv
from staff.models import Staff
with open('users.csv') as csv_file:
csv_reader = csv.DictReader(csv_file, delimiter=',')
line_count = 0
for row in csv_reader:
firstname = row['firstname']
lastname = row['lastname']
email = row['email']
staff = Staff(firstname=firstname, lastname=lastname, email=email)
staff.save()
csv_file.close()
However, I am getting below error message:
raise ImproperlyConfigured(
django.core.exceptions.ImproperlyConfigured: Requested setting INSTALLED_APPS, but settings are not configured. You must either define the environment variable DJANGO_SETTINGS_MODULE or call settings.configure() before accessing settings.
Is what I am doing correct? If yes what I am missing here?

Django needs some environment variables when it is being bootstrapped to run. DJANGO_SETTINGS_MODULE is one of these, which is then used to configure Django from the settings. Typically many developers don't even notice because if you stay in Django-land it isn't a big deal. Take a look at manage.py and you'll notice it sets it in that file.
The simplest thing is to stay in django-land and run your script in its framework. I recommend creating a management command. Perhaps a more proper way is to create a data migration and put the data in a storage place like S3 if this is something many many people need to do for local databases... but it seems like a management command is the way to go for you. Another option (and the simplest if this is really just a one-time thing) is to just run this from the django shell. I'll put that at the bottom.
It's very simple and you can drop in your code almost as you have it. Here are the docs :) https://docs.djangoproject.com/en/3.2/howto/custom-management-commands/
For you it might look something like this:
/app/management/commands/load_people.py <-- the file name here is what manage.py will use to run the command later.
from django.core.management.base import BaseCommand, CommandError
import csv
from staff.models import Staff
class Command(BaseCommand):
help = 'load people from csv'
def handle(self, *args, **options):
with open('users.csv') as csv_file:
csv_reader = csv.DictReader(csv_file, delimiter=',')
line_count = 0
for row in csv_reader:
firstname = row['firstname']
lastname = row['lastname']
email = row['email']
staff = Staff(firstname=firstname, lastname=lastname, email=email)
staff.save()
# csv_file.close() # you don't need this since you used `with`
which you would call like this:
python manage.py load_people
Finally, the simplest solution is to just run the code in the Django shell.
python manage.py shell
will open up an interactive shell with everything loaded properly. You can execute your code there and it should work.

Related

Changing subdirectory of MLflow artifact store

Is there anything in the Python API that lets you alter the artifact subdirectories? For example, I have a .json file stored here:
s3://mlflow/3/1353808bf7324824b7343658882b1e45/artifacts/feature_importance_split.json
MlFlow creates a 3/ key in s3. Is there a way to change to modify this key to something else (a date or the name of the experiment)?
As I commented above, yes, mlflow.create_experiment() does allow you set the artifact location using the artifact_location parameter.
However, sort of related, the problem with setting the artifact_location using the create_experiment() function is that once you create a experiment, MLflow will throw an error if you run the create_experiment() function again.
I didn't see this in the docs but it's confirmed that if an experiment already exists in the backend-store, MlFlow will not allow you to run the same create_experiment() function again. And as of this post, MLfLow does not have check_if_exists flag or a create_experiments_if_not_exists() function.
To make things more frustrating, you cannot set the artifcact_location in the set_experiment() function either.
So here is a pretty easy work around, it also avoids the "ERROR mlflow.utils.rest_utils..." stdout logging as well.
:
import os
from random import random, randint
from mlflow import mlflow,log_metric, log_param, log_artifacts
from mlflow.exceptions import MlflowException
try:
experiment = mlflow.get_experiment_by_name('oof')
experiment_id = experiment.experiment_id
except AttributeError:
experiment_id = mlflow.create_experiment('oof', artifact_location='s3://mlflow-minio/sample/')
with mlflow.start_run(experiment_id=experiment_id) as run:
mlflow.set_tracking_uri('http://localhost:5000')
print("Running mlflow_tracking.py")
log_param("param1", randint(0, 100))
log_metric("foo", random())
log_metric("foo", random() + 1)
log_metric("foo", random() + 2)
if not os.path.exists("outputs"):
os.makedirs("outputs")
with open("outputs/test.txt", "w") as f:
f.write("hello world!")
log_artifacts("outputs")
If it is the user's first time creating the experiment, the code will run into an AttributeError since experiment_id does not exist and the except code block gets executed creating the experiment.
If it is the second, third, etc the code is run, it will only execute the code under the try statement since the experiment now exists. Mlflow will now create a 'sample' key in your s3 bucket. Not fully tested but it works for me at least.

How to load all config files in pythanic way?

Just want to know is there any proper way to load multiple config files to python scripts.
Directory structure as below.
dau
|-APPS
|---kafka
|---brokers
|-ENVS
As per the above, my base directory is dau. I'm planing to hold the script in Kafka and Broker directories. All global environments store in ENVS directory with ".ini" format. I want to load those ini files to all the script without adding one by one, because we may have to add more environments files in the future , in that case we don't have to add them manually on each and every scripts.
Sample env.ini
[DEV]
SERVER_NAME = dev123.abcd.net
i was trying to use the answer of below link, but still we have to add them manually, or if the parent path change in the dau directory, we have to edit the code.
Stack-flow-answer
Hi I came up with below solution, Thanks for the support.
Below code will get the all .ini file as list and return.
import os
def All_env_files():
try:
BASE_PATH = os.path.abspath(os.path.join(__file__,"../.."))
ENV_INI_FILES = [os.path.join(BASE_PATH + '/ENVS/',each) for each in os.listdir(BASE_PATH + '/ENVS') if each.endswith('.ini')]
return ENV_INI_FILES
except ValueError:
raise ValueError('Issue with Gathering Files from ENVS Directory')
Below code will take the list ini files and provide it to ConfigParser.
import ConfigParser, sys , os
"""
This is for kafka broker status check
"""
#Get Base path
Base_PATH = os.path.abspath(os.path.join(__file__,"../../.."))
sys.path.insert(0, Base_PATH)
#Importing configs python file on ../Configs.py
import Configs, edpCMD
#Taking All the ENVS ini file as list
List_ENVS = Configs.All_env_files()
Feel free to provide any shorter way to this.

Opening another .py file in a function to pass agruments in Python3.5

I'm pretty new to Python and the overall goal of the project I am working on is to setup a SQLite DB that will allow easy entries in the future for non-programmers (this is for a small group of people who are all technically competent). The way I am trying to accomplish this right now is to have people save their new data entry as a .py file through a simple text editor and then open that .py file within the function that enters the values into the DB. So far I have:
def newEntry(material=None, param=None, value=None):
if param == 'density':
print('The density of %s is %s' % (material, value))
import fileinput
for line in fileinput.input(files=('testEntry.py'))
process(line)
Then I have created with a simple text editor a file called testEntry.py that will hopefully be called by newEntry.py when newEntry is executed in the terminal. The idea here is that some user would just have to put in the function name with the arguments they are inputing within the parentheses. testEntry.py is simply:
# Some description for future users
newEntry(material='water', param='density', value='1')
When I run newEntry.py in my terminal nothing happens. Is there some other way to open and execute a .py file within another that I do not know of? Thank you very much for any help.
Your solution works, but as a commenter said, it is very insecure and there are better ways. Presuming your process(...) method is just executing some arbitrary Python code, this could be abused to execute system commands, such as deleting files (very bad).
Instead of using a .py file consisting of a series of newEntry(...) on each line, have your users produce a CSV file with the appropriate column headers. I.e.
material,param,value
water,density,1
Then parse this csv file to add new entries:
with open('entries.csv') as entries:
csv_reader = csv.reader(entries)
header = True
for row in csv_reader:
if header: # Skip header
header = False
continue
material = row[0]
param = row[1]
value = row[2]
if param == 'density':
print('The density of %s is %s' % (material, value))
Your users could use Microsoft Excel, Google Sheets, or any other spreadsheet software that can export .csv files to create/edit these files, and you could provide a template to the users with predefined headers.

how to update record in between a file in python

We are trying to implement a file based student record program. We have implemented the reading and writing (Append Records) Functionalities. However now we are interested in updating a particular record via Student id. We have tried using the replace() function to replace the record but this ends up replacing all occurrences of the data in the file which makes it ambiguous.We have also tried over writing data using seek() but it does not helps much.example code is as follows.
import fileinput
import sys
for i,line in enumerate(fileinput('file_name.txt', inplace=1)):
sys.stdout.write(line.replace('old','new')
Any help shall be great.
Assume your input file is
1:Naveen Kumar:Chennai
2:Joseph Nirmal:Bangalore
and you want to change the name of student id 2 to Joseph Nirmal Kumar, then the following code should do
import fileinput
import sys
# change name for student 2
for line in fileinput.input('records.txt', inplace=1):
id,name,location = line.split(':')
if id == '2':
sys.stdout.write('{}:{}:{}'.format(id,'Joseph Nirmal Kumar',location))
else:
sys.stdout.write(line)
then the file will become
1:Naveen Kumar:Chennai
2:Joseph Nirmal Kumar:Bangalore
Hope that helps. I would recommend to try other data persistence options available in python

How to make a config file that stores global variables but can also be easily edited from the level of the code?

I have a Python project that uses some global variables. The project has multiple .py files and all of them need to be able to read those variables easily. All of them should also be able to edit them.
Right now I am using a .py file that stores the variables and I simply import them in other files as a module. This solution is far from optimal, though, because I can't edit them easily. I naively tried to write setters for the variables, but obviously that doesn't work either.
I thought about using JSON for the variables, but then reading them wouldn't be as easy as 'import cfg'.
Is there a standard solution for this problem that I'm not aware of because I'm a noob?
Thanks for every answer.
Store information in a CSV and then use csv.reader to write the key, value pairs to something like a dictionary. Here's some example code to get you started:
csv file:
firstname Garrett
something_else blah
Python:
import csv
output = []
with open('csv.csv') as csvfile:
reader = csv.reader(csvfile, delimiter=' ')
for row in reader:
output.append(tuple(row))
return {key: value for (key, value) in output}

Resources