django-haystack - No module named search_sites - django-haystack

I download django-haystack-1.1.0.tar.gz, unzip it, then copy haystack directory, which is in it, to my apps directory and add haystack to my INSTALLED_APPS (also add whoosh, because i copy it too), but when i restart server i get 500 internal error. Then i delete, for experiment, handle_registrations() from haystack.__init__ and site start working, but when i try search by haystack i get No fields were found in any search_indexes. Please correct this before attempting to search. In settings.py i have also:
HAYSTACK_SITECONF = 'search_sites'
HAYSTACK_SEARCH_ENGINE = 'whoosh'
HAYSTACK_WHOOSH_PATH = os.path.join(PROJECT_ROOT, 'mysite_search_sites')
Then i undelete handle_registrations(), delete haystack from INSTALLED_APPS and restart server and now i am getting No module named search_sites.
Also import haystack and haystack.__version__ works but haystack.management.commands didn't.
Could someone help me with this, please?
EDIT
My traceback:
/lib/python2.7/django/core/handlers/base.py in get_response
response = callback(request, *callback_args, **callback_kwargs) ...
▶ Local vars
/myproject/apps/djangobb_forum/util.py in wrapper
output = function(request, *args, **kwargs)
...
▶ Local vars
/myproject/apps/djangobb_forum/util.py in wrapper
result = func(request, *args, **kwargs)
...
▶ Local vars
/myproject/apps/djangobb_forum/views.py in search
for post in posts:
...
▶ Local vars
/myproject/apps/haystack/query.py in _manual_iter
if not self._fill_cache(current_position, current_position + ITERATOR_LOAD_PER_QUERY):
...
▶ Local vars
/myproject/apps/haystack/query.py in _fill_cache
results = self.query.get_results()
...
▶ Local vars
/myproject/apps/haystack/backends/__init__.py in get_results
self.run()
...
▶ Local vars
/myproject/apps/haystack/backends/__init__.py in run
results = self.backend.search(final_query, **kwargs)
...
▶ Local vars
/myproject/apps/haystack/backends/__init__.py in wrapper
return func(obj, query_string, *args, **kwargs)
...
▶ Local vars
/myproject/apps/haystack/backends/whoosh_backend.py in search
self.setup()
...
▶ Local vars
/myproject/apps/haystack/backends/whoosh_backend.py in setup
self.content_field_name, self.schema = self.build_schema(self.site.all_searchfields())
...
▶ Local vars
/myproject/apps/haystack/backends/whoosh_backend.py in build_schema
raise SearchBackendError("No fields were found in any search_indexes. Please correct this before attempting to search.")
...
▶ Local vars

From the install steps you've listed it sounds like you're missing a few steps.
Definitely revisit the Haystack setup instructions with a particular eye to looking at the Create a Search Site and Creating Indexes sections.
The long and short is you seem to be missing an indexes file. Haystack registers a bunch of stuff from your indexes when it is first included, so that explains why you're getting errors from haystack.__init__
Add a file called 'search_indexes.py' to your application directory. This file contains a list of the indexes you want to generate for different models. A simple example would be:
from haystack.indexes import *
from haystack import site
from myapp.models import MyModel
class MyModelIndex(SearchIndex):
text = CharField(document=True, use_template=True)
def prepare(self, obj):
self.prepared_data = super(MyModelIndex, self).prepare(obj)
self.prepared_data['text'] = obj.my_field
site.register(MyModel, MyModelIndex)
This will add a free text search field called 'text' to your index. When you search for free text without a field to search specified, haystack will search this field by default. The property my_field from the model MyModel is added to this text field and is made searchable. This could, for example, be the name of the model or some appropriate text field. The example is a little naive, but it will do for now to help you get something up and running, and you can then read up a bit and expand it.
The call site.register registers this index against the model MyModel so haystack can discover it.
You'll also need a file called search_sites.py (Name as per your settings) in your project directory to point to the index files you've just made. Adding the following will make it look through your apps and auto-discover any indexes you've registered.
import haystack
haystack.autodiscover()

You need to create search_sites.py in ur project root dir according to ur settings.py and add
import haystack
haystack.autodiscover()
This will fix the "No module named search_sites" error
And this is the LatestDocs for Django-Haystack Configurations

Related

How to write batch of data to Django's sqlite db from a custom written file?

For a pet project I am working on I need to import list of people to sqlite db. I have 'Staff' model, as well as a users.csv file with list of users. Here is how I am doing it:
import csv
from staff.models import Staff
with open('users.csv') as csv_file:
csv_reader = csv.DictReader(csv_file, delimiter=',')
line_count = 0
for row in csv_reader:
firstname = row['firstname']
lastname = row['lastname']
email = row['email']
staff = Staff(firstname=firstname, lastname=lastname, email=email)
staff.save()
csv_file.close()
However, I am getting below error message:
raise ImproperlyConfigured(
django.core.exceptions.ImproperlyConfigured: Requested setting INSTALLED_APPS, but settings are not configured. You must either define the environment variable DJANGO_SETTINGS_MODULE or call settings.configure() before accessing settings.
Is what I am doing correct? If yes what I am missing here?
Django needs some environment variables when it is being bootstrapped to run. DJANGO_SETTINGS_MODULE is one of these, which is then used to configure Django from the settings. Typically many developers don't even notice because if you stay in Django-land it isn't a big deal. Take a look at manage.py and you'll notice it sets it in that file.
The simplest thing is to stay in django-land and run your script in its framework. I recommend creating a management command. Perhaps a more proper way is to create a data migration and put the data in a storage place like S3 if this is something many many people need to do for local databases... but it seems like a management command is the way to go for you. Another option (and the simplest if this is really just a one-time thing) is to just run this from the django shell. I'll put that at the bottom.
It's very simple and you can drop in your code almost as you have it. Here are the docs :) https://docs.djangoproject.com/en/3.2/howto/custom-management-commands/
For you it might look something like this:
/app/management/commands/load_people.py <-- the file name here is what manage.py will use to run the command later.
from django.core.management.base import BaseCommand, CommandError
import csv
from staff.models import Staff
class Command(BaseCommand):
help = 'load people from csv'
def handle(self, *args, **options):
with open('users.csv') as csv_file:
csv_reader = csv.DictReader(csv_file, delimiter=',')
line_count = 0
for row in csv_reader:
firstname = row['firstname']
lastname = row['lastname']
email = row['email']
staff = Staff(firstname=firstname, lastname=lastname, email=email)
staff.save()
# csv_file.close() # you don't need this since you used `with`
which you would call like this:
python manage.py load_people
Finally, the simplest solution is to just run the code in the Django shell.
python manage.py shell
will open up an interactive shell with everything loaded properly. You can execute your code there and it should work.

Reusing common step definitions between feature files in behave python

i have some checks that needs to be included in multiple feature files i don't want to duplicate the step definitions across other step definitions.
eg:
#when(u'parquet files exist in "{container}" container in the data lake')
def step_imp(context, container):
parquet_files_array = []
for parquet_file in context.list_of_files:
parquet_files_array.append(parquet_file.name)
check_parquet_files_are_present_in_the_container_area_data_lake(parquet_files_array)**
i have to use this check in another step definition files too.
I have created a common_steps.py class and stuck all the common steps there i wonder how can reuse them with out duplicating across multiple features
Did you try importing them?
# in <step definitions>.py
import common_steps
#when(u'parquet files exist in "{container}" container in the data lake')
def step_imp(*args, **kwargs):
common_steps.step_imp(*args, **kwargs)
#in common_steps.py
def step_imp(context, container):
# implementation
When the common_steps.py is imported we don't have to define the step in the respective step definition file when we execute the feature file the step definition will be accessed from common_steps automatically

Does pytest provide a way of automatically reading properties for a test from a yml or conf file?

I'm using Python 3.8 and pytest for unit testing. I have tests similar to the following
def test_insert_record(az_sql_db_inst):
    """."""
    foreign_key_id = 10
I don't like hard-coding "10" in the file, and would prefer this be stored in some external configuration file where other tests can also access the value. Does pytest provide a way to do this (to load the values from a properties or yml file)? I could just open a file on my own, e.g.
def test_insert_record(az_sql_db_inst):
    """."""
file = open('test.txt')
for line in file:
fields = line.strip().split()
foreign_key_id = fields[0]
but it seems like there is something that can automatically do that for me.
If you simply want to initialize some constants, a separated config.py would do the trick.
from config import foreign_key_id
And PyYAML package would be good if you prefer general-purpose text files.
However, if you want to specify expected test results or even different results depending on scenarios, #pytest.mark.parametrize may be a good option.

How to load all config files in pythanic way?

Just want to know is there any proper way to load multiple config files to python scripts.
Directory structure as below.
dau
|-APPS
|---kafka
|---brokers
|-ENVS
As per the above, my base directory is dau. I'm planing to hold the script in Kafka and Broker directories. All global environments store in ENVS directory with ".ini" format. I want to load those ini files to all the script without adding one by one, because we may have to add more environments files in the future , in that case we don't have to add them manually on each and every scripts.
Sample env.ini
[DEV]
SERVER_NAME = dev123.abcd.net
i was trying to use the answer of below link, but still we have to add them manually, or if the parent path change in the dau directory, we have to edit the code.
Stack-flow-answer
Hi I came up with below solution, Thanks for the support.
Below code will get the all .ini file as list and return.
import os
def All_env_files():
try:
BASE_PATH = os.path.abspath(os.path.join(__file__,"../.."))
ENV_INI_FILES = [os.path.join(BASE_PATH + '/ENVS/',each) for each in os.listdir(BASE_PATH + '/ENVS') if each.endswith('.ini')]
return ENV_INI_FILES
except ValueError:
raise ValueError('Issue with Gathering Files from ENVS Directory')
Below code will take the list ini files and provide it to ConfigParser.
import ConfigParser, sys , os
"""
This is for kafka broker status check
"""
#Get Base path
Base_PATH = os.path.abspath(os.path.join(__file__,"../../.."))
sys.path.insert(0, Base_PATH)
#Importing configs python file on ../Configs.py
import Configs, edpCMD
#Taking All the ENVS ini file as list
List_ENVS = Configs.All_env_files()
Feel free to provide any shorter way to this.

Bulbs: g.scripts.update() : TypeError: sequence item 2: expected string or Unicode, NoneType found

I'm using TitanGraphDB + Cassandra. I'm starting Titan as follows
cd titan-cassandra-0.3.1
bin/titan.sh config/titan-server-rexster.xml config/titan-server-cassandra.properties
I have a Rexster shell that I can use to communicate to Titan + Cassandra above.
cd rexster-console-2.3.0
bin/rexster-console.sh
I'm attempting to model a network topology using Titan Graph DB. I want to program the Titan Graph DB from my python program. I'm using bulbs package for that.
I create five types of vertices
- switch
- port
- device
- flow
- flow_entry
I create edges between vertices that are connected logically. The edges are not labelled.
Let us say I want to test the connectivity between Vertex A and Vertex B
I have a groovy script is_connected.groovy
def isConnected (portA, portB) {
return portA.both().retain([portB]).hasNext()
}
Now from my rexster console
g = rexster.getGraph("graph")
==>titangraph[embeddedcassandra:null]
rexster[groovy]> g.V('type', 'flow')
==>v[116]
==>v[100]
rexster[groovy]> g.V('type', 'flow_entry')
==>v[120]
==>v[104]
As you can see above I have two vertices of type flow v[116] and v[100]
I have two vertices of type flow_entry v[120] and v[104]
I want to check for the connectivity between v[120] and v[116] for e.g
rexster[groovy]> ?e is_connected.groovy
==>null
rexster[groovy]> is_connected(g.v(116),g.v(120))
==>true
So far so good.Now I want to be able to use this script from my python program that imports bulbs package.
My directory structure is as follows.
Projects/ryu
--> ryu/app_simple_switch.py
Projects/ryu_extras
--> rexster-console-2.3.0
--> titan-cassandra-0.3.1
My script is_connected.groovy which contains isConnected() function/procedure is kept in Projects/ryu_extras/rexster-console-2.3.0
Now from my python program which is in Projects/ryu/ryu/app/simple_switch.py I try to do the following.
self.g.scripts.update('Projects/ryu_extras/rexster-console-2.3.0/is_connected.groovy') # add file to scripts index
script = self.g.scripts.get('isConnected') # get a function by its name
params = dict(portA=flow,portB=fe1) # put function params in dict
items = self.g.gremlin.query(script, params)
self.create_outgoing_edge(flow,fe1)
self.create_outgoing_edge(fe1,sw_vertex)
I get the following error.
hub: uncaught exception: Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/ryu/lib/hub.py", line 48, in _launch
func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/ryu/base/app_manager.py", line 256, in _event_loop
handler(ev)
File "/home/karthik/Projects/ryu/ryu/app/simple_switch.py", line 322, in _packet_in_handler
self.compute_path(src,dst,datapath)
File "/home/karthik/Projects/ryu/ryu/app/simple_switch.py", line 289, in compute_path
self.g.scripts.update('/home/karthik/Projects/ryu_extras/rexster-console-2.3.0/is_connected.groovy') # add file to scripts index
File "/usr/local/lib/python2.7/dist-packages/bulbs/groovy.py", line 120, in update
methods = self._get_methods(file_path)
File "/usr/local/lib/python2.7/dist-packages/bulbs/groovy.py", line 160, in _get_methods
return Parser(file_path).get_methods()
File "/usr/local/lib/python2.7/dist-packages/bulbs/groovy.py", line 255, in __init__
Scanner(handlers).scan(groovy_file)
File "/usr/local/lib/python2.7/dist-packages/bulbs/groovy.py", line 246, in scan
self.get_item(fin,line)
File "/usr/local/lib/python2.7/dist-packages/bulbs/groovy.py", line 236, in get_item
content = "\n".join(sections).strip()
TypeError: sequence item 2: expected string or Unicode, NoneType found
As you can see the error is in scripts.update() function.I just can't seem to figure out what I am doing wrong.Any help would be appreciated.
You need to save your scripts in a file named gremlin.groovy or specify the script's namespace when you get it from the Bulbs script index.
Like Rexster, Bulbs uses the first part of the Groovy filename as a namespace.
For example, methods defined in gremlin.groovy files are added to the Bulbs gremlin namespace.
All of Bulbs' pre-defined Gremlin scripts are defined in gremlin.groovy files and thus gremlin is the default namespace:
https://github.com/espeed/bulbs/blob/master/bulbs/gremlin.groovy
https://github.com/espeed/bulbs/blob/master/bulbs/rexster/gremlin.groovy
https://github.com/espeed/bulbs/blob/master/bulbs/titan/gremlin.groovy
You can have multiple/additional gremlin.groovy files in your app. Do this if you want to keep everything under the same namespace or if you want to override a pre-defined method:
>>> g.scripts.update("/path/to/gremlin.groovy") # add scripts to gremlin namespace
See...
https://github.com/espeed/bulbs/blob/f666fa89b3c99bc0a6b4e964aa1bff70b05a2e96/bulbs/groovy.py#L112
You can create app-specific and model-specific namespaces by defining your Gremlin methods in files with names like myapp.groovy or somemodel.groovy:
>>> g.scripts.update("/path/to/myapp.groovy") # add scripts to myapp namespace
>>> g.scripts.update("/path/to/somemodel.groovy") # add scripts to somemodel namespace
And then to get a script under a particular namespace, do:
>>> script = g.scripts.get('myapp:isConnected') # use prefix notation, or...
>>> script = g.scripts.get('isConnected', 'myapp') # specify namespace as arg
See...
https://github.com/espeed/bulbs/blob/f666fa89b3c99bc0a6b4e964aa1bff70b05a2e96/bulbs/groovy.py#L77
To generate concatenated server-side script files for each namespace, use the g.make_script_files() method:
>>> g.make_script_files() # write files to the current dir, or...
>>> g.make_script_files("/path/to/scripts/dir") # write files to specified dir
The make_scripte_files() method will create a separate .groovy file for each namespace. If you override a method in a namespace, only the latest will be included in the generated file.
See...
https://github.com/espeed/bulbs/blob/f666fa89b3c99bc0a6b4e964aa1bff70b05a2e96/bulbs/rexster/graph.py#L62
For more details, see...
Using Rexster Server-Side Scripts with Bulbs
https://groups.google.com/d/topic/gremlin-users/Up3JQUwrq-A/discussion
There might be a "bulbs way" to do this, but you could try to expose your function globally by putting it on the server with Rexster. Then in the <script-engine><init-scripts> section you can just add your is_connected.groovy. The sample rexster.xml should already have an example of this:
<script-engines>
<script-engine>
<name>gremlin-groovy</name>
<reset-threshold>-1</reset-threshold>
<init-scripts>config/is_connected.groovy</init-scripts>
<imports>com.tinkerpop.rexster.client.*</imports>
<static-imports>java.lang.Math.PI</static-imports>
</script-engine>
</script-engines>

Resources