Solr Shard Configuration - search

Is there a way to configure a set of solr shards other than the querystring variable?

I'm pretty sure you can also configure this server-side in the search handler configuration.

Related

Secure Elasticsearch installation retrospectively

I have an Elasticsearch installation (V7.3.2). Is it possible to secure this retrospectively? This link states that a password can only be set "during the initial configuration of the Elasticsearch". Basically, I require consumers of the restful API to provide a password (?) going forward.
The elastic bootstrap password is used to init the internal/reserved users used by the components or features of the elastic stack (kibana, logstash, beats, monitoring, ...).
If you want to secure the API, you need to create users/roles for your scenario on top.
Please use TLS in your cluster when handling with passwords and don't expose the cluster directly for security reasons.
Here are all informations regarding a secure cluster including some tutorials: https://www.elastic.co/guide/en/elasticsearch/reference/7.3/secure-cluster.html
EDIT: Added links as requested. Feel free to raise a new question here at SO if you're facing serious problems!
Here you can find a complete guide to install and secure ElasticSearch.
Basically the bootstrap password is used initially to setup the built-in ElasticSearch users (like "elastic", "kibana"). Once this is done, you won't be able access ElasticSearch anonymously but only with one of the built in users, e.g. "elastic".
Then you can use "elastic" user to create additional users (with their own password) and roles (e.g. to asses specific indexes only in read-only mode).
As #ibexit wrote it's highly recommended to secure your cluster and don't expose it directly (use a proxy server, secured with SSL).

Varnish Multi-Site configuration with varying caching

I have 3 groups of API's
Each of the 3 has a unique requirement for caching.
So group 1 can be cached "normally" as in just the URL matters.
Group 2 requires that an auth header is passed, so would like to cache them based on that header and url.
Group 3 generates responses based upon the UserAgent and url
Now I can easily do any of those on their own, but because all of the API's are "small" I would like them to share a cache system and reduce costs.
From what I understand using multiple vcl's and vcl.load in varnishadm would allow me to specify a custom vcl_hash (among others) for each. Or is there a better solution as having an army of if statements just seems awful.
If I use vcl.load is there a way of having varnish automatically do this at startup so that the servers can be in an auto-scaling group? (currently using systemctl)
Cheers
It looks like you're looking for VCL Labels. Please check https://varnish-cache.org/docs/trunk/users-guide/vcl-separate.html or https://info.varnish-software.com/blog/one-vcl-per-domain for documentation and some examples.

CouchDB _replicator database requires a password for the local target?

I'm using the CouchDB _replicator database and am surprised to find that I have to put a full URL to localhost:5984 with username and password in the "target" field; just the database name by itself doesn't work. Does CouchDB just work this way or am I doing something wrong?
Part of CouchDB's real power is the consistency of its approach. Replication just uses standard REST/HTTP(S) requests to do its work. That's why it's so easy to replicate locally or across the world.
The only gotcha here is that CouchDB cheats slightly for (unsecured) local DBs by allowing you to provide just the DB name, not a full URL - although the actual replication calls prepend the rest of the URL to the DB name and go through the same process as any other request.
So, think of replication the same as you'd think of curl from the command line of your local machine, that way having to provide the auth credentials should feel more intuitive.

neo4j: authentication - only allow reading cypher queries

I'm using neo4j 1.9.4 and I would like to display some information about the graph on a (public) website using neo4jphp. In order to fetch some data I use cypher queries within neo4jphp. Those queries obviously only read data from the graph.
I have to make sure that visitors of the website are unable to modify any data in the graph. Therefore, I set up the authentication-extension plugin and created two users (one with read-only 'RO' and one with read-write 'RW' access rights) as documented there. However, the cypher queries within neo4jphp only work for the user with RW rights but not for the one with RO rights.
I know that http://docs.neo4j.org/chunked/stable/security-server.html#_security_in_depth pretty much explains how to secure neo4j, but I absolutely can't figure out how to do that. Especially the section "arbitrary_code_execution" seems to be interesting, but I don't know how to make use of it.
How can I achieve that reading cypher queries can be executed from the web server? BTW: The web server (to display some results) and neo4j are running on a different machine.
I would appreciate any help, thank you!
EDIT: My scenario is actually not that complicated, so I'm sure there must be a solution for that: From localhost any access (read write) is granted, whereas access from a remote web server is restricted to reading from the graph. How can I achieve that? If that is not possible: How could I restrict access from remote web server to some predefined (cypher) queries, where only some parameters can be supplied by the user?
You should use apache proxy as explained in http://docs.neo4j.org/chunked/stable/security-server.html#_security_in_depth
The information you need is the URL to post a cypher query:
http://localhost:7474/db/data/cypher
neo4php is only a wrapper and will end up posting to that url. You can find more details here : http://docs.neo4j.org/chunked/milestone/rest-api-cypher.html
So basically this means that you only allow queries with the cypher url to have access to the neo4j server.
Regarding read only cypher queries :
I didn't check with neo4jphp, but if you use the REST API directly, you can set the database to read_only by adding to conf/neo4j.properties :
read_only=true
You can check in the webadmin that the server is indeed in read_only mode
Just tested it, the server will accept only read queries :
And will return the following response
{
"message": "Expected to be in a transaction at this point",
"exception": "InternalException",
"fullname": "org.neo4j.cypher.InternalException",
"stacktrace":
[...],
"fullname" : "org.neo4j.graphdb.NotInTransactionException"
}
An alternative answer is to use the Cypher-RS plugin. There is a 1.9 branch.
This allows you to create endpoints that are in essense a single cypher query. (So the query must be predefined).
You could use the mod proxy to restrict to only these predefined queries. I'm not sure if mod proxy allows you to restrict to only GET requests, but if it does, you could allow access to GET requests for the plugin, because it won't allow modification queries to be GET requests.
https://github.com/jexp/cypher-rs

How does a MOSS web front end route search requests to query servers?

Does the web front end accept a search request coming from my own program?
If yes, how does the request get routed to a particular query server, given that I have multiple query servers?
Is any particular algorithm used (e.g. round-robin)?
If you have multiple WFEs, SharePoint will route the request based on its Load Balancer.
As for your own program, it depends greatly on how you want to request the search. The short answer is "yes," the long answer is, "I don't know what you're coding in, so I can't help you with specifics."
Why did you invest in dedicated query servers? A common and good deployment practice is to install the query server role on the WFE servers and the index server role on a separate server. You will only need dedicated query servers if you expect heavy query traffic.

Resources