Retrieve details of a runlist from a node using CHEF - attributes

How to retrieve the names of all the cookbooks/profiles which are listed on a specific node's runlist using CHEF?
The ohai attribute node['recipes'] doesn't give me the list of recipes that are in the runlist of a specified node.
Is there any other way of retrieving the runlist of a specific node so that certain checks can be applied on the retrieved information of the runlist of a specific node?

Chef search can be done on the data indexed in the Chef Server using:
knife on workstations
knife command can be used on workstations to search node data. If the node name is known and we want to see the run_list attribute of the example node testbox1.home.net:
knife node show testbox1.home.net -a 'run_list'
If the node name is not known, but we know a cookbook is it's run_list. This will return nodes that match recipe[my_cookbook] in their run_list:
knife search node 'run_list:recipe\[my_cookbook\]'
search from clients through Chef recipes
Similar searches can be performed from within "recipes" using the search functionality.
If the node name is known and we want to fetch the run_list for testbox1.home.net. In my recipes/default.rb:
search(:node, 'name:testbox1*',
:filter_result => {
'name' => [ 'name' ],
'runlist' => [ 'run_list' ]
} ).each do |details|
puts "#{details['name']}: #{details['runlist']}"
end
If node name is not known, but we know that recipe[my_cookbook] is in it's run_list. In my recipes/default.rb:
search(:node, 'run_list:recipe\[cookbook1\]',
:filter_result => {
'name' => [ 'name' ],
'runlist' => [ 'expanded_run_list' ]
} ).each do |details|
puts "#{details['name']}: #{details['runlist']}"
end
Note: I have applied filters to extract only the data required for examples above. If :filter_result is omitted then complete information will be retrieved into details.

Related

How to add hash the whole content of an event in Logstash for OpenSearch?

the problem is the following: I'm investigating how to add some anti-tampering protection to events stored in OpenSearch that are parsed and sent there by Logstash. Info is composed of application logs collected from several hosts. The idea is to add a hashed field that's linked to the original content so that any modification of the fields break the hash result and can be detected.
Currently, we have in place some grok filters that extract information from the received log lines and store it into different fields using several patterns. To make it more difficult for an attacker who modifies these logs to cover their tracks, I'm thinking of adding an extra field where the whole line is hashed and salted before splitting.
Initial part of my filter config is like this. It was used primarily with ELK, but our project will be switching to OpenSearch:
filter {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:mytimestamp} (\[)*%{LOGLEVEL:loglevel}(\])* %{JAVACLASS:javaclass}(.)*(\[/])* %{DATA:component} %{DATA:version} - %{GREEDYDATA:message}"}
overwrite => [ "message" ]
overwrite => [ "version" ]
break_on_match => false
keep_empty_captures => true
}
// do more stuff
}
OpenSearch has some info on Field masking, but this is not exactly what I am after.
If any of you could help me with a pointer or an idea on how to do this. I don't know whether the hash fields available in ELK are also available in OpenSearch, or whether the Logstash plugin that does the hashing of fields would be usable without licensing issues. But maybe there are other and better options that I am not aware. I was looking for info on how to call an external script to do this during the filter execution, but I don't even know whether that is possible (apparently not, at least I couldn't find anything).
Any ideas? Thank you!

How can I use Foreman host groups with Puppet?

I have this manifest:
$foremanlogin = file('/etc/puppetlabs/code/environments/production/manifests/foremanlogin.txt')
$foremanpass = file('/etc/puppetlabs/code/environments/production/manifests/foremanpass.txt')
$query = foreman({foreman_user => "$foremanlogin",
foreman_pass => "$foremanpass",
item => 'hosts',
search => 'hostgroup = "Web Servers"',
filter_result => 'name',
})
$quoted = regsubst($query, '(.*)', '"\1"')
$query6 = join($quoted, ",")
notify{"The value is: ${query6}": }
node ${query6} {
package { 'atop':
ensure => 'installed',
}
}
When I execute this on agent I got error:
Server Error: Could not parse for environment production: Syntax error at ''
Error in my node block
node ${query6} {
package { 'atop':
ensure => 'installed',
}
}
I see correct output from notify, my variable looks like this:
"test-ubuntu1","test-ubuntu2"
Variable in correct node manifest format.
I don't understand whats wrong? variable query6 is correct.
How to fix that?
I just want to apply this manifest to foreman host group, how to do this right?
On the Puppet side, you create classes describing how to manage appropriate subunits of your machines' overall configuration, and organize those classes into modules. The details of this are far too broad to cover in an SO answer -- it would be analogous to answering "How do I program in [language X]?".
Having prepared your classes, the task is to instruct Puppet which ones to assign to each node. This is called "classification". Node blocks are one way to perform classification. Another is external node classifiers (ENCs). There are also alternatives based on ordinary top-level Puppet code in your site manifest. None of these are exclusive.
If you are running Puppet with The Foreman, however, then you should configure Puppet to use the ENC that Foreman provides. You then use Foreman to assign (Puppet) classes to nodes and / or node groups, and Foreman communicates the details to Puppet via its ENC. That does not require any classification code on the Puppet side at all.
See also How does host groups work with foreman?

Case insensitive search in arrays for CosmosDB / DocumentDB

Lets say I have these documents in my CosmosDB. (DocumentDB API, .NET SDK)
{
// partition key of the collection
"userId" : "0000-0000-0000-0000",
"emailAddresses": [
"someaddress#somedomain.com", "Another.Address#someotherdomain.com"
]
// some more fields
}
I now need to find out if I have a document for a given email address. However, I need the query to be case insensitive.
There are ways to search case insensitive on a field (they do a full scan however):
How to do a Case Insensitive search on Azure DocumentDb?
select * from json j where LOWER(j.name) = 'timbaktu'
e => e.Id.ToLower() == key.ToLower()
These do not work for arrays. Is there an alternative way? A user defined function looks like it could help.
I am mainly looking for a temporary low-effort solution to support the scenario (I have multiple collections like this). I probably need to switch to a data structure like this at some point:
{
"userId" : "0000-0000-0000-0000",
// Option A
"emailAddresses": [
{
"displayName": "someaddress#somedomain.com",
"normalizedName" : "someaddress#somedomain.com"
},
{
"displayName": "Another.Address#someotherdomain.com",
"normalizedName" : "another.address#someotherdomain.com"
}
],
// Option B
"emailAddressesNormalized": {
"someaddress#somedomain.com", "another.address#someotherdomain.com"
}
}
Unfortunately, my production database already contains documents that would need to be updated to support the new structure.
My production collections contain only 100s of these items, so I am even tempted to just get all items and do the comparison in memory on the client.
If performance matters then you should consider one of the normalization solution you have proposed yourself in question. Then you could index the normalized field and get results without doing a full scan.
If for some reason you really don't want to retouch the documents then perhaps the feature you are missing is simple join?
Example query which will do case-insensitive search from within array with a scan:
SELECT c FROM c
join email in c.emailAddresses
where lower(email) = lower('ANOTHER.ADDRESS#someotherdomain.com')
You can find more examples about joining from Getting started with SQL commands in Cosmos DB.
Note that where-criteria in given example cannot use an index, so consider using it only along another more selective (indexed) criteria.

How to use properly exported resources with Puppet

I've been thinking for hours on how to solve a problem with Puppet 4.
Here is my case :
I have a module "Cassandra", and I have 3 machines
Cluster1 (Hostname : CassandraCluster1)
Cluster2 (Hostname : CassandraCluster1)
Cluster3 (Hostname : CassandraCluster1)
I want to collect the hostnames of three clusters in an array, so I can pass them to the cassandra configuration file (which I'm using as a template epp) :
cassandra.yaml.epp
- seeds: "<%= $cluster::hostnames %>"
So the solution is Exported Resources I came up with :)
I've been playing with all day, but no idea how to make up this work. here is what I've tried :
on each cluster I add this code to collect the hostnames :
##file {"${hostname}":
content => 'epp(puppet://modules/cassandra/cassandra.yaml.epp)',
}
# Collect:
File <<| |>>
But I'm not sure if this is a good idea ?!
I've found a temporary solution that I'm using now :
I have a role fact which I'm adding to each machine. and I'm retrieving the hosts from puppetdb using puppetdbquery module.
and then on puppet code I use this query to find the nodes with the role attached
$hosts = query_nodes("role=apache")
To the extent that the question is how to create the wanted file with use of exported resources, it's important to understand that collecting exported resources causes them to be added to the target node's catalog. Exporting and collecting resources is not merely a data-transfer excercise. Your example exports several File resources; collecting these will result in the same number of separate files being managed on the target (or else a duplicate resource error if you happen to have a resource title collision).
For a long time, the usual way to build a file based on pieces contributed by multiple nodes was to use the puppetlabs/concat module, with each contributing node exporting a concat::fragment resource. For example:
##concat::fragment { "${hostname} Cassandra seed":
target => '/path/to/file',
content => " ${hostname}",
order => '10',
tag => 'cassandra'
}
Those would be collected into the catalog for the target node, to be used in conjunction with a corresponding concat resource declared there:
concat { '/path/to/file':
ensure => 'present',
# ...
}
concat::fragment { "Cassandra seed start":
target => '/path/to/file',
content => ' - seeds: "'
order => '05',
}
Concat::Fragment <<| tag == 'cassandra' |>>
concat::fragment { "Cassandra seed tail":
target => '/path/to/file',
content => '"'
order => '15',
}
That still works just fine, but it is becoming more common to rely instead on querying puppetdb, as you demonstrate in your own answer.

Integrate Elasticsearch with PostgreSQL while using Sails.js with Waterline ORM

I am trying to integrate Elasticsearch with Sails.js and my database isn't MongoDB: I use PostgreSQL, so this post doesn't help.
I have installed Elasticsearch on my Ubuntu box and now it's running successfully. I also installed this package on my Sails project, but I cannot create indexes on my existing models.
How can I define indexes on my models, and how can I search using Elasticsearch inside my Models?
What are the hooks which I need to define it inside models?
Here you could find a pretty straightforward package (sails-elastic). It operates by configs directly from elasticsearch itself.
Elasticsearch docs and index creation in particular
There are lots of approach to solve this issue. The recommended way is to use logstash by elasticsearch which I have given in detail.
I would list most of the approaches that I know here:
Using Logstash
curl https://download.elastic.co/logstash/logstash/logstash-2.3.2.tar.gz > logstash.tar.gz
tar -xzf logstash.tar.gz
cd logstash-2.3.2
Install the jdbc input plugin:
bin/logstash-plugin install logstash-input-jdbc
Then download postgresql jdbc driver.
curl https://jdbc.postgresql.org/download/postgresql-9.4.1208.jre7.jar > postgresql-9.4.1208.jre7.jar
Now create a configuration file for logstash to use jdbc input as input.conf:
input {
jdbc {
jdbc_driver_library => "/Users/khurrambaig/Downloads/logstash-2.3.2/postgresql-9.4.1208.jre7.jar"
jdbc_driver_class => "org.postgresql.Driver"
jdbc_connection_string => "jdbc:postgresql://localhost:5432/khurrambaig"
jdbc_user => "khurrambaig"
jdbc_password => ""
schedule => "* * * * *"
statement => 'SELECT * FROM customer WHERE "updatedAt" > :sql_last_value'
type => "customer"
}
jdbc {
jdbc_driver_library => "/Users/khurrambaig/Downloads/logstash-2.3.2/postgresql-9.4.1208.jre7.jar"
jdbc_driver_class => "org.postgresql.Driver"
jdbc_connection_string => "jdbc:postgresql://localhost:5432/khurrambaig"
jdbc_user => "khurrambaig"
jdbc_password => ""
schedule => "* * * * *"
statement => 'SELECT * FROM employee WHERE "updatedAt" > :sql_last_value'
type => "employee"
}
# add more jdbc inputs to suit your needs
}
output {
elasticsearch {
index => "khurrambaig"
document_type => "%{type}" # <- use the type from each input
document_id => "%{id}" # <- To avoid duplicates
hosts => "localhost:9200"
}
}
Now run logstash using the above file:
bin/logstash -f input.conf
For every model that you want to insert as a document(table) type in a index(database, khurrambaig here), use appropriate SQL statement ( SELECT * FROM employee WHERE "updatedAt" > :sql_last_value here). Here I have use sql_last_value to put only updated data only. You can do scheduling also and many stuff in logstash. Here I am using every minute. For more details refer this.
To see the documents which has been inserted into index for a particular type:
curl -XGET 'http://localhost:9200/khrm/user/_search?pretty=true'
This will list all the documents under customer models for my case. Look into elastic search api. Use that. Or use nodejs official client.
Using jdbc input
https://github.com/jprante/elasticsearch-jdbc
You can read its readme. It's quite straightforward. But this doesn't provide scheduling and many of the things that are provided by logstash.
Using sails-elastic
You need to use multiple adapters as given in README.
But this isn't recommended because it will slow down your requests. For every creation, updation and deletion, you will be calling two dbs : elastic search and postgresql.
In logstash, indexing of documents is independent of requests. This approach is used by many including wikipedia. Also you remain independent of framework. Today you are using sails, tomorrow you might use something else but you don't need to change anything in case of logstash if you still use postgresql. (If you change db, even then many of the db's input are available and in case of change from any sql rdbms to another, you just need to change to jdbc driver)
There's zombodb also but it work for pre 2.0 elastic only currently (Support for > ES 2.0 coming also).

Resources