Configure log in aks - azure

I'm trying to limit the aks logs for the various containers. Following this guide https://learn.microsoft.com/en-us/azure/azure-monitor/containers/container-insights-agent-config I created my config map:
kind: ConfigMap
apiVersion: v1
data:
schema-version:
#string.used by agent to parse config. supported versions are {v1}. Configs with other schema versions will be rejected by the agent.
v1
config-version:
#string.used by customer to keep track of this config file's version in their source control/repository (max allowed 10 chars, other chars will be truncated)
ver1
log-data-collection-settings: |-
# Log data collection settings
# Any errors related to config map settings can be found in the KubeMonAgentEvents table in the Log Analytics workspace that the cluster is sending data to.
[log_collection_settings]
[log_collection_settings.stdout]
# In the absense of this configmap, default value for enabled is true
enabled = false
# exclude_namespaces setting holds good only if enabled is set to true
# kube-system,gatekeeper-system log collection are disabled by default in the absence of 'log_collection_settings.stdout' setting. If you want to enable kube-system,gatekeeper-system, remove them from the following setting.
# If you want to continue to disable kube-system,gatekeeper-system log collection keep the namespaces in the following setting and add any other namespace you want to disable log collection to the array.
# In the absense of this configmap, default value for exclude_namespaces = ["kube-system","gatekeeper-system"]
# exclude_namespaces = ["kube-system","gatekeeper-system","kube-node-lease","kube-public","default","nsbpo","nscommon","nsregistry","aks-command"]
[log_collection_settings.stderr]
# Default value for enabled is true
enabled = true
# exclude_namespaces setting holds good only if enabled is set to true
# kube-system,gatekeeper-system log collection are disabled by default in the absence of 'log_collection_settings.stderr' setting. If you want to enable kube-system,gatekeeper-system, remove them from the following setting.
# If you want to continue to disable kube-system,gatekeeper-system log collection keep the namespaces in the following setting and add any other namespace you want to disable log collection to the array.
# In the absense of this cofigmap, default value for exclude_namespaces = ["kube-system","gatekeeper-system"]
exclude_namespaces = []
[log_collection_settings.env_var]
# In the absense of this configmap, default value for enabled is true
enabled = false
[log_collection_settings.enrich_container_logs]
# In the absense of this configmap, default value for enrich_container_logs is false
enabled = false
# When this is enabled (enabled = true), every container log entry (both stdout & stderr) will be enriched with container Name & container Image
[log_collection_settings.collect_all_kube_events]
# In the absense of this configmap, default value for collect_all_kube_events is false
# When the setting is set to false, only the kube events with !normal event type will be collected
enabled = false
# When this is enabled (enabled = true), all kube events including normal events will be collected
#[log_collection_settings.schema]
# In the absence of this configmap, default value for containerlog_schema_version is "v1"
# Supported values for this setting are "v1","v2"
# See documentation at https://aka.ms/ContainerLogv2 for benefits of v2 schema over v1 schema before opting for "v2" schema
# containerlog_schema_version = "v2"
metric_collection_settings: |-
# Metrics collection settings for metrics sent to Log Analytics and MDM
[metric_collection_settings.collect_kube_system_pv_metrics]
# In the absense of this configmap, default value for collect_kube_system_pv_metrics is false
# When the setting is set to false, only the persistent volume metrics outside the kube-system namespace will be collected
enabled = false
# When this is enabled (enabled = true), persistent volume metrics including those in the kube-system namespace will be collected
alertable-metrics-configuration-settings: |-
# Alertable metrics configuration settings for container resource utilization
[alertable_metrics_configuration_settings.container_resource_utilization_thresholds]
# The threshold(Type Float) will be rounded off to 2 decimal points
# Threshold for container cpu, metric will be sent only when cpu utilization exceeds or becomes equal to the following percentage
container_cpu_threshold_percentage = 95.0
# Threshold for container memoryRss, metric will be sent only when memory rss exceeds or becomes equal to the following percentage
container_memory_rss_threshold_percentage = 95.0
# Threshold for container memoryWorkingSet, metric will be sent only when memory working set exceeds or becomes equal to the following percentage
container_memory_working_set_threshold_percentage = 95.0
# Alertable metrics configuration settings for persistent volume utilization
[alertable_metrics_configuration_settings.pv_utilization_thresholds]
# Threshold for persistent volume usage bytes, metric will be sent only when persistent volume utilization exceeds or becomes equal to the following percentage
pv_usage_threshold_percentage = 60.0
# Alertable metrics configuration settings for completed jobs count
[alertable_metrics_configuration_settings.job_completion_threshold]
# Threshold for completed job count , metric will be sent only for those jobs which were completed earlier than the following threshold
job_completion_threshold_time_minutes = 360
integrations: |-
[integrations.azure_network_policy_manager]
collect_basic_metrics = false
collect_advanced_metrics = false
[integrations.azure_subnet_ip_usage]
enabled = false
# Doc - https://github.com/microsoft/Docker-Provider/blob/ci_prod/Documentation/AgentSettings/ReadMe.md
agent-settings: |-
# prometheus scrape fluent bit settings for high scale
# buffer size should be greater than or equal to chunk size else we set it to chunk size.
#[agent_settings.prometheus_fbit_settings]
# tcp_listener_chunk_size = 10
# tcp_listener_buffer_size = 10
# tcp_listener_mem_buf_limit = 200
# The following settings are "undocumented", we don't recommend uncommenting them unless directed by Microsoft.
# They increase the maximum stdout/stderr log collection rate but will also cause higher cpu/memory usage.
## Ref for more details about Ignore_Older - https://docs.fluentbit.io/manual/v/1.7/pipeline/inputs/tail
# [agent_settings.fbit_config]
# log_flush_interval_secs = "1" # default value is 15
# tail_mem_buf_limit_megabytes = "10" # default value is 10
# tail_buf_chunksize_megabytes = "1" # default value is 32kb (comment out this line for default)
# tail_buf_maxsize_megabytes = "1" # defautl value is 32kb (comment out this line for default)
# tail_ignore_older = "5m" # default value same as fluent-bit default i.e.0m
metadata:
name: container-azm-ms-agentconfig
namespace: kube-system
Reading the agent logs I find a couple of weird things, in the figure below it says that the config map has been changed, but I also find that both stderr and stdout have an exclusion, what I am wondering is, since stdout is disabled as is this possible? which means config :: No ADX database name set, using default value: containerinsights, I tried to search but can't find any information.
Also in the Log Analytics workspace I see that the stdout logs are still retrieved in the
ContainerLog resource.
I wonder if I have not misinterpreted the guide or if I have misconfigured

I tired to reproduce the same issue in my environment and got the expected results
I have created and deployed the config file
Vi container-azm-ms-agentconfig.yaml
kubectl apply -f container-azm-ms-agentconfig.yaml
We can check the logs using below command
kubectl get pods -n kube-system
We can check the logs using below command
kubectl logs pod_name -n kubesystem
When I check the logs got the same like config :: No ADX database name set, using default value: containerinsights
This is not the error, here we didn't create any ADX database so containerinsights will take the default value
if we need we can create the ADX sample data base then it won't show the message
you can refer this link

Related

aws elasticbeanstalk terraform plan does not show sensitive setting

I am using terraform to provision elasticbeanstalk and there have been no changes in my template but still when I try to plan, it shows me below:
# module.abc.aws_elastic_beanstalk_environment.this will be updated in-place
~ resource "aws_elastic_beanstalk_environment" "this" {
id = "abc"
name = "abc"
tags = {}
# (19 unchanged attributes hidden)
~ setting {
# At least one attribute in this block is (or was) sensitive,
# so its contents will not be displayed.
}
}
Plan: 0 to add, 3 to change, 0 to destroy.
I do not want to apply until I know what setting change it is referring to. Can someone help me show that setting output in tf plan output?
You can see the configuration changes from AWS EB console - Change History

Understanding the difference between ~ and -

I'm working on importing one of our rds instance into terraform.
Terraform plan shows ~ and -
~ maintenance_window = "sat:06:10-sat:06:40" -> (known after apply)
- max_allocated_storage = 0 -> null
Both of these values are not defined in the configuration, I would like to understand, why it is showing - and Do we configure null variables also in the module?
Using Terraform 0.12.28
Basically:
~ the value is in the state and is changing after the plan
- the value is in the state and you are trying to remove it (null value)
maintenance_window is showing ~ because its value is going to change, and in your specific case its value is computed and hence known after applying the changes. From the docs:
maintenance_window - (Optional) The window to perform maintenance in. Syntax: "ddd:hh24:mi-ddd:hh24:mi". Eg: "Mon:00:00-Mon:03:00". See RDS Maintenance Window docs for more information.
If that window is fine for you, you can specify that as an argument or let Terraform change it to its default value.
max_allocated_storage is showing - because when you imported the resource in the state, it imported all Terraform known arguments, but you are not specifying that one. In particular from the docs:
max_allocated_storage - (Optional) When configured, the upper limit to which Amazon RDS can automatically scale the storage of the DB instance. Configuring this will automatically ignore differences to allocated_storage. Must be greater than or equal to allocated_storage or 0 to disable Storage Autoscaling.
In this case you can set max_allocated_storage = 0 in order to not show any change in the plan for that argument

Stormcrawler pages with noindex nofollow are crawled

We are using Stormcrawler 1.13 to crawl site pages. When using in one environment, it's not crawling pages having robots meta noindex nofollow but when we are deploying the same modules in another environment, pages with noindex nofollow are also crawled. Below is our crawler-conf.yaml.
# Custom configuration for StormCrawler
# This is used to override the default values from crawler-default.xml and provide additional ones
# for your custom components.
# Use this file with the parameter -conf when launching your extension of ConfigurableTopology.
# This file does not contain all the key values but only the most frequently used ones. See crawler-default.xml for an extensive list.
config:
topology.workers: 1
topology.message.timeout.secs: 300
topology.max.spout.pending: 100
topology.debug: false
fetcher.threads.number: 50
# give 2gb to the workers
worker.heap.memory.mb: 2048
# mandatory when using Flux
topology.kryo.register:
- com.digitalpebble.stormcrawler.Metadata
# metadata to transfer to the outlinks
# used by Fetcher for redirections, sitemapparser, etc...
# these are also persisted for the parent document (see below)
# metadata.transfer:
# - customMetadataName
# lists the metadata to persist to storage
# these are not transfered to the outlinks
metadata.persist:
- _redirTo
- error.cause
- error.source
- isSitemap
- isFeed
http.agent.name: "Anonymous Coward"
http.agent.version: "1.0"
http.agent.description: "built with StormCrawler Archetype ${version}"
http.agent.url: "http://someorganization.com/"
http.agent.email: "someone#someorganization.com"
# The maximum number of bytes for returned HTTP response bodies.
# The fetched page will be trimmed to 65KB in this case
# Set -1 to disable the limit.
http.content.limit: -1
# FetcherBolt queue dump => comment out to activate
# if a file exists on the worker machine with the corresponding port number
# the FetcherBolt will log the content of its internal queues to the logs
# fetcherbolt.queue.debug.filepath: "/tmp/fetcher-dump-{port}"
parsefilters.config.file: "parsefilters.json"
urlfilters.config.file: "urlfilters.json"
# revisit a page daily (value in minutes)
# set it to -1 to never refetch a page
fetchInterval.default: 1440
# revisit a page with a fetch error after 2 hours (value in minutes)
# set it to -1 to never refetch a page
fetchInterval.fetch.error: 120
# never revisit a page with an error (or set a value in minutes)
fetchInterval.error: -1
# custom fetch interval to be used when a document has the key/value in its metadata
# and has been fetched successfully (value in minutes)
# fetchInterval.FETCH_ERROR.isFeed=true: 30
# fetchInterval.isFeed=true: 10
# configuration for the classes extending AbstractIndexerBolt
# indexer.md.filter: "someKey=aValue"
indexer.url.fieldname: "url"
indexer.text.fieldname: "content"
indexer.canonical.name: "canonical"
indexer.md.mapping:
- parse.title=title
- parse.keywords=keywords
- parse.description=description
- domain=domain
# Metrics consumers:
topology.metrics.consumer.register:
- class: "org.apache.storm.metric.LoggingMetricsConsumer"
parallelism.hint: 1
Please let me know if need to do some changes in above code or any other configurations in storm-crawler.
Thank you.
The behaviour of meta noindex is not configurable in 1.13 so any difference between your environments can't be due to a difference in configuration.
How did you generate the topology? Did you use the archetype?
PS: it is good practice to set the http.agent.* configs.

How to suppress "Number of stages in the query (nnn) exceeds the soft limit" warning in Presto DB

I've tried:
passing --session query_max_stage_count=150 to presto CLI client.
setting set session query_max_stage_count = 150 inside a REPL session.
setting set session query_max_stage_count = 150 as the first command of a script passed using -f.
All to no avail. The query_max_stage_count seems to be somewhat recognized since passing it an invalid (say, non numeric) value triggers an error.
The query_max_stage_count governs only the hard stages count limit.
You can observe this by setting this to a low value:
presto> SET SESSION query_max_stage_count = 2;
SET SESSION
presto> SELECT DISTINCT name FROM (SELECT name FROM tpch.tiny.nation UNION ALL SELECT name FROM tpch.tiny.nation);
Query 20200621_080512_00011_gd9gz failed: Number of stages in the query (4) exceeds the allowed maximum (2). [...]
Currently, the "soft stages limit" (the threshold above which a warning is issued) is configurable only in the config.properties with query.stage-count-warning-threshold property and there is no session property to override this setting.
If you feel the warning should also be controlled by a session property, please file a new issue at https://github.com/prestosql/presto/issues/new.

Conditional creating fields depends on filtering results in logstash influxdb output

I'm using the logstash for collecting sar metrics from the server and store its in influxdb.
Metrics from different sources (CPU, Memory, Network) should be inserted to the different series in influxdb. Of course amount and names of fields in those series depends type of metric source.
This is my config file: https://github.com/evgygor/test/blob/master/logstash.conf
For each [type] of metrics I should configure separate influxdb output. In this example, I configured two types of metrics, but I'm planning to use it for SAR metrics, JMX metrics, csv from Jmeter metrics, that mean - I need configure the appropriate output for each of them (tens).
Questions:
How can I elaborate desired configuration?
I there any option to use conditions inside plugin. Example:
if [type]=="system.cpu" {
data_points => {
"time" => "%{time}"
"user" => "%{user}"
}
}
else {
data_points => {
"time" => "%{time}"
"kbtotalmemory" => "%{kbtotalmemory}"
"kbmemfree" => "%{kbmemfree}"
"kbmemused" => "%{kbmemused}"
}
}
Is there any flag to define to influxdb plugin to use by default fields names/data types from input?
Is there any flag/ability to define default datatype?
Is there any ability to set field name "time" reserved with datatype integer?
Thank a lot.
I cooked some nice solution.
This fork permits to create fields on the fly, accrding to fields names and datatypes that arrives to that output plugin.
I added 2 configuration paramters:
This settings revokes the needs to use data_points and coerce_values configuration # to create appropriate insert to influxedb. Should be used with fields_to_skip configuration # This setting sets data points (column) names as field name from arrived to plugin event, # value for data points config :use_event_fields_for_data_points, :validate => :boolean, :default => true
The array with keys to delete from future processing. # By the default event that arrived to the output plugin contains keys "#version", "#timestamp" # and can contains another fields like, for example, "command" that added by input plugin EXEC. # Of course we doesn't needs those fields to be processed and inserted to influxdb when configuration # use_event_fields_for_data_points is true. # We doesn't deletes the keys from event, we creates new Hash from event and after that, we deletes unwanted # keys.
config :fields_to_skip, :validate => :array, :default => []
This is my example config file: I'm retrieving different number of fields with differnt names from CPU, memory, disks, but I doesn't need defferent configuration per data type as in master branch. I'm creating relevant fields names and datatypes on filter stage and just skips the unwanted fields in outputv plugin.
https://github.com/evgygor/logstash-output-influxdb

Resources