Puppet Local Compile failure due to requesting accessing to Puppet Master - puppet

I want to perform puppet(version 6.16) local compile for testing purpose by using following cmd.
puppet catalog compile testhost.int.test.com --environmentpath="xxx" --environment="ccc" --modulepath="ttt" --manifest="hhh" --vardir="eee"
It went well until I hit one custom module which is calling following puppet function:
Puppet::FileServing::Content.indirection.find(file, :environment => compiler.environment)
The error is as per below :
Error: Request to https://test.int.test.com:8140/puppet/v3 failed after 0.002 seconds: The private key is missing from '/etc/puppetlabs/puppet/ssl/private_keys/testhost.int.test.com.pem'
which I think it tries to connect to Puppet Master to query the file.
The thing is I only want to perform a local compile which NOT really want to talking to Puppet Master for any information.
So any workaround that I can do to ask it only looking at local environment not checking with Puppet Master?
BTW following method seems ok which is checking local environment not check Master side.
Puppet::Parser::Files.find_file(file, compiler.environment)
I'm relatively new in Puppet, Thanks in advance.
I expect I can run puppet catalog compile cmd purely locally without talking to Puppet Master to perform a sanity check only to see whether we are missing anything during compile phase before we push anything to production branch.

Related

Airflow can't reach logs from webserver due to 403 error

I use Apache Airflow for daily ETL jobs. I installed it in Azure Kubernetes Service using the provided Helm chart. It's been running fine for half a year, but since recently I'm unable to access the logs in the webserver (this used to always work fine).
I'm getting the following error:
*** Log file does not exist: /opt/airflow/logs/dag_id=analytics_etl/run_id=manual__2022-09-26T09:25:50.010763+00:00/task_id=copy_device_table/attempt=18.log
*** Fetching from: http://airflow-worker-0.airflow-worker.default.svc.cluster.local:8793/dag_id=analytics_etl/run_id=manual__2022-09-26T09:25:50.010763+00:00/task_id=copy_device_table/attempt=18.log
*** !!!! Please make sure that all your Airflow components (e.g. schedulers, webservers and workers) have the same 'secret_key' configured in 'webserver' section and time is synchronized on all your machines (for example with ntpd) !!!!!
****** See more at https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#secret-key
****** Failed to fetch log file from worker. Client error '403 FORBIDDEN' for url 'http://airflow-worker-0.airflow-worker.default.svc.cluster.local:8793/dag_id=analytics_etl/run_id=manual__2022-09-26T09:25:50.010763+00:00/task_id=copy_device_table/attempt=18.log'
For more information check: https://httpstatuses.com/403
What have I tried:
I've made sure that the log file exists (I can exec into the airflow-worker-0 pod and read the file on command line in the location specified in the error).
I've rolled back my deployment to an earlier commit from when I know for sure it was still working, but it made no difference.
I was using webserverSecretKeySecretName in the values.yaml configuration. I changed the secret to which that name was pointing (deleted it and created a new one, as described here: https://airflow.apache.org/docs/helm-chart/stable/production-guide.html#webserver-secret-key) but it didn't work (no difference, same error).
I changed the config to use a webserverSecretKey instead (in plain text), no difference.
My thoughts/observations:
The error states that the log file doesn't exist, but that's not true. It probably just can't access it.
The time is the same in all pods (I double checked be exec-ing into them and typing date in the command line)
The webserver secret is the same in the worker, the scheduler, and the webserver (I double checked by exec-ing into them and finding the corresponding env variable)
Any ideas?
Turns out this was a known bug with the latest release (2.4.0) of the official Airflow Helm chart, reported here:
https://github.com/apache/airflow/discussions/26490
Should be resolved in version 2.4.1 which should be available in the next couple of days.

tfplugindocs terraform document generation error

I am new to golang and terraform plugin development. I was trying to generate plugin documentation by 'tfplugindocs' and I am still getting errors..
below is the output of 'tfplugindocs' execution.
my plugin is still under development and not registered with terraform
what can be the cause for this error..
% go run github.com/hashicorp/terraform-plugin-docs/cmd/tfplugindocs
rendering website for provider "hashicups"
exporting schema from Terraform
compiling provider "hashicups"
getting Terraform binary
running terraform init
getting provider schema
**Error executing command: unable to generate website:
Error: Could not load plugin
Plugin reinitialization required. Please run "terraform init".
Plugins are external binaries that Terraform uses to access and manipulate
resources. The configuration provided requires plugins which can't be
located,
don't satisfy the version constraints, or are otherwise incompatible.
Terraform automatically discovers provider requirements from your
configuration, including providers used in child modules. To see the
requirements and constraints, run "terraform providers".
failed to instantiate provider "registry.terraform.io/hashicorp/hashicups" to
obtain schema: Unrecognized remote plugin message: open : no such file or
directory
This usually means that the plugin is either invalid or simply
needs to be recompiled to support the latest protocol.**
exit status 1
I noticed that I started getting this error.
It appears to have started when I commented out the Address field in the providerserver.ServeOpts struct in main.go like this:
opts := providerserver.ServeOpts{
// TODO: Update this string with the published name of your provider.
// Address: "registry.terraform.io/davidalpert/myproject",
Debug: debug,
}
Once I uncommented that Address field I was able to run the build (which runs generate, which runs tfplugindocs) without error.
opts := providerserver.ServeOpts{
// TODO: Update this string with the published name of your provider.
Address: "registry.terraform.io/davidalpert/myproject",
Debug: debug,
}
My plugin is not published yet either, so that Address value is not valid, but that doesn't seem to matter so long as the Address field is set to override the default.

"Error: Key not loaded" in h2o deployed through a K3s cluster, using python3 client

I can confirm the 3-replica cluster of h2o inside K3s is correctly deployed, as executing in the Python3 interpreter h2o.init(ip="x.x.x.x") works as expected. I followed the instructions noted here: https://www.h2o.ai/blog/running-h2o-cluster-on-a-kubernetes-cluster/
Nevertheless, I had to modify the service.yaml and comment out the line which says clusterIP: None, as K3s was complaining about something related to its inability to set the clusterIP to None. But even though, I can certify it is working correctly, and I am able to use an external IP to connect to the cluster.
If I try to load the dataset using the h2o cluster inside the K3s cluster using the exact same steps as described here http://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html, this is the output that I get:
>>> train = h2o.import_file("https://s3.amazonaws.com/erin-data/higgs/higgs_train_10k.csv")
...
h2o.exceptions.H2OResponseError: Server error java.lang.IllegalArgumentException:
Error: Key not loaded: Key<Frame> https://s3.amazonaws.com/erin-data/higgs/higgs_train_10k.csv
Request: POST /3/ParseSetup
data: {'check_header': '0', 'source_frames': '["https://s3.amazonaws.com/erin-data/higgs/higgs_train_10k.csv"]'}
The same error occurs if I use the h2o.upoad_file("x.csv") method.
There is a clue about what may be happening here: Key not loaded: Key<Frame> while POSTing source frame through ParseSetup in H2O API call but I am not using curl, and I can not find any parameter that could help me overcome this issue: http://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/h2o.html?highlight=import_file#h2o.import_file
I need to use the Python client inside the same K3s cluster due to different technical reasons, so I am not able to kick off nor Flow nor Firebug to know what may be happening.
I can confirm it is working correctly when I simply issue a h2o.init(), using the local Java instance.
UPDATE 1:
I have tried in different K3s clusters without success. I changed the service.yaml to a NodePort, and now this is the error traceback:
>>> train = h2o.import_file("https://s3.amazonaws.com/erin-data/higgs/higgs_train_10k.csv")
...
h2o.exceptions.H2OResponseError: Server error java.lang.IllegalArgumentException:
Error: Job is missing
Request: GET /3/Jobs/$03010a2a016132d4ffffffff$_a2366be93ec99a78d7bc161de8c54d67
UPDATE 2:
I have tried using different services (NodePort, LoadBalancer, ClusterIP) and none of them work. I also have tried using Minikube with the official image, and with a custom image made by me, without success. I suspect this is something related to either h2o itself, or the clustering between pods. I will keep digging and let's think there will be some gold in it.
UPDATE 3:
I also found out that the post about running H2O in Docker is really outdated https://www.h2o.ai/blog/h2o-docker/ nor is working the Dockerfile present at GitHub (I changed it to uncomment the ENTRYPOINT section without success): https://github.com/h2oai/h2o-3/blob/master/Dockerfile
Even though, I tried with the custom image I built for h2o-k8s and it is working seamlessly in pure Docker. I am wondering why it is still not working in K8s...
UPDATE 4:
I have tried modifying the environment variable called H2O_KUBERNETES_SERVICE_DNS without success.
In the meantime, the cluster started to be unavailable, that is, the readinessProbe's would not successfully complete. No matter what I change now, it does not work.
I spinned up a K3d cluster in local to see what happened, and surprisingly, the readinessProbe's were not failing, using v3.30.0.6. But now I started testing it with R instead of Python. I am glad I tried, because I may have pinpointed what was wrong. There is a version mismatch between the client and the server. So I updated accordingly the image to v3.30.0.1.
But now again, the readinessProbe is not working in my k3d cluster, so I am unable to test it.
It seems it is working now. R client version 3.30.0.1 with server version 3.30.0.1. Also tried with Python version 3.30.0.7 and server version 3.30.0.7 and it started working. Marvelous. The problem was caused by a version mismatch between the client and the server, as the python client was updated to 3.30.0.7 while the latest server for docker was 3.30.0.6.

Puppet - Could not find environment 'none

We had seeing following errors for puppet-server and puppet-agent
Jun 22 19:26:30 node puppet-agent[12345]: Local environment: "production" doesn't match server specified environment "none", restarting agent run with environment "none"
Jun 22 19:44:55 node INFO [puppet-server] Puppet Not Found: Could not find environment 'none
Configuration was verified couple of times and it looks fine. Production env exists.
Anyone has experienced similar issue?
We have enabled debug logging for puppet server however it doesn't seem point us to the root cause.
What part of code could be related to what we see here?
Regards
The master is overriding the agent's requested environment with a different one, but the environment the master chooses is either empty or explicitly "none", and, either way, no such environment is actually known to it. This is a problem with the external node classifier the master is using. Check the master's external_nodes setting if you're uncertain what ENC is in play, or for a summary of Puppet's expectations for such a program.
If the ENC emits an environment attribute for the node in question, then the value of that attribute must be the name of an existing environment ('production', for example). If you want to let the agent choose then the ENC should avoid emitting any environment attribute at all.

GitLab - CI/CD : external public ressources issue

I've build a test scenario for gitlab (.gitlab-ci.yml) but I've issue for some tests as they need to have an access on public ressources (provided by internet without limitation but specific url + port).
A simple test has validated the access blocked by the gitlab docker during the validation phase:
$ wget http://url:port
--2018-07-03 13:42:06-- http://url:port/
Connecting to url:port... failed: Connection refused.
ERROR: Job failed: exit code 1
ls of course validated on another server/pc/...
Someone has an idea on the 'howto' proceed to open this network flow?
Thanks for help and have a nice day ;-) !
Guillain

Resources