I used https://learn.microsoft.com/en-us/azure/aks/certificate-rotation this link to rotate certificates in AKS. Certificate got updated but my cluster is in failed state. Because of this my application is down.
I am getting below mentioned error when I am running this command az aks rotate-certs -g $RESOURCE_GROUP_NAME -n $CLUSTER_NAME
ERROR: "error": { "code": "ErrorCodeRotateClusterCertificates", "message": "VMASAgentPoolReconciler retry failed: Category: ClientError; SubCode: OutboundConnFailVMExtensionError; Dependency: Microsoft.Compute/virtualMachines/extensions; OrginalError: Code=\"VMExtensionProvisioningError\" Message=\"VM has reported a failure when processing extension 'cse-agent-0'. Error message: \\\"Enable failed: failed to execute command: command terminated with exit status=50\\n[stdout]\\n\\n[stderr]\\ncurl: option --proxy-insecure: is unknown\\ncurl: try 'curl --help' or 'curl --manual' for more information\\nCommand exited with non-zero status 2\\n0.00user 0.00system 0:00.00elapsed 100%!!(MISSING)C(string=VMAS agent pools reconciling)PU (0avgtext+0avgdata 7044maxresident)k\\n0inputs+8outputs (0major+372minor)pagefaults 0swaps\\n\\\"\\r\\n\\r\\nMore information on troubleshooting is available at https://aka.ms/VMExtensionCSELinuxTroubleshoot \"; AKSTeam: NodeProvisioning, Retriable: false" } }
Kubernetes version: 1.14.8
Please help to resolved this issue.
What version of Ubuntu are you running on your nodes? From that error, guessing Ubuntu 16.04 or older.
I'm not sure if it will work, but instead of trying to rotate certificates, can you try upgrading the nodes?
You might also want to consider just creating a new cluster, and using VMSS instead of VMAS.
Related
Can someone please help me with this problem.
error [connectors/v2/FabricGateway] Failed to perform query transaction [ReadAsset] using arguments [2_4], with error: Error: error in simulation: failed to execute transaction 9ca49b08603ab086104fec8777546bbbc24d826a3900136b4a0e66aadf4bb6e4: could not launch chaincode basic_1:9820659c595e662a849033ca23b4424e87a126e8f40b5f81ace59820b81fe8e7: chaincode registration failed: error starting container: error starting container: API error (404): network _test not found
The report has been generated but all the transactions has failed.
It looks like the chaincode's Docker container failed to start for some reason. You will need to use the docker logs command to inspect the logs for the failure reason. Use the docker ps -a command to see what containers are available, including stopped / failed containers. Both the chaincode container (if it exists) and peer container logs may hold useful information.
I have deployed devstack for my OpenStack using the default configuration and trying to deploy kypo. I am running ./create-base.sh and getting the following error
[kypo-proxy-jump-stack]: CREATE_FAILED Resource CREATE failed: ResourceInError: resources.kypo-proxy-jump: Went to status ERROR due to "Message: No valid host was found. , Code: 500"
[kypo-proxy-jump-stack.kypo-proxy-jump]: CREATE_FAILED ResourceInError: resources.kypo-proxy-jump: Went to status ERROR due to "Message: No valid host was found. , Code: 500"
My devstack config:
content of local.conf
[[local|localrc]]
#Enable heat services
enable_service h-eng h-api h-api-cfn h-api-cw
[[local|localrc]]
#Enable heat plugin
enable_plugin heat https://opendev.org/openstack/heat
IMAGE_URL_SITE="https://download.fedoraproject.org"
IMAGE_URL_PATH="/pub/fedora/linux/releases/33/Cloud/x86_64/images/"
IMAGE_URL_FILE="Fedora-Cloud-Base-33-1.2.x86_64.qcow2"
IMAGE_URLS+=","$IMAGE_URL_SITE$IMAGE_URL_PATH$IMAGE_URL_FILE
There is a workaround: you need to reduce the kypo-proxy-jump's flavor.
Something like this:
openstack flavor create --ram 2048 --disk 10 --vcpus 1 standard.medium
However, check your Openstack resources and logs, there is probably lack of resource (disk, mem or cpu).
So I have an AKS cluster in DEV env which was working fine. Today I have noticed that some pods due being removed/uninstalled via helm were stuck in Terminating state.
I found out that none of the 3 nodes are ready. When I stopped the cluster and started again, VMs failed to create in VMMS with associated message:
VM has reported a failure when processing extension 'vmssCSE'. Error message: "Enable failed: failed to execute command: command terminated with exit status=50
According to what I have found might look like the VMs in scale set are missing outbound internet connectivity, however the associated NSG has only the defaults:
When inspecting the VMSS status, it says the following:
VM has reported a failure when processing extension 'vmssCSE'. Error message: "Enable failed: failed to execute command: command terminated with exit status=50 [stdout] [stderr] nc: connect to mcr.microsoft.com port 443 (tcp) failed: Connection timed out Command exited with non-zero status 1 0.00user 0.00system 2:10.07elapsed 0%CPU (0avgtext+0avgdata 2360maxresident)k 0inputs+8outputs (0major+113minor)pagefaults 0swaps " More information on troubleshooting is available at https://aka.ms/VMExtensionCSELinuxTroubleshoot
This troubleshooting doesn't seem to be helpful as it states:
When restricting egress traffic from an AKS cluster, there are required and optional recommended outbound ports / network rules and FQDN / application rules for AKS. If your settings are in conflict with any of these rules, certain kubectl commands won't work correctly. You may also see errors when creating an AKS cluster.
Verify that your settings aren't conflicting with any of the required or optional recommended outbound ports / network rules and FQDN / application rules.
But the default rules have not changed, therefore I'm lost at that point.
I am fairly new to Linux (and brand new to chef) and I have ran into an issue when setting up my chef server. I am trying to create an admin user with the command
sudo chef-server-ctl user-create admin Admin Ladmin admin#example.com
examplepass -f admin.pem
but after I keep getting this error:
ERROR: Connection refused connecting...
ERROR: Connection refused connecting to https://127.0.0.1/users/, retry 5/5
ERROR: Network Error: Connection refused - Connection refused
connecting to https://..., giving up
Check your knife configuration and network settings
I also noticed that when I ran chef-server-ctl I got this output:
[2016-12-21T13:24:59-05:00] ERROR: Running exception handlers Running
handlers complete
[2016-12-21T13:24:59-05:00] ERROR: Exception
handlers complete Chef Client failed. 0 resources updated in 01 seconds
[2016-12-21T13:24:59-05:00] FATAL: Stacktrace dumped to
/var/opt/opscode/local-mode-cache/chef-stacktrace.out
[2016-12-21T13:24:59-05:00] FATAL: Please provide the contents of the
stacktrace.out file if you file a bug report
[2016-12-21T13:24:59-05:00] FATAL:
Chef::Exceptions::CannotDetermineNodeName: Unable to determine node
name: configure node_name or configure the system's hostname and fqdn
I read that this error is due to a prerequisite mistake but I'm uncertain as to what it means or how to fix it. So any input would be greatly appreciated.
Your server does not have a valid FQDN (aka full host name). You'll have to fix this before installing Chef server.
I am trying to bootstrap ec2 instance using knife on chef server hosted on azure marketplace. the chef client run fails throwing the following error.
PS C:\Users\xyz\chef-repo> knife ec2 server create -I ami-25c00c46 -f t2.micro
--region ap-southeast-1 -N ec2module
-x ubuntu -i abc.pem -r "role[ec2], role[jenkinserver]" -g sg-9f1b31fa sudo
.ap-southeast-1.compute.amazonaws.com Chef encountered an error attempting to create the client "ec2module"
.ap-southeast-1.compute.amazonaws.com Running handlers:
.ap-southeast-1.compute.amazonaws.com [2016-01-20T11:39:26+00:00] ERROR: Running exception handlers
.ap-southeast-1.compute.amazonaws.com Running handlers complete
.ap-southeast-1.compute.amazonaws.com [2016-01-20T11:39:26+00:00] ERROR: Exception handlers complete
.ap-southeast-1.compute.amazonaws.com Chef Client failed. 0 resources updated in 03 seconds
.ap-southeast-1.compute.amazonaws.com [2016-01-20T11:39:26+00:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out
.ap-southeast-1.compute.amazonaws.com [2016-01-20T11:39:26+00:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
.ap-southeast-1.compute.amazonaws.com [2016-01-20T11:39:26+00:00] ERROR: undefined method `length' for nil:NilClass
.ap-southeast-1.compute.amazonaws.com [2016-01-20T11:39:26+00:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)
There seems to be a problem with the server SSL certificate and perhaps it is related to Chef issue #4301 (read it).
Try downloading the SSL certificate from the Chef Server:
> knife ssl fetch
Then, you can check it with:
> knife ssl check
I hope this helps.