I have deployed a Python API into OCP with 3 pod replicas. All the incoming requests seem to be going to only one pod while the other 2 being idle all the time.
Configuration I have is :
haproxy.router.openshift.io/timeout:1800s
haproxy.router.openshift.io/balance:roundrobin
haproxy.router.openshift.io/disable_cookies:”true”
Need help to resolve this issue
Tried changing balance above with leastconn and roundrobin. I don’t see any difference
I found the fix for my issue, Actually i was making the API requests to these pods from another pod in the same namespace. I used the name of service in my url instead of using OCP url in the API call ex: http://ocpservicename:port
Related
I'm currently researching and experimenting with Kubernetes in Azure. I'm playing with AKS and the Application Gateway ingress. As I understand it, when a pod is added to a service, the endpoints are updated and the ingress controller continuously polls this information. As new endpoints are added AG is updated. As they're removed AG is also updated.
As pods are added there will be a small delay whilst that pod is added to the AG before it receives requests. However, when pods are removed, does that delay in update result in requests being forwarded to a pod that no longer exists?
If not, how does AG/K8S guarantee this? What behaviour could the end client potentially experience in this scenario?
Azure Application gateway ingress is an ingress controller for your kubernetes deployment which allows you to use native Azure Application gateway to expose your application to the internet. Its purpose is to route the traffic to pods directly. At the same moment all questions about pods availability, scheduling and generally speaking management is on kubernetes itself.
When a pod receives a command to be terminated it doesn't happen instantly. Right after kube-proxies will update iptables to stop directing traffic to the pod. Also there may be ingress controllers or load balancers forwarding connections directly to the pod (which is the case with an application gateway). It's impossible to solve this issue completely, while adding 5-10 seconds delay can significantly improve users experience.
If you need to terminate or scale down your application, you should consider following steps:
Wait for a few seconds and then stop accepting connections
Close all keep-alive connections not in the middle of request
Wait for all active requests to finish
Shut down the application completely
Here are exact kubernetes mechanics which will help you to resolve your questions:
preStop hook - this hook is called immediately before a container is terminated. This is very helpful for graceful shutdowns of an application. For example simple sh command with "sleep 5" command in a preStop hook can prevent users to see "Connection refused errors". After the pod receives an API request to be terminated, it takes some time to update iptables and let an application gateway know that this pod is out of service. Since preStop hook is executed prior SIGTERM signal, it will help to resolve this issue.
(example can be found in attach lifecycle event)
readiness probe - this type of probe always runs on the container and defines whether pod is ready to accept and serve requests or not. When container's readiness probe returns success, it means the container can handle requests and it will be added to the endpoints. If a readiness probe fails, a pod is not capable to handle requests and it will be removed from endpoints object. It works very well with newly created pods when an application takes some time to load as well as for already running pods if an application takes some time for processing.
Before removing from the endpoints readiness probe should fail several times. It's possible to lower this amount to only one fail using failureTreshold field, however it still needs to detect one failed check.
(additional information on how to set it up can be found in configure liveness readiness startup probes)
startup probe - for some applications which require additional time on their first initialisation it can be tricky to set up a readiness probe parameters correctly and not compromise a fast response from the application.
Using failureThreshold * periodSecondsfields will provide this flexibility.
terminationGracePeriod - is also may be considered if an application requires more than default 30 seconds delay to gracefully shut down (e.g. this is important for stateful applications)
I'm new to using AWS EBS and ECS, so please bear with me if I ask questions that might be obvious for others. To the issue:
I've got a single-container Node/Express application that runs on EBS. The local docker container works as expected. On EBS, I can access one endpoint of the API and get the expected output. For the second endpoint, which runs longer (around 10-15 seconds) I get no response and run after 60 seconds into a time out: "504 Gateway Time-out".
I wonder how I would approach debugging this as I can't connect to the container directly? Currently there isn't any debugging functionality in the code included either as I'm not sure what the best node approach for a EBS container is - any recommendations are highly appreciated.
Thank you in advance!
You can see the EC2 instances running on EBS in your AWS, and you can choose to give them IP addresses in your EBS options. That will let you SSH directly into them if you need to.
Otherwise check the keepAliveTimeout field in your server (the value returned by app.listen() of you're using express).
I got a decent number of 504s when my Node server timeout was less than my load balancer timeout.
Your application takes longer than expected (> 60 seconds) to respond, so either nginx or the Load Balancer terminates your request.
See my answer here
As a part of next assignment, I need to prepare a scalable and full concurrent supporting node architecture. I am confused with kubernetes/containers concept and really need some help. And I cannot use any paid service! Just plain raw DO servers and load balancers.
Basically a basic sketch/idea/explanation/pointers to The architecture which should explain API endpoints, data service connectivity and data flows between database, server and client is needed!
Here is what I have in my mind:
Client <-> NginX -> Nodejs <-> MongoDB
So above is a standard setup for nodejs based REST APIs I believe. Now how to add scalability to this and concurrency?
Any help would be appreciated!
Let me give you a quick overview and after that just ask more questions in the comments of my answer if you need to know more.
You need a docker image of all your services:
You will need an nginx image wich contains your frontend code. (https://serversforhackers.com/c/dckr-nginx-image)
You will need a docker image with which contains your backend code.
(https://nodejs.org/en/docs/guides/nodejs-docker-webapp/)
You will need an simple mongo-db base image.
(https://medium.com/#pablo_ezequiel/creating-a-docker-image-with-mongodb-4c8aa3f828f2)
Now for beginners I would go to Google Cloud Plattform and set up a manged kubernetes cluster. This is done in 1 minute and you will have a fulll functinal kubernetes environment. (https://cloud.google.com/kubernetes-engine/docs/quickstart) - In the first year you will have 300$ for free usage. So this is more then enough to play arround and set up an environment for your assignment.
Now you will need an Ingress API. The Ingress is the only access point to the Services you will later deploy on your cluster. Lets say your Ingress is listening to 14.304.233. When your write 14.304.233/customerBackend, it will redirect this request to the customerBackend Service (You need to define this of course) More information here: https://kubernetes.io/docs/concepts/services-networking/ingress/#what-is-ingress
Now you need to deploy the images you created. In Kubernetes you have the concept of Pods (see here: https://kubernetes.io/docs/concepts/workloads/pods/pod/).Normally in each Pod there runs only one container. Each Pod-Group (f.e all Pods which have an Node Container inside) has one so called Service, which is managing the access on the pod. Let say you want to have 3 instances of your NodeJS backend. Each of the 3 Container will run in a individual pod. If you want to send a request to the backend, it will go trough the Service, which then redirects the requests to one of the pos. When you need to scale, you simply deploy more pods. The Service automaticly balances the load over the deployed pods.
How many pods you want have is deployt is defined in a so called deployment.yaml
(see: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/).
This is very simular to a docker-compose.yaml but with some more attributes you can configure.
I have a service that displays logs from pods running in my Kubernetes cluster. I receive them via k8s /pods/{name}/log API. The logs tend to grow big so I'd like to be able to paginate the response to avoid loading them whole every time. Result similar to how Kubernetes dashboard displays logs would be perfect.
This dashboard however seems to solve the problem by running a separate backend service that loads the logs, chops them into pieces and prepares for the frontend to consume.
I'd like to avoid that and use only the API with its query parameters like limitBytes and sinceSeconds but those seem to be insufficient to make proper pagination work.
Does anyone have a good solution for that? Or maybe know if k8s plans to implement pagination in logs API?
We have a stateless (with shared Azure Redis Cache) WebApp that we would like to automatically scale via the Azure auto-scale service. When I activate the auto-scale-out, or even when I activate 3 fixed instances for the WebApp, I get the opposite effect: response times increase exponentially or I get Http 502 errors.
This happens whether I use our configured traffic manager url (which worked fine for months with single instances) or the native url (.azurewebsites.net). Could this have something to do with the traffic manager? If so, where can I find info on this combination (having searched)? And how do I properly leverage auto-scale with traffic-manager failovers/perf? I have tried putting the traffic manager in both failover and performance mode with no evident effect. I can gladdly provide links via private channels.
UPDATE: We have reproduced the situation now the "other way around": On the account where we were getting the frequent 5XX errors, we have removed all load balanced servers (only one server per app now) and the problem disappeared. And, on the other account, we started to balance across 3 servers (no traffic manager configured) and soon got the frequent 502 and 503 show stoppers.
Related hypothesis here: https://ask.auth0.com/t/health-checks-response-with-500-http-status/446/8
Possibly the cause? Any takers?
UPDATE
After reverting all WebApps to single instances to rule out any relationship to load balancing, things ran fine for a while. Then the same "502" behavior reappeared across all servers for a period of approx. 15 min on 04.Jan.16 , then disappeared again.
UPDATE
Problem reoccurred for a period of 10 min at 12.55 UTC/GMT on 08.Jan.16 and then disappeared again after a few min. Checking logfiles now for more info.
UPDATE
Problem reoccurred for a period of 90 min at roughly 11.00 UTC/GMT on 19.Jan.16 also on .scm. page. This is the "reference-client" Web App on the account with a Web App named "dummy1015". "502 - Web server received an invalid response while acting as a gateway or proxy server."
I don't think Traffic Manager is the issue here. Since Traffic Manager works at the DNS level, it cannot be the source of the 5XX errors you are seeing. To confirm, I suggest the following:
Check if the increased response times are coming from the DNS lookup or from the web request.
Introduce Traffic Manager whilst keeping your single instance / non-load-balanced set up, and confirm that the problem does not re-appear
This will help confirm if the issue relates to Traffic Manager or some other aspect of the load-balancing.
Regards,
Jonathan Tuliani
Program Manager
Azure Networking - DNS and Traffic Manager