We have a k8s deployment of several services including Apache Spark. All services seem to be operational. Our application connects to the Spark master to submit a job using the k8s DNS service for the cluster where the master is called spark-api so we use master=spark://spark-api:7077 and we use spark.submit.deployMode=cluster. We submit the job through the API not by the spark-submit script.
This will run the "driver" and all "executors" on the cluster and this part seems to work but there is a callback to the launching code in our app from some Spark process. For some reason it is trying to connect to harness-64d97d6d6-4r4d8, which is the pod ID, not the k8s cluster IP or DNS.
How could this pod ID be getting into the system? Spark somehow seems to think it is the address of the service that called it. Needless to say any connection to the k8s pod ID fails and so does the job.
Any idea how Spark could think the pod ID is an IP address or DNS name?
BTW if we run a small sample job with master=local all is well, but the same job executed with the above config tries to connect to the spurious pod ID.
BTW2: the k8s DNS for the calling pod is harness-api
You can consider to use Headless service for harness-64etcetc Pod in order to accomplish backward DNS discovery. Actually, it will create particular endpoint for the relevant service by matching appropriate selector inside your application Pod and as result A record expects to be added into Kubernetes DNS configuration.
Eventually, I've found related #266 Github issue, which probably can bring some useful information for further investigation.
Related
KOPS lets us create a Kubernetes cluster along with a bastion that has ssh access to the cluster nodes
With this setup is it still considered safe to use kubectl to interact with the Kubernetes API server?
kubectl can also be used to interact with shell on the pods? Does this need any restrictions?
What are the precautionary steps that need to be taken if any?
Should the Kubernetes API server also be made accessible only through the bastion?
Deploying a Kubernetes cluster with the default Kops settings isn’t secure at all and shouldn’t be used in production as such. There are multiple configuration settings that can be done using kops edit command. Following points should be considered after creating a Kubnertes Cluster via Kops:
Cluster Nodes in Private Subnets (existing private subnets can be specified using --subnets with the latest version of kops)
Private API LoadBalancer (--api-loadbalancer-type internal)
Restrict API Loadbalancer to certain private IP range (--admin-access 10.xx.xx.xx/24)
Restrict SSH access to Cluster Node to particular IP (--ssh-access xx.xx.xx.xx/32)
Hardened Image can also be provisioned as Cluster Nodes (--image )
Authorization level must be RBAC. With latest Kubernetes version, RBAC is enabled by default.
The Audit logs can be enabled via configuration in Kops edit cluster.
kubeAPIServer:
auditLogMaxAge: 10
auditLogMaxBackups: 1
auditLogMaxSize: 100
auditLogPath: /var/log/kube-apiserver-audit.log
auditPolicyFile: /srv/kubernetes/audit.yaml
Kops provides reasonable defaults, so the simple answer is : it is reasonably safe to use kops provisioned infrastructure as is after provisioning.
We have a cluster which is built by docker swarm
Cluster consists of 1 Manager 3 Worker nodes.
it can be seen as follow:
and we have run Apache Spark on the cluster. It consists of a master and four workers. It is seen as follow on master web ui
The problem is that I can not access the details of worker node. It wants to connect to an ip(10.0.0.5:8081). But I can not access the link from my local machine.
you need to bind the port of the spark webui service and access the webui using localhost:8081 (if you are binding the localport 8081)
example in docker-compose.yml file
in the spark webui service put something like this
https://docs.docker.com/compose/compose-file/#ports
the ip you specified(10.0.0.5) is the subnet created by docker you cannot access using that ip from your machine
I am migrating from GCP to Azure. My use case is simple: I have a k8s cluster running some web crawlers which needs to talk to Elastic and Cassandra clusters (not in the k8s cluster) using internal IPs. All of these components can be in the same Azure Region (e.g: East US). I understand from this discussion that VNET peering is the way to go.
This solution did not work for me. I am still unable to reach my Cass/ES cluster from the pods. I believe this solution is outdated, is there some other approach to accomplish this, that I am missing ?
We can use Azure route table to achieve that, then we can ping pod IP address out of your k8s Vnet.
I have answered your another question here, please check it.
If you would like further assistance, please let me know:)
Some of my data is in Mongo replicas that are hosted in docker containers running in kubernetes cluster. I need to access this data from the AWS lambda that is running in the same VPC and subnet (as the kubernetes minions with mongo db). lambda as well as the kubernetes minions (hosting mongo containers) are run under the same security group. I am trying to connect using url "mongodb://mongo-rs-1-svc,mongo-rs-2-svc,mongo-rs-3-svc/res?replicaSet=mongo_rs" where mongo-rs-x-svc are three kubernetes services that enables access to the appropriate replicas. When I try to connect using this url, it fails to resolve the mongo replica url (e.g. mongo-rs-2-svc). Same URL works fine for my web service that is running in its own docker container in the same kubernetes cluster.
Here is the error I get from mongo client that I use...
{\"name\":\"MongoError\",\"message\":\"failed to connect to server [mongo-rs-1-svc:27017] on first connect [MongoError: getaddrinfo ENOTFOUND mongo-rs-1-svc mongo-rs-1-svc:27017]\"}". I tried replacing mongo-rs-x-svc to their internal ip addresses in the url. In this case the above name resolution error disappeared but got another error - {\"name\":\"MongoError\",\"message\":\"failed to connect to server [10.0.170.237:27017] on first connect [MongoError: connection 5 to 10.0.170.237:27017 timed out]\"}
What should I be doing to enable this access successfully?
I understand that I can use the webservice to access this data as intermediary but since my lambda is in VPC, I have to deploy NAT gateways and that would increase the cost. Is there a way to access the webservice using the internal endpoint instead of public url? May be that is another way to get data.
If any of you have a solution for this scenario, please share. I went through many threads that showed up as similar questions or in search results but neither had a solution for this case.
This is a common confusion with Kubernetes. The Service object in Kubernetes is only accessible from inside Kubernetes by default (i.e. when type: ClusterIP is set). If you want to be able to access it from outside the cluster you need to edit the service so that it is type: NodePort or type: LoadBalancer.
I'm not entirely sure, but it sounds like your network setup would allow you to use type: NodePort for your Service in Kubernetes. That will open a high-numbered port (e.g. 32XXX) on each of the Nodes in your cluster that forwards to your Mongo Pod(s). DNS resolution for the service names (e.g. mongo-rs-1-svc) will only work inside the Kubernetes cluster, but by using NodePort I think you should be able to address them as mongodb://ec2-instance-1-ip:32XXX,ec2-instance-2-ip:32XXX,....
Coreyphobrien's answer is correct. Subsequently you were asking for how to keep the exposure private. For that I want to add some information:
You need to make the Lambdas part of your VPC that your cluster is in. For this you use the --vpc-config parameter when creating the lambdas or updating. This will create a virtual network interface in the VPC that allows the Lambda access. For Details see this.
After that you should be able to set the AWS security group for your instances so that the NodePort will only be accessible from another security group that is used for your Lambdas network interface.
This blog discusses an example in more detail.
I have a Spark standalone master running on a machine in the AWS VPC binding to the private IP address. I'm able to run spark jobs from the machine inside the VPC but not from my laptop that is connecting to the cluster via VPN. I checked executor logs on a spark worker and got Cannot receive any reply in 120 seconds.. Looks like networking issue.
Anybody knows how to solve that?