AWS Linux sudo: unable to resolve host - linux

I have a problem when launching any instance (from and AMI) within a particular VPC. Everytime I type a "sudo" command, I get the following:
sudo: unable to resolve host ip-xxx-xx-x-xxx
I have seen many posts on the internet about this and they all mention the following two solutions:
Manually edit the /etc/hosts file on every server... This is not practical (especially when using AutoScaling to generate new instances) and I shouldn't have to do this as I don't have to do this for any instances on other VPCs
Turn on "DNS hostnames" on the VPC... I have done this and it hasn't made a difference. I have gone through all the settings on the VPC (and route tables, subnets etc) and I have it set exactly the same way as a "working" VPC on my account. Still no luck
Any help here?

It seems as if this is an AWS specific problem that has been run into before, it stems from not enabling enableDnsHostnames in your VPC configuration.
Here is a link to the AWS documentation that talks about this.
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-dns.html#vpc-dns-updating

This is a really late answer, and I'm not sure if it's still relevant for newer EC2 instances, but I've just taken over an account running Ubuntu 16, and I solved this annoyance by just adding the dns hostname to the hosts file. Presumably this could be automated for any auto-scaling if desired. As simple as:
echo 127.0.0.1 ip-10.x.x.x >> /etc/hosts
Or any variant using tee or sed or your other Unix tool of choice. You could include ipv6 version as well, if you really want to be thorough, but ipv4 was enough to stop the warning in my case.

Related

Bridge to Kubernetes doesnt add entries in /etc/hosts

I need help with the Bridge to Kubernetes setup in my Linux(WSL) environment.
The debug starts as expected but it doesn't change my /etc/hosts, hence I can't connect to the other services in the cluster.
I believe the issue can be related to not having enough permissions, and I can't find endpointManager running in Linux.
https://learn.microsoft.com/en-us/visualstudio/bridge/overview-bridge-to-kubernetes#additional-configuration
Any idea what this could be related to?

AKS with Static IP and Custom Cert / AKS Ingress issues

Well, for the last 2 days I battled this documentation:
https://learn.microsoft.com/en-au/azure/aks/static-ip
and
https://learn.microsoft.com/en-au/azure/aks/ingress-own-tls
First of all I ensured that I had my aks k8s cluster upgraded to 1.11.5, so there is no question about having the static IP in a different resource group.
Overall, I could not get the static IP really working. With dynamic everything sounds fine, but I cannot add a A record for a dynamic IP.
I managed to deploy everything successfully, but any curl ip.. does not work. I did run exec -ti locally, and locally everything is fine.
Could someone please point me to a GitHub config or article that has this configuration running? As a disclaimer I know azure very well, so well the service principal assignments are well done, etc. However, I am new, only a few months on k8s.
Thanks in advance for any suggestion.
I can share logs if needed but believe I did check everything from dns to ingress routes. I am worried that this doc is not good and I am just loosing my time.
Answering myself this question, after quite a journey, for when I get older and I forget what I've done, and maybe my nephew will save some hours someday.
First, it's important:
In the values provided to nginx-ingress chart template, there are 2 annotations that are important:
service.beta.kubernetes.io/azure-load-balancer-resource-group: "your IP's resource group"
externalTrafficPolicy: "Local"
Here are all the values documented: https://github.com/helm/charts/blob/master/stable/nginx-ingress/values.yaml
The chart can be deployed near your service's namespace, it should not be in kube-system (with my current knowledge I don't find a reason to have it in system).
Second, could be misleading
There is a delay of ~30+ seconds (in my case) from the moment when IP appeared in the kubectl get services --watch and till the moment curl -i IP was able to answer the call. So, if you have automation or health probes then ensure that you have 1 - 2 mins added to wait. Or maybe take better nodes, bare metal machines.
Look at GCE and DO for the same setup as might help:
https://cloud.google.com/community/tutorials/nginx-ingress-gke
https://www.digitalocean.com/community/tutorials/how-to-set-up-an-nginx-ingress-with-cert-manager-on-digitalocean-kubernetes
The guys at DO, are good writes as well.
Good luck!
Based on your comments, it seems that you are trying to override the externalIPs but use the default value of the helm chart for controller.service.type which is LoadBalancer. What you might want to do is to keep controller.service.type to LoadBalancer and set controller.service.loadBalancerIP with your static IP instead of overriding externalIPs.
Here some documentation from microsoft.

Jenkins error trying to raise on-demand linux ec2 slave

Whenever I try to trigger a job that depends on that ec2 slave, it just stands in queue. I looked at the logs and saw this exception:
com.amazonaws.services.ec2.model.AmazonEC2Exception: Network interfaces and an instance-level security groups may not be specified on the same request
Whenever I click on build executor status on the left, there is a button that says "provision via ". I click on it and see the correct amazon linux image name that I entered under cloud on Jenkins' System Configuration, but when I click on that, I see that same exception as well... I just don't know how to fix this and cannot find any helpful information on this.
Any help would be much appreciated.
Ok, I'm not exactly sure what was causing the error since I don't really know how the Jenkins plugin interfaces with the aws api. But after a good amount of trial and error, I was able to provision the On Demand worker by adding more details/parameters in Configuration, under Cloud.
Adding a subnet ID for the VPC and a IAM Instance profile did the trick (I already had everything else including security groups, availability zone, instance type, etc). So it seems like you either leave out security groups, or go all in and fill in pretty much everything.
As an FYI if you see this with Jenkins EC2 Plugin v1.46 it looks like a genuine bug:
https://issues.jenkins-ci.org/browse/JENKINS-59543
The solution is to use 1.45 until it's fixed (see link above for more details).

Windows Active Directory Domain setup remotely through univention using samba4

I have a slight problem bit of the back story. recently ive been trying to test out univention which is a linux distribution with the goal of being able to replace Microsoft active directory.
I tested it locally and all went reasonably well after a few minor issues i then decided to test it remotely as the company wants to allow remote users to access this so i used myhyve.com to host it and its now been setup successfully and works reasonably well.
however
my main problem is DNS based as when trying to connect to the domain the only way windows will recognize it is by editing the network adapter and setting ip v4 dns server address to the ip address of the server hosting the univention active directory replacement. although this does allow every thing to work its not ideal and dns look up on the internet are considerably longer. i was wondering if any one had any ideas or have done something similar and encountered this problems before and know a work around. i want to avoid setting up a vpn if possible.
after initially registering the computer on the domain i am able to remove the dns server address and just use a couple of amendments to the HOST file to keep it running but this still leads to having issues connecting to the domain controller sometimes and is not ideal. any ideas and suggestions would be greatly received.
.Michael
For the HOST entries, the most likely issue is, that there are several service records a computer in the domain needs. I'm not sure, whether these can be provided via the HOST file or not but you'll definitely have authentication issues if they are missing. To see the records your domain is using issue the following commands on the UCS system.
/usr/share/univention-samba4/scripts/check_essential_samba4_dns_records.sh
For the slow resolution of the DNS records there are several points where you could start looking. My first test would be whether or not you are using a forwarder for the web DNS requests and whether or not the forwarder is having a decent speed. To check if you are using one, type
ucr search dns/forwarder
If you get a valid IP for either of the UCR Variables, dns/forwarder1, dns/forwarder2 or dns/forwarder3, you are forwarding your DNS requests to a different Server. If all of them are empty or not valid IPs then your server is doing the resolution itself.
Not using a forwarder is often slow, as the DNS servers caching is optimized for the AD operations, like the round robin load balancing. Likewise a number of ISPs require you to use a forwarder to minimize the DNS traffic. You can simply define a forwarder using ucr, I use Google on IPv4 for the example
ucr set dns/forwarder1='8.8.8.8'
The other scenario might be a slow forwarder. To check it try to query the forwarder directly using the following command
dig univention.com #(ucr get dns/forwarder1)
If it takes long, then there is nothing the UCS server can do, you'll simply have to choose a different forwarder from the ucr command above.
If neither of the above helps, the next step would be to check whether there are error messages for the named daemon in the syslog file. Normally these come when you are trying to manually remove software or if the firewall configuration got changed.
Kevin
Sponsored post, as I work for Univention North America, Inc.

Install Neo4j on Azure, cannot browse WebAdmin

I've just installed Neo4j 1.8.2 onto Azure by following this step-by-step process...
http://de.slideshare.net/neo4j/neo4j-on-azure-step-by-step-22598695
Unfortunately, when I browse to http://:7474/webadmin Fiddler says Error 10061 - No connection could be made because the target machine actively refused it.
I've followed the instructions exactly and haven't received any errors.
Any help much appreciated.
So, I think I got to the bottom of this. I think it was due to the size of compute / VM I was creating. It looks like the problem is caused when running on Extra Small instances. I created a new installation using a Small instance and everything now works :).
Try setting the server to accept connections form all hosts, and maybe use a newer Neo4j, say 1.9.4
http://docs.neo4j.org/chunked/stable/security-server.html#_secure_the_port_and_remote_client_connection_accepts
The way the VM Depot image is set up, it's pre-configured to allow all hosts to connect, and the Neo4j server will auto-start. The only thing you need to take care of, when constructing your VM, is to open an Input Endpoint, with any public port you want (preferably 7474 to stay true to Neo4j) and internal port 7474.
Note that the UI changed a bit since the how-to was published: You can specify the endpoint as the last step before creating your virtual machine. Other than that, the instructions should be the same. And... once the VM is up and running (it'll take about 5-10 minutes), you just visit http://yourservicename.cloudapp.net:7474 and you should see the web admin. Note: this is not the same as your vm name. If you named your VM something like 'neo' then you do not want http://neo:7474 or http://neo.cloudapp.net:7474. You need to use your cloud service name (you had to create a name for the service when you deployed the VM.
I've deployed that image several times in demos, and just tried again right now to make sure nothing wonky happened. Worked perfectly.

Resources