It’s not your fault.
If your virtual network in Azure is having internal communication problems between virtual machines, I know why, and you are not to blame.
Recently, an Azure virtual network I have been using as a test system, started getting wonky like yours.
Each day I would complete my work and de-provision the virtual machines in an effort to manage costs. Each morning I would start machines and begin my day’s effort.
Then one day… BOOM!
Everything went haywire.
I thought I was going nuts. Hadn’t it worked just yesterday?!
Some of my machines simply would not communicate with any of the others. I ensured that all firewalls were completely disabled, and Internet Explorer enhanced security was off. No luck.
The next day, other machines would not communicate and some of the machines that had not worked the previous day were working just fine today.
I pinged, I nslookup’d, I tried everything I could think of. When I ran out of ideas, the Google machine provided nothing useful.
I finally gave in and created a ticket with Azure Support.
My ticket was assigned to an extremely helpful representative. For the purposes of this article, we will call this person “Pat”.
Pat and I embarked on several days of tests and troubleshooting, screen shares, and methodology reviews. Pat was doing a great deal behind the scenes to help resolve my problem. None of it to any avail.
Finally, I decided to act on a hunch.
What was this hunch?
Azure DNS delegation was broken.
I designed a series of simple tests which would help to prove, or disprove, my theory.
I would create a test virtual network, propagate two virtual machines into it and use those in the following methodology.
Two fresh VMs, on a fresh VN, with proper communication.
Azure does not properly delegate DNS to configured virtual machine.
DNS functions properly without Azure delegation.
So you see, everything points to something in the way Azure is managing DNS delegation.
I passed my methodology and results to Pat in Azure support. Pat replied in two separate emails:
“Thank you for updating me on this. That is what we were thinking too but confirmation is always good. Currently I am collaborating with another team on this issue and will let you know what we find.”
“… we have to troubleshoot the Azure code, please understand that this type of troubleshooting is a long running process and updates may come slower that what you are used to. I appreciate your understanding in this matter.”
I will update this article as I receive updates.