Windows Domain Controller/DNS Failure-Collection of common programming errors

I have a previous question up about this, but I’ve come to some new information and I figured I would start a new post to stir up some new discussion.

To start, I will give you all a short description of our network setup (from the way I understand it). We have 2 stores. We’ll call them CP, and HQ. Now HQ is a domain controller, and we have a local domain called billsgs.net. Each store basically operates on its own. They each have a firewall, and their own server running windows server 2008 R2. The only time they interact is through replication. We have specified replicated directories, which are mostly user profiles, and our database files. This is for backup for the most part.

Now to get onto the problem… a few weeks ago (early June) we noticed the replication service on the HQ server was hogging a ton of memory, and by a ton, I mean ALL of the available memory it could get its hands on. We have 13gbs and within 10 minutes of running DFS it was about 98% memory usage. So we stopped it. We havent really been bothered by this, but if something crashes, we are pretty much screwed on the backups. We have ran some hot fixes but nothing has worked. So as of right now, DFS is not running.

Now, a couple of weeks ago the firewalls operating system was corrupted, I have no idea how, I wasn’t there when it happened. This was at the HQ store. So we have a broken firewall and DFS isn’t working properly. We have recently reinstalled the operating system on the firewall, which is pfsense. Everything seemed to be working fine.. except we started noticing some DNS problems. We are at the point where we don’t know if this is related to DNS/AD/DFS issues or if this is related to firewall issues. We basically have the firewall open, so we have decided that it’s is not a problem, at least it doesn’t seem like it. So here is a few debugging things we have done…

Here is dcdiag output…

    C:\Users\Administrator>dcdiag

    Directory Server Diagnosis

    Performing initial setup:
     Trying to find home server...
     Home Server = BGS-HQ-VRDSVR01
     * Identified AD Forest.
     Done gathering initial info.

    Doing initial required tests

     Testing server: BGS-HQ\BGS-HQ-VRDSVR01
      Starting test: Connectivity
       ......................... BGS-HQ-VRDSVR01 passed test Connectivity

    Doing primary tests

     Testing server: BGS-HQ\BGS-HQ-VRDSVR01
      Starting test: Advertising
       ......................... BGS-HQ-VRDSVR01 passed test Advertising
      Starting test: FrsEvent
       There are warning or error events within the last 24 hours after the SYSVOL has been shared. Failing SYSVOL replication problems may cause Group Policy problems.
       ......................... BGS-HQ-VRDSVR01 passed test FrsEvent
      Starting test: DFSREvent
       ......................... BGS-HQ-VRDSVR01 passed test DFSREvent
      Starting test: SysVolCheck
       ......................... BGS-HQ-VRDSVR01 passed test SysVolCheck
      Starting test: KccEvent
       A warning event occurred. EventID: 0x8000082C
       Time Generated: 08/05/2011 15:04:12
       Event String:
       A warning event occurred. EventID: 0x8000082C
       Time Generated: 08/05/2011 15:05:12
       Event String:
       ......................... BGS-HQ-VRDSVR01 passed test KccEvent
      Starting test: KnowsOfRoleHolders
       ......................... BGS-HQ-VRDSVR01 passed test KnowsOfRoleHolders
      Starting test: MachineAccount
       ......................... BGS-HQ-VRDSVR01 passed test MachineAccount
      Starting test: NCSecDesc
       ......................... BGS-HQ-VRDSVR01 passed test NCSecDesc
      Starting test: NetLogons
       ......................... BGS-HQ-VRDSVR01 passed test NetLogons
      Starting test: ObjectsReplicated
       ......................... BGS-HQ-VRDSVR01 passed test ObjectsReplicated
      Starting test: Replications
       [Replications Check,BGS-HQ-VRDSVR01] A recent replication attempt failed:
       From BGS-CP-VRDSVR01 to BGS-HQ-VRDSVR01
       Naming Context: DC=ForestDnsZones,DC=billsgs,DC=net
       The replication generated an error (1908):
       Could not find the domain controller for this domain.
       The failure occurred at 2011-08-05 14:34:49.
       The last success occurred at 2011-08-05 13:51:35.
       1 failures have occurred since the last success.
       Kerberos Error.
       A KDC was not found to authenticate the call.
       Check that sufficient domain controllers are available.
       [Replications Check,BGS-HQ-VRDSVR01] A recent replication attempt failed:
       From BGS-CP-VRDSVR01 to BGS-HQ-VRDSVR01
       Naming Context: DC=DomainDnsZones,DC=billsgs,DC=net
       The replication generated an error (1908):
       Could not find the domain controller for this domain.
       The failure occurred at 2011-08-05 14:34:48.
       The last success occurred at 2011-08-05 13:51:35.
       1 failures have occurred since the last success.
       Kerberos Error.
       A KDC was not found to authenticate the call.
       Check that sufficient domain controllers are available.
       [Replications Check,BGS-HQ-VRDSVR01] A recent replication attempt failed:
       From BGS-CP-VRDSVR01 to BGS-HQ-VRDSVR01
       Naming Context: CN=Schema,CN=Configuration,DC=billsgs,DC=net
       The replication generated an error (1908):
       Could not find the domain controller for this domain.
       The failure occurred at 2011-08-05 14:34:47.
       The last success occurred at 2011-08-05 13:51:34.
       1 failures have occurred since the last success.
       Kerberos Error.
       A KDC was not found to authenticate the call.
       Check that sufficient domain controllers are available.
       [Replications Check,BGS-HQ-VRDSVR01] A recent replication attempt failed:
       From BGS-CP-VRDSVR01 to BGS-HQ-VRDSVR01
       Naming Context: CN=Configuration,DC=billsgs,DC=net
       The replication generated an error (1908):
       Could not find the domain controller for this domain.
       The failure occurred at 2011-08-05 14:34:46.
       The last success occurred at 2011-08-05 13:51:34.
       1 failures have occurred since the last success.
       Kerberos Error.
       A KDC was not found to authenticate the call.
       Check that sufficient domain controllers are available.
       [Replications Check,BGS-HQ-VRDSVR01] A recent replication attempt failed:
       From BGS-CP-VRDSVR01 to BGS-HQ-VRDSVR01
       Naming Context: DC=billsgs,DC=net
       The replication generated an error (1908):
       Could not find the domain controller for this domain.
       The failure occurred at 2011-08-05 14:34:46.
       The last success occurred at 2011-08-05 13:51:34.
       1 failures have occurred since the last success.
       Kerberos Error.
       A KDC was not found to authenticate the call.
       Check that sufficient domain controllers are available.
       ......................... BGS-HQ-VRDSVR01 failed test Replications
      Starting test: RidManager
       ......................... BGS-HQ-VRDSVR01 passed test RidManager
      Starting test: Services
       Invalid service startup type: DFSR on BGS-HQ-VRDSVR01, current value DISABLED, expected value AUTO_START
       DFSR Service is stopped on [BGS-HQ-VRDSVR01]
       ......................... BGS-HQ-VRDSVR01 failed test Services
      Starting test: SystemLog
       A warning event occurred. EventID: 0x00000458
       Time Generated: 08/05/2011 14:08:10
       Event String:
       The Group Policy Client Side Extension Folder Redirection was unable to apply one or more settings because the changes must be processed before system startup or u
    ser logon. The system will wait for Group Policy processing to finish completely before the next startup or logon for this user, and this may result in slow startup and boot p
    erformance.
       An error event occurred. EventID: 0x00000456
       Time Generated: 08/05/2011 14:23:08
       Event String:
       The processing of Group Policy failed. Windows could not determine if the user and computer accounts are in the same forest. Ensure the user domain name matches th
    e name of a trusted domain that resides in the same forest as the computer account.
       An error event occurred. EventID: 0xC0001B78
       Time Generated: 08/05/2011 14:28:16
       Event String:
       The Service Control Manager tried to take a corrective action (Restart the service) after the unexpected termination of the DFS Replication service, but this actio
    n failed with the following error:
       An error event occurred. EventID: 0xC000271A
       Time Generated: 08/05/2011 14:31:28
       Event String: The server {995C996E-D918-4A8C-A302-45719A6F4EA7} did not register with DCOM within the required timeout.
       A warning event occurred. EventID: 0x8000001D
       Time Generated: 08/05/2011 14:34:09
       Event String:
       The Key Distribution Center (KDC) cannot find a suitable certificate to use for smart card logons, or the KDC certificate could not be verified. Smart card logon m
    ay not function correctly if this problem is not resolved. To correct this problem, either verify the existing KDC certificate using certutil.exe or enroll for a new KDC certi
    ficate.
       A warning event occurred. EventID: 0x000003F6
       Time Generated: 08/05/2011 14:34:13
       Event String: Name resolution for the name billsgs.net timed out after none of the configured DNS servers responded.
       An error event occurred. EventID: 0xC0001B58
       Time Generated: 08/05/2011 14:34:48
       Event String: The DgiVecp service failed to start due to the following error:
       An error event occurred. EventID: 0x0000168E
       Time Generated: 08/05/2011 14:34:55
       Event String:
       The dynamic registration of the DNS record '6282bfca-ade1-41c8-84dc-516ce19b49be._msdcs.billsgs.net. 600 IN CNAME BGS-HQ-VRDSVR01.billsgs.net.' failed on the follo
    wing DNS server:
       An error event occurred. EventID: 0x0000168E
       Time Generated: 08/05/2011 14:34:56
       Event String:
       The dynamic registration of the DNS record '_kpasswd._udp.billsgs.net. 600 IN SRV 0 100 464 BGS-HQ-VRDSVR01.billsgs.net.' failed on the following DNS server:
       A warning event occurred. EventID: 0x00002724
       Time Generated: 08/05/2011 14:34:56
       Event String: This computer has at least one dynamically assigned IPv6 address.For reliable DHCPv6 server operation, you should use only static IPv6 addresses.
       A warning event occurred. EventID: 0x000003F6
       Time Generated: 08/05/2011 14:34:55
       Event String: Name resolution for the name billsgs.net timed out after none of the configured DNS servers responded.
       An error event occurred. EventID: 0xC00110F1
       Time Generated: 08/05/2011 14:35:09
       Event String: The WINS Server could not initialize security to allow the read-only operations.
       An error event occurred. EventID: 0xC0002720
       Time Generated: 08/05/2011 14:36:05
       Event String: The application-specific permission settings do not grant Local Launch permission for the COM Server application with CLSID
       A warning event occurred. EventID: 0x000727AA
       Time Generated: 08/05/2011 14:38:30
       Event String: The WinRM service failed to create the following SPNs: WSMAN/BGS-HQ-VRDSVR01.billsgs.net; WSMAN/BGS-HQ-VRDSVR01.
       A warning event occurred. EventID: 0x0000043D
       Time Generated: 08/05/2011 14:47:48
       Event String:
       Windows failed to apply the Folder Redirection settings. Folder Redirection settings might have its own log file. Please click on the "More information" link.
       An error event occurred. EventID: 0x0000168E
       Time Generated: 08/05/2011 15:02:25
       Event String:
       The dynamic registration of the DNS record '6282bfca-ade1-41c8-84dc-516ce19b49be._msdcs.billsgs.net. 600 IN CNAME BGS-HQ-VRDSVR01.billsgs.net.' failed on the follo
    wing DNS server:
       An error event occurred. EventID: 0x0000168E
       Time Generated: 08/05/2011 15:02:26
       Event String:
       The dynamic registration of the DNS record '_kpasswd._udp.billsgs.net. 600 IN SRV 0 100 464 BGS-HQ-VRDSVR01.billsgs.net.' failed on the following DNS server:
       ......................... BGS-HQ-VRDSVR01 failed test SystemLog
      Starting test: VerifyReferences
       ......................... BGS-HQ-VRDSVR01 passed test VerifyReferences


     Running partition tests on : ForestDnsZones
      Starting test: CheckSDRefDom
       ......................... ForestDnsZones passed test CheckSDRefDom
      Starting test: CrossRefValidation
       ......................... ForestDnsZones passed test CrossRefValidation

     Running partition tests on : DomainDnsZones
      Starting test: CheckSDRefDom
       ......................... DomainDnsZones passed test CheckSDRefDom
      Starting test: CrossRefValidation
       ......................... DomainDnsZones passed test CrossRefValidation

     Running partition tests on : Schema
      Starting test: CheckSDRefDom
       ......................... Schema passed test CheckSDRefDom
      Starting test: CrossRefValidation
       ......................... Schema passed test CrossRefValidation

     Running partition tests on : Configuration
      Starting test: CheckSDRefDom
       ......................... Configuration passed test CheckSDRefDom
      Starting test: CrossRefValidation
       ......................... Configuration passed test CrossRefValidation

     Running partition tests on : billsgs
      Starting test: CheckSDRefDom
       ......................... billsgs passed test CheckSDRefDom
      Starting test: CrossRefValidation
       ......................... billsgs passed test CrossRefValidation

     Running enterprise tests on : billsgs.net
      Starting test: LocatorCheck
       ......................... billsgs.net passed test LocatorCheck
      Starting test: Intersite
       ......................... billsgs.net passed test Intersite

Now, keep in mind this is pretty different everytime we restart the server. Sometimes we have issues related to DCOM being unable to reach our specified dns servers! Now.. here is the output of a dns test…

C:\Users\Administrator>dcdiag /test:DNS

Directory Server Diagnosis

Performing initial setup:
 Trying to find home server...
 Home Server = BGS-HQ-VRDSVR01
 * Identified AD Forest.
 Done gathering initial info.

Doing initial required tests

 Testing server: BGS-HQ\BGS-HQ-VRDSVR01
  Starting test: Connectivity
   ......................... BGS-HQ-VRDSVR01 passed test Connectivity

Doing primary tests

 Testing server: BGS-HQ\BGS-HQ-VRDSVR01

  Starting test: DNS

   DNS Tests are running and not hung. Please wait a few minutes...
   ......................... BGS-HQ-VRDSVR01 passed test DNS

 Running partition tests on : ForestDnsZones

 Running partition tests on : DomainDnsZones

 Running partition tests on : Schema

 Running partition tests on : Configuration

 Running partition tests on : billsgs

 Running enterprise tests on : billsgs.net
  Starting test: DNS
   Test results for domain controllers:

   DC: BGS-HQ-VRDSVR01.billsgs.net
   Domain: billsgs.net


    TEST: Basic (Basc)
     Warning: adapter [00000007] Intel(R) PRO/1000 MT Network Connection has invalid DNS server: 192.168.40.254 ()

    TEST: Records registration (RReg)
     Network Adapter [00000007] Intel(R) PRO/1000 MT Network Connection:
      Warning:
      Missing SRV record at DNS server 192.168.40.13:
      _ldap._tcp.billsgs.net

      Warning:
      Missing SRV record at DNS server 192.168.40.13:
      _ldap._tcp.22017278-29d1-493a-b72d-e44b31411a70.domains._msdcs.billsgs.net

      Warning:
      Missing SRV record at DNS server 192.168.40.13:
      _kerberos._tcp.dc._msdcs.billsgs.net

      Warning:
      Missing SRV record at DNS server 192.168.40.13:
      _ldap._tcp.dc._msdcs.billsgs.net

      Warning:
      Missing SRV record at DNS server 192.168.40.13:
      _kerberos._tcp.billsgs.net

      Warning:
      Missing SRV record at DNS server 192.168.40.13:
      _kerberos._udp.billsgs.net

      Warning:
      Missing SRV record at DNS server 192.168.40.13:
      _kpasswd._tcp.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.13:
      _ldap._tcp.BGS-HQ._sites.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.13:
      _kerberos._tcp.BGS-HQ._sites.dc._msdcs.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.13:
      _ldap._tcp.BGS-HQ._sites.dc._msdcs.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.13:
      _kerberos._tcp.BGS-HQ._sites.billsgs.net

      Warning:
      Missing SRV record at DNS server 192.168.40.13:
      _ldap._tcp.gc._msdcs.billsgs.net

      Warning:
      Missing A record at DNS server 192.168.40.13:
      gc._msdcs.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.13:
      _gc._tcp.BGS-HQ._sites.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.13:
      _ldap._tcp.BGS-HQ._sites.gc._msdcs.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.13:
      _ldap._tcp.pdc._msdcs.billsgs.net

      Warning:
      Missing CNAME record at DNS server 192.168.40.254:
      6282bfca-ade1-41c8-84dc-516ce19b49be._msdcs.billsgs.net

      Warning:
      Missing A record at DNS server 192.168.40.254:
      BGS-HQ-VRDSVR01.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.254:
      _ldap._tcp.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.254:
      _ldap._tcp.22017278-29d1-493a-b72d-e44b31411a70.domains._msdcs.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.254:
      _kerberos._tcp.dc._msdcs.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.254:
      _ldap._tcp.dc._msdcs.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.254:
      _kerberos._tcp.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.254:
      _kerberos._udp.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.254:
      _kpasswd._tcp.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.254:
      _ldap._tcp.BGS-HQ._sites.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.254:
      _kerberos._tcp.BGS-HQ._sites.dc._msdcs.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.254:
      _ldap._tcp.BGS-HQ._sites.dc._msdcs.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.254:
      _kerberos._tcp.BGS-HQ._sites.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.254:
      _ldap._tcp.gc._msdcs.billsgs.net

      Warning:
      Missing A record at DNS server 192.168.40.254:
      gc._msdcs.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.254:
      _gc._tcp.BGS-HQ._sites.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.254:
      _ldap._tcp.BGS-HQ._sites.gc._msdcs.billsgs.net

      Error:
      Missing SRV record at DNS server 192.168.40.254:
      _ldap._tcp.pdc._msdcs.billsgs.net

    Error: Record registrations cannot be found for all the network adapters

   Summary of test results for DNS servers used by the above domain controllers:

   DNS server: 192.168.40.254 ()
    1 test failure on this DNS server
    Name resolution is not functional. _ldap._tcp.billsgs.net. failed on the DNS server 192.168.40.254

   Summary of DNS test results:

           Auth Basc Forw Del Dyn RReg Ext
   _________________________________________________________________
   Domain: billsgs.net
    BGS-HQ-VRDSVR01    PASS WARN PASS PASS PASS FAIL n/a

   ......................... billsgs.net failed test DNS

C:\Users\Administrator>

I believe this is our main issue, but I’m lost on the whole thing. I’ve given the netlogon restart trick a few tries. I’ve even ran the following sequence:

net stop netlogon
net stop dns
ipconfig /flushdns
net start dns
net start netlogon

Nothing seems to work. Just recently, today, we went into “active directory users and computers”, and under “Domain Controllers”, the HQ server is not listed. It simply says unavailable.

Also.. here is an ip config output…

Microsoft Windows [Version 6.1.7600]
Copyright (c) 2009 Microsoft Corporation. All rights reserved.

C:\Users\Administrator>ipconfig /all

Windows IP Configuration

 Host Name . . . . . . . . . . . . : BGS-HQ-VRDSVR01
 Primary Dns Suffix . . . . . . . : billsgs.net
 Node Type . . . . . . . . . . . . : Hybrid
 IP Routing Enabled. . . . . . . . : No
 WINS Proxy Enabled. . . . . . . . : No
 DNS Suffix Search List. . . . . . : billsgs.net

Ethernet adapter Local Area Connection:

 Connection-specific DNS Suffix . :
 Description . . . . . . . . . . . : Intel(R) PRO/1000 MT Network Connection
 Physical Address. . . . . . . . . : 00-0C-29-03-BA-38
 DHCP Enabled. . . . . . . . . . . : No
 Autoconfiguration Enabled . . . . : Yes
 IPv4 Address. . . . . . . . . . . : 192.168.40.13(Preferred)
 Subnet Mask . . . . . . . . . . . : 255.255.255.0
 Default Gateway . . . . . . . . . : 192.168.40.254
 DNS Servers . . . . . . . . . . . : 192.168.40.13
          192.168.40.254
 Primary WINS Server . . . . . . . : 192.168.40.13
 Secondary WINS Server . . . . . . : 192.168.41.17
 NetBIOS over Tcpip. . . . . . . . : Enabled

Tunnel adapter isatap.{ADEC15A8-2603-40EB-964C-489CCBD11E08}:

 Media State . . . . . . . . . . . : Media disconnected
 Connection-specific DNS Suffix . :
 Description . . . . . . . . . . . : Microsoft ISATAP Adapter
 Physical Address. . . . . . . . . : 00-00-00-00-00-00-00-E0
 DHCP Enabled. . . . . . . . . . . : No
 Autoconfiguration Enabled . . . . : Yes

Tunnel adapter Local Area Connection* 11:

 Media State . . . . . . . . . . . : Media disconnected
 Connection-specific DNS Suffix . :
 Description . . . . . . . . . . . : Teredo Tunneling Pseudo-Interface
 Physical Address. . . . . . . . . : 00-00-00-00-00-00-00-E0
 DHCP Enabled. . . . . . . . . . . : No
 Autoconfiguration Enabled . . . . : Yes

C:\Users\Administrator>

192.168.40.13 is HQ and 192.168.41.17 is CP. Also 192.168.40.254 is the HQ firewall, and 192.168.41.254 is the CP firewall.

To tie this all together, we are basically down to the servers aren’t communicating. The DNS seems to be the main issue, like I said. Any example of this would be.. from the HQ network, If I run nslookup billsgs.net the address is 192.168.41.17 which is the CP servers address. With that said, no one can “access” the active directory from the HQ location. Meaning.. \\billsgs.net is inaccessible via the HQ network.

  1. You are right AD issues are almost always DNS issues. I think the issue is with having the firewall set as a secondary DNS on your DC IP settings. Remove that from the NIC configuration and instead add the firewall as a forwarder in the DNS configuration.

    This will force all DNS resolution to start with the Windows DNS and addresses it doesn’t know about will be queried through the forwarder.

    Once you reset the DNS settings, run ipconfig /registerdns on the DC to fix the AD registrations in DNS.

    Also, all your Windows servers and clients should point only to this DNS. If you need an alternate DNS, install DNS on another server (it does not need to be a DC to run DNS).