An error has occurred 0x8007….

This article is for those that don’t know that 0x80070002 is “The system cannot find the file specified.” or that 0x80070020 is “The process cannot access the file because it is being used by another process”.  It seems impossible to memorize all the error codes in Windows and what they mean.  Thankfully there is no need to do this, as there is a utility built into Windows to decode them.

To find out what an error code means launch a command window and run this command slui 0x2a <error code>.  For instance slui 0x2a 0x80070002.  You will get a popup similar to the following:

slui 0x2a

You will need to Show details.  The description is the error code text.

I hope you found this article informative.  If you have anything to add please do so in the comments below.

Can you have too many CPU cores?

As I found out today the answer is yes, if you are deploying a Windows role that requires the WID (Windows Internal Database).  Below is the scenario I ran into and how to workaround the issue.

I had a customer that was attempting to deploy RDS (Remote Desktop Services).  I say attempting as he was having no luck getting connection broker to install properly.  The connection broker, session host, and rdweb roles would install, but the session collection was not being created.  Additionally, my customer was not able to manage RDS in server manager.  After several attempts of installing and removing the RDS components I noticed that the WID service was taking a very long time to start at boot and most times it would just hang.  I figured that we might have an issue with the existing OS or possibly a GPO (Group Policy Object) , so we isolated the server in an OU (Organizational Unit) with blocked inheritance and then added the newly loaded server to the domain.  The deployment still failed.  We then reloaded Windows.  Upon our first attempt at loading RDS it failed in exactly the same way.

At this point we knew the root of the issue was with the WID.  Searching the Internet turned up an article that alluded to a possible issue with configurations over 32 CPU cores.  My customer’s server is going to be used for a very CPU intensive application, so it was configured with 48 CPU cores (96 logical cores).  Since I was fresh out of ideas on what to try next I removed the WID and RDS components.  I then limited the server to 24 CPU cores through msconfig.  After a reboot we were able to deploy RDS without any problems.  To test, we removed the limit on CPU cores and rebooted.  The WID service then behaved exactly as before.

Now that we had the issue nailed down it was time to find a more permanent fix.  Before I get into that, let me detail the symptoms that were observed.  Hopefully this should help the next person that runs into this issue.

The primary behavior we observed was the WID service hung in a starting state.

Additionally we saw the following event in the application log when the WID finally started with more than 32 cores were exposed:

Process 0:0:0 (0xee8) Worker 0x0000000000 appears to be non-yielding on Schedule 47....

Finally the SQL error log contained a similar event:

*******************************************************************************
*
* BEGIN STACK DUMP:
* 07/21/17 09:35:26 spid 4268
*
* Non-yielding Scheduler
*
* *******************************************************************************
Stack Signature for the dump is 0x000000000000009C
External dump process return code 0x20000001.
External dump process returned no errors.

Process 0:0:0 (0x780) Worker 0x0000003077802160 appears to be non-yielding on Scheduler 47. Thread creation time: 13145128446017. Approx Thread CPU Used: kernel 62171 ms, user 7281 ms. Process Utilization 4%. System Idle 96%. Interval: 70052 ms.

 

So how did we fix this?

First we limited the number of CPUs exposed to Windows.  We then loaded SQL Management Studio as my customer was going to load SQL on the server.  We then connected to the WID (\\.\pipe\MICROSOFT##WID\tsql\query).  We set the CPU affinity to only use CPU 0 and CPU 1.  Finally we allowed Windows to see all the CPUs and rebooted.

Here are the steps I would recommend taking to correct this issue.

  1. If the WID and associated roles are loaded, remove them.  This may not be required depending on the role being installed, but it is better to be safe than sorry.
  2. Limit the CPUs exposed to Windows.  The easy way to do this is through msconfig.
    1. Launch msconfig.  Start, Run, msconfig
    2. Click on the Boot tab.
    3. Click Advanced options…
    4. Check the box for Number of processors:
    5. Set the server for 16 or less.
    6. Click OK twice and reboot.
  3. Install the Windows role that requires the WID as you normally would.
  4. Add the -P2 parameter to the WID service
    1. Open the services console (start, run, services.msc)
    2. Locate the Windows Internal Database service
    3. Right-click on the Windows Internal Database service and choose properties
    4. In the Start parameters box add “-P2” without quotes and click OK.  (This will limit the WID to 2 CPUs.  If you want more, change the number.)
  5. Remove the CPU limit imposed in step 2.

 

I would like to thank my colleague Curt for the startup parameter for the WID.  Far easier than loading SQL Management Studio Express.  I hope you found this article informative.  If you have anything to add or any suggestions, please do so in the comments below.

The case of the missing domain controller…

I wanted to talk about an issue today that I see with a great deal of regularity.  Statically setting an external or public DNS (Domain Name System) server in the DNS client settings of a machine that is joined to an Active Directory domain.

external-dns-server

In the above picture we have the Active Directory domain controller as the Preferred DNS server.  However we also have one of the Google public DNS servers as the Alternate DNS server.  At first look, it appears this might be a good idea.  If the Active Directory domain controller goes down, this PC can still resolve names on the Internet. However there is a significant disadvantage to setting up the DNS client in this way.  To understand this disadvantage, we must first understand how the DNS name resolution process works in Windows.

When a Windows system, either client or server, needs to resolve a name it goes through the following process.*

  1. The client checks to see if the name queried is its own.
  2. The client queries the DNS client resolver cache.  Any entries from the hosts file are preloaded to the resolver cache.
  3. Domain Name System (DNS) servers are queried.
  4. If the name is still not resolved, the NetBIOS name resolution sequence is used.

*I have omitted WINS from the process as it is rarely used anymore.

Let’s dive a little deeper into bullet point two.  There are two important takeaways for the DNS client cache.  The first is the time limit a record or lack thereof is cached.  This is typically referred to as TTL (Time To Live).  A positive answer is cached for its TTL or 24 hours whichever is less.  A negative response, that is when the record does not exist or cannot be found, is cached for 5 minutes.  The second takeaway is that clearing the cache can only be achieved by restating the DNS client service, running ipconfig /flushdns or restarting the client.

Now on to bullet point three and our example above.  The Windows DNS client will use the Preferred DNS server first.  If that server fails to respond, even just for a second, the Windows DNS client will switch over to the Alternate DNS server.  The Windows DNS client will not switch back to the Preferred DNS server unless the alternate fails to respond.  In the case of a public DNS server this is unlikely to happen.  If the Windows DNS client does get “stuck” on the alternate server there are three ways to get it to switch back; restart the DNS client service, restart the computer, or modify the DNS client configuration.

So now we have a better understanding of how DNS name resolution and the Windows DNS client work.  Let’s go over why public DNS servers should not be used.  In most environments there will be DNS timeouts on occasion.  As we now understand a DNS timeout can cause the DNS client to switch to the next DNS server in the list.  If the DNS client switches to a public DNS server then queries for internal resources, such as domain controller service records, or other systems on the LAN (Local Area Network), will fail.  So the bottom line here is that using an external DNS server in the DNS client settings can and usually will cause unpredictable behavior.

One final thought.  The default configuration of the Microsoft DNS server will allow Internet names to be resolved.  This is accomplished using root hints.  If a public DNS server must be used for Internet queries, then a DNS forwarder can be added in the DNS server configuration.

I hope you found this article informative.  If you have anything to add or see something that needs a correction, please leave a comment below.

 

 

 

 

 

 

 

 

 

 

We couldn’t create a new partition or locate an existing one.

Good afternoon.  I ran into an issue today I have seen quite a few times.  I had a customer that was trying to load Windows 2012 on a server.  No matter what he tried he would always receive the same error.

“We couldn’t create a new partition or locate an existing one.”
partition error.png

At first glance it might seem like there is an issue with the disk.  That is not the case though.  The problem has to do with the boot priority as setup in the BIOS of the system.  In this case my customer had a Dell server with an SD card.  He had ordered the server with ESXi loaded on the SD card.  Therefore Dell had put the SD card at the top of the boot priority when it was configured at the factory.  The raid controller was second in the boot priority.  The reason this is a problem is that Windows setup needs to create or use an existing partition on the first device in the boot priority.  Compounding the issue is that, when using the setup, Windows can only be loaded on a fixed disk.  Therefore the error is due to the inability to create a system reserved partition for the boot loader files.

Keep in mind this issue can happen on any system that has more than one entry in the boot priority.  For instance, I have also seen this problem when there were multiple hard disk controllers in a server and the wrong one is at the top of the list.

The fix is quite simple.  Go into the bios and change the boot priority to put the device that will have Windows loaded on it at the top.

I hope you have found this article informative.  If you have anything to add, please use the comments section below.

Why are my computers not showing up on the Network in Windows explorer?

I ran across an interesting issue this morning.  I had a customer who was not able to browse for computers on most of his workstations and servers.  The problem seemed to start within the last two weeks.

I checked the workstation, DNS client, network list, and network location awareness services.  They were all running.  The problem turned out to be the Function Discovery Resource Publication service.  This service was not started and was set to manual.  Without this service the computer will not advertise itself and will not be able to discover other computers on the network.

So if no computers are showing in Network in Windows Explorer check the Function Discovery Resource Publication service and verify it is running.  I would also recommend setting it to Automatic start so that everything works correctly after a reboot.

I hope this article has been informative.  If you have anything to add, please use the comments section below.

Where are my file shares?

Good morning.  I ran into an issue I see from time to time in support.  One of my customers was unable to find where his shares were located on the filesystem.  The solution is quite easy to solve with a single command.

Net Share

Running that command will display all shares on the server including hidden and administrative shares along with their paths.  This command will work in any version of Windows and does not require elevation.

I hope you found this article informative.  If you have anything to add, please do so by adding a comment below.

Outlook 2016 and Exchange 2010

Good morning.  It is another fine day in support.  I wanted to share an issue that I have seen a couple of times and want to have it handy for future reference.  I had a customer with an SBS (Small Business Server) 2011 install.  He was adding in Outlook 2016 clients, but could not get any of them to connect with autodiscover.  One key piece of information in this case is that Outlook 2010 and 2013 clients work fine.  With this in mind I checked Google.  I found quite a few articles pointing to disabling MAPI/HTTP.  This should not keep Outlook 2016 from connecting as it will drop down to RPC/HTTP.

In the end I setup an Outlook profile with IMAP.  I was then able to get into Outlook and run an autodiscover test.  When I ran the test I was able to get the error code from the server.  Here is what I saw:

Attempting URL https://myserver.mydomain.com/autodiscover/autodiscover.xml found through SCP
Autodiscover to https://myserver.mydomain.com/autodiscover/autodiscover.xml starting
GetLastError=2147954402; httpStatus=0.

Autodiscover then proceeded to try the default path and failed.  I did a search on this error code and found the following Microsoft article.  My customer was running the latest version of Outlook though.  I ended up doing the workaround at the bottom.  Here it is.

  1. Open Registry Editor.
  2. Locate and then click the following registry subkey:
    HKEY_CURRENT_USER\Software\Microsoft\Office\16.0\Outlook\Autodiscover
  3. On the Edit menu, point to New, and then click DWORD Value.
  4. Type ExcludeHttpsRootDomain, and then press Enter.
  5. On the Edit menu, click Modify, type 1 in the Value data box, and then click OK.
  6. Exit Registry Editor.

Outlook immediately worked after this, and much faster.

I hope this article has been informative.  If you have anything to add or just want to comment, please do so below.