The diskshadow command, a hidden gem

Good morning.  In case you haven’t guessed it already I typically write these posts in the morning.  As I write this now it is 6:30AM.  Today I wanted to share a command line utility I just recently discovered.  It has been part of Windows for quite some time though.  At least since Windows Server 2008.  The utility is called diskshadow.  This utility allows direct interaction with VSS (Volume Shadow Copy Service).  You can find the Microsoft technet article here.  In this article I will go over how I used it to troubleshoot a recent issue with VSS.

I was recently troubleshooting a VSS where the snapshot was failing on release.  As is typical, my customer was using a 3rd party backup software.  I wanted to test outside of the backup software, so we installed the Windows Server Backup feature and tried that.  Unfortunately the symptoms were identical.  After quite a bit of digging I ran across the diskshadow utility.  With that utility I received a different error which led me down the path of discovering the problem.  It turned out that the backup software’s filter driver was stepping on VSS and causing the failure.  After removing the backup software, VSS worked without issue.

So how is the diskshadow command used?  It can be used to create a snapshot, mount an existing snapshot, restore a snapshot and several other things.  Below I will cover the commands to take a VSS snapshot, as that is the functionality I find most useful.  To take a snapshot of the C: drive and test the majority of the VSS writers there are just 3 commands that need to be run.

  1. diskshadow (This starts the command and puts you at a diskshadow prompt.  This is similar to ntdsutil and nslookup.)
  2. add volume c: (This adds the C: drive to the snapshot.  You could substitute another drive letter if you want to test a specific writer.  The command can also be repeated with other drive letters to include them in the snapshot.)
  3. create (This starts the snapshot process with VSS.  It is important to note that the create command by itself will create a non-persistent snapshot.  That is the snapshot will be removed on exit from the diskshadow utility.  A persistent snapshot can be created with additional parameters.)

This utility is considerably faster when troubleshooting VSS, taking only about 1-2 minutes to take a snapshot or fail.  It also removes the requirement for a USB drive to temporarily store a backup.  For these reasons I will be using whenever troubleshooting VSS in the future.

I hope you found this article informative.  If you have anything to add or just want to leave a comment, please do so below.

 

Advertisements

The Network Location Awareness service

Good morning.  I wanted to share an issue I see on a regular basis.  This has to do with the NLA (Network Location Awareness) service.  For those that are not aware of this service it is responsible for determining the type and safety of the network(s) the computer is connected to.  There are 3 network classifications that are used.

  • Public – The NLA determines the computer is directly connected to the Internet or is on an unsafe network.  This is also the default profile assigned to a network adapter until one of the other profiles can be determined.
  • Private – The NLA determines the computer is isolated from the Internet by a NAT (Network Address Translation) device or router.
  • Domain – The NLA determines that the computer is connected to a domain.  It does this by attempting to contact a domain controller.  More specifically it performs a DNS (Domain Name System) query for a SRV (Service) record.  It will then make a connection to the domain controller.  If this is all successful, the domain profile is set.

So what is the purpose of the NLA and setting a network profile?  The primary purpose is for the Windows firewall.  Other applications and services can also access this data though.

Now that the NLA service is sufficiently explained, on to the common issue with it.  The NLA service by default is set to Automatic for its startup type.  Normally this works fine and the NLA properly detects the network.  There are some situations though where the service fails to set the profile correctly on startup.  I typically see this on domain controllers in a domain with just one domain controller.  This means that the network stack and DNS server service have to fully initialize and start before the NLA queries the network.  If they do not then the NLA is not able to contact a domain controller and assumes the computer is connected to a private or public network.

Regardless of the reason why the NLA is failing at startup the solution is fairly simple.  I have seen a 100% fix rate with simply setting the service startup type to Automatic (Delayed Start).  Doing this forces the NLA service to wait until all Automatic services have started, giving DNS enough time to start.  I have seen this little trick work with other services when they are having trouble at startup.

I hope you found this article informative.  If I missed anything or you just want to comment, please feel free to do so below.

How to re-deploy VPN in 2016 Essentials in legacy mode.

This is the third article in a series of articles covering VPN in Windows Essentials.  In the first article I covered an issue with VPN and DHCP.  In the second article I covered how to re-deploy VPN with PowerShell in 2016 Essentials.  In this article I will cover how to re-deploy VPN in legacy mode.

  1. First we must clear the configuration. Launch a PowerShell session as administrator.
  2. Run Uninstall-RemoteAccess.  Hit enter when prompted.
  3. Install the RRAS (Routing and Remote Access Service) console by running the following command: Install-WindowsFeature RSAT-RemoteAccess-Mgmt
  4. Run rrasmgmt.msc to launch the RRAS console.
  5. Right-click on the server name and choose “Configure and Enable Routing and Remote Access”
    RRAS 1
  6. Click Next.
  7. Ensure the Custom configuration radio button is selected and click Next.
    RRAS 2
  8. Check the box for VPN and click Next.
    RRAS 3
  9. Click Finish to complete the initial configuration.  You will get a popup indicating a policy was created.  Click OK to continue.
    RRAS 4
  10. When prompted to start the service, click Start service.
  11. RRAS is now running, but there are two more required steps to complete the configuration.  Right-click the server name and choose Properties.
    RRAS 5
  12. Click on the Security tab.  At the bottom of the screen, choose the correct certificate and click Apply.  Click Yes to restart RRAS.
    RRAS certificate
  13. Click the IPv4 tab.  Click the radio button for Static address pool and click the Add button.  Fill in the start IP address and end IP address and click OK twice.
    RRAS static pool
  14. Restart the RRAS service.

At this point RRAS should be configured properly.  Optionally you can disable the unused protocols in RRAS.  To do so right-click on Ports and click Properties.
RRAS ports

Only SSTP is used in Essentials by default, so the other protocols can be removed/minimized.  Highlight IKEv2 and click Configure.  Change the maximum ports to 0 (zero) and click OK.  Click Yes on the popup.  Repeat this with L2TP and GRE.  For PPTP you cannot reduce to zero, but you can reduce to 1 (one).  I also like to reduce the number of ports to match the number of IP addresses in the static pool.  This is to ensure that all connections get a valid IP address.  So I limited the ports to 20 for SSTP.  When complete it should look something like below.
RRAS ports limited

I hope you found this article informative.  If you have anything to add or just want to comment, please do so below.

How to re-deploy VPN in 2016 Essentials with PowerShell

In my previous article I discussed an issue I see commonly with VPN in Essentials.  In that article I gave the fix for all versions of Essentials except 2016.  In this article I will cover the fix for 2016 Essentials.

As stated previously, 2016 Essentials uses PowerShell to configure the VPN.  Here is what the default configuration looks like:

RemoteAccess Default

If you try to manage it in the RRAS (Routing and Remote Access Server) console, you will see this:

legacy mode

The message would imply that you could turn on legacy mode.   This is true, but to turn on legacy mode requires clearing the configuration from RRAS.  Clearing the configuration must be done with PowerShell.  Re-deploying the VPN can be done with both PowerShell and the RRAS console.  Below are the PowerShell commands.

  1. Launch a PowerShell session as administrator.
  2. Run Uninstall-RemoteAccess.  Hit enter when prompted
  3. Run Install-RemoteAccess -VpnType Vpn -IPAddressRange 192.168.16.100,192.168.16.120
    Change the ip addresses to match the range you want to use.  In the command above the start IP address is 192.168.16.100 and the end IP is 192.168.16.120.
  4. It may be necessary to modify the SSL certificate.  To check this run Get-RemoteAccess.  If the SSL certificate matches the one installed by the Essentials anywhere wizard, then you are done.  If not, please proceed to the next step.
  5. Run Set-Location Cert:\LocalMachine\My; Get-ChildItem | Subject,Thumbprint
    You should see output similar to the following:
    certificate 1
  6. Make note of the Thumbprint for the certificate that was created in the anywhere access wizard.
  7. Next assign the certificate to the VPN with the following command:
    Get-ChildItem | ? Thumbprint -eq “C39ED8D5ADC2F73A05A909BE9C4692B43B963FB2” | Set-RemoteAccess
  8. Finally verify the correct certificate is assigned to the VPN with the command:
    Get-RemoteAccess
    RemoteAccess fixed

Clients should be able to connect and access resources via the VPN now.

I hope you found this article informative.  If you have any suggestions or comments please leave them below.

Why I am unable to access any resources on my Essentials VPN?

Windows Server Essentials is a great product.  Easy to configure and it uses the existing network infrastructure to save money and resources. There is a situation that I see fairly regularly with the VPN (Virtual Private Network) on Essentials though.  I have seen this issue on all versions of Essentials from 2011 to 2016.

My customer will setup the VPN using the anywhere access wizard and it completes without any errors.  He/she will then test the connection with a client.  The client connects without a problem, but is unable to access any resources on the Essentials network.

The problem is that RRAS (Routing and Remote Access), the VPN server in Windows, is not able to lease an IP from the DHCP server running on the router.  Failing to lease an IP, Windows reverts to using an APIPA (Automatic Private IP Addressing) address.  This will be an IP in the 169.254.0.0/16 subnet.  More likely than not this is on a different subnet than the rest of the Essentials network.  This effectively isolates the VPN client from the Essentials network.

The fix is quite easy on Essentials 2011, 2012, and 2012 R2.  Simply add a static pool to the VPN server configuration.  Here are the steps:

  1. Install the RRAS management console, if not installed.
    • Run Windows PowerShell as administrator
    • Run the following command: Install-WindowsFeature RSAT-RemoteAccess-Mgmt
  2. Run rrasmgmt.msc to launch the RRAS console
  3. Right-click on the server name and choose properties
    static pool
  4. Click on the IPv4 tab
  5. Click the radio button for “Static address pool”
  6. Click the “Add” button
  7. Fill in the start and end IP address for the pool.  This should be a range that is not included in the router’s DHCP (Dynamic Host Control Protocol) range, but that is part of the same subnet.
  8. Click OK twice.
  9. Restart the Routing and Remote Access service. PowerShell: Restart-Service RemoteAccess

For Essentials 2016 the fix is a bit more involved.  Unfortunately the RRAS configuration cannot be edited to simply add a static pool.  The anywhere access wizard in 2016 uses PowerShell to configure RRAS and disables the RRAS console.  This would be fine, but Microsoft neglected to include a PowerShell command to modify the IP address management.  Since the configuration cannot be modified it must be torn down and re-deployed outside the anywhere access wizard.  I may add this to this article in the future.

I hope this article has been informative.  If you have any comments or suggestions, please post them below.

 

 

An error has occurred 0x8007….

This article is for those that don’t know that 0x80070002 is “The system cannot find the file specified.” or that 0x80070020 is “The process cannot access the file because it is being used by another process”.  It seems impossible to memorize all the error codes in Windows and what they mean.  Thankfully there is no need to do this, as there is a utility built into Windows to decode them.

To find out what an error code means launch a command window and run this command slui 0x2a <error code>.  For instance slui 0x2a 0x80070002.  You will get a popup similar to the following:

slui 0x2a

You will need to Show details.  The description is the error code text.

I hope you found this article informative.  If you have anything to add please do so in the comments below.

Can you have too many CPU cores?

As I found out today the answer is yes, if you are deploying a Windows role that requires the WID (Windows Internal Database).  Below is the scenario I ran into and how to workaround the issue.

I had a customer that was attempting to deploy RDS (Remote Desktop Services).  I say attempting as he was having no luck getting connection broker to install properly.  The connection broker, session host, and rdweb roles would install, but the session collection was not being created.  Additionally, my customer was not able to manage RDS in server manager.  After several attempts of installing and removing the RDS components I noticed that the WID service was taking a very long time to start at boot and most times it would just hang.  I figured that we might have an issue with the existing OS or possibly a GPO (Group Policy Object) , so we isolated the server in an OU (Organizational Unit) with blocked inheritance and then added the newly loaded server to the domain.  The deployment still failed.  We then reloaded Windows.  Upon our first attempt at loading RDS it failed in exactly the same way.

At this point we knew the root of the issue was with the WID.  Searching the Internet turned up an article that alluded to a possible issue with configurations over 32 CPU cores.  My customer’s server is going to be used for a very CPU intensive application, so it was configured with 48 CPU cores (96 logical cores).  Since I was fresh out of ideas on what to try next I removed the WID and RDS components.  I then limited the server to 24 CPU cores through msconfig.  After a reboot we were able to deploy RDS without any problems.  To test, we removed the limit on CPU cores and rebooted.  The WID service then behaved exactly as before.

Now that we had the issue nailed down it was time to find a more permanent fix.  Before I get into that, let me detail the symptoms that were observed.  Hopefully this should help the next person that runs into this issue.

The primary behavior we observed was the WID service hung in a starting state.

Additionally we saw the following event in the application log when the WID finally started with more than 32 cores were exposed:

Process 0:0:0 (0xee8) Worker 0x0000000000 appears to be non-yielding on Schedule 47....

Finally the SQL error log contained a similar event:

*******************************************************************************
*
* BEGIN STACK DUMP:
* 07/21/17 09:35:26 spid 4268
*
* Non-yielding Scheduler
*
* *******************************************************************************
Stack Signature for the dump is 0x000000000000009C
External dump process return code 0x20000001.
External dump process returned no errors.

Process 0:0:0 (0x780) Worker 0x0000003077802160 appears to be non-yielding on Scheduler 47. Thread creation time: 13145128446017. Approx Thread CPU Used: kernel 62171 ms, user 7281 ms. Process Utilization 4%. System Idle 96%. Interval: 70052 ms.

 

So how did we fix this?

First we limited the number of CPUs exposed to Windows.  We then loaded SQL Management Studio as my customer was going to load SQL on the server.  We then connected to the WID (\\.\pipe\MICROSOFT##WID\tsql\query).  We set the CPU affinity to only use CPU 0 and CPU 1.  Finally we allowed Windows to see all the CPUs and rebooted.

Here are the steps I would recommend taking to correct this issue.

  1. If the WID and associated roles are loaded, remove them.  This may not be required depending on the role being installed, but it is better to be safe than sorry.
  2. Limit the CPUs exposed to Windows.  The easy way to do this is through msconfig.
    1. Launch msconfig.  Start, Run, msconfig
    2. Click on the Boot tab.
    3. Click Advanced options…
    4. Check the box for Number of processors:
    5. Set the server for 16 or less.
    6. Click OK twice and reboot.
  3. Install the Windows role that requires the WID as you normally would.
  4. Add the -P2 parameter to the WID service
    1. Open the services console (start, run, services.msc)
    2. Locate the Windows Internal Database service
    3. Right-click on the Windows Internal Database service and choose properties
    4. In the Start parameters box add “-P2” without quotes and click OK.  (This will limit the WID to 2 CPUs.  If you want more, change the number.)
  5. Remove the CPU limit imposed in step 2.

 

I would like to thank my colleague Curt for the startup parameter for the WID.  Far easier than loading SQL Management Studio Express.  I hope you found this article informative.  If you have anything to add or any suggestions, please do so in the comments below.