Server 2008 R2 – Roles Failure

So, after performing some Windows Updates I was unable to add/remove roles to my Server 2008 R2 Std Edition server. Seems that one of the Windows updates causes a problem with some of the .mum files.

I’ve had this a few times in the past but I’ve never managed to crack exactly what was going on, thankfully I found a website that had decent instructions and I’ve managed to get the feature working again.

http://crosbysite.blogspot.com/2010/12/viewing-server-2008-r2-roles-shows.html

Well worth a read if you’re having the same problem.

Exchange 2003 and Double Take – The correct way to configure your replication

Exchange 2003 and Double Take can, and do work very well together, that is providing you ignore all of the Microsoft best practices for Exchange. What do i mean, well, if you are unfamiliar with my pain caused by our Exchange and Double Take pair you can get an idea from this post. The long and short of it is that we had massive stability issues, applied all of the Microsoft best practices and even opened a call to Microsoft to get some answers. After much searching and getting hold of a really good tech at Double Take we have our answers and i wanted to share them.

Below are the following guide lines for getting Exchange and Double Take working happily together (this seems to be required for all versions, we are happily running the latest build 5.3.1.593.0 which before was unstable) There are a few reg keys to check and amend if nessary, as well as a couple to add.

  • Set – HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\SystemPages = ‘0’
  • Add – “HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Session Manager\Memory Management”
    Value name: PoolUsageMaximum
    Data type: REG_DWORD
    Radix: Decimal
    Value data: 60
  • Add – “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\RepKap”\Parameters
    Value name: DisableKfaiCaching
    Data type: REG_DWORD
    Radix: Decimal
    Value data: 1
  • Remove the /3GB and USERVA=xxxx switches from your boot.ini file if they are present
  • Update your NIC drivers to the latest available
  • Under the adapter advanced properties (R-click, properties – Configure):
    Set Speed & Duplex to “Full”
    Set Checksum Offload to “None”
    Set Large Send Offload to “Disable”

Doing the above has resulted in a stable Exchange server while running our replication. It all boils down to getting the best distribution of the available memory between Exchange and the kernel while still having enough spare for Double Take to do its thing. I hope this helps someone out there.

The specific ins and outs of the above reg keys are beyond the scope of this blog post, please do research their meanings before applying them and ensure that you are happy with your own conclusions before you use them. They have worked wonders for us giving us back a stable, replicated Exchange 2003 server.

This post is provided “as is” and should be used for information only as every environment is different

VMWare Tools Failure

Just a quick update for today, I had an issue with VMWare Tools refusing to uninstall from a server. I ended up deleting registry keys and folders but have since found a far easier way to manually uninstall VMWare Tools, as follows:

  1. Right-click on the virtual machine.
  2. Click Guest > Install/Upgrade VMware Tools.
  3. Open a Console to the virtual machine and log into the guest operating system.
  4. Click Start > Run, type cmd, and click OK to open a command prompt in Windows.
  5. Change the drive to your CD-ROM drive. For example, D:\.
  6. Type setup /c and press Enter to force removal of all registry entries and delete the old version of VMware Tools.
  7. In My Computer, double click the CD-ROM that contains VMware Tools.
  8. After Auto-Run starts, follow the prompts to install.
  9. Note: This must be done from the GUI interface. Do not launch the install by running Setup from the Command Prompt.
  10. When the installation is complete, reboot the guest operating system.

Network List Service – Failure

Here’s an interesting one for you. You have a Windows Server 2008 configured in an Exchange 2007 cluster. You can RDP to the active and passive node and the active node can browse the network. However, the passive node cannot browse the network and the connection icon in the system tray shows that the network connection has a red ‘X’ on it, signifying that there is no network access. However you know this not to be true as you have already RDP’d to the server. What do you do?

Well, I opened the Network and Sharing Center and was presented with an error I’ve never seen before. ‘Network Access may be limited. Your network is not compliant with corporate network requirements’. I instantly suspected a GPO setting was the culprit until I clicked on ‘more information’ to see what Microsoft could suggest.

 

 

 

 

 

 

 

 

 

 

Upon clicking the ‘more information’ link I was told that the Network List Service had failed to start and would I like to start it. Yes, I would! However the NL Service failed to start, citing that if it’s not needed then it automatically stops, which is a bit of a red herring as that’s not accurately correct.

 

 

 

Instead, I found that a DCOM registered attribute needed to be set for the Local Service account in order for it to be enabled so that it could start the service properly. After making the necessary changes the service started first time and I was once more able to browse the network and was informed by the connectivity icon that all was well with the world again.

I took these steps to allow the Local Service account permission on the DCOM object.

  1. Click Start, then Run and type ‘dcomcnfg’ and press OK.
  2. Component Services should start up.
  3. Double click Computers, then My Computer then click DCOM Config.
  4. Look for the friendly name you’re having the issue with, in this case netprofile.
  5. Right click and select Properties
  6. Then select the Security tab and click Launch and Activation Permissions (you need to set them to customized first).
  7. Click Add and type in a user’s account name, in this case ‘Local Service’ and press OK.
  8. Select the name you just created and see that all 4 boxes are checked on Allow.
  • Local Launch
  • Local Activation
  • Press OK and then OK again and then quit the registry window.

So, should you be faced with the same problem you know where the likely culprit lies.

 

Folder Redirection – The Results

With some extensive testing done, I finally can reveal that folder redirection is faster and more reliable than roaming profiles for the organisation. I’ve conducted 2 tests on seperate bits of hardware as a comparisson, using the setup technique discussed in earlier blog entries.

  •  Machine: Toshiba Tecra M11-11K
  •  Network Connection: Wireless G – 54MB

Roaming Profile

Folder Redirection

Logon #1 00:39 Logon #1 03:13
Logoff #1 00:32 Logoff #1 00:12
Logon #2 00:11 Logon #2 00:17
493MB Folder placed on desktop. 493MB Folder placed on desktop.
Logoff #2 04:34 Logoff #2 00:10
Local profile cleared from machine, to represent a new logon or alternative machine logon. Local profile cleared from machine, to represent a new logon or alternative machine logon.
Logon #3 04:30 Logon #3 01:15
Logoff #3 00:39 Logon #3 00:10
  •  Machine: Dell Optiplex 745
  •  Network Connection: Wired 1GB

Roaming Profile

Folder Redirection

Logon #1 02:23 Logon #1 02:57
Logoff #1 00:18 Logoff #1 00:11
Logon #2 00:35 Logon #2 00:31
493MB Folder placed on desktop. 493MB Folder placed on desktop.
Logoff #2 01:26 Logoff #2 00:11
Local profile cleared from machine, to represent a new logon or alternative machine logon. Local profile cleared from machine, to represent a new logon or alternative machine logon.
Logon #3 02:13 Logon #3 02:32
Logoff #3 00:33 Logon #3 00:13
Logon #4 00:39 Logon #3 00:32
Logoff #4 00:23 Logon #3 00:11

It’s worth noting that I did have problems with other GPO’s clashing with the FR GPO and causing it not to run. Thankfully the GPO modelling wizard helped identify exactly which ones were causing the issue and they’ve since been removed as they were redundant anyway.

Overall I’m pleased with the results from Folder Redirection. Once a user has logged on and the initial sync for Offline Files and Folders is complete the entire experience is much quicker and more robust that roaming profiles. Plus over time a roaming profile will bloat as the users inevitably pile up more and more useless data on it. Folder Redirection appears to be a constant winner in log off times thanks to the background sync of any locally stored data actually being at the FR destination already, preempting the logoff sync.

I the end I opted to redirect the following folders:

  • AppData
  • Contacts
  • Desktop
  • Favorites
  • My Documents
  • My Music
  • My Pictures
  • My Videos
No doubt if I exclude AppData I could increase the logon/off speed even further, but I’d prefer for now if that data followed the users around.
Next week I plan to implement this solution for a set of laptop users that connect via WiFi to see if I can make improvements to logon/off times. Some of which are currently being reported to take from 10 minutes up to 2 hours in the worst case scenarios.
To finish, here’s a picture of Monty Python, LEGO style.

Microsoft – Mouse Without Borders

Its not very often that i get excited about a new piece of software thats released, but today was an exception. I have just been shown a nifty little tool by Microsoft called Mouse Without Borders, and i love it!

So what is Mouse Without Borders? Well, its a little bit of software that Microsoft have put together that allows you to control multiple PC’s from one keyboard & mouse. Why am i excited about that? Well its simple, i have my work laptop with dock that comes and goes with me and a multi screen server which has permanent residence under my desk. Until today that meant having two keyboards and two mice littered on my desk so that i could work on both as needed. Not any more! I have just installed it and the 2nd keyboard and mouse has now been moved under my desk, no more typing on the wrong key board or whizzing my mouse over and watching it clang on the side of my monitor, refusing to jump to the next screen.

Mouse without borders is light (1.1MB download) and runs quite happily in your tray bar. Looking in my task manager is using around about 30MB of memory. You get a complex code for pairing up with your other PC’s and you also gain the ability of dragging from one PC to another (files that is, sadly not active windows ;) ), which is really handy (currently tested with a text file but the theory seems sound)

You can read more at the the technet blog as well as a link to download the install. This is one thats going to be added to my personal “toolbox” of bits that i keep on a flash drive, right along side all the PSTools.

Exchange 2003 – Loss of stability and reliability

After spending some time tackling a stability issue on our Exchange 2003 cluster we have finally found some light at the end of the tunnel.

The symptoms

  • Cluster resources failing over to 2nd node, nothing logged in event logs
  • Clients not being able use outlook or things taking in excess of 5 mins to action (opening emails etc)
  • HTTP Virtual Server Instance going into a failed state and wont start
  • Exchange Routing Services Instance failing and causing the cluster to fail over
  • Error 9582 logged in the event log a few days before the failure

With what little clues we where being left in various logs we agreed that there must be some kind of memory issue going on here. A fresh reboot of either node would allow the service to run for 1 – 5 days with out issue, try to fail it back with out a reboot and we where in trouble.

Leaning on the memory theory I came across this TechNet article (Optimizing Memory Usage for Exchange Server 2003) so set about getting some performance logs set up and boy where we shocked. Some of our counters where fine, others where cause for concern.
Pool Pages Bytes where okay, under the 200MB mark, not a lot of headroom, but not a concern given the other counters below
Pool Nonpages Bytes where also okay, in fact this counter has been consistent all the way through this.
Working Set this was good, a nice steady line, with the odd increase followed by a decrease once memory was released by processes.
Free System Page Table Entries Ohh dear, there’s not much more to say to that. Microsoft advise that below 3000 indicates a problem, so boy did we have a problem! Ours was riding around the 2000 mark, some times dipping to 1800. This was going to be an area that needed some attention.
VM Largest Block Size Another cause for alarm this one. Our first sample recorded this around 70MB Okay, so its not at the 32MB must restart the server ASAP situation, but remember that this is a few hours after a fresh reboot, its not giving us much room to move in and is well below the 200MB warning level advised in the above article.

With this data in hand we set about implementing the tweaks suggested above to get this server happy again. After much tweaking we finally got the server into a happy state. Free system page table entries sitting around 20,000 and the VM largest block size moving from 800MB to 400MB depending on up time (so far 400MB has been our lowest). This was great, but the problem didn’t go away, we even removed physical memory from the server (down to 8GB) and burnt 4GB to give the server 4GB in total but it still wasn’t happy.

Time for a support call. We opened a call with Microsoft, explained our problem and we got a technician on the phone less than 4 hours later. Part of the data gathering process involved going through all the applications installed and what they where for if they where non Microsoft. The light came a couple days after this, I was sitting down with my manager while Microsoft crunched their data scrape, Double Take. The technician mentioned in a passing comment that he has had trouble with Double Take in the past so was going to flag it as a “needs looking at”. We pulled all of our alert history for these servers to track down the exact date that this all started, within a week of a Double Take version upgrade.

With the evidence in hand we went and got authorisation to stop replicating our exchange over to our DR site for a couple of weeks and removed Double Take from both nodes. That was Thursday 15th and so far, the Exchange has been up and processing mail ever since. We are going to leave it another week before we break out the bubbly but the initial findings look good.

—Update—
See this post for imformation on how we finally got this issue resolved.

Folder Redirection and best practice

Continuing on from my initial work with FR and OF&F I came across a technet article on best practice for FR.

http://technet.microsoft.com/en-us/library/cc739647(WS.10).aspx

The key points being that you really must use UNC paths for folder redirection and not mapped drives. The reason being that mapped drives are processed after FR in the logon process and as they won’t be mapped at that point FR will infact fall flat on it’s face. A caveat for UNC paths is that they must not exceed a character length of 260, doing so will cause more problems, and will prevent FR from taking place.

Another point was that client side caching should be enabled to ensure that files are synced correctly during the logoff process. This can be enabled as a user or computer GPO setting but bear in mind that if you have both then the computer GPO setting takes precedence over the user setting. For my test lab I’ve kept the setting to the computer portion of the GPO and put the computer object in a it’s own OU to avoid problems (Note: I’ve since found out that this is only relevant to Server 2000/2003 and XP, so irrelevant for my Server 2008 R2 and Windows 7 lab. Instead you simply enable OF&F on the source share and specify what/where is to be made available. I did this on a CIFS server residing on an EMC SAN and made the entire structure available offline, something I will have to test and re-address further down the line).

 

 

I’ve also enabled FR for the ‘My Pictures’ folder and specified that it follows the ‘My Documents’ folder, basically becoming a sub folder.

At this point I’ve hit my first problem. I’ve forced a GP update and have since rebooted the client machine. Now when I browse the documents folder I get the following error: ‘Some library features are unavailable due to unsupported library locations.’ Joy! This is because Windows 7 uses libraries rather than specific file locations (think of the library as an index of locations) which causes problems when you specify UNC path’s as part of that library. Thankfully I can still see my documents, but the error is not going to help when users are faced with a message that they won’t understand.

There is a fix for this problem that involves creating a local folder, redirecting to that and then deleting it and re linking to a UNC path, which seems like an incredibly crappy solution on MS’s part.

MS Fix - http://social.technet.microsoft.com/Forums/en/w7itproui/thread/8975fb07-26ea-455e-b8b9-40bf33662502

Still, we can suppress that error message to prevent it from confusing users. The GPO setting is ‘User Configuration/Policies/Administrative Templates/Windows Components/Windows Explorer – Turn off Windows Libraries features that rely on indexed file data’.

Be aware that by doing this however you are disabling the library index/search feature. I am doing this because the UNC location I will be referring to resides on a SAN. If your location resides on a Windows File Server I would just enable searching and indexing, which will also get rid of the message (as it will be configured correctly, hurrah!).

Now, onto understanding the rest of the redirectable attributes of the user profile and whether or not they should be redirected and if they are to be, what’s best practice?

But before I do, it’s time for tea! (You should probably understand that I write this blog as I work, so I’ve actually been writing this for 2 hours so far and have deleted just as much text as I’ve written. If I wrote this all up at the end of the day it would probably be a once sentence entry entitled ‘Enabled FR for the organisation today, lurp-a-durp’).

So, enabling ‘My Pictures’ to be redirected and follow ‘My Documents’ has caused an issue. That issue being that if you click on the ‘Pictures’ entry in the library in W7 then you do indeed see all of your pictures. However, if you navigate to it via the ‘My Documents’ folder you in fact get nothing. This is because the ‘Pictures’ library entry actually points to 2 locations. The redirected folder and the public pictures library location of C:\Users\Public. I need to remove the local library location using a GPO entry and have only the redirected folder available by navigating Pictures. Otherwise it will cause more user strife. I can see this being an issue for every redirected folder following the Documents folder.

Rather than have ‘My Pictures/Music/Videos’ all follow the documents folder I have instead linked them directly in the GPO. The ‘follow’ function makes more sense for XP workstations as in that iteration of Windows they did actually nest inside the documents folder. In Windows 7 they are a separate entity within the library. So for Music/Videos/Pictures I have set the following:

GPO Setting: User Configuration/Policies/Windows Settings/Folder Redirection/<folder>

Target

  • Target: Basic – Redirect everyone’s folder to the same location.
  • Target folder location: Create a folder for each user under the root path.
  • Root Path: \\<Servername>\<Sharename>\

 

 

 

 

 

 

 

Settings

  • Unselected – Grant the user exclusive rights to <folder>
  • Selected – Move the contents of <folder> to the new location.
  • Unselected – Also apply redirection policy to Windows 2000, Windows 2000 Server, Windows XP, and Windows 2003 operating systems.

Policy Removal

  • Selected – Leave the folder in the new location when policy is removed.

 

 

 

 

 

 

I have ended up enabling FR for the following: AppData (Roaming) / Documents / Pictures / Music / Videos / Favorites / Contacts. I did not enable FR for: Desktop / Downloads / Start Menu / Links / Searches / Saved Games.

I’m leaving FR there for the time being, I’ll continue to update the blog as I work on it but for now it’s working very satisfactory.

Roaming Profiles? Clearly not…

So, having worked in the education sector for quite some time now one of the biggest banes of our existence has been Microsoft’s roaming profiles. I’ve battled with these in the past and lost countless hours/days/months/years to troubleshooting and configuring the beast that is the roaming profile. So much so in fact that I’ve finally had an ‘official’ response from Microsoft stating that “they don’t work as you intend to use them”. Meaning, that as we have so many users (students) and those users can use upwards of 5 different machines each day that we are not using the technology as Microsoft intended for it to be used.

“Pardon?”

Yes that’s right, how dare you ‘roam’ from machine to machine and expect roaming profiles to work. It’s a downright ridiculous idea! The basic premise being that at any time there could be an issue that will cause the roaming profile not to sync correctly. Infact, Microsoft recommended to me that roaming profiles should only be used by people that occasionally roam and that we’d actually be better off using folder redirection and offline files and folders. So there you have it, after all these years I finally had a coherent response from the company themselves. Hurrah!

With that in mind I’ve started testing folder redirection and offline files and folder (herein referenced to as FR and OF&F). Upon tackling the appropriate resource kit (Windows Server 2008 Active Directory for those of you interested) I had a grasp of what it was MS wanted me to do and I’ve since set about setting up a test lab for FR and OF&F. This webpage was of particular importance and it detailed both best practice for share and security permissions, plus an idiots guide to the location of FR in a group policy object.

http://www.itechtalk.com/thread1958.html

So with that in mind my user (Bruce Campbell) was created and he was doubly blessed with a redirected documents folder that was made available offline. Groovy.

I’ll update this blog in stages as I progress with the FR and OF&F set-up, but for now I have the basics working correctly.

For info, if you are setting up share and security permissions, here’s one of the better methods of securing them (rather than using the ‘everyone = read’ copout).

Security Permissions:

  1. Configure the folder to not inherit permissions and remove all existing permissions.
  2. Add the file server’s local Administrators group with Full Control of This Folder, Subfolders, and Files.
  3. Add the Domain Admins domain security group with Full Control of This Folder, Subfolders, and Files.
  4. Add the System account with Full Control of This Folder, Subfolders, and Files.
  5. Add the Creator/Owner with Full Control of Subfolders and Files.
  6. Add the Authenticated Users group with both List Folder/Read Data and Create Folders/Append Data – This Folder Only rights. The Authenticated Users group can be replaced with the desired group, but do not choose the Everyone group as a best practice.

The share permissions of the folder can be configured to grant administrators Full Control and authenticated users Change permissions.

The GPO settings for FR can be found in; User Configuration node, expand Policies, expand Windows Settings, and select the Folder Redirection node.

Invalid Virtual Machine Configuration? No it’s not, I can see it right there!

After some maintenance network infrastructure work on the weekend (firmware upgrades) I came in on Monday morning to my Backup guy disturbing my first cup of tea of the day and telling me that a whole plethora of VM’s we’re failing within VRanger (the VM backup software we use at the organisation). I did a quick investigation and found that not only were the backups failing but other errors were also being logged by the VM’s running on our ESXi infrastructure.

Errors in the Task log were reported as seen below:

Create virtual machine snapshot
<SERVERNAME>
Invalid virtual machine
configuration.
<DOMAIN/USERNAME>
<FQDN SERVERNAME>
19/09/2011 14:00:18
19/09/2011 14:00:18
19/09/2011 14:00:20#

And the event log was showing the following:

Clearly something was causing the problem, snapshots were failing and the guest VM’s were also reporting that files were locked and not accessible.

At first I tried rebooting a single guest VM to see if that would clear away the locked .vmdk files but that was unsuccessful. Upon further investigation I did discover that all the affected VM’s were running on the same Host, which lead me to belive that was the problem. I put the host into maintenance mode, migrated the guest VM’s and then rebooted the host and finally migrated the guest VM’s back again. I then made the call to my backup guy who confirmed that backups had sprung back to life.

So, whatever was causing the issue it was certainly host related. Unfortunately there were no warnings or errors that were different on the host that were not the same as the individual errors being displayed on the guest VM’s. All I do know is that the error was host specific and was ‘possibly’ caused by the network maintenance that was carried out over the weekend. Either way, I had enabled backups to continue after approximately 40 minutes of troubleshooting, all was well with the world again and I could get back to my cup of tea.

As a result of this I thought it was about time I wrote down any live fixes I implement from now one, more as a reference for myself than the outside world, but I do hope at least someone manages to fix a problem thanks to what they’ve read here.

Reading on other websites after I solved the problem I did notice that people experienced this error if they attempted to do anything with the guest VM, so it wasn’t just VRanger specific. Other people had also had problems powering on guests, adding and removing hardware, etc. The bottom line was the with all these problems, everyone came to the same conclusion; that the root cause was host specific and by performing a clean reboot the error was fixed.