Basic Troubleshooting of Windows 2008 Server

28
Basic Troubleshooting of Windows 2008 Server Boot Menu Options Repair your Computer Along with the options that are usually presented when you press F8 key, - the new option that has been introduced is Repair your Computer, provided if you have installed Windows RE locally to the server. Safe Mode This option starts the server with fewer drivers and services that are running on the server. This option is usually used when you are facing some issues with device drivers. As the server boots the drivers are also loaded on the screen. The list of drivers that are loaded during this process is CD, and hard disk. Services are event log, plug and play, RPC, WMI, and cryptographic. Safe Mode with Networking It's like safe mode but it has the drivers for networking support. It also loads network drivers along with networking services. Safe Mode With Command Prompt This is command line interface if you want to run some commands with cmd. (5)Enable Boot Logging It is used for debugging, it will show which drivers load and failed. It is written in ntbtlog.txt file which is in windows folder. Enable Low Resolution Video Useful to keep the resolution 640*480 mode. Useful to troubleshoot graphical resolutions that are set higher. Last Known Good Configuration If you made any changes to the server and then that server is not logging you into Windows, try this option. This is useful if you have installed any drivers and after reboot you are unable to login. Directory Services Restore Mode This is used only in Active Directory servers. Used to restore AD using methods (Authoritative and Non-authoritative.) Debugging Mode This mode is widely used by developers.

Transcript of Basic Troubleshooting of Windows 2008 Server

Basic Troubleshooting of Windows 2008 ServerBoot Menu OptionsRepair your ComputerAlong with the options that are usually presented when you press F8 key, - the new option that has been introduced is Repair your Computer, provided if you have installed Windows RE locally to the server.Safe ModeThis option starts the server with fewer drivers and services that are running on the server. This option is usually used when you are facing some issues with device drivers. As the server boots the drivers are also loaded on the screen. The list of drivers that are loaded during this process is CD, and hard disk. Services are event log, plug and play, RPC, WMI, and cryptographic.Safe Mode with NetworkingIt's like safe mode but it has the drivers for networking support. It also loads network drivers along with networking services.Safe Mode With Command PromptThis is command line interface if you want to run some commands with cmd.(5)Enable Boot LoggingIt is used for debugging, it will show which drivers load and failed. It is written in ntbtlog.txt file which is in windows folder.Enable Low Resolution VideoUseful to keep the resolution 640*480 mode. Useful to troubleshoot graphical resolutions that are set higher.Last Known Good ConfigurationIf you made any changes to the server and then that server is not logging you into Windows, try this option. This is useful if you have installed any drivers and after reboot you are unable to login.Directory Services Restore ModeThis is used only in Active Directory servers. Used to restore AD using methods (Authoritative and Non-authoritative.)Debugging ModeThis mode is widely used by developers.Disable Automatic Restart on System FailureAs the name suggests, for any errors observed the server will reboot and you can control that with this option.Disable Driver Signature EnforcementWith this option you can allow non signed drivers to load in server system; however the recommendation would be to use the signature drivers wherever possible.(12)Starts Windows NormallyIf by mistake, you press F8 and want to start Windows in normal mode than please press this option.What is Windows Recovery Environment?Actually RE is the part of Windows 2008 Media, to use this feature boot from CD. You need to install and repair your computer. It checks for integrity of hardware, drivers, etc.Tools SectionReliability MonitorIt has been introduced in Windows 2008 Server; it indicates system performance based on hardware, windows and other failures.Event ViewerEvent viewers can give to lots of information. There are three critical logs that are available in 2008 server: Application Logs - Where application related information is written Security Logs - Where security information is noted down System log - Where system components write the vent for E.g. hardwareMSconfigThis utility provides system configuration information, you can use which mode (e.g.-diag.) Mode needs to be used for troubleshooting.Troubleshooting Windows Server 2008 R2 Failover ClustersTop things to look forJul. 19, 2012John Marlin|Windows IT Pro EMAIL INSHARE COMMENTS0I want to discuss some of the troubleshooting techniques that we use with Windows Server 2008 R2 failover clusters. There are many ways to troubleshoot clusters, and some engineers might do things that others might not. So I wanted to pass along some of the most common things to look for and where to look to look for them. With that in mind, lets first talk about the files that youll generally be looking at and their descriptions.Related:4 Failover Clustering Hassles and How to Avoid ThemOne of the first things youll be working with is Failover Cluster Manager, the new interface for managing a cluster. With this tool, youll be managing groups and resources as well as performing some troubleshooting, which Ill explain as I go along. Failover Cluster Manager can be accessed from the Start menu and Administrative Tools.Event ChannelsYoure probably familiar with the System event log. Its where we log critical, error, and warning events. However, its not the only event log location that we write to. Starting in Server 2008, there are additional event channels. Figure 1 shows where to find the channels relevant to failover clustering. Here is where well log all the informational-type events and debug/diagnostic events. Youll find the following list of logs and their channels:

Figure 1:Channels relevant to failover clustering FailoverClustering Diagnostic (ifShow Analytic and Debug Logsis selected) Operational Performance-CSV (ifShow Analytic and Debug Logsis selected) FailoverClustering-Client Diagnostic (ifShow Analytic and Debug Logsis selected) FailoverClustering-Manager Admin Diagnostic (ifShow Analytic and Debug Logsis selected) FailoverClustering-WMIProvider Admin Diagnostic (ifShow Analytic and Debug Logsis selected)If youre starting/stopping the cluster service, or youre moving groups, or groups are coming online and offline, and so on, those events will be logged in the FailoverClustering\Operational log. For example:Event ID: 1061Description: The Cluster Service successfully formed the failover Cluster JohnsCluster

Any failures connecting to other nodes opening Failover Cluster Manager are logged in FailoverClustering-Manager\Admin. For example:Event ID: 4684Description: Failover Cluster Manager could not contact the DNS Servers to resolve name W2K8-R2-NODE2.contoso.com. For more information see the Failover Cluster Manager Diagnostics channel.

If you look at the FailoverClustering-Manager\Diagnostic log, you would see this:Event ID: 4609Description: An error was encountered while attempting to ping W2K8-R2-NODE2.contoso.com. System.ApplicationException: Could not contact one or more DNS Servers. Please verify that DNS configuration is correct and the machine is fully connected to the network.

Event ID: 4612Description: Server W2K8-R2-NODE2.contoso.com ping failed.Just from these events, you can see that there is a problem with the node getting to the DNS server and can start troubleshooting this specific problem. What you might see without looking at these logs is possibly the W2K8-R2-NODE2 showing as down in Failover Cluster Manager. (One of the other logs mentioned above is the FailoverClustering\Diagnostic log. Ill discuss this log a bit later.)Related:New Features of Windows Server 2012 Failover ClusteringFailover Cluster ManagerTo make things a bit easier, you can also view system event errors and warnings from within Failover Cluster Manager. On the main page in the middle pane, there is a Recent Cluster Events link that you can select, as Figure 2 shows. This link provides a handy way to display all warnings and errors that have occurred with Failover Cluster as the source in the past 24 hours. It pulls these events from all nodes and gives you everything in one spot. So theres no need to go to multiple machines and have multiple event logs open that you must switch between.

Figure 2: Recent Cluster EventsYou can use the Query option to look for specific events. On the main page in the left pane, youll see Cluster Events. You can right click Cluster Events and choose Query, or you can select Query from the Actions pane on the right. Figure 3 shows the Cluster Events Filter.

Figure 3: The Cluster Events FilterThis is also a good way to display everything in the same location. For example, suppose youre experiencing the failure of a disk resource. You can bring up Failover Cluster manager and have it query all nodes, the System event log, the error, and the specific date. On the main page, you can see when the disk failed, on what node(s) it failed, and any other pertinent data (such as disk events where a path failed). You also have the ability to save these queries for later use.You have two more options for looking up events. You can look up all resource-failure events for anything in a group, or you can be resource-specific. In the Actions menu, which Figure 4 shows, you can selectShow the critical events for this application(any resource in the group) orShow the critical events for this resource(only the specific resource). Doing so will bring up the query for any of the events in the current event logs on all nodes. This option can also beneficial for determining history and whether the event can be narrowed down to a specific time period or node.

Figure 4: The Failover Cluster Managers Actions menuFor those who remember the Windows 2003 Server Cluster days, this is the Cluster.Log equivalent. Starting in Server 2008 failover clustering, the functionality is more in line with the Windows Event Tracing (ETW) process. Instead of writing to a Cluster.Log text file, it writes it to a Diagnostics log located in the C:\Windows\System32\winevt\logs folder. There are three diagnostics logs that we write to (clusterlog.etl.001, clusterlog.etl.002, and clusterlog.etl.003). Were only going to write to one of these at a time on any given boot. For more information about these log files and how theyre used, check out theUnderstanding the Cluster Debug Log in 2008blog post.This log is enabled and always writing. If you right-click FailoverClustering\Diagnostic and selectDisable log, you can see all the events it has written. If you disable this log, the system will no longer write to it and information wont be saved. If you do this, its best to save the event out as an event log or text file and enable it again. There are essentially three main events youll see: Event 2049 is an informational event. Event 2050 is a warning. Event 2051 is an error.These events will only be from the current diagnostic .ETL being written to. Youll see the event information just as you would the System or Application event log. However, each event will be only one line at a time. So, going event by event through this diagnostic event log can be pretty tedious. You can create a Cluster.Log text file with commands that combine all three of these logs into one to make the review of it much easier.The PowerShell Get-ClusterLog command goes out to all nodes and generates a Cluster.Log on each node and places it in the C:\Windows\Cluster\Reports folder. This would be the Cluster.Log you might be more familiar with from Windows 2003. There are Get-ClusterLog switches you might want to consider, depending on the circumstances. For example, say you can reproduce a failure at will and need to find the reason for the failure. Simply reproduce the problem and use the commandGet-ClusterLog -TimeSpan 5to get data from the past 5 minutes. Because you need only the log from the one node you reproduce the problem on, you could add the NodeNodenameswitch to create the Cluster.Log on this single node. If you have a number of nodes and need to send these logs, it might take some time to connect to each node to get the file. In these circumstances, you could use the -Destination switch. This switch creates a Cluster.Log for each node, copies it to a folder you specify, and tags the name of the machine as part of the file name (e.g., W2K8-R2-Node1_Cluster.Log).Remember that the Cluster.Log youre creating is a snapshot in time. It will take whats there right now and wont update with anything after its generated. When its generated, if theres a Cluster.Log in the Reports folder, it will get deleted to make room for the new one.Resource Host SystemThe next thing I wanted to discuss is the Resource Host System (RHS). One of its responsibilities is to monitor the health of all resources in the cluster. It does this through a series of checks (basic and thorough). If a resource doesnt respond to these checks, RHS will issue the following system event:Event ID: 1230Description: Cluster resource 'Cluster Disk 1' (resource type '', DLL 'clusres.dll') either crashed or deadlocked.

In this instance, the disk didnt respond that the health check that was made. What the cluster will do is fail the resource and restart it to get you back to production. If these checks werent in place, it could lead to a hung machine or no connectivity from a client application.When troubleshooting the RHS event, you must consider the resource. If a disk deadlocks, you would need to consider everything in the disk stack. Was there slow disk I/O? Did you lose a path to the drive? This would be the focus of your troubleshooting. So, next up is reviewing the System event log for disk-related events, looking at Performance Monitor, updating drivers, and so on. If the resource was an IP address or a network name, your focus would be the network stack and everything there.Cluster ValidateThe last thing I want to mention is the Cluster Validate Report. For a cluster to be certified, all components must be listed on the Windows Server Catalog, and it must pass a full Cluster Validate. Many people will run Cluster Validate before the cluster is created or just after. However, if there is a problem later on, few people remember to run Cluster Validate. You can use it as a troubleshooting tool! If youre having some disk problems, run the Storage Tests. If youre having network-communication problems, run the Network Tests. You can also use Cluster Validate to get information about groups, resources, and settings for your currently running Failover Cluster to be referenced at a later time.The nice thing about Cluster Validate is that you can run it even while in production. When you run it and select the Storage Test, it will ask if you want to take the running groups offline, as you see in Figure 5. The default setting is to leave the online groups alone, so production wont be affected. For the Storage Tests, it will test disks that are: In groups that are offline In the available storage group Not a part of the cluster

Figure 5: Running Cluster ValidateEach time you run Cluster Validate, it will create a file in the C:\Windows\Cluster\Reports directory and will tag the date and time as part of the file name. So, every time you run it, it will create a new one and will create the file on all nodes that Cluster Validate was run against.There are other ways to troubleshoot failover clustersI just dont have enough space to cover them all. However, this column should get you started for most of the problems you may face. For more information, check out theAsk the Core Team blogand theClustering and High Availability blog. Happy clustering!

Five Quick Links: Windows Server troubleshooting E-Mail Print A AA AAA inShare Facebook Twitter Share This RSS Reprints

Fromservice failurestoblue screens of death, bringing a Windows Serverback to lifeis a feat for any admin trying to maintain peace within the OS -- and while experience helps streamline the process, a little guidance doesnt hurt either.Here are five quick links that will help you return your Windows Server to good health in no time, including helpful tips and the best tools for server outages.For more tips and tricks, check out ourtroubleshooting topic page.

1.Tackling the top Windows server crashesDated antivirus software, unsuitable storage drivers and excessive filter drivers are some of the most common causes of Windows Server crashes. Expert Bruce Mackenzie-Low breaks down each pitfall and how to avoid them.2.How to reconcile a hung Windows serverThere are several steps that go into troubleshooting Windows Server hangs -- but before admins can figure out the cause of a hung server, they have to lay the groundwork.3.Windows tools mend application crashes and hangsMisbehaving Windows apps can cause headaches for admins trying to keep the OS running smoothly, so Microsoft has developed a tool belt of free utilities to help diagnose and repair Windows outages.4.Debugging Windows print spooler crashesWhile troubleshooting the cause of a Windows print spooler crash isnt always obvious, a utility called ADPlus may be the key to pinpointing the culprit.5.The best utilities for plugging Windows memory leaksPutting a cork in Windows memory leaks involves 24/7 monitoring, but tools like Perfmon and the Windows Kernel Debugger are just a few of the tools that can help get the job done.Troubleshooting Failed Requests Using Tracing in IIS 7ByIIS TeamDecember 12, 2007IntroductionRequest-based tracing is available both in stand-alone IIS Servers and on Windows Azure Web Sites (WAWS) and provides a way to determine what exactly is happening with your requests and why, provided that you can reproduce the problem that you are experiencing. Problems like poor performance on some requests, or authentication-related failures on other requests, or the server 500 error from ASP or ASP.NET can often be difficult to troubleshoot--unless you have captured the trace of the problem when it occurs. the following article discusses failed request tracing on IIS Server. For information about doing this with Windows Azure Web Sitesclick hereFailed-request tracing is designed to buffer the trace events for a request and only flush them to disk if the request "fails," where you provide the definition of "failure". If you want to know why you're getting 404.2 error messages or request start hanging, use failed-request tracing.The tasks that are illustrated in this article include: Enabling the failed-request tracing module Configuring failed-request tracing log-file semantics Defining the URL for which to keep failed request traces, including failure definitions and areas to trace Generating the failure condition and viewing the resulting tracePrerequisitesINSTALL IISYou must install IIS 7 or above before you can perform the tasks in this article. Browse tohttp://localhost/to see if IIS is installed. If IIS is not installed, seeInstalling IIS on Windows Server 2008for installation instructions. When installing IIS, make sure that you also install the following: ASP.NET (under World Wide Web Services - Application Development Features - ASP.NET) Tracing (under World Wide Web Services - Health and Diagnostics - Tracing)LOG IN AS ADMINISTRATOREnsure that the account that you use to log in is the administrator account or is in the Administrators group.Note:Being in the Administrators group does not grant you complete administrator user rights by default. You must run applications as Administrator, which you can do by right-clicking on the application icon and selectingRun as administrator.MAKE A BACKUPYou must make a backup of the configuration before doing the following tasks.To make a backup of the configuration:1. ClickStart->All Programs->Accessories.2. Right-clickCommand Prompt, and then clickRun as administrator.

3. In a command prompt, run the following command:%windir%\system32\inetsrv\appcmd add backup cleanInstallCREATE SAMPLE CONTENT1. Navigate to %systemdrive%\inetpub\wwwroot.2. Move the content to a secure location (in case you want to restore the existing content) or delete it.3. Create a blank file and name it test.asp.4. In the command prompt, navigate to the test.asp file in \inetpub\wwwroot.5. In the test.asp file, paste the following content:Failed Request Tracing Lab

Today's date is DISABLE ASPASP must be disabled for this task. ASP is disabled only as an example and for the purposes of the tasks in this article.To disable ASP:1. Open IIS Manager.2. Double-clickISAPI and CGI Restrictions.

3. SelectActive Server Pages. In theActionspane, clickDenyto disable ASP.

Enable Failed-Request TracingAfter you enable failed-request tracing, you need to configure where the log files will reside. In this task, you will enable failed-request tracing for the Default Web Site and specify where to put the log files. You will then configure the failure for which to generate failure logs.STEP 1 : ENABLE FAILED-REQUEST TRACING FOR THE SITE AND CONFIGURE THE LOG FILE DIRECTORY1. Open a command prompt with administrator user rights.2. Launchinetmgr.3. In theConnectionspane, expand the machine name, expandSites, and then clickDefault Web Site.4. In theActionspane, underConfigure, clickFailed Request Tracing.

5. In theEdit Web Site Failed Request Tracing Settingsdialog box, configure the following: Select theEnablecheck box. Keep the defaults for the other settings.

6. ClickOK.Failed-request tracing logging is now enabled for the Default Web Site. Check the %windir%\system32\inetsrv\config\applicationHost.config file to confirm that the configuration looks as follows:

STEP 2 : CONFIGURE YOUR FAILURE DEFINITIONSIn this step, you will configure the failure definitions for your URL, including what areas to trace. You will troubleshoot a 404.2 that is returned by IIS for any requests to extensions that have not yet been enabled. This will help you determine which particular extensions you will need to enable.1. Open a command prompt with administrator user rights.2. Launchinetmgr.3. In theConnectionspane, expand the machine name, expandSites, and then clickDefault Web Site.4. Double-clickFailed Request Tracing Rules.

5. ClickFinish.6. In theActionspane, clickAdd....7. In theAdd Failed Request Tracing Rulewizard, on theSpecify Content to Tracepage, selectAll content (*). ClickNext.

8. On theDefine Trace Conditionspage, select theStatus code(s)check box and enter404.2as the status code to trace.

9. ClickNext.10. On theSelect Trace Providerspage, underProviders, select theWWW Servercheck box. UnderAreas, select theSecuritycheck box and clear all other check boxes. The problem that you are generating causes a security error trace event to be thrown. In general, authentication and authorization (including ISAPI restriction list issues) problems can be diagnosed by using the WWW Server Security area configuration for tracing. However, because the FREB.xsl style sheet helps highlight errors and warnings, you can still use the default configuration to log all events in all areas and providers.UnderVerbosity, selectVerbose.

11. ClickFinish. You should see the following definition for theDefault Web Site:

IIS Manager writes the configuration to the %windir%\system32\inetsrv\config\applicationHost.configfile by using a tag. The configuration should look as follows: Test and View the Failure Request Log FileIn this task, you will generate a failed request and view the resulting trace log. You already configured IIS to capture trace logs forhttp://localhost/*.asprequests that fail with an HTTP response code of 404.2. Now verify that it works.STEP 1 : GENERATE AN ERROR AND THE FAILURE REQUEST LOG FILE1. Open a new Internet Explorer window.2. Type in the following address:http://localhost/test.asp.3. You should see the following:

STEP 2 : VIEW THE FAILURE REQUEST LOG FILE1. Now that you have generated a failed request, open a command prompt with administrator user rights and navigate to %systemdrive%\inetpub\logs\FailedReqLogFiles\W3SVC1.2. Runstartto start an Internet Explorer window from the directory.

3. Notice a few things here: When IIS writes the failed request log file, it writes one fileper failed request. Afreb.xslstyle sheet is also written, one per directory. This helps when you view the resulting failure request log files (such asfr000001.xmlabove).4. Right-click the log file for the 404.2 error, and clickOpen With -> Internet Explorer. If this is the first time that you are opening a Failed Request Tracing file, you must addabout:internetto the list of trusted sites, since Internet Explorer's Enhanced Security Configuration is enabled by default. If this is the case, you will see the following:

5. In theInternet Explorerdialog box, clickAddto addabout:internetto the list of trusted sites. This allows the XSL to work. You will see the following after addingabout:internetto the list of trusted sites:

A summary of the failed request is logged at the top, with theErrors & Warningstable identifying any events that are WARNING, ERROR, or CRITICAL ERROR in severity. In this example, the WARNING severity level is due to ISAPI RESTRICTION. The image that you tried to load was %windir%\system32\inetsrv\asp.dll.6. Open the raw XML file directly by using a text editor, and look at the contents of each event.SummaryYou have completed two tasks: configured failed request tracing to capture traces for*if IIS returns it with a 404.2 status code; and verified that IIS captured the trace for your request. You also verified that the freb*.xml log file did not contain any other requests for the requests that you made because the requests did not have a 404.2 return code. When you consult the failure log file, you determined that the cause of the failure was that the extension was disabled for that request. You can try other non-HTML pages (like gifs or jpgs) and note that the log file does NOT add these traces. You can also easily change this to be 404, or capture the failure if the request takes longer than 30 seconds by setting thetimeTakenfield in your failureDefinitions.Restore Your BackupNow that you have completed the tasks in this article, you can restore the backup of the configuration. Run the following command with administrator user rights:%windir%\system32\inetsrv\appcmd restore backup cleanInstall