Back to Home Page  

Problems with non-trusted computers reporting to ACS (Audit Collection Services)

by Doug Sawert 17. August 2010 22:36

When attempting to load the ACS forwarders on systems not attached to the domain, it seems to go ok, but then, no data ever seems to make it to the server for auditing.

I started troubleshooting by enabling extended logging for the agent.

Dipesh’s blog explains some common issues associated with the agent. The following shows how to enable extended logging.

1. Browse to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\AdtAgent\Parameters

2. Create DWORD value = TraceFlags and set it to a decimal value of 524420.

3. Restart AdtAgent service.

After following the procedures, I ended up seeing the following error frequently in the logfile:

[20100816 122255,058][Info   ]LookupServersReg(): Found HQSCOM01.DirectApps.int:51909.

[20100816 122255,065][Error  ]IoContext(0xC5C630)::OnRecvComplete(): Disconnecting: dwIo = 0, dwError = 0x00000040.

[20100816 122255,065][Warning]IoContext(0xC5C630)::OnRecvComplete(): Disconnecting.

[20100816 122255,065][Error  ]AgentClient::Connect(): Connection status 0x00000040.

[20100816 122255,065][Warning]ConnectToServer(): Calling Disconnect() with error 0x00000040.

 

Solution:

There are a variety of parts associated with this. Some of which I’ll cover here, some of which I presume have already been completed. Keep in-mind, these errors were encountered on a non-trusted host, meaning the forwarding server was not listed in Active Directory. If you are getting the same errors on a host that is trusted (e.g. joined to the domain) then you are likely facing a different problem.

We had already completed the steps of registering a non-trusted computer using certificates. However, the computer was not sending the security auditing information back to ACS and the agent was generating the 0x00000040 errors in the agent log. All other aspects of the agent were working correctly, such as performance monitoring, pings, health reports, etc.

The following steps were performed (as outlined in System Center Operations Manager 2007 Unleashed, p 782):

1.       On the ACS collector, perform the following task once –

a.       Stop the AdtServer service (net stop adtserver at a command prompt with admin privileges)

b.      Change directory to %WinDir%\System32\Security\adtserver (not in the book)

c.       Type “adtserver -c” at the command prompt

d.      You’ll be prompted to pick a certificate that was registered to the ACS collector host and generated by the domain certificate authority

e.      Start the Adtserver service (net start adtserver)

2.       For each certificate-based, ACS forwarder, use the certificates mmc on the forwarder to export the certificate (cer) file for the local computer (when you bring up the mmc, it wants to default to user level, so make sure to change to computer)

3.       On a domain controller for the ACS collector domain, open up active directory users and computers (ADUC). In one of the OUs, create a computer account (using the defaults) with the same name as the forwarder that you exported the certificate from. (to keep management under control, we created a “foreign computers” OU and then made an OU for each client site or grouping of computers.

4.       Right-click on the computer and select the “Name mapping” option. Add the exported certificate.

5.       In the operations manager console, enable the audit health collection option.

6.       On the forwarder, stop the adtagent servers and run adtagent -c at the admin command prompt. Select an existing certificate to import associated with the name of the forwarding computer. Start the adtagent service again.

Once these steps were completed, a review of the adtagent logs showed no further errors and ACS events were now showing up in the audit reports.

DPM Compatible Tape Libraries

by Dave Murphy 12. August 2010 17:52

The link can be difficult to find when searching for tape libraries that are compatible with DPM (Data Protection Manager).

I've enclosed a link to help people find out if their hardware has been tested with DPM. Keep in mind, even if your hardware hasn't been tested, the general rule is that if Windows can successfully see the appliance and access all the features, then DPM will be able to utilize the device. Another good way to test is to run the tape library through regular Windows Backup prior to assigning the device to DPM.

http://technet.microsoft.com/en-us/systemcenter/dm/cc678583.aspx

SCOM 2007 Reporting – Auditing – Accessing the Audit Reports

by Dave Murphy 11. August 2010 18:20

If you have tried to access the “Audit Reports” directly through the SCOM management interface, you have probably been met with a variety of errors. I started out the same way many others have. I’ve enclosed the progression of errors and one of many ways to get around the issue. I do want to credit the book, “System Center Operations Manager 2007 Unleashed”, which explained how ACS runs and how Microsoft initially envisioned the Architecture. Some people have declared this as a bug, however, I think that is a misclassification of the issue.

Microsoft has to design products that appeal to large audiences. SCOM is a robust tool and in the hands of large organizations, teams would split the management functions of the SCOM suite, including auditing. Out of the box, the platform presumes there will be independent security auditors, thereby excluding the rest of the staff from seeing the security reports. For those of us not in small teams, this exclusion becomes a nuisance.

On a freshly loaded SCOM system, if you attempt to access any of the Audit Reports, you should see an error to the following effect (you may see a failure with one of two connection  strings):

“An error has occurred during report processing.

Cannot create a connection to data source ‘DB_Audit’

Cannot create a connection to data source ‘datasource1’

For more information about this error navigate to the report server on the local server machine, or enable remote errors”

 

This is where I started trying to figure out what was wrong. So I started by enabling remote error reporting, which I cover in a previous blog entry. With remote errors enabled, I now receive the following error(s) when I attempt to run the report (by the way, enabling remote errors does not require a reboot or SQL service restart):

“An error has occurred during report processing.

Cannot create a connection to data source ‘DB_Audit’

Cannot create a connection to data source ‘datasource1’

Login failed for user ‘NT AUTHORITY\ANONYMOUS LOGON’

So we have two common errors encountered at this point. The following steps can help you get around the security barrier.

First, create a domain user account in active directory. I called ours ‘acsauditor’ as an example.

Open SQL Server Management Studio

Expand the Security folder, then Logins folder

Right-Click on Logins and select “New Login”

Select the “Search” option, change location to the forest level (from the local computer level) and type in the name of the new account you created

Under the “User Mapping” section of the new login properties, select the “OperationsManagerAC” database

Next, Check the db_datareader role membership in the lower pane

OK out of the screen and now you’re done in SQL Server Management Studio

The next stop is to open a browser window and navigate to hhtp://<reportserver>/reports

Select the “Audit Reports” Datasource Folder

Select “Show Details” in the upper right corner (under the Search For box)

Now select “DB Audit”

Select “Credentials stored securely in the report server”

Enter the new account name in the format of <domain>\<username>

Select “Use as Windows credentials when connecting to the data source”

Apply and close the browser

You should now have access to the reports! If this isn't working for folks, I would definitely be intererested in feedback.

How to: Enable Remote Errors (Reporting Services Configuration)

by Dave Murphy 10. August 2010 20:08

When working with SCOM 2007 Reporting, we had a number of Audit Reports not working. Part of the troubleshooting process involved turning on remote error reporting for SQL Server Reporting Services (SSRS). Enclosed is the article from Microsoft and set of steps necessary to take in order to enable remote error reporting from your SCOM reporting server.

http://msdn.microsoft.com/en-us/library/aa337165(SQL.100).aspx

Enable remote errors through SQL Server Management Studio


  1. Start Management Studio and connect to a report server instance.

  2. Right-click the report server node, and select Properties.

  3. Click Advanced to open the properties page.

  4. In EnableRemoteErrors, select True.

  5. Click OK.

Enable remote errors through script

  1. Create a text file and copy the following script into the file.

    Public Sub Main()
      Dim P As New [Property]()
      P.Name = "EnableRemoteErrors"
      P.Value = True
      Dim Properties(0) As [Property]
      Properties(0) = P
      Try
        rs.SetSystemProperties(Properties)
        Console.WriteLine("Remote errors enabled.")
      Catch SE As SoapException
        Console.WriteLine(SE.Detail.OuterXml)
      End Try
    End Sub
    
  2. Save the file as EnableRemoteErrors.rss.

  3. Click Start, point to Run, type cmd, and click OK to open a command prompt window.

  4. Navigate to the directory that contains the .rss file you just created.

  5. Type the following command line, replacing servername with the actual name of your server:

    rs -i EnableRemoteErrors.rss -s http://servername/ReportServer
    

System Center Operations Manager 2007 Cumulative Update 1

by Dave Murphy 4. August 2010 23:21

Passing along information about the latest SCOM Update, Post SP1 hotfixes.

http://support.microsoft.com/kb/2028594

The Microsoft System Center Operations Manager 2007 SP1 Cumulative Update 1 resolves many issues that are found in System Center Operations Manager 2007 SP1. A prerequisite for installation of the Cumulative Update 1 is installation of System Center Operations Manager SP1 Update (KB 971541  (http://support.microsoft.com/kb/971541/ ) ). Cumulative Update 1 addresses the following issues:

  • The Active Alerts report from the data warehouse includes auto-resolved alerts that are no longer active in the Operations Manager Console.
  • The Generic performance report consumes lots of temporary database space and can fail in some instances.
  • The agents that are taken out of maintenance mode revisit old entries in the Application log in certain cases. When this occurs, incorrect alerts are generated.
  • When Antigen Enterprise Manager (AEM) is set up to use SharePoint, reports from Watson are blocked.
  • Reports for Operation Manager 2007 SP1 fail when you use a shared data warehouse after you upgrade a Management Group to System Center Operations Manager 2007 R2.
  • Because of a thread-locking issue, the SDK service stops responding in certain cases.
  • The color of the chart lines does not match the legend in Performance views when the Web Console is used.
  • ACS stops collecting data when the event log uses auto-backup.
  • The Alert View link in the notification emails shows the list of active alerts, instead of the detailed alert description.
  • The allow anonymous discovery property in the Internet Information Services (IIS) Management Pack is incorrect in the console.
  • Management Pack discoveries fail when a double-byte character exists in string data that is returned.

Estimating DPM Replication Time

by Dave Murphy 3. August 2010 18:18

Every now and then, I stumble across a nugget of information from the Microsoft website that is worth sharing. I just came across a link that tells how to estimate the amount of time it would take to replicate information between two DPM servers, presuming one is onsite and the other is at an alternate location. We have found, that on a T1 line, the system will not run at a full 1.5 Mbps as the line will become choked and the DPM server will eventually lose connection with the remote server. On low speed connections, throttling is a must as Data Protection Manager will consume all available bandwidth. We typically use between 1300 and 1200 Kbps as the throttling threshold on a typical 1 Mbps link.

I’ve enclosed the table from the website found at the following link:

http://technet.microsoft.com/en-us/library/ff399619.aspx

Time Required to Transmit Data over a Network at Various Speeds

Data size

Network speed 1 Gbps

Network speed 100 Mbps

Network speed 32 Mbps

Network speed 8 Mbps

Network speed 2 Mbps

Network speed 512 Kbps

1 GB

< 1 minute

< 1 hour

< 1

< 1

1.5

6

50 GB

<10 minutes

1.5 hour

5

18

71

284

200 GB

<36 minutes

6 hours

18

71

284

1137

500 GB

<1.5 hours

15

45

178

711

2844

 

Note

In the preceding table, Gbps = gigabits per second, Mbps = megabits per second, and Kbps = kilobits per second. The figures for a network speed of 1 Gbps assume that the disk speed on the DPM server and the protected computer are not a bottleneck. Typically, the time to complete initial replica (IR) creation can be calculated as follows:

IR: hours = ((data size in MB) / (.8 x network speed in MB/s)) / 3600

Note 1: Convert network speed from bits to bytes by dividing by 8.

Note 2: The network speed is multiplied by .8 because the maximum network efficiency is approximately 80%.

SCOM 2007 Reporting Series - SLA - Availability

by Dave Murphy 28. July 2010 23:07

SCOM 2007 Reporting Series: Availability report for multiple servers. Most system administrators or managed service companies have to account for downtime in their environment. Usually, this is in the form of a service level agreement (SLA), tied to some percentage of uptime or downtime and maintenance periods. This is often where you hear about 5-nine availability (99.999% uptime). System Center Operations Manager has a quick report that can give a great overview of how long servers and services have been running, when planned maintenance cycles occurred, unplanned cycles, when monitoring wasn’t being performed, etc.

It is a great accountability report from a management perspective; it’s a great “state of the environment” report from a sys admin perspective. The report could also be used as one of many tools during performance reviews of system administrators to gauge how well systems are being maintained.

I’ve borrowed a table from a Microsoft blog to illustrate the levels of reporting for system availability. More...

SCOM 2007 Reporting Series - performance - % Processor

by Dave Murphy 27. July 2010 23:29

SCOM 2007 Reporting Series: This entry focuses on % Processor utilization for multiple server targets. This post continues on System Center reporting. The goal simply, is to help get you up and running as quickly as possible within System Center and get you familiar with navigating the system. %Processor Utilization is a basic metric suggesting how busy the CPU on your system is. The default report for System Center only gives the total utilization, not the utilization of individual cores, but we’ll get more into that later. For now, this report will outline how to select a few servers and individually chart the processor utilization for each server within SCOM 2007.

 

In a virtual environment, sustained utilization will often hover between 60-80%, which is fairly normal since you’re likely sharing resources with other virtual guests on the same physical hardware. Microsoft offers some basic guidelines for what to look for here: http://technet.microsoft.com/en-us/library/cc768535(BTS.10).aspx 

 

Otherwise, if you’re still on a physical system, you’ll likely see 10% sustained usage. Long spikes above 90% on a physical piece of dedicated hardware could indicate a problem. However, the problem may not just be associated with the CPU. Excessive disk I/O or Ethernet traffic can increase the burden on the CPU, causing excessive CPU utilization. As with any troubleshooting, the system as a whole should be viewed to start the troubleshooting process of poor performance.  More...

SCOM 2007 Reporting Series - performance - Avg Disk Sec/Write

by Dave Murphy 22. July 2010 01:10

SCOM 2007 Reporting Series: This entry focuses on reporting the Avg. Disk Sec/Write counter for a single server and single logical disk volume. Write latency on a disk volume is another measurement of the overall health of the disk subsystem. We’ve seen, in many examples, where a system will appear to have a healthy storage configuration, only to find that even with a large number of spindles, the device is not configured correctly. For example, by running the latency report against volumes tied to a SAN, we were able to pinpoint unacceptable latency and after speaking with the storage vendor, it was determined that write caching was not turned on.

Here is an interesting calculation to think about: a query that generates 1 million I/Os at 10ms latency takes 2.7 hours to complete.  Bump up the latency to 40ms, and the same query would take 11 hours. More...

SCOM 2007 Reporting Series - performance - Avg Disk Sec/Read

by Dave Murphy 22. July 2010 01:06

System Center Operations Manager is a fantastic tool for system management, providing a greater degree of depth into system management than is available through most platforms. Direct Technology utilizes the product as the foundation of our monitoring platform internally and for customers. This entry is part of an ongoing series on how to navigate around SCOM 2007.

This entry focuses on reporting the Avg. Disk Sec/Read counter for a single server which has multiple, logical drives. A brief video is also included for each to help visualize the navigation through the management console.

“The Avg. Disk sec/Read performance counter indicates the average time, in seconds, of a read of data from the disk. The average value of the Avg. Disk sec/Read performance counter should be under 10 milliseconds. The maximum value of the Avg. Disk sec/Read performance counter should not exceed 50 milliseconds.” More...