microsoft-bug-fixI’ve seen folks all over the interweb and even a client of mine hit this issue, so I wanted to take a moment in this blog to document it for posterity.  It deals with the Usage logging features of SharePoint, and the all too familiar Microsoft bugs. 

Preface

I wanted to set the level of knowledge before moving on the issue we discuss, so if you already get it, feel free to move on. 

SharePoint 2010 came with a Usage and Health Data Collection service application, with a logging database behind it and continued with 2013.  When enabled, SharePoint will periodically collect statistics and information on usage and store them for review either in the Health Analyzer or Health Reports or directly from the database.  It gathers information from multiple sources (server event logs, ULS/trace logs, and usage log files. 

This is enabled in Central Administration, in Monitoring –> Configure usage and health data collection:

Usage Health Data Collection settings SharePoint 2010


It writes usage data it’s collection first to .usage files in the path specified above on the SharePoint server drive:

SharePoint usage log files

Based on the schedule of the Microsoft SharePoint Foundation Usage Data Import timer job, this timer job will take the data in the .usage files and put their data into the logging database.  It then sits in the logging database for a period of time, then once deemed expired it is removed by the Microsoft SharePoint Foundation Usage Data Processing timer job.  These are Microsoft’s descriptions of these timer jobs:

Microsoft SharePoint Foundation Usage Data Import Imports usage log files into the logging database. 30 minutes
Microsoft SharePoint Foundation Usage Data Processing

Checks for expired usage data at the farm level and deletes the data. Expired usage data consists of records in the central usage data collection database that are older than 30 days.

If the Web Analytics Service application is also installed, this timer job aggregates and writes the data to a Web Analytics Reporting database. You can run this timer job manually to force a check on expired data, or to force a usage data import to a Web Analytics application database.

Daily

You can use Set-SPUsageDefinition to dictate retention levels for individual event types.  So then here’s where we arrive at our issue.

The Issue

For some unknown reason, the .usage log files stopped getting pushed to the logging database by the timer job.  This caused a couple different symptoms.  You might see this error in the ULS logs:

  • The Execute method of job definition Microsoft.SharePoint.Administration.SPTimerRecycleJobDefinition (ID 20fea14e-75db-4e86-a1af-d780fd87eb01) threw an exception. More information is included below.

    The timer service was not recycled because the following jobs were still running: Microsoft SharePoint Foundation Usage Data Import
  • Deleting usage log file 'C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\LOGS\***-*******-20130502-1600.usage' after data import. 
    Failed to delete usage log file 'C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\LOGS\***-*******-20130502-1600.usage' after data import. Exception: System.IO.IOException: The process cannot access the file because it is being used by another process.
       

If your DBAs are diligent about their monitoring, they might see that there would be a large amount of entries in the ULSTraceLog tables (my client had 5 million entries of this error):

Trace Management Service unable to delete file 'C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\LOGS\<servername>-20131102-0403.usage'.  Error 00000020

If you see the above, pay attention to the hex error code.  If’ it’s a error 5, that means access denied and it’s like permissions with the WSS_WPG and other groups on the server as discussed here.  If it’s the 00000020 code, you are likely facing the issue we’re discussing here.  For fun I threw the hex in the ERR tool, and one caught my eye:

Microsoft ERR tool

Resolution

All pointers seem to point to the file can’t be removed because something has it locked.  That turned out to be the SharePoint timer service.  If you just restart the timer service, then run the data import timer job, you can watch the .usage logs magically disappear.  As I outline below, the cause of this turned out to be installing updates 2775511 or 2682011

Microsoft has fixed this issue in the December 2013 cumulative update for SharePoint 2010.  In one of the Foundation KBs 2849981 for the CU, you see this:

After you install hotfix 2775511 or hotfix 2682011 on a SharePoint Server 2010 server, the Usage Provider (.usage) files are not deleted from the file system. Additionally, the .usage files keep growing.

Note By default, the .usage files are located in the following path:
C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\LOGS\

So, installing the December 2013 CU is the permanent fix.  But, that was just released, and I would be cautious about applying this CU.  Some users will be fine with just manually restarting the timer service when needed if the logs build up.  I don’t know how long it would take to re-occur, but it might not be all that often to risk installing a new CU.  Obviously test the CU in a test environment, and it may be fine. Or you can uninstall the above hotfix. 

I will save how to truncate the logs for another blog, but I wanted to be sure to let everyone know this has been included in a CU now.

 

Reference Links


Configure usage and health data collection (SharePoint Foundation 2010)
Understanding the Logging Database (SharePoint Server 2010)
TechNet Forum issue
Extending the SharePoint 2010 Health & Usage by Todd Carter

 


For more information on C5 Insight or this blog entry, please Contact Us.