Identifying row in SQL view (or table) with a calculation error

Computed columns can be a great tool if you want to add a value to a table that’s dependent on other columns in the table, and you don’t want to maintain it on its own every time the source columns change. In a view, these can provide a single, consistent calculation to the end users – say, for a

-- Set up a view with the same columns and calculation errors in it
-- The first calculation will always work
-- The second gives a divide by zero error on every 10th row
CREATE VIEW SourceObject AS
SELECT object_id AS KeyColumn1,
       column_id as KeyColumn2,
       object_id - column_id as CalcColumn1,
       (object_id - (column_id % 10)) as CalcColumn2
  FROM msdb.sys.columns

Now that it’s set up, we can try selecting all the rows from the view we just created, and we’ll see about 100 rows output, and then the query will stop with a “Divide by zero” error:

SELECT * FROM SourceObject

The calculation in this query is pretty straightforward, and you can see which rows are causing a problem (where column_id is divisible by 10), but what if it was more complicated? The problem is that SQL doesn’t display the row that had a problem – it stops on the row before the problem, so finding the row with the error is bit more difficult. If there were multiple columns involved in the calculation, or different combinations of values that could result in an error? Tracking down the rows causing an error can be difficult – you have to find all the possible conditions that could cause an error, and then query for each of them.

This script will allow you to find all the rows in a view with a calculation error, all at once. It uses a cursor to pull the rows from the view one at a time, test the calculation, and then write any errors it finds to a table where you can see the rows that are causing problems. Using a CURSOR generally isn’t ideal, but in this case, it’s the only way to react to a bad calculation on a row-by-row basis and deal with it.

The script can use two key values from your view – they’re called KeyColumn1 and KeyColumn2 – and you can modify the script to name them whatever you want, or just a single a value if that makes more sense in your scenario. It also supports two computed columns – CalcColumn1 and 2 – though again, it could be changed to just check a single column.

 -- Set up variables
DECLARE @KeyColumn1 INT,
		@KeyColumn2 INT,
		@CalcColumn1 INT,
		@CalcColumn2 INT
		
DECLARE @CurrentRow BIGINT
	SET @CurrentRow = 1

-- Set up a place to hold key values for rows that work  
  SELECT TOP 0 KeyColumn1, KeyColumn2
    INTO #WorkingRows
    FROM SourceObject

-- Set up a place to hold errors for rows that don't work    
CREATE TABLE #ErrorRows (
	RowNumber BIGINT,
	KeyColumn1 INT,
	KeyColumn2 INT,
	[ERROR_NUMBER] INT,
	[ERROR_MESSAGE] nvarchar(4000)
)    

-- Begin loop to look through rows in the view 
  DECLARE cur INSENSITIVE CURSOR FOR
  SELECT KeyColumn1, KeyColumn2
  FROM SourceObject
  ORDER BY KeyColumn1, KeyColumn2
  
  OPEN cur
  
  FETCH NEXT FROM cur
  INTO @KeyColumn1, @KeyColumn2
	
  WHILE @@FETCH_STATUS = 0
  BEGIN
  
	BEGIN TRY

		-- Try to select the calculated columns	
		-- If there's an error, it will jump to the CATCH block
		SELECT @CalcColumn1 = CalcColumn1,
				@CalcColumn2 = CalcColumn2
		  FROM SourceObject
		 WHERE KeyColumn1 = @KeyColumn1
		   AND KeyColumn2 = @KeyColumn2
		   
		-- This lookup succeeded
		INSERT INTO #WorkingRows
		SELECT @KeyColumn1, @KeyColumn2
		
	END TRY
	BEGIN CATCH
		
		-- The lookup failed - save details
		INSERT INTO #ErrorRows
		SELECT @CurrentRow,
				@KeyColumn1,
				@KeyColumn2,
				ERROR_NUMBER(),
				ERROR_MESSAGE()			
		
	END CATCH
	
	SET @CurrentRow = @CurrentRow + 1
	    
	FETCH NEXT FROM cur
	INTO @KeyColumn1, @KeyColumn2
	    
  END
  
  -- Show the key columns of rows with errors
  SELECT * FROM #ErrorRows
  
  -- Show the key columns of working rows
  SELECT * FROM #WorkingRows
  
  -- Clean things up
  close cur
  deallocate cur
  drop table #ErrorRows
  drop table #workingrows

At the end, you’ll have two tables with results in them – #ErrorRows, which contains the key values for rows with errors in them, as well as details about the error message, and #WorkingRows, which contains the key values for all of the working rows from the view.

Note: I could just as easily set up a table with a computed column in it that causes the same problem You’d be unable to select the entire table without an error, and hunting down the row with an error is painful. The script to find the error is the same, but here’s an example of a table that has a computed column with this problem:

-- Set up table with a list of numbers in it
SELECT object_id AS KeyColumn1,
		RANK() OVER (PARTITION BY 0
					 ORDER BY NEWID()) as KeyColumn2
  INTO SourceObject
  FROM msdb.sys.columns
  
-- Add two calculations to the table
-- The first will always work
-- The second will give a "Divide by zero" every 100 rows
  ALTER TABLE SourceObject ADD CalcColumn1 as (KeyColumn1 - KeyColumn2)
  ALTER TABLE SourceObject ADD CalcColumn2 as (KeyColumn1 / (KeyColumn2 % 100))

-- Note that you can't add a persisted computed column to a table
-- if there's a calculation error in any of the rows, so this
-- command will fail with a "Divide by zero" error
  ALTER TABLE SourceObject ADD CalcColumn3 as (KeyColumn1 / (KeyColumn2 % 100)) PERSISTED

Identifying SQL Server 2005 Enterprise Edition features in use

I recently stumbled across a great system view, sys.dm_db_persisted_sku_features, which identifies any enterprise features in use in the current database, which would prevent you from moving this database to a Standard Edition instance. Unfortunately, it appears in SQL 2008, and I wanted to run this check on a SQL 2005 system.

There are a number of server-level features of SQL 2005 that require Enterprise Edition, but only two database-level features – partitioning and the VarDecimal storage format. Both are easy to check for, so I put together this quick query to do it:

select * from
   (-- VarDecimal Storage Format
    select case
             when max(objectproperty(object_id, N'TableHasVarDecimalStorageFormat')) = 0
               then ''
             else 'VarDecimal Storage'
           end as feature_name
      from sys.objects
    UNION ALL
    -- Partitioning
    select case
             when max(partition_number) > 1
               then 'Partitioning'
             else ''
           end
      from sys.partitions
) t
where feature_name <> ''

On a SQL 2005 server, this query will serve the same purpose that sys.dm_db_persisted_sku_features does on SQL 2008/2012 servers.

Refreshing changed .NET SQL CLR assemblies after patching/updates

After applying some Windows updates to one of my servers, I started getting the following error when I ran a customer .NET SQL-CLR stored proc:

Server: Msg 6522, Level 16, State 2, Line 1
A .NET Framework error occurred during execution of user defined routine or aggregate ‘somemethodyourecalling’:

System.IO.FileLoadException: Could not load file or assembly ‘System.Drawing, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a’ or one of its dependencies. Assembly in host store has a different signature than assembly in GAC. (Exception from HRESULT: 0x80131050)

I’d imported some additional assemblies into SQL Server for use in SQL CLR mapped stored procedures, and the Windows updates had included a service pack for .NET, which changed some of the assemblies. Now the version I’d linked to SQL Server no longer existed on disk, and SQL couldn’t load it – that made sense, and is documented pretty clearly in this MS Support article: http://support.microsoft.com/kb/949080

However, I had dozens of servers with SQL CLR components enabled, and hundreds of different assemblies loaded across them all, and not always the same in each server, so a standard update script wouldn’t work to refresh all the changed assemblies (the MS Support link provides a list of the standard ones that cause that error, but if you’ve got custom assemblies loaded, or you’ve loaded an assembly that’s not specifically cleared for SQL CLR, then it’s not on the list either). To deal with this, I wrote a script that fetches the assembly list for a database and attempts to refresh every one of them from their disk location. If they haven’t changed, the update attempt will fail with a specific error message about the MVID, and there’s no change for that assembly.

Also, I’ve commented out the line that restricts it to just framework assemblies (System.* and Microsoft.*), but you can uncomment that line if you’d like to restrict the refresh from attempting to reload your custom assemblies as well.

DECLARE @AssemblyName VARCHAR(255),
	    @AssemblyLocation VARCHAR(255),
	    @AlterAssemblyCommand NVARCHAR(1024),
	    @DotNetFolder VARCHAR(100)

   SET @DotNetFolder = 'C:\Windows\Microsoft.NET\Framework\v2.0.50727'

CREATE TABLE #Results (
		AssemblyName VARCHAR(255),
		AssemblyLocation VARCHAR(255),
		AlterAssemblyCommand NVARCHAR(1024),
		Results VARCHAR(1024)
)

select sa.name as AssemblyName,
		saf.name as Assemblylocation,
		case when charindex('', saf.name) = 0
			then 'ALTER ASSEMBLY [' + sa.name + '] FROM ''' + @DotNetFolder
			else 'ALTER ASSEMBLY [' + sa.name + '] FROM '''
		end + saf.name + (case right(saf.name, 4) when '.dll' then '' else '.dll' end) + ''''
		as AlterAssemblyCommand
INTO #Refresh
from sys.assemblies sa
join sys.assembly_files saf
  on sa.assembly_id = saf.assembly_id
where sa.name <> ('Microsoft.SqlServer.Types')
  --and (sa.name like 'System.%' or sa.name like 'microsoft.%')

DECLARE Commands CURSOR FAST_FORWARD FOR
SELECT AssemblyName,
	   AssemblyLocation,
	   AlterAssemblyCommand
  FROM #Refresh

OPEN Commands

FETCH NEXT FROM Commands
INTO @AssemblyName,
	   @AssemblyLocation,
	   @AlterAssemblyCommand

WHILE @@FETCH_STATUS = 0
BEGIN

	BEGIN TRY
		exec sp_executesql @AlterAssemblyCommand

		insert into #Results
		SELECT @AssemblyName,
				@AssemblyLocation,
				@AlterAssemblyCommand,
				'Assembly refreshed successfully'

	END TRY
	BEGIN CATCH

		insert into #Results
		SELECT @AssemblyName,
				@AssemblyLocation,
				@AlterAssemblyCommand,
				CASE ERROR_NUMBER()
					WHEN 6285 THEN 'No update necessary (MVID match)'
					WHEN 6501 THEN 'Physical assembly not found at specified location (SQL Error 6501)'
					ELSE ERROR_MESSAGE() + ' (SQL Error ' + convert(varchar(10), ERROR_NUMBER()) + ')'
				END

	END CATCH

	FETCH NEXT FROM Commands
	INTO @AssemblyName,
		   @AssemblyLocation,
		   @AlterAssemblyCommand

END

CLOSE Commands
DEALLOCATE Commands

SELECT * FROM #Results

drop table #refresh
drop table #Results

While troubleshooting the error, I came across this as well – I don’t have a SQL 2012 server handy to check with, it looks like this problem might be resolved with a reboot in SQL 2012:

http://msdn.microsoft.com/en-us/library/hh479773.aspx

I’m not sure if that’s the case, but it would make things easier. Also, it would be nice if there was some kind of “ALTER ASSEMBLY [SomeAssembly] REFRESH” command to address this problem, but maybe in a future version.

Additionally, this error can be caused if the signing of an assembly has changed, but not necessarily the signature, but just reloading it from disk won’t work because the method that SQL Server uses to determine if there’s been a change is the MVID, and that doesn’t change unless there’s been a code change of some kind (see the Connect bug here). In those cases, you’ll actually need to drop and recreate the assembly, including any supporting objects that rely on that assembly.

Calculating working hours between two dates

As a follow-up to an earlier post (Return a list of all dates between a start and end date), I need to find the number of working hours between two timestamps – in this case, it was to see how long a support ticket had been open before it was initially assigned, but the user didn’t want non-work hours to count against them.

To do this, I used the previous script to generate a list of dates and hours, and then marked the rows as work time or not (based on day of week and hour of day, evaluated together). The result was a table that would effectively let me do a SUM to find the value I was looking for. Once I had that table, I could join to it for rows between the two datetimes in question and SUM up rows that had “WorkTime” marked:

SELECT tt.TicketNumber,
       tt.TicketCreateTime,
       tt.TicketAssignTime,
       SUM(  CONVERT(INT, wh.WorkTime)) as WorkHoursBeforeAssigned
       COUNT(CONVERT(INT, wh.WorkTime)) as TotalHoursBeforeAssigned
  FROM TroubleTickets tt
  JOIN #WorkingHours wh
    ON wh.EvaluateTime BETWEEN tt.TicketCreateTime
                           AND tt.TicketAssignTime
GROUP BY tt.TicketNumber,
         tt.TicketCreateTime,
         tt.TicketAssignTime

In this case, tickets that were created and picked up after hours, without passing any worktime, would show as zero hours old (as they should, since they were interested in working time) – however, I’ve also included COUNT here to show total hours as well as work hours.

Also, this script only counts for raw day-of-week and hour-of-day working time – it ignores holidays and other special circumstances. I have a script that tracks holidays (American ones, at least), and I’ll put that up shortly as well – if you want to take holidays into account, you could incorporate that into your evaluation.

Here’s the script that builds the working time table (you can also download it here):

-- Set things up before we get started
--------------------------------------
DECLARE @WorkTimeStart		TINYINT,
		@WorkTimeEnd		TINYINT,
		@WorkDayOfWeekStart	TINYINT,
		@WorkDayOfWeekEnd	TINYINT

DECLARE @StartDate			DATETIME,
		@EndDate			DATETIME

CREATE TABLE #WorkingHours (
		EvaluateTime	DATETIME,
		IsWorktime		BIT DEFAULT(0)
)

--------------------------------------

	SET @WorkTimeStart = 7  --7AM
	SET @WorkTimeEnd   = 16 --4PM hour (4-5PM count as working)
	SET @WorkDayOfWeekStart = 2 --Monday
	SET @WorkDayOfWeekEnd   = 6 --Friday

	SET @StartDate	= '2000-01-01 00:00:00'
	SET @EndDate	= '2020-12-31 23:59:59'

--------------------------------------


-- Built the list of timestamps we're working with
;WITH numberlist(number)
   AS (SELECT RANK() over(order by c1.object_id,
                                   c1.column_id,
                                   c2.object_id,
                                   c2.column_id)
		 from sys.columns c1
        cross 
         join sys.columns c2)
INSERT INTO #WorkingHours (EvaluateTime)
SELECT DATEADD(hh, number-1, @StartDate)
  FROM numberlist
 WHERE DATEADD(hh, number-1, @StartDate) <= @EndDate


-- Set the times to worktime if they match criteria
UPDATE #WorkingHours
   SET IsWorktime = CASE WHEN (DATEPART(dw, EvaluateTime)
								BETWEEN @WorkDayOfWeekStart
								AND @WorkDayOfWeekEnd)
							  AND
							  (DATEPART(hh, EvaluateTime)
							   BETWEEN @WorkTimeStart
							   AND @WorkTimeEnd) THEN 1
						 ELSE 0
					END


-- Retun the results
 SELECT * FROM #WorkingHours
 ORDER BY EvaluateTime

 DROP TABLE #WorkingHours

Clean up vendor names and other data with unwanted numbers/characters

In creating an accounting report, the vendor names we get back from our credit card processor needed some major clean-up: “52334SOUTHWESTAIR1234”, “ABD2343-BLUE DINER 843”, and so on. I initially found a great function for this from Pinal Dave:

http://blog.sqlauthority.com/2007/05/13/sql-server-udf-function-to-parse-alphanumeric-characters-from-string/

But I wanted to make a few enhancements to it:

  1. He leaves numbers in the string and I’d like to remove them
  2. I’d like to combine multiple spaces in a row into a single space, but leave spaces intact

The first is pretty easy to do – just remove the [0-9] and add a space to the PATINDEX. The second one uses a trick from another post I did a few years ago.

Here’s the modified version:

CREATE FUNCTION dbo.UDF_ParseAlphaChars2
(
   @string VARCHAR(8000)
)
RETURNS VARCHAR(8000) WITH SCHEMABINDING
AS
BEGIN
   DECLARE @IncorrectCharLoc SMALLINT
   SET @IncorrectCharLoc = PATINDEX('%[^ A-Za-z]%', @string)

   WHILE @IncorrectCharLoc > 0
   BEGIN
      SET @string = STUFF(@string, @IncorrectCharLoc, 1, '')
      SET @IncorrectCharLoc = PATINDEX('%[^ A-Za-z]%', @string)
   END

   -- Trim groups of spaces into single space
   SET @string = LTRIM(RTRIM(REPLACE(REPLACE(REPLACE(@string,' ','<>'),'><',''),'<>',' ')))

   RETURN @string
END
GO

--Test
SELECT dbo.UDF_ParseAlphaChars2('ABC”_I+{D[]}4|:e;””5,<.F>/?6')
SELECT dbo.UDF_ParseAlphaChars2('52334SOUTHWESTAIR1234')
SELECT dbo.UDF_ParseAlphaChars2('ABD2343-BLUE DINER 843')
GO