Clean up vendor names and other data with unwanted numbers/characters

In creating an accounting report, the vendor names we get back from our credit card processor needed some major clean-up: “52334SOUTHWESTAIR1234”, “ABD2343-BLUE DINER 843”, and so on. I initially found a great function for this from Pinal Dave:

http://blog.sqlauthority.com/2007/05/13/sql-server-udf-function-to-parse-alphanumeric-characters-from-string/

But I wanted to make a few enhancements to it:

  1. He leaves numbers in the string and I’d like to remove them
  2. I’d like to combine multiple spaces in a row into a single space, but leave spaces intact

The first is pretty easy to do – just remove the [0-9] and add a space to the PATINDEX. The second one uses a trick from another post I did a few years ago.

Here’s the modified version:

CREATE FUNCTION dbo.UDF_ParseAlphaChars2
(
   @string VARCHAR(8000)
)
RETURNS VARCHAR(8000) WITH SCHEMABINDING
AS
BEGIN
   DECLARE @IncorrectCharLoc SMALLINT
   SET @IncorrectCharLoc = PATINDEX('%[^ A-Za-z]%', @string)

   WHILE @IncorrectCharLoc > 0
   BEGIN
      SET @string = STUFF(@string, @IncorrectCharLoc, 1, '')
      SET @IncorrectCharLoc = PATINDEX('%[^ A-Za-z]%', @string)
   END

   -- Trim groups of spaces into single space
   SET @string = LTRIM(RTRIM(REPLACE(REPLACE(REPLACE(@string,' ','<>'),'><',''),'<>',' ')))

   RETURN @string
END
GO

--Test
SELECT dbo.UDF_ParseAlphaChars2('ABC”_I+{D[]}4|:e;””5,<.F>/?6')
SELECT dbo.UDF_ParseAlphaChars2('52334SOUTHWESTAIR1234')
SELECT dbo.UDF_ParseAlphaChars2('ABD2343-BLUE DINER 843')
GO

Leave a Reply

Your email address will not be published. Required fields are marked *

Why ask?