In creating an accounting report, the vendor names we get back from our credit card processor needed some major clean-up: “52334SOUTHWESTAIR1234”, “ABD2343-BLUE DINER 843”, and so on. I initially found a great function for this from Pinal Dave:
But I wanted to make a few enhancements to it:
- He leaves numbers in the string and I’d like to remove them
- I’d like to combine multiple spaces in a row into a single space, but leave spaces intact
The first is pretty easy to do – just remove the [0-9] and add a space to the PATINDEX. The second one uses a trick from another post I did a few years ago.
Here’s the modified version:
CREATE FUNCTION dbo.UDF_ParseAlphaChars2 ( @string VARCHAR(8000) ) RETURNS VARCHAR(8000) WITH SCHEMABINDING AS BEGIN DECLARE @IncorrectCharLoc SMALLINT SET @IncorrectCharLoc = PATINDEX('%[^ A-Za-z]%', @string) WHILE @IncorrectCharLoc > 0 BEGIN SET @string = STUFF(@string, @IncorrectCharLoc, 1, '') SET @IncorrectCharLoc = PATINDEX('%[^ A-Za-z]%', @string) END -- Trim groups of spaces into single space SET @string = LTRIM(RTRIM(REPLACE(REPLACE(REPLACE(@string,' ','<>'),'><',''),'<>',' '))) RETURN @string END GO --Test SELECT dbo.UDF_ParseAlphaChars2('ABC”_I+{D[]}4|:e;””5,<.F>/?6') SELECT dbo.UDF_ParseAlphaChars2('52334SOUTHWESTAIR1234') SELECT dbo.UDF_ParseAlphaChars2('ABD2343-BLUE DINER 843') GO