Tuesday, October 19, 2010

How to strip HTML and special character from string.

I have created following function to strip or remove HTML and specail character from string.
Regular Expression for special character:([^\w\+\-\\\/_ ])
Regular Expression for HTML strip:<(.\n)+?>

-------------Code--------------------
public static string StripHTMLAndSpecialChars(string sHTML)
{
string strOutput = StripHTML(sHTML);
System.Text.RegularExpressions.Regex objRegExp = new System.Text.RegularExpressions.Regex(@"([^\w\+\-\\\/_ ])");
//Replace all characters other than ones specified above
strOutput = objRegExp.Replace(strOutput, "");
objRegExp = null;
return strOutput;
}

----------------Function to strip HTML-----------------
public static string StripHTML(string sHTML)
{
string strOutput;
System.Text.RegularExpressions.Regex objRegExp = new System.Text.RegularExpressions.Regex("<(.
\n)+?>", System.Text.RegularExpressions.RegexOptions.IgnoreCase);
//Replace all HTML tag matches with the empty string
strOutput = objRegExp.Replace(sHTML, "");
objRegExp = null;
return strOutput;
}

-----------------------------

1 comment:

  1. Hi,Web development is no longer the anyone can get a job at a Web Company career it was half a decade ago. But the Internet is here to stay with the Web Design Cochin in the number of people using the Web nearly tripled and advances in technology will make Web skills an ongoing need in the corporate world, in government, at academic institutions, and in the nonprofit sector.Thanks....

    ReplyDelete