Freigeben über


Easy code: Parse HTML String to get InnerText

Today, I had to get comments, stored in a database, to publish them in a Web Form Application.

These comments were formatted with HTML tags (not well formed) so I needed to parse the data to get only the Inner text.

I developed this piece of code which is very easy … but useful too.

public static string GetInnerHtmltext(string data)
{
  string decode = System.Web.HttpUtility.HtmlDecode(data);
  Regex objRegExp = new Regex("<(.|\n)+?>");
  string replace = objRegExp.Replace(decode, "");
  return replace.Trim ("\t\r\n ".ToCharArray ());
}

Have Fun !!!

Comments

  • Anonymous
    May 24, 2009
    nice code but how to parse html images ? my mail address b is shilpakmlthn@yahoo.co.in

  • Anonymous
    March 29, 2010
    Thanks dude! The code rocks!!!

  • Anonymous
    May 15, 2010
    DUDE. This code sounds awesome... Any tips on how to get this code to parse my stuff... I have a 7 megabyte file full of HTML, I only want the visible text. I am a total newb when it comes to .net. on the other hand I have some experience with C++ and alot with web languages.

  • Anonymous
    October 22, 2011
    need to more info about forms and  blogs