首页 | 互联网 | IT动态 | IT培训 | Cisco | Windows | Linux | Java | .Net | Oracle | 软件测试 | C/C++ | 嵌入式开发 | 存储世界 | 服务器
网络设备 | IDC | 安全 | 求职招聘 | 数字网校 | 网页设计 | 平面设计 | 技术专题 | 电子书下载 | 教学视频 | 源码下载 | 搜索 | 博客 | 论坛
中国IT实验室Dotnet频道
中国IT教育
Google
首页 ASP.NET  C#  XML/WebService ADO.NET VC.NET VB.NET .NET 资讯动态 专题 RSS订阅 讨论 下载
您现在的位置: 中国IT实验室 >> Dotnet >> 资讯动态 >> 正文

提取HTML代码中文字的C#函数

         方法2:

        public static string DelHTML(string Htmlstring)//将HTML去除
                 {
                     #region
                     //删除脚本
                     Htmlstring =System.Text.RegularExpressions. Regex.Replace(Htmlstring,@"<script[^>]*?>.*?</script>","",System.Text.RegularExpressions.RegexOptions.IgnoreCase);
                     //删除HTML
                     Htmlstring =System.Text.RegularExpressions. Regex.Replace(Htmlstring,@"<(.[^>]*)>","",System.Text.RegularExpressions.RegexOptions.IgnoreCase);
                     Htmlstring =System.Text.RegularExpressions. Regex.Replace(Htmlstring,@"([\r\n])[\s]+","",System.Text.RegularExpressions.RegexOptions.IgnoreCase);
                     Htmlstring =System.Text.RegularExpressions. Regex.Replace(Htmlstring,@"-->","",System.Text.RegularExpressions.RegexOptions.IgnoreCase);
                     Htmlstring =System.Text.RegularExpressions. Regex.Replace(Htmlstring,@"<!--.*","",System.Text.RegularExpressions.RegexOptions.IgnoreCase);
                     //Htmlstring =System.Text.RegularExpressions. Regex.Replace(Htmlstring,@"<A>.*</A>","");
                     //Htmlstring =System.Text.RegularExpressions. Regex.Replace(Htmlstring,@"<[a-zA-Z]*=\.[a-zA-Z]*\?[a-zA-Z]+=\d&\w=%[a-zA-Z]*|[A-Z0-9]","");
                     Htmlstring =System.Text.RegularExpressions. Regex.Replace(Htmlstring,@"&(quot|#34);","\"",System.Text.RegularExpressions.RegexOptions.IgnoreCase);
                     Htmlstring =System.Text.RegularExpressions. Regex.Replace(Htmlstring,@"&(amp|#38);","&",System.Text.RegularExpressions.RegexOptions.IgnoreCase);
                     Htmlstring =System.Text.RegularExpressions. Regex.Replace(Htmlstring,@"&(lt|#60);","<",System.Text.RegularExpressions.RegexOptions.IgnoreCase);
                     Htmlstring =System.Text.RegularExpressions. Regex.Replace(Htmlstring,@"&(gt|#62);",">",System.Text.RegularExpressions.RegexOptions.IgnoreCase);
                     Htmlstring =System.Text.RegularExpressions. Regex.Replace(Htmlstring,@"&(nbsp|#160);"," ",System.Text.RegularExpressions.RegexOptions.IgnoreCase);
                     Htmlstring =System.Text.RegularExpressions. Regex.Replace(Htmlstring,@"&(iexcl|#161);","\xa1",System.Text.RegularExpressions.RegexOptions.IgnoreCase);
                     Htmlstring = System.Text.RegularExpressions.Regex.Replace(Htmlstring,@"&(cent|#162);","\xa2",System.Text.RegularExpressions.RegexOptions.IgnoreCase);
                     Htmlstring =System.Text.RegularExpressions. Regex.Replace(Htmlstring,@"&(pound|#163);","\xa3",System.Text.RegularExpressions.RegexOptions.IgnoreCase);
                     Htmlstring =System.Text.RegularExpressions. Regex.Replace(Htmlstring,@"&(copy|#169);","\xa9",System.Text.RegularExpressions.RegexOptions.IgnoreCase);
                     Htmlstring =System.Text.RegularExpressions. Regex.Replace(Htmlstring, @"&#(\d+);","",System.Text.RegularExpressions.RegexOptions.IgnoreCase);
                     Htmlstring.Replace("<","");
                     Htmlstring.Replace(">","");
                     Htmlstring.Replace("\r\n","");
                     //Htmlstring=HttpContext.Current.Server.HtmlEncode(Htmlstring).Trim();
                     #endregion
                     return Htmlstring;
                 }

上一页  [1] [2] 

【责编:michael】

中国IT教育

相关产品和培训
文章评论
 友情推荐链接
 认证培训
 专题推荐

 ·WEB程序开发--ASP.NET和PHP、JSP究竟学哪个?
 ·五步带你入门XML
 ·关于Java框架技术专题
 ·XML全攻略技术专题
 ·JAVA开源技术介绍专题
 ·Java嵌入式开发之J2ME技术专题
 ·超前体验 Oracle 11g的5个新特性…
 ·揭密使用VB.NET的五个实用技巧
 ·Oracle和SQL Server常用函数对比专题…
 ·展现C#世界 C#程序设计专题…
 今日更新
 社区讨论
 博客论点
 频道精选
 Dotnet频道相关导航