`
dcaoyuan
  • 浏览: 306406 次
社区版块
存档分类
最新评论

HTML Entity Refs and xmerl

阅读更多

According to [erlang-bugs] xmerl and standard HTML entity refs, currently xmerl_scan only recognizes the very limited set of entity references. In brief, if you try to xmerl:scan xml text that includes standard HTML entity refs, such as nbsp, iexcl, pound, frac14, etc. you'll encounter something like:

16> edoc:file("exprecs.erl").
2670- fatal: {unknown_entity_ref,nbsp}
2580- fatal: error_scanning_entity_ref
exprecs.erl, in module header: at line 28: error in XML parser: {fatal,
                         {error_scanning_entity_ref,
                             {file,file_name_unknown},
                             {line,86},
                             {col,18}}}.
** exited: error **

Ulf Wiger said:

... I realize that xmerl can be customized with a rules function which, for example, can handle entity references...

So I take a try by writing a piece of code (html_entity_refs.erl) to parse a HTML entity ref DTD file to ets rules, then:

xmerl_scan:string(XmlText, [{rules, html_entity_refs:get_xmerl_rules()}]).

Yes, this works. But for a 3MB testing file, the parsing took about 30 seconds.

How about convert these entity refs to utf-8 chars first, then apply xmerl_scan to it?
I wrote another piece of code, and now:

xmerl_scan:string(html_entity_refs:decode_for_xml(XmlText)).

This time, the decoding+parsing time is about 5 seconds, it's 6 times faster than ets rules solution.

The html_entity_refs.erl can be got from:
http://caoyuan.googlecode.com/svn/trunk/erlang/html_entity_refs.erl

分享到:
评论

相关推荐

    ASP.NET MVC with Entity Framework and CSS

    ASP.NET MVC with Entity Framework and CSS by Lee Naylor 2016 | ISBN: 1484221362 | English | 608 pages | True PDF | 30 MB This book will teach readers how to build and deploy a fully working example ...

    [ASP.NET MVC] ASP.NET MVC with Entity Framework and CSS (英文版)

    This book will teach readers how to build and deploy a fully working example retail website using Microsoft ASP.NET MVC and Entity Framework technologies and recommendations. This book contains ...

    ASP.NET MVC with Entity Framework and CSS pdf图书+源代码

    在"ASP.NET MVC with Entity Framework and CSS"这本书中,读者可以深入理解如何使用这些技术来创建动态、响应式的web应用。 Entity Framework(EF)是微软提供的一个对象关系映射(ORM)工具,它允许开发者使用...

    ASP.NET MVC with Entity Framework and CSS [2016]

    ASP.NET MVC with Entity Framework and CSS Lee Naylor | 2016 | EPUB| ISBN: 1484221362 | 608 pages | This book will teach readers how to build and deploy a fully working example retail website using ...

    Entity Framework 4.0 and Web Forms

    Entity Framework 4.0和*** Web Forms是微软公司推出的用于构建Web应用程序的技术,这本书主要讲述了如何使用Entity Framework 4.0在*** Web Forms应用程序中实现数据的显示和编辑。 Entity Framework是微软的.NET...

    A Joint Framework for Entity Discovery and Linking in Chinese Questions

    A Joint Framework for Entity Discovery and Linking in Chinese Questions

    Entity.Framework.Tutorial.2nd.Edition.1783550015

    A comprehensive guide to the Entity Framework with insight into its latest features and optimizations for responsive data access in your projects About This Book Create Entity data models from your ...

    Mastering Entity Framework(PACKT,2015)

    Data access is an integral part of any ... You'll learn how to retrieve data by querying the Entity Data Model and understand how to use LINQ to Entities and Entity SQL to query the Entity Data Model.

    Entity Framework SQL Tracing and Caching Provider Wrappers

    总的来说,"Entity Framework SQL Tracing and Caching Provider Wrappers" 是一个强大的辅助工具,它可以帮助开发者深入理解EF的行为,优化SQL查询,以及有效利用缓存机制提升应用性能。通过利用这个工具,开发者...

    springMVC-HttpEntity(ResponseEntity)demo

    在Spring MVC框架中,HttpEntity和ResponseEntity是两个非常重要的概念,它们主要用于处理HTTP请求和响应。本项目“springMVC-HttpEntity(ResponseEntity)demo”是一个实战演示,展示了如何在Spring MVC应用中使用...

    Entity Framework Core Cookbook - Second Edition 2016

    This book will provide .NET developers with this knowledge and guide them through working efficiently with data using Entity Framework Core. Key Features Learn how to use the new features of Entity...

    ASP.NET MVC with Entity Framework and CSS 2016 pdf 0分

    根据给定文件内容,可以看出这是一本关于*** MVC框架,配合Entity Framework数据处理技术和CSS样式设计的书籍。由于提供的信息中包含许多出版相关的信息,如ISBN号码、电子版权信息以及出版社信息,我们可以推测这是...

    Entity Framework Core in Action

    Entity Framework Core in Action teaches you how to access and update relational data from .NET applications. Following the crystal-clear explanations, real-world examples, and around 100 diagrams, you...

    Programming Entity Framework: Building Data Centric

    Written by JuliaLerman, the leading independent authority on the framework,Programming Entity Framework covers it all -- from the Entity DataModel and Object Services to WCF Services, MVC Apps, and ...

    System.Data.Entity

    《深入理解System.Data.Entity》 System.Data.Entity是.NET框架中一个关键的部分,它构成了Entity Framework的核心,这是一个强大的对象关系映射(ORM)框架,用于简化数据库操作。ORM允许开发人员使用面向对象的...

    Pro Entity Framework Core 2 for ASP.NET Core MVC

    Model, map, and access data effectively with Entity Framework Core 2, the latest evolution of Microsoft’s object-relational mapping framework that allows developers to access data using .NET objects,...

    Chinese Entity Linking Comprehensive

    be linked to an entity in the knowledge base), and entity type information for each of the queries. The data included in this package were originally released by LDC to TAC KBP coordinators and ...

    Programming Entity Framework DbContext

    在本篇详细知识点讲解中,将基于给定文件信息,深入探讨Entity Framework(实体框架)中Code First方法的相关知识点。根据文件标题《Programming Entity Framework DbContext》和描述,该文件应该是关于Entity ...

Global site tag (gtag.js) - Google Analytics