  • 浏览: 638929 次
  • 性别: Icon_minigender_1
  • 来自: 北京

XULRunner with Java: JavaXPCOM Tutorial 3


6 加载页面的W3C DOM访问

6.1 mozdom4java库
  访问W3C DOM树比访问Mozilla的DOM树要好,因为它是一个动态访问HTML和XML的DOM树的标准。为了实现这个,我们使用从Mozilla

DOM到W3C DOM的java Bridge。有一个叫做mozdom4java的项目http://mozdom4java.mozdev.org/index.html。


    // When that button is pressed, then we obtain the HTML document corresponding to  
    // the URL loaded in browser. Next, we extract all its child nodes with 'a' tag name  
    // and print its content.  
    final ToolItem anchorItem = new ToolItem(toolbar, SWT.PUSH);  
    anchorItem.addSelectionListener(new SelectionAdapter() {  
            public void widgetSelected(SelectionEvent event) {  
                    // First, we obtain a Mozilla DOM Document representation  
                    nsIDOMDocument doc = browser.getDocument();  
                    // Get all anchors from the loaded HTML document  
                    nsIDOMNodeList nodeList = doc.getElementsByTagName("a");  
                    for ( int i = 0; i < nodeList.getLength(); i++ ){  
                            // Get Mozilla DOM node  
                            nsIDOMNode mozNode = nodeList.item(i);  
                            // Get the appropiate interface  
                            nsIDOMHTMLAnchorElement mozAnchor =  
                                    (nsIDOMHTMLAnchorElement) mozNode.queryInterface(  
                            // Get the corresponding W3C DOM node  
                            HTMLAnchorElement a = (HTMLAnchorElement)  
                            // Test the HTML element  
                            System.out.println("Tag Name: " + a.getNodeName() + " -- Text: " + a.getTextContent()  
                                            + " -- Href: " + a.getHref());  

6.2 给mozdom4j打补丁来实现mozilla DOM Tree到 W3C DOM Tree的转换
如果我们总想使用W3C DOM Tree,节点的转换可能有点麻烦。我们建议修改mozdom4java。在我们看来,这些修改简化了代码,因为

我们可以忘掉Mozilla DOM节点。最后,当我们讨论XPath时evaluator将返回一个节点的list,操作W3C element比Mozilla的node方

便,换句话说,我们的目标是构建一个可用的web browser,用标准的方法使用它而不用知道Mozilla实现的任何知识。

首先,我们需要下载Java Language Binding for DOM Level 2规范。比较好的做法是下载mozdom4java项目的jar包,


  w3chtml.jar 包含了W3C DOM HTML level 2的接口,分成两个包 org.w3c.dom.html 和 org.w3c.dom.html2
  w3cextension.jar 包含 KeyEvent 类于org.w3c.dom.events包中。

  我们将要创建一个HMTL element的factory,这个类能转换Mozilla DOM element节点为相应的W3C DOM element节点。下面的类就

做了这件事情并且包含了许多注释。它使用了java反射来做前面的事情,这种方式可以让你不需要知道任何Mozilla DOM节点。
package es.ladyr.dom;

    import java.lang.reflect.Field;  
    import java.lang.reflect.Method;  
    import java.util.HashMap;  
    import java.util.Map;  
    import org.mozilla.interfaces.*;  
    import org.w3c.dom.html.HTMLElement;  
    public class HTMLElementFactory {  
            private static HTMLElementFactory instance;  
            private Map<String, String> corresp;  
            private HTMLElementFactory() {  
            public static HTMLElementFactory getInstance(){  
                    if(instance == null){  
                            instance = new HTMLElementFactory();  
                    return instance;  
            public static HTMLElement getHTMLElement(nsIDOMNode nsNode) {  
                    return getInstance().getConcreteNode(nsNode);  
            private void initCorrespondence() {  
                    corresp = new HashMap<String, String>();  
                    corresp.put("a", "Anchor");  
                    corresp.put("applet", "Applet");  
                    corresp.put("area", "Area");  
                    corresp.put("base", "Base");  
                    corresp.put("basefont", "BaseFont");  
                    corresp.put("body", "Body");  
                    corresp.put("br", "BR");  
                    corresp.put("button", "Button");  
                    corresp.put("dir", "Directory");  
                    corresp.put("div", "Div");  
                    corresp.put("dl", "DList");  
                    corresp.put("fieldset", "FieldSet");  
                    corresp.put("font", "Font");  
                    corresp.put("form", "Form");  
                    corresp.put("frame", "Frame");  
                    corresp.put("frameset", "FrameSet");  
                    corresp.put("head", "Head");  
                    corresp.put("h1", "Heading");  
                    corresp.put("h2", "Heading");  
                    corresp.put("h3", "Heading");  
                    corresp.put("h4", "Heading");  
                    corresp.put("h5", "Heading");  
                    corresp.put("h6", "Heading");  
                    corresp.put("hr", "HR");  
                    corresp.put("html", "Html");  
                    corresp.put("iframe", "IFrame");  
                    corresp.put("img", "Image");  
                    corresp.put("input", "Input");  
                    corresp.put("isindex", "IsIndex");  
                    corresp.put("label", "Label");  
                    corresp.put("legend", "Legend");  
                    corresp.put("li", "LI");  
                    corresp.put("link", "Link");  
                    corresp.put("map", "Map");  
                    corresp.put("menu", "Menu");  
                    corresp.put("meta", "Meta");  
                    corresp.put("ins", "Mod");  
                    corresp.put("del", "Mod");  
                    corresp.put("object", "Object");  
                    corresp.put("ol", "OList");  
                    corresp.put("optgroup", "OptGroup");  
                    corresp.put("option", "Option");  
                    corresp.put("p", "Paragraph");  
                    corresp.put("param", "Param");  
                    corresp.put("pre", "Pre");  
                    corresp.put("q", "Quote");  
                    corresp.put("script", "Script");  
                    corresp.put("select", "Select");  
                    corresp.put("style", "Style");  
                    corresp.put("caption", "TableCaption");  
                    corresp.put("td", "TableCell");  
                    corresp.put("col", "TableCol");  
                    corresp.put("table", "Table");  
                    corresp.put("tr", "TableRow");  
                    corresp.put("thead", "TableSection");  
                    corresp.put("tfoot", "TableSection");  
                    corresp.put("tbody", "TableSection");  
                    corresp.put("textarea", "TextArea");  
                    corresp.put("title", "Title");  
                    corresp.put("ul", "UList");  
             * Try to convert a Mozilla DOM node into W3C DOM element. 
             * @param nsNode        node to convert into W3C DOM element. 
             * @return      W3C HTML element corresponding to a Mozilla DOM node. 
            public HTMLElement getConcreteNode(nsIDOMNode nsNode) {  
                    // Only converts element nodes. If the mozilla node  
                    // isn't a Mozilla DOM element, we cannot convert into  
                    // an W3C DOM element  
                    if (nsNode.getNodeType() == nsIDOMNode.ELEMENT_NODE) {  
                            // We use a hashmap to obtain element names from node names  
                            String htmlElementType = corresp.get(nsNode.getNodeName()  
                            // If we don't know the element type, we cannot transform  
                            // that node into W3C DOM element  
                            if(htmlElementType == null){  
                                    return null;  
                            // Compose the class name for the Mozilla DOM element.  
                            String nsClassName = "org.mozilla.interfaces.nsIDOMHTML"  
                                            + htmlElementType + "Element";  
                            // Compose the field name for the element IID  
                            String nsFieldInterfaceName = "NS_IDOMHTML"  
                                            + htmlElementType.toUpperCase() + "ELEMENT_IID";  
                            try {  
                                    // Once we have their names, obtain the class and the field  
                                    Class nsClass = Class.forName(nsClassName);  
                                    Field field = nsClass.getField(nsFieldInterfaceName);  
                                    // Get the field value (is a static field, so the argumentis ignored)  
                                    String iid = (String) field.get(null);  
                                    // Get the apropiate node interface  
                                    Object nsElement = nsNode.queryInterface(iid);  
                                    // Build the W3C DOM Element implementation class name  
                                    // (the package org.mozilla.dom.html contains concrete implementations  
                                    // for the W3C HTML element interfaces)  
                                    String w3cClassName = "org.mozilla.dom.html.HTML"  
                                                    + htmlElementType + "ElementImpl";  
                                    // Obtain the class for the corresponding W3C DOM Element implementation  
                                    Class w3cClass = Class.forName(w3cClassName);  
                                    // Extract the method that must be invoked to transform the element  
                                    Method creationMethod = w3cClass.getMethod("getDOMInstance", nsClass);  
                                    // Invokes getDOMInstance method of corresponding W3C HTML element  
                                    //  which returns an instance of corresponding W3C HTML element  
                                    HTMLElement node = (HTMLElement) creationMethod.invoke(null, nsElement);  
                                    return node;  
                            } catch (Exception e) {  
                                    throw new Error(e);  
                    return null;  

利用我们的HTMLElementFactory类,我们将要修改NodeFactory类。修改后你可以调用org.w3c.dom.Node getNodeInstance

(nsIDOMNode node),当输入是类型是nsIDOMNode.ELEMENT_NODE时,返回的是与之对应的W3C DOM element。

          // Import our factory to create W3C HTML elements from Mozilla DOM elements  
          import es.ladyr.dom.HTMLElementFactory;  
            public static Node getNodeInstance( nsIDOMNode node )  
            if (node == null) {  
                    return null;  
            switch ( node.getNodeType() )  
                case nsIDOMNode.ELEMENT_NODE:  
                    // Use our factory to obtain a W3C HTML DOM element  
                    Node htmlElement = HTMLElementFactory.getHTMLElement(node);  
                    if (htmlElement != null) {  
                            return htmlElement;  
                    } else {  
                            // If factory cannot convert the concrete node (for instance,  
                            // the type is unknown for our factory implementation), then  
                            // returns a generic W3C DOM element  
                            return ElementImpl.getDOMInstance((nsIDOMElement) node  


下面是NodeFactory 类的完整代码:

    /* ***** BEGIN LICENSE BLOCK ***** 
     * Version: MPL 1.1/GPL 2.0/LGPL 2.1 
     * The contents of this file are subject to the Mozilla Public License Version 
     * 1.1 (the "License"); you may not use this file except in compliance with 
     * the License. You may obtain a copy of the License at 
     * http://www.mozilla.org/MPL/ 
     * Software distributed under the License is distributed on an "AS IS" basis, 
     * WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License 
     * for the specific language governing rights and limitations under the 
     * License. 
     * The Original Code is mozdom4java 
     * The Initial Developer of the Original Code is 
     * Peter Szinek, Lixto Software GmbH, http://www.lixto.com. 
     * Portions created by the Initial Developer are Copyright (C) 2005-2006 
     * the Initial Developer. All Rights Reserved. 
     * Contributor(s): 
     *  Peter Szinek (peter@rubyrailways.com) 
     *  Michal Ceresna (michal.ceresna@gmail.com) 
     * Alternatively, the contents of this file may be used under the terms of 
     * either the GNU General Public License Version 2 or later (the "GPL"), or 
     * the GNU Lesser General Public License Version 2.1 or later (the "LGPL"), 
     * in which case the provisions of the GPL or the LGPL are applicable instead 
     * of those above. If you wish to allow use of your version of this file only 
     * under the terms of either the GPL or the LGPL, and not to allow others to 
     * use your version of this file under the terms of the MPL, indicate your 
     * decision by deleting the provisions above and replace them with the notice 
     * and other provisions required by the GPL or the LGPL. If you do not delete 
     * the provisions above, a recipient may use your version of this file under 
     * the terms of any one of the MPL, the GPL or the LGPL. 
     * ***** END LICENSE BLOCK ***** */  
    import org.w3c.dom.Node;  
    import org.mozilla.dom.*;  
    import org.mozilla.interfaces.*;  
    public class NodeFactory  
            private NodeFactory()  
            public static Node getNodeInstance( nsIDOMEventTarget eventTarget )  
                    if (eventTarget == null ) {  
                            return null;  
            nsIDOMNode node = (nsIDOMNode) eventTarget.queryInterface(nsIDOMNode.NS_IDOMNODE_IID);  
            return getNodeInstance(node);  
        public static Node getNodeInstance( nsIDOMNode node )  
            if (node == null) {  
                    return null;  
            switch ( node.getNodeType() )  
                case nsIDOMNode.ELEMENT_NODE:  
                    // Use our factory to obtain a W3C HTML DOM element  
                    Node htmlElement = HTMLElementFactory.getHTMLElement(node);  
                    if (htmlElement != null) {  
                            return htmlElement;  
                    } else {  
                            // If factory cannot convert the concrete node (for instance,  
                            // the type is unknown for our factory implementation), then  
                            // returns a generic W3C DOM element  
                            return ElementImpl.getDOMInstance((nsIDOMElement) node  
                case nsIDOMNode.ATTRIBUTE_NODE: return AttrImpl.getDOMInstance((nsIDOMAttr) node.queryInterface  
                case nsIDOMNode.TEXT_NODE: return TextImpl.getDOMInstance((nsIDOMText) node.queryInterface  
                case nsIDOMNode.CDATA_SECTION_NODE: return CDATASectionImpl.getDOMInstance((nsIDOMCDATASection)  
                case nsIDOMNode.ENTITY_REFERENCE_NODE: return EntityReferenceImpl.getDOMInstance((nsIDOMEntityReference)  
                case nsIDOMNode.ENTITY_NODE: return EntityImpl.getDOMInstance((nsIDOMEntity) node.queryInterface  
                case nsIDOMNode.PROCESSING_INSTRUCTION_NODE: return ProcessingInstructionImpl.getDOMInstance  
    ((nsIDOMProcessingInstruction) node.queryInterface(nsIDOMProcessingInstruction.NS_IDOMPROCESSINGINSTRUCTION_IID));  
                case nsIDOMNode.COMMENT_NODE: return CommentImpl.getDOMInstance((nsIDOMComment) node.queryInterface  
                case nsIDOMNode.DOCUMENT_NODE: return DocumentImpl.getDOMInstance((nsIDOMDocument) node.queryInterface  
                case nsIDOMNode.DOCUMENT_TYPE_NODE: return DocumentTypeImpl.getDOMInstance((nsIDOMDocumentType)  
                case nsIDOMNode.DOCUMENT_FRAGMENT_NODE: return DocumentFragmentImpl.getDOMInstance  
    ((nsIDOMDocumentFragment) node.queryInterface(nsIDOMDocumentFragment.NS_IDOMDOCUMENTFRAGMENT_IID));  
                case nsIDOMNode.NOTATION_NODE: return NotationImpl.getDOMInstance((nsIDOMNotation) node.queryInterface  
                default: return NodeImpl.getDOMInstance(node);  
        public static nsIDOMNode getnsIDOMNode( Node node )  
            if (node instanceof NodeImpl) {  
                NodeImpl ni = (NodeImpl) node;  
                return ni.getInstance();  
            else {       
                return null;  
        private static boolean toLower = true;  
        public static boolean getConvertNodeNamesToLowerCase()  
            return toLower;  
        public static void setConvertNodeNamesToLowerCase(boolean convert)  
            toLower = convert;  
        private static boolean expandFrames = false;  
        public static boolean getExpandFrames()  
            return expandFrames;  
        public static void setExpandFrames(boolean expand)  
            expandFrames = expand;  


最后,我们需要修改ElementImpl类。这个类有两个方法, public String getAttribute(String name) 和 public String

getTagName() ,这个两个方法最后会调用toLowerCase来把结果变成小写。这可能会带来问题,比如,一个anchor的属性可能是



        public String getAttribute(final String name)  
            //METHOD-BODY-START - autogenerated code  
            Callable<String> c = new Callable<String>() { public String call() {  
                String result = getInstanceAsnsIDOMElement().getAttribute(name);  
                return result;  
            return ThreadProxy.getSingleton().syncExec(c);  
            //METHOD-BODY-END - autogenerated code  
        public String getTagName()  
            //METHOD-BODY-START - autogenerated code  
            Callable<String> c = new Callable<String>() { public String call() {  
                String result = getInstanceAsnsIDOMElement().getTagName();  
                return result;  
            return ThreadProxy.getSingleton().syncExec(c);  
            //METHOD-BODY-END - autogenerated code  


6.3 安装我们的补丁来转换Mozilla DOM Tree成 W3CDOM Tree

     patch -p0 < moz4java_patch.diff
6.4 测试补丁后的库

    import org.mozilla.dom.NodeFactory;  
    import org.mozilla.interfaces.*;  
    import org.w3c.dom.html.HTMLAnchorElement;  
    import org.w3c.dom.html.HTMLElement;  
                    final ToolItem anchorItem = new ToolItem(toolbar, SWT.PUSH);  
                    anchorItem.addSelectionListener(new SelectionAdapter() {  
                            public void widgetSelected(SelectionEvent event) {  
    //                               First, we obtain a Mozilla DOM Document representation  
                                    nsIWebBrowser webBrowser = (nsIWebBrowser)browser.getWebBrowser();  
                                    if (webBrowser == null) {  
                                            System.out.println("Could not get the nsIWebBrowser from the Browser  
                                    nsIDOMWindow window = webBrowser.getContentDOMWindow();  
                                    nsIDOMDocument doc = window.getDocument();  
                            // Get all anchors from the loaded HTML document  
                            nsIDOMNodeList nodeList = doc.getElementsByTagName("a");  
                            private void analyzeAnchors(nsIDOMNodeList nodeList) {  
                            for (int i = 0; i < nodeList.getLength(); i++) {  
                                    // Get Mozilla DOM node  
                                    nsIDOMNode mozNode = nodeList.item(i);  
                                    // We are supposing that the NodeList contains only HTMLElements  
                                    // because we only call this method over HTML nodes  
                                    // (NodeFactory.getNodeInstance could returns another node  
                                    //  descendants, depends on the input Mozilla DOM node)  
                                    HTMLElement htmlElement = (HTMLElement) NodeFactory.getNodeInstance(mozNode);  
                                    // We only are interested in anchors  
                                    if (htmlElement instanceof HTMLAnchorElement) {  
                                            HTMLAnchorElement a = (HTMLAnchorElement) htmlElement;  
                                            // Test the HTML element  
                                            System.out.println("Tag Name: " + a.getNodeName()  
                                                            + " -- Text: " + a.getTextContent()  
                                                            + " -- Href: " + a.getHref());  




    **XULRunner详解:下载、安装、配置与实例** XULRunner是一款开源的软件运行时环境,由Mozilla基金会开发,用于支持使用XUL(XML User Interface Language)编写的应用程序。XUL是一种基于XML的标记语言,它允许...

    xulrunner-package:xulrunner deb包装

    该项目包含有助于破解脚本。 如何为Mer配置开发环境 用于嵌入式设备的Gecko引擎开发非常耗资源。 如果要尝试,最好有一台运行Linux的强大计算机,并避免进行虚拟化。 首先,您需要安装Mer platform SDK并输入它。...


    3. **导入相关库**:在Java代码中,你需要引入必要的库,如`xulrunner.jar`,这些库提供了与XULRunner交互的API。 4. **创建浏览器组件**:使用Swing的组件,如`JApplet`或`JFrame`,创建一个容器来承载浏览器组件...


    3. **编写Java代码**:使用特定的API,如XPCOM(交叉平台组件对象模型),与XULRunner进行交互,创建并控制浏览器窗口。 4. **配置界面**:通过XUL描述文件定义浏览器界面布局,可以包含地址栏、标签页、工具栏等...


    《XULRunner:Firefox浏览器内核的深度解析》 XULRunner,全称为“XML User Interface Library Runner”,是Mozilla基金会开发的一个开源软件框架,用于运行使用XUL(XML User Interface Language)界面描述语言...


    NPAPI,作为早期浏览器扩展的接口标准,允许开发者创建能在多种浏览器上运行的插件,如Adobe Flash、Java等。然而,随着技术的发展,由于安全性和性能问题,NPAPI逐渐被取代,但仍有部分应用依赖于它,比如VLC媒体...


    XULRunner 10.0.4 ESR版本提供了稳定性和安全性,而DJNativeSwing则增强了Java应用在用户界面方面的表现。对于需要在Windows环境中开发或运行使用XUL和Swing技术的Java应用的开发者来说,这是一个必不可少的资源。


    【标题解析】:“xulrunner-9.0b4.en-US.win32”是一个特定版本的XULRunner软件,主要用于Windows 32位系统,并且是英文版(en-US)。XULRunner是一个开源的运行环境,它能提供Mozilla Firefox浏览器和其他基于XUL的...


    3. **SDK组件**:XULRunner SDK包含头文件、库文件、示例代码、文档和构建工具,帮助开发者构建基于XUL的应用程序。其中,`xulrunner-sdk`目录下可能包含了以下内容: - 头文件:`.h`文件,用于编程时的类型定义和...


    XULRunner是一个开源的软件运行环境,由Mozilla基金会开发,主要用于支持使用XUL(XML User Interface Language)构建的应用程序。XUL是一种标记语言,类似于HTML,但设计用于创建跨平台的用户界面,尤其适用于桌面...


    org.eclipse.swt.SWTError: No more handles [MOZILLA_FIVE_HOME=''] (java.lang.UnsatisfiedLinkError: Could not load SWT library. Reasons: no swt-mozilla-gtk-4335 in java.library.path no swt-mozilla-gtk ...

    GeckoFx 33.09版本源码加对应版本XULrunner

    3. **XULrunner详解**: - XULrunner是Mozilla开发的一个开源平台,用于运行使用XUL(XML User Interface Language)编写的应用程序。 - 它提供了运行GeckoFx所需的各种库和运行时环境,包括网络连接、插件支持、...


    【xulrunner-】是一个与Firefox早期版本相关的技术,它是Mozilla基金会开发的一个开源运行时环境,用于支持基于XUL(XML User Interface Language)的应用程序运行。XUL是一种XML语言,用于...


    《XULRunner-win64- Swing的视图化展示利器》 在IT行业中,XULRunner是一款至关重要的工具,尤其对于开发者来说,它提供了一个强大的平台来运行基于XUL(XML User Interface Language)的应用程序。...




    3. **XULRunner**:在提供的文件"xulrunner-"中,XULRunner是Firefox的核心组件,它提供了一个运行环境来执行使用XUL(XML User Interface Language)构建的应用程序。XUL是一种XML标记语言...

    xulrunner 29.0版本

    3. **多平台支持**:尽管基于 .NET,Geckofx 仍保留了 XULRunner 的跨平台特性,可以在多个操作系统上运行。 4. **性能优化**:由于直接与底层的 Gecko 引擎交互,Geckofx 能够提供接近原生的性能。 5. **持续更新...


    XULRunner是一个开源的软件运行环境,由Mozilla基金会开发,主要用于支持使用XUL(XML User Interface Language)构建的跨平台应用程序。XULRunner提供了一套完整的库和框架,使得开发者可以构建桌面应用,而无需...


    3. **组件重用**:XULRunner包含了一系列可复用的组件,如JavaScript引擎(SpiderMonkey)、CSS解析器、HTML解析器等。这些组件都是Firefox浏览器的核心部分,确保了稳定性和安全性。 4. **开放标准**:XULRunner...

    GeckoFx-5.0-0.1 and xulrunner-5.0 工具

    标题提到的"GeckoFx-5.0-0.1 and xulrunner-5.0"是两个在IT行业中与Web浏览器渲染引擎相关的关键组件,它们主要用于开发基于.NET Framework的应用程序,特别是那些需要自定义Web浏览功能或者嵌入式浏览器控件的情况...

Global site tag (gtag.js) - Google Analytics