There is a useful class CL_DOCX_DOCUMENT provided by SAP which could support read and write access to a word document with file extension “.docx”.
This document gives a brief introduction about its usage and could be used as a starting point to build your own application which needs to manipulate word document via ABAP.
Office OpenXML
Starting with Microsoft Office2007, when you create a new word document, you will get a file with “.docx” file extension by default which follows the Office openXML format. You can find its detailed definition from wiki.
For example, I create a very simple word document which contains a header area, a paragraph with three lines as body, and a picture.
According to Office OpenXML protocal, after you change the file extension from “.docx” to “.zip”, its icon changes to an archive file and thus could be opened via winrar. All information about my sample document are spreaded inside a series of xml files in the archive file ( plus media file like picture, music and video if the word document has such one).
The most efficient way to study is create a word document by yourself, change extension to zip and explore it.
Using CL_DOCX_DOCUMENT to read word document
I use the following sample code to explain how to use this class. In order to avoid unnecessary local variable declaration, I use the new feature “inline declaration” available in release 740. If this version is not available for you, just replace them with old manual declaration for local variable.
DATA: lv_content TYPE xstring,
lo_document TYPE REF TO cl_docx_document.
PERFORM get_doc_binary USING 'C:\Users\i042416\Desktop\test.docx' CHANGING lv_content.
lo_document = cl_docx_document=>load_document( lv_content ).
CHECK lo_document IS NOT INITIAL.
DATA(lo_core_part) = lo_document->get_corepropertiespart( ).
DATA(lv_core_data) = lo_core_part->get_data( ).
DATA(lo_main_part) = lo_document->get_maindocumentpart( ).
DATA(lo_image_parts) = lo_main_part->get_imageparts( ).
DATA(lv_image_count) = lo_image_parts->get_count( ).
DO lv_image_count TIMES.
DATA(lo_image_part) = lo_image_parts->get_part( sy-index - 1 ).
DATA(lv_image_data) = lo_image_part->get_data( ).
ENDDO.
DATA(lo_header_parts) = lo_main_part->get_headerparts( ).
DATA(lv_header_count) = lo_header_parts->get_count( ).
DO lv_header_count TIMES.
DATA(lo_header_part) = lo_header_parts->get_part( sy-index - 1 ).
DATA(lv_header_data) = lo_header_part->get_data( ).
ENDDO.
Comments
(1) you can get a instance of word document via methodcl_docx_document=>load_document. It is necessary to pass the document binary data with type xstring into this method. I don’t list source code of subroutine get_doc_binary as it is not relevant. Just find it from attachment.
(2) The system administrative data like author, creation and last modification date are stored in so called “Core property part”, which could be fetched via document instance got in step1. Once you own the instance of Core property part, you can get its binary data via method get_data().
The returned data has xml format( so does all the left other kinds of parts in this document ) so it could be easily parsed via DOM or SAX parser.
(3) from document instance we can get main part instance. Its binary data includes all the three body line texts with their font color:
(4) The binary data of all pictures embedded in the word document could be retrieved via two steps. Firstly get the image part collection from main part instance and then loop each image part instance from the image collection. The get_part method accepts the index starting from 0. The way to read header block information is exactly the same.
Using CL_DOCX_DOCUMENT to change word document
See the nice document How to – Add Custom XML Parts to Microsoft Word using ABAP from Leon Limson.
You could also achieve the same requirement with the respective class below.
Further reading
If you would like to know how a word template is merged with data from xml file ( for example a response file from web service ), you can find technical detail in my blog Understand how the word template is merged with xml data stream.
要获取更多Jerry的原创文章,请关注公众号"汪子熙":
相关推荐
通过DOI,开发者能够更方便地创建、编辑和管理Excel或Word文档,而无需直接使用VBA。 在DOI开发中,涉及的关键对象包括: 1. **Container**:这是一个用来存放Excel电子表格的容器,通常在对话屏幕中定义。在ABAP...
在文档的内容方面,可以推断文档包括了对ABAP/4 OLE自动化控制器的详细描述,解释了如何在ABAP中使用OLE自动化技术,包括了各种操作的语法和相关提示、示例、以及推荐的最佳实践。 针对ABAP/4 OLE自动化控制器的...
这是执行服务器应用程序特定功能的关键,例如在Word文档中插入文本或在Excel中执行计算。 5. **FREE OBJECT**:释放已经创建的OLE对象。当不再需要与OLE应用程序的连接时,使用此关键字来释放资源,避免内存泄漏。 ...
结合OpenOffice或Microsoft Office,`xtt`可以用来批量生成或更新Word、Excel等文档,提高办公自动化水平。 ### 学习和资源 要掌握`xtt`,你可以参考SAP官方文档、在线教程和社区论坛。`xtt-master`压缩包可能是`...
- 文档API: 要在Word文档中创建表格,通常需要使用与Office兼容的库,例如在Python中使用`python-docx`库,在Java中使用`Apache POI`库。ABAP中可能需要使用SAP的ABAP Office接口或者导出为CSV格式,然后通过Excel...
这部分内容可能会详细介绍ABAP的基础和概念,如数据类型、控制结构、函数模块、报表编写等,以及如何使用ABAP Workbench进行应用程序的生命周期管理。 文档还提到了SAP的一些关键功能和产品,例如: - SAP Archive...
使用 Live Office,可以在 Microsoft Office 文档(PowerPoint、Word、Excel 或 Outlook)中插入来自 Web Intelligence、Crystal Reports 和 Universe 查询的内容。当插入一个对象时,也会插入 SAP BusinessObjects ...
使用 Live Office,可以在 Microsoft Office 文档(PowerPoint、Word、Excel 或 Outlook)中插入来自 Web Intelligence、Crystal Reports 和 Universe 查询的内容。当插入一个对象时,也会插入 SAP BusinessObjects ...
此外,SAP BPS的报表也可以导出到Microsoft Office软件中,比如Excel或Word,这样用户可以在这些应用程序中对数据进行进一步的编辑、格式化和打印。要实现这一点,可能需要在SAP BPS和Office应用程序之间使用开放的...
在计算机编程中,有时我们需要执行一个外部程序来完成某些任务,例如打开一个Word文档、启动一个应用程序等。调用Windows程序函数的目的就是为了让程序能够启动并执行指定的Windows程序。 ### 知识点二:调用...