- 浏览: 681203 次
- 性别:
- 来自: 中山
文章分类
最新评论
-
wuhuizhong:
jFinal支持Rest风格吗?可以想spring mvc那样 ...
在JFinal的Controller中接收json数据 -
wuhuizhong:
在jfinal中应如何获取前端ajax提交的Json数据?ht ...
在JFinal的Controller中接收json数据 -
wuhuizhong:
jfinal如何处理json请求的数据:问题: 在某些api接 ...
在JFinal的Controller中接收json数据 -
wuhuizhong:
Ubuntu14.04 安装 Oracle 11g R2 Ex ...
Oracle 11g release 2 XE on Ubuntu 14.04 -
alanljj:
这个很实用,已成功更新,谢过了!
odoo薪酬管理模块l10n_cn_hr_payroll
Starting with Office 2007, Microsoft switched to an XML-based format called Office Open XML (OOXML).
There has been some debate as to how "open" this format really is, given that the specs run to around 7,000 pages (!).
Be that as it may, it's a fact of life that a lot of people use Microsoft's Office suite, and that means we have to deal with this new format in a lot of situations.
The OOXML format is, as it turns out, not so difficult to deal with. The main concept is that an Office document, whether it is a Word document (.docx), Excel spreadsheet (.xlsx) or Powerpoint presentation (.pptx), is actually a compressed (.zip) file that contains a number of XML documents (as well as any image files the user has included in the document).
So to work with OOXML files, we need to be able to zip and unzip files, and to parse and generate XML. Oracle (and PL/SQL) has had good support for XML for a number of years, but (even though there is a UTL_COMPRESS package in the database) there is no built-in zip/unzip support. Of course you could load some Java classes into the database to do it, but dealing with the Java stuff is always a bit of a hassle. But some time ago the good gentleman Anton Scheffer published a PL/SQL implementation based on UTL_COMPRESS that supports zipping and unzipping.
Based on this I have written a package for working with OOXML documents. It's calledOOXML_UTIL_PKG and you can download it as part of (you guessed it) the Alexandria utility library for PL/SQL.
Let's see what this package allows us to do.
Get document properties from a Word (docx) file
First we fire up Word and create a test document:
By the way, you can read and write the new OOXML formats using an older version of Office (as I do in the screenshot above), by downloading the Microsoft Office Compatibility Pack from Microsoft.
After saving the document, we can then extract the document properties using theGET_DOCX_PROPERTIES function, which returns a custom record type called T_DOCX_PROPERTIES.
Extract plain text from a Word (docx) file
Using our test document again, we can extract the plain text of the document using theGET_DOCX_PLAINTEXT function, which returns a CLOB.
This is of course very useful if you want to search and/or index (just) the text of a document, or otherwise work with the content.
Get document properties from an Excel (xlsx) file
Let's first create an Excel test file (again, using Excel 2003 but saving in Excel 2007 format):
(This has to be one of the lamest spreadsheets of all time, but it will do fine as an example. It has some text, some numbers, and a formula.)
Similar to the Word document, we can now use the GET_XLSX_PROPERTIES function, which returns a custom record type called T_XLSX_PROPERTIES.
You'll notice that Word documents and Excel spreadsheets have slightly different properties.
Extract a cell value from an Excel (xlsx) file
The GET_XLSX_CELL_VALUE function allows us to retrieve a single value from a named cell in a specific worksheet, like this:
Technical Detail: In the XLSX format, strings (as opposed to numbers) are not stored in the actual cell where they are entered, but rather in a "shared strings" section. The cell just contains a reference back to this "shared strings" section. The GET_XLSX_CELL_VALUE function handles this for you, so you don't have to worry about that.
Extract multiple cell values from an Excel (xlsx) file
Since the function that extracts a single value from a spreadsheet must open the file, unzip it, and parse the XML content every time you call that function, there is another function (GET_XLSX_CELL_VALUES, notice the plural) that allows you to retrieve multiple values in one call. In other words, the file is unzipped and the contents parsed as XML just once, which is obviously more efficient.
Simply specify the names of multiple cells using an array of strings:
Write contents into OOXML file using PL/SQL
Since the contents of OOXML files are XML files, you can manipulate the existing content, or generate new content, and then save it back to the zip file that contains your document.
The following demonstrates one approach; it uses a Powerpoint file, but this technique will also work with Word and Excel files.
We create a Powerpoint 2007 file (.pptx) and put in some tags that we want to replace via code. In other words, this becomes a template that we can fill with dynamic values from the database.
The GET_FILE_FROM_TEMPLATE function takes a template file as input, and two string arrays: The tag names and actual values to replace the tags with. It unzips the file, performs the substitutions, writes back the file to the zip archive, and returns the file, which you can then save back to disk (or, more likely, store in the database or send to a web browser).
The code is trivial:
Here is the result when opening the output file:
So the next time you do a presentation, you could actually update your Powerpoint slides with the latest sales figures (or whatever) from within SQL*Plus...
Conclusion
Working with Office 2007 (OOXML) files from PL/SQL is easy and opens up many possibilities, both for extracting information from documents and storing them in the database, as well as generating or modifying OOXML files in the database server.
http://ora-00001.blogspot.com/2011/02/working-with-office-2007-ooxml-files.html
发表评论
-
用函数unistr将Oracle数据库中的Unicode转换为中文
2016-07-19 11:51 7918例子: DECLARE V_EXT_DES V ... -
ORACLE APPLICATION EXPRESS 5.0 升级
2016-05-12 11:43 580Oracle11GR2 XE 缺省是安装了oracle ap ... -
Oracle ACL(Access Control List)
2016-05-12 11:36 889在oralce 11g中假如你想获取server的ip或者h ... -
了解systemstate dump
2016-04-26 14:09 487当数据库出现严重的性能问题或者hang了的时候,我们非常需要 ... -
通过ORACLE的UTL_HTTP工具包发送包含POST参数的请求
2016-03-18 16:25 5152DECLARE req utl_http. ... -
Shell: extract more from listener.log(分析监听日志)
2016-03-16 14:57 1148统计一天内每小时的session请求数 # fgrep ... -
ORA-01031: insufficient privileges 问题解决笔记
2016-02-01 15:53 1186A) File $Oracle_HOME/network/a ... -
listener.log中报Warning: Subscription For Node Down Event Still Pending问题的解决方法
2016-01-07 16:34 1634一套Oracle 10.2.0.1 for aix的数据库环 ... -
Oracle触发器和MySQL触发器之间的区别
2015-11-19 12:55 670Oracle触发器格式: CREATE [OR RE ... -
查询正在执行的存储过程
2015-11-13 09:27 20501、找正在执行的PROCEDURE的 sid ,serial# ... -
undo表空间损坏的处理过程
2015-10-14 13:49 1219磁碟陣列故障,分區/rman上包括undo和archivel ... -
登录oracle资料库时很久无反应的问题处理一例
2015-10-11 10:56 993原因是系统存在僵死的进程,促使session处于激活状态.首 ... -
TNS-12560问题解决
2015-10-01 19:52 613tnsping远程主机实例出现TNS-12560: TNS ... -
查看undo中sql语句的占用情况
2015-08-06 17:18 1764查看undo中sql语句的占用情况 select * ... -
Install Open System Architect And ODBC Instant Client
2015-05-21 14:03 749How to Install Open System Arc ... -
恢复oracle中用pl sql误删除drop掉的表
2015-04-03 16:12 553查看回收站中表 select object_name,or ... -
在Oracle Linux 6.6上安装Oracle 10gR2
2015-01-15 15:36 2681查看硬體配置 # df -h Filesystem ... -
kill
2015-01-03 11:36 457--根据某一对象查询进程 col owner fo ... -
Oracle 数据库Storage存储迁移笔记
2014-12-27 11:08 9861.确认数据文件、控制文件、临时文件、日志文件 位置 / ... -
異地備份資料庫的開啟步驟
2014-11-19 14:03 487使用EMC設備執行異地備份, 資料庫的複製是開啟的狀態下, ...
相关推荐
标题中的"poi全家桶ooxml-schemas/poi/poi-examples/poi-ooxml/poi-ooxml-schemas/poi-scratchpad"提及的是Apache POI项目中的多个关键组件和目录结构。Apache POI是一个开源的Java库,专门用于读写Microsoft Office...
亚历山大plsql实用程序 Oracle PL / SQL实用程序库 该库是PL / SQL各种实用程序包的集合,以及指向在其他位置托管和维护... http://ora-00001.blogspot.com/2011/02/working-with-office-2007-ooxml-files.html http
poi-ooxml-3.8.jar和poi-ooxml-schemas-3.8.jar和xmlbeans-2.3.0.jar和dom4j-1.6.1.jar和poi-3.8.jar,需要的童鞋可以下载。
OOXML Strict Converter for Office 2010 allows you to open ISO strict documents that are created using Office 2013 in Office 2010. It will preserve the fidelity of the document. If you make any changes...
poi-ooxml-***.jar中文文档.zip,java,poi-ooxml-***.jar,org.apache.poi,poi-ooxml,***,org.apache.poi.ooxml,jar包,Maven,第三方jar包,组件,开源组件,第三方组件,Gradle,apache,poi,ooxml,中文API文档,手册,开发...
Apache POI 是一个Java库,专门用于处理Microsoft Office格式的文件,如Excel、Word和PowerPoint。"poi-bin-3.0.2-FINAL" 是一个特定版本的Apache POI二进制发行版,它包含了处理Excel文件所需的所有类和资源。这个...
Office 2007 兼容包是一款专为使用旧版Microsoft Office(如Office 2003)的用户设计的软件,它使得用户能够在不升级到Office 2007或更高版本的情况下,能够打开、编辑和保存采用新格式(如.docx, .xlsx, .pptx等)...
poi-ooxml-5.2.2.jar
OOXML(Office Open XML)是微软推出的一种开放文档格式,用于存储Word、Excel、PowerPoint等办公软件的数据。ooxml-schemas-1.0.jar和ooxml-schemas-2.0是与OOXML相关的Java库,它们包含了处理OOXML文件所必需的XML...
Office 2007文件格式转换器是一款专为处理Microsoft Office 2007文档设计的工具,它能够帮助用户将新版本的Office文件转换为更旧版本的格式,或者将其他格式的文档转换为Office 2007支持的格式。在日常工作中,我们...
这是因为从Office 2007开始,微软引入了一种新的文件格式,称为Open XML(也称为OOXML),这导致了旧版Office无法识别的新文件扩展名,例如.docx代替了传统的.doc,.xlsx代替.xlsx,以及.pptx代替.ppt。 为了解决这...
HSSF是用于读写旧的BIFF格式(Excel 97-2007),而XSSF则是用于处理新的OOXML格式(Excel 2007及以后版本)。 2. poi-ooxml-3.9.jar:这个库提供了对Office Open XML (OOXML) 格式的额外支持。OOXML是微软推出的一...
OOXML是Microsoft为Office文档定义的一种新的XML标准,它用于Excel 2007及以后版本的文件。此库提供了对这些XML schema的访问,使开发者能够解析和构建符合OOXML规范的Excel文件。 2. **poi-3.8**: 这是Apache POI...
poi-ooxml-***.jar中文文档.zip,java,poi-ooxml-***.jar,org.apache.poi,poi-ooxml,***,org.apache.poi.ooxml,jar包,Maven,第三方jar包,组件,开源组件,第三方组件,Gradle,apache,poi,ooxml,中文API文档,手册,开发...
`poi-ooxml-3.9.jar`是主库文件,包含了所有处理Open XML格式(OOXML)的API,这是Microsoft Office 2007及以后版本所使用的文件格式。开发者可以使用这些API来创建、修改和解析OOXML文件。 标签"poi-ooxml"是这个...
这个版本的POI引入了对Office Open XML (OOXML)标准的支持,这是一种XML-based文件格式,广泛用于现代的Microsoft Office应用程序。 Apache POI的核心功能在于提供API,允许程序员在Java环境中创建、修改和读取MS ...
标题中的“Office 2003打开Office 2007文件格式的兼容软件包”指的是微软Office在不同版本之间存在文件格式不兼容的问题,尤其是从较旧版本(如Office 2003)尝试打开较新版本(如Office 2007)创建的文档时。...
Apache POI是一个流行的Java库,用于处理Microsoft Office格式的文件,如Word(.docx)、Excel(.xlsx)和PowerPoint(.pptx)。而"poi-ooxml-schemas-3.17.zip"是Apache POI项目的一个组件,包含了Open XML(OOXML...
poi-ooxml-***.jar中文文档.zip,java,poi-ooxml-***.jar,org.apache.poi,poi-ooxml,***,org.apache.poi.ooxml,jar包,Maven,第三方jar包,组件,开源组件,第三方组件,Gradle,apache,poi,ooxml,中文API文档,手册,开发...