How do I access the XYZ file format in java ?
Specifications for many file formats can be found at Wotsit. A large database of file extensions be found at www.file-extensions.org and dotwhat.net
And if you don't know what type a given file is, they there are various way to determine it programmatically: http://www.rgagnon.com/javadetails/java-0487.html
An interesting article about Microsoft's binary file formats, especially DOC and XLS, is Why are the Microsoft Office file formats so complicated? (And some workarounds) It also mentions some alternatives to dealing with those formats directly.
Access
- JDBC/ODBC bridge - JDBC driver for ODBC databases, comes as part of the JDK; on Linux, you'll have to get ODBC up and running first:http://www.unixodbc.org/
- Jackcess - library to read and write MDB files
- HXTT Access - commercial pure Java JDBC driver for MS Access
CGM
- cgmva - an applet to display CGM files; comes with source code
CHM
- JChm - library to read CHM files
Excel
- Apache Commons CSV, Ostermiller Utils, CSVObjects, CSVBeans, opencsv, Java CSV, Super CSV - libraries to read and write CSV files. CSV is not as easy to read and write as it first looks - once all the special cases are considered, one might as well use a library.
- POI - library to read and write XLS and XLSX files
- JExcelAPI - library to read and write XLS (but not XLSX) files
- jXLS - library for writing XLS files based on templates
- Java2Excel - library for creating Excel files based on Collections
- It is possible to use JDBC to read Excel files
- Obba works with Excel spreadsheets on Windows
- OpenXLS - "OpenXLS is the open source version of ExtenXLS - a Java spreadsheet SDK that allows you to read, modify and create Java Excel spreadsheets from your Java applications."
Gedcom
HDF (Hierarchical Data Format)
Image and movie files
- ImageJ - Java image processing application and library that has plugins for lots of image file formats
- JIMI - library to read and write BMP, CUR, GIF, ICO, JPEG, PICT, PNG, PSD, Sun Raster, TGA, TIFF, XBM and XPM. There's a plugin for using JIMI with ImageJ, which also includes a couple of JIMI patches.
- GIF write, TIFF, RAW, PNM and JPEG2000 read/write support for ImageIO: JAI Image I/O Tools
- Reading QuickTime files in Java. Apple's QT4J library is unfortunately no longer supported.
- MP4 parser
INI
- ini4j "is a simple Java API for handling configuration files in Windows .ini format."
Matlab
OpenDocument (ODF)
- basic Java code for reading ODF files is here
- ODFDOM is a Java library for accessing ODF files.
- jDocument.org has an open-source library for accessing all Open Document file types.
- Obba works with OpenOffice? spreadsheets
- Office2FO converts ODF documents to XSL-FO documents, making possible further transformations (like conversion to PDF using FOP)
Office Open XML
- These are the new XML-based Microsoft Office formats.
- OpenXML4J
- docx4j - create and edit docx documents using a JAXB content model matching the WordML schema
- Apache POI implements these formats.
OpenOffice Java API
- OpenOffice can read a number of file formats, and makes them accessible through its API. A starting point might be this article, this article and of course theOO developer site
- Some introductory information about the OO file format can be found here and here
- oooview is an OO Viewer written in Java.
- JODConverter is a Java library that uses the OO Java API to perform document conversions between any formats supported by OO
Outlook
- The Apache POI project developed some code that can read the texual contents of Outlook's MSG files. This page talks about that.
- Xena can convert multiple file formats -including MSG- to XML. Either the result of that conversion, or Xena's source code, may be helpful.
- JPST can read and extract PST files.
- PDF is a hard to read format. The best one can do is try to extract the text contained in a PDF file.
- iText - library to create PDFs; see ItextExample for a code example. The older version iText 2 (which uses a more permissive license) is also available: jar file, javadocs
- FOP - libray to create PDFs (and other formats) from XML by using XSL-FO transformations
- FlyingSaucer - library to convert CSS-styled XHTML to PDF
- PDFBox - library that can merge, split and print PDFs, extract text, create images from PDFs, encrypt/decrypt PDFs, fill in PDF forms and more
- PDF Clown - general-purpose library to read/create/modify PDF files. It features a rich multi-layered object model that allows access even to each single content stream instruction.
- JPedal - library for viewing and printing PDFs, can also extract text (how to print PDFs); commercial (the LGPL version provides PDF viewing only)
- PDFTextStream - commercial library to extract text from PDFs
- PDF Renderer is a more up-to-date PDF viewer that renders using Java2D. Download, Examples, Printing PDFs
- ICEPdf is another library that can render PDFs.
- Qoppa offers numerous libraries for PDF-related tasks
- Aspose.Pdf for Java is a commercial library for reading and writing PDFs
- jPod is a rich PDF manipulation and rendering framework
PowerPoint
- The Apache POI project developed some code that can open and (to a limited extent) edit PPT files. This page talks about it.
Project
- The MPXJ library can work with several Project file formats.
PST
- LibPST is a C library that could be used through JNI.
- Xena can convert multiple file formats -including PST- to XML. Either the result of that conversion, or Xena's source code, may be helpful.
- java-libpst is a pure Java library that can access 64bit PST files.
QIF (used by Microsoft Money and Quicken)
- Buddi and Eurobudget are Java applications that can import and export QIF files (and thus contain code you may be able to use in your application). Both are licensed under the GPL.
RTF
- jRTF can create RTFs
- iText 2 can create RTFs: jar file, javadocs
- JavaCC - is a lexer/parser for which an RTF grammar is available. From that an RTF reader can be constructed.
Visio
- The Apache POI project developed some code that can read Visio files. This page talks about that.
Word
- POI - library to read and write DOC and DOCX files. It can also be used for extracting the text of a document.
- WordApi.exe is native Windows component with a Java interface, which lets you create Word documents, and alter word templates. Some impressions about it can be found here.
- Java2Word - library to create Word documents, especially reports, on the fly.
Something else?
If you encounter an obscure format for which no library is available, it may be feasible to create a reader for it if you have a file format description (which may be available on Wotsit, see link above). Several libraries, so-called lexers and parsers, are available that help in creating a reader, especially if the file format is ASCII, and not binary. You will need knowledge of regular expressions, though. Some file formats that have been tackled using this approach include RTF, CSV, HPGL and PBM/PGM/PPM. Lexers are easier to start with, but parsers can do more of the work for you. All these have ready-to-use examples on their web sites.
- Lexers: JFlex (introductory article in the JavaRanch Journal)
- Parsers: Antlr, SableCC, JavaCC
相关推荐
在Java编程环境中,读写Excel文件是一项常见的任务,特别是在数据处理、数据分析以及报表生成等领域。为了实现这个功能,我们可以利用各种库和插件。这里,我们将深入探讨如何使用Java插件和组件来读取和写入Excel...
java.util.jar 提供读写 JAR (Java ARchive) 文件格式的类,该格式基于具有可选清单文件的标准 ZIP 文件格式。 java.util.logging 提供 JavaTM 2 平台核心日志工具的类和接口。 java.util.prefs 此包允许应用程序...
Java提供了一系列的类和接口,如File、InputStream、OutputStream以及BufferedReader和BufferedWriter,用于读写文件和处理数据流。理解这些类的用法和工作原理对于处理数据存储和传输至关重要。 反射是Java的另一...
由于Applet可能来自不可信的网络源,因此Java引入了安全沙箱模型,限制了Applet的权限,如读写本地文件、访问网络资源等。这降低了恶意Applet对用户系统的潜在威胁。 六、JavaApplet的局限性与替代方案 随着Web...
关于XLSB格式,它是Microsoft Excel的一种二进制文件格式,相比传统的XLSX格式,XLSB在读写速度和文件大小上有所优化,特别是在处理大量数据时。Aspose Online Editor支持这种格式,意味着它可以处理大容量的数据...
"java_disabuse.rar_java disabuse"这个压缩包文件旨在帮助解决这些问题,通过实例来澄清常见的Java学习误区。 文档“java_disabuse.doc”可能包含了以下几个方面的重要知识点: 1. **基础语法**:Java的基础语法...
5. **输入/输出流**:I/O流用于处理数据的读写,包括文件操作、网络通信等。理解流的概念,以及InputStream、OutputStream、Reader、Writer等基类和BufferedReader、PrintWriter等常用类的用法,对处理数据交换至关...
在`Java FTPServer.doc`文档中,可能会详细介绍如何安装、配置和使用这个Java FTP服务器,包括命令行参数、配置文件格式、API用法示例,以及常见问题解答等内容。通过阅读这份文档,开发者和系统管理员可以更好地...
XML解析是数据交换和存储的常见方式,Java提供了DOM、SAX和StAX等多种解析方式,理解这些方式的优缺点以及适用场景很重要。 文件处理和网络编程也是Java开发者必备的技能。文件操作包括读写、复制、查找等,而网络...
6. **输入/输出流**:Java I/O系统提供了丰富的类和接口,用于读写文件、网络通信等。熟悉InputStream、OutputStream、Reader、Writer等基类,以及BufferedReader、FileWriter等具体实现,可以有效地处理数据流。 7...
在Java测试中,文本处理通常涉及到读取、解析、操作和生成文本文件,这些都是自动化测试脚本或测试框架中的常见任务。例如,我们可能需要读取配置文件、日志文件,或者生成测试报告。以下是一些关键的知识点: 1. *...
5. **输入/输出(I/O)操作**:如果项目涉及文件读写或网络通信,源代码可能会包含使用Java I/O流进行数据交换的代码。 6. **集合框架**:Java集合框架是存储和管理对象的工具,源码中可能使用ArrayList、...
6. **IO流**:Java的输入/输出流系统用于读写文件和网络数据,包括字节流、字符流、对象流以及缓冲流等。 7. **反射API**:反射允许在运行时检查类的信息,创建和调用对象,是实现动态代理、插件机制等高级功能的...
- **可扩展性**:JMF允许开发者添加自定义插件来支持新的媒体格式或功能。 - **高性能**:利用原生代码提高处理速度,适用于对性能要求较高的应用场景。 #### 三、JMF体系结构概述 - **核心组件**:包括`javax....
- **反射**:在运行时动态获取类信息和实例化对象,常用于插件框架、配置文件解析等。 - **注解**:元数据,用于提供编译时和运行时的信息,如@Deprecated表示过时,@Override确保方法重写。 10. **集合框架优化*...
5. **文件I/O操作**:Java提供了丰富的文件操作API,如`java.io.File`类和`java.nio`包,用于读写文件、创建目录、移动文件等。 6. **安全性**:在处理用户上传的文件时,需要确保服务器安全。这包括防止恶意文件...
5. **IO与NIO**:Java的IO流提供了读写文件、网络通信的能力,而NIO(New IO)则引入了非阻塞I/O,提高了效率。通过分析源码,我们可以学习到如何高效地处理输入输出,并理解缓冲区、通道和选择器的使用。 6. **...
基于java的开发源码-FAT文件系统读写类库 fat32-lib.zip 基于java的开发源码-FAT文件系统读写类库 fat32-lib.zip 基于java的开发源码-FAT文件系统读写类库 fat32-lib.zip 基于java的开发源码-FAT文件系统读写类库 ...
5. **输入/输出(I/O)**:Java的I/O流系统强大而复杂,包括文件读写、网络通信等。InputStream和OutputStream是所有字节流的基类,而Reader和Writer则是所有字符流的基类。 6. **多线程**:Java内置了对多线程的...