- 浏览: 18640 次
- 性别:
文章分类
最新评论
excel2007文件格式与之前版本不同,之前版本采用的是微软自己的存储格式。07版内容的存储采用XML格式,所以,理所当然的,对大数据量的 xlsx文件的读取采用的也是XML的处理方式SAX。
同之前的版本一样,大数据量文件的读取采用的是事件模型eventusermodel。usermodel模式需要将文件一次性全部读到内存中,07版的既然采用的存储模式是xml,解析用的DOM方式也是如此,这种模式操作简单,容易上手,但是对于大量数据占用的内存也是相当可观,在Eclipse中经常出现内存溢出。
下面就是采用eventusermodel对07excel文件读取。
同上篇,我将当前行的单元格数据存储到List中,抽象出 optRows 方法,该方法会在每行末尾时调用,方法参数为当前行索引curRow(int型)及存有行内单元格数据的List。继承类只需实现该行级方法即可。
经测试,对12万条数据,7M大小的文件也能正常运行。无需设置vm的内存空间。
excel读取采用的API为POI3.6,使用前先下载此包,若运行中出现其他依赖包不存在,请下载相应依赖包。
抽象类:XxlsAbstract ,作用:遍历excel文件,提供行级操作方法 optRows
package com.gaosheng.util.xls;
import java.io.InputStream;
import java.sql.SQLException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import org.apache.poi.xssf.eventusermodel.XSSFReader;
import org.apache.poi.xssf.model.SharedStringsTable;
import org.apache.poi.xssf.usermodel.XSSFRichTextString;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLReaderFactory;
/**
* XSSF and SAX (Event API)
*/
public abstract class XxlsAbstract extends DefaultHandler {
private SharedStringsTable sst;
private String lastContents;
private boolean nextIsString;
private int sheetIndex = -1;
private List<String> rowlist = new ArrayList<String>();
private int curRow = 0;
private int curCol = 0;
//excel记录行操作方法,以行索引和行元素列表为参数,对一行元素进行操作,元素为String类型
// public abstract void optRows(int curRow, List<String> rowlist) throws SQLException ;
//excel记录行操作方法,以sheet索引,行索引和行元素列表为参数,对sheet的一行元素进行操作,元素为String类型
public abstract void optRows(int sheetIndex,int curRow, List<String> rowlist) throws SQLException;
//只遍历一个sheet,其中sheetId为要遍历的sheet索引,从1开始,1-3
public void processOneSheet(String filename,int sheetId) throws Exception {
OPCPackage pkg = OPCPackage.open(filename);
XSSFReader r = new XSSFReader(pkg);
SharedStringsTable sst = r.getSharedStringsTable();
XMLReader parser = fetchSheetParser(sst);
// rId2 found by processing the Workbook
// 根据 rId# 或 rSheet# 查找sheet
InputStream sheet2 = r.getSheet("rId"+sheetId);
sheetIndex++;
InputSource sheetSource = new InputSource(sheet2);
parser.parse(sheetSource);
sheet2.close();
}
/**
* 遍历 excel 文件
*/
public void process(String filename) throws Exception {
OPCPackage pkg = OPCPackage.open(filename);
XSSFReader r = new XSSFReader(pkg);
SharedStringsTable sst = r.getSharedStringsTable();
XMLReader parser = fetchSheetParser(sst);
Iterator<InputStream> sheets = r.getSheetsData();
while (sheets.hasNext()) {
curRow = 0;
sheetIndex++;
InputStream sheet = sheets.next();
InputSource sheetSource = new InputSource(sheet);
parser.parse(sheetSource);
sheet.close();
}
}
public XMLReader fetchSheetParser(SharedStringsTable sst)
throws SAXException {
XMLReader parser = XMLReaderFactory
.createXMLReader("org.apache.xerces.parsers.SAXParser");
this.sst = sst;
parser.setContentHandler(this);
return parser;
}
public void startElement(String uri, String localName, String name,
Attributes attributes) throws SAXException {
// c => 单元格
if (name.equals("c")) {
// 如果下一个元素是 SST 的索引,则将nextIsString标记为true
String cellType = attributes.getValue("t");
if (cellType != null && cellType.equals("s")) {
nextIsString = true;
} else {
nextIsString = false;
}
}
// 置空
lastContents = "";
}
public void endElement(String uri, String localName, String name)
throws SAXException {
// 根据SST的索引值的到单元格的真正要存储的字符串
// 这时characters()方法可能会被调用多次
if (nextIsString) {
try {
int idx = Integer.parseInt(lastContents);
lastContents = new XSSFRichTextString(sst.getEntryAt(idx))
.toString();
} catch (Exception e) {
}
}
// v => 单元格的值,如果单元格是字符串则v标签的值为该字符串在SST中的索引
// 将单元格内容加入rowlist中,在这之前先去掉字符串前后的空白符
if (name.equals("v")) {
String value = lastContents.trim();
value = value.equals("")?" ":value;
rowlist.add(curCol, value);
curCol++;
}else {
//如果标签名称为 row ,这说明已到行尾,调用 optRows() 方法
if (name.equals("row")) {
try {
optRows(sheetIndex,curRow,rowlist);
} catch (SQLException e) {
e.printStackTrace();
}
rowlist.clear();
curRow++;
curCol = 0;
}
}
}
public void characters(char[] ch, int start, int length)
throws SAXException {
//得到单元格内容的值
lastContents += new String(ch, start, length);
}
}
package com.gaosheng.util.xls;
import java.io.InputStream;
import java.sql.SQLException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import org.apache.poi.xssf.eventusermodel.XSSFReader;
import org.apache.poi.xssf.model.SharedStringsTable;
import org.apache.poi.xssf.usermodel.XSSFRichTextString;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLReaderFactory;
/**
* XSSF and SAX (Event API)
*/
public abstract class XxlsAbstract extends DefaultHandler {
private SharedStringsTable sst;
private String lastContents;
private boolean nextIsString;
private int sheetIndex = -1;
private List<String> rowlist = new ArrayList<String>();
private int curRow = 0;
private int curCol = 0;
//excel记录行操作方法,以行索引和行元素列表为参数,对一行元素进行操作,元素为String类型
// public abstract void optRows(int curRow, List<String> rowlist) throws SQLException ;
//excel记录行操作方法,以sheet索引,行索引和行元素列表为参数,对sheet的一行元素进行操作,元素为String类型
public abstract void optRows(int sheetIndex,int curRow, List<String> rowlist) throws SQLException;
//只遍历一个sheet,其中sheetId为要遍历的sheet索引,从1开始,1-3
public void processOneSheet(String filename,int sheetId) throws Exception {
OPCPackage pkg = OPCPackage.open(filename);
XSSFReader r = new XSSFReader(pkg);
SharedStringsTable sst = r.getSharedStringsTable();
XMLReader parser = fetchSheetParser(sst);
// rId2 found by processing the Workbook
// 根据 rId# 或 rSheet# 查找sheet
InputStream sheet2 = r.getSheet("rId"+sheetId);
sheetIndex++;
InputSource sheetSource = new InputSource(sheet2);
parser.parse(sheetSource);
sheet2.close();
}
/**
* 遍历 excel 文件
*/
public void process(String filename) throws Exception {
OPCPackage pkg = OPCPackage.open(filename);
XSSFReader r = new XSSFReader(pkg);
SharedStringsTable sst = r.getSharedStringsTable();
XMLReader parser = fetchSheetParser(sst);
Iterator<InputStream> sheets = r.getSheetsData();
while (sheets.hasNext()) {
curRow = 0;
sheetIndex++;
InputStream sheet = sheets.next();
InputSource sheetSource = new InputSource(sheet);
parser.parse(sheetSource);
sheet.close();
}
}
public XMLReader fetchSheetParser(SharedStringsTable sst)
throws SAXException {
XMLReader parser = XMLReaderFactory
.createXMLReader("org.apache.xerces.parsers.SAXParser");
this.sst = sst;
parser.setContentHandler(this);
return parser;
}
public void startElement(String uri, String localName, String name,
Attributes attributes) throws SAXException {
// c => 单元格
if (name.equals("c")) {
// 如果下一个元素是 SST 的索引,则将nextIsString标记为true
String cellType = attributes.getValue("t");
if (cellType != null && cellType.equals("s")) {
nextIsString = true;
} else {
nextIsString = false;
}
}
// 置空
lastContents = "";
}
public void endElement(String uri, String localName, String name)
throws SAXException {
// 根据SST的索引值的到单元格的真正要存储的字符串
// 这时characters()方法可能会被调用多次
if (nextIsString) {
try {
int idx = Integer.parseInt(lastContents);
lastContents = new XSSFRichTextString(sst.getEntryAt(idx))
.toString();
} catch (Exception e) {
}
}
// v => 单元格的值,如果单元格是字符串则v标签的值为该字符串在SST中的索引
// 将单元格内容加入rowlist中,在这之前先去掉字符串前后的空白符
if (name.equals("v")) {
String value = lastContents.trim();
value = value.equals("")?" ":value;
rowlist.add(curCol, value);
curCol++;
}else {
//如果标签名称为 row ,这说明已到行尾,调用 optRows() 方法
if (name.equals("row")) {
try {
optRows(sheetIndex,curRow,rowlist);
} catch (SQLException e) {
e.printStackTrace();
}
rowlist.clear();
curRow++;
curCol = 0;
}
}
}
public void characters(char[] ch, int start, int length)
throws SAXException {
//得到单元格内容的值
lastContents += new String(ch, start, length);
}
}
继承类:XxlsBig,作用:将数据转出到数据库临时表
Java代码
package com.gaosheng.util.examples.xls;
import java.io.FileInputStream;
import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.List;
import java.util.Properties;
import com.gaosheng.util.xls.XxlsAbstract;
public class XxlsBig extends XxlsAbstract {
public static void main(String[] args) throws Exception {
XxlsBig howto = new XxlsBig("temp_table");
howto.processOneSheet("F:/new.xlsx",1);
howto.process("F:/new.xlsx");
howto.close();
}
public XxlsBig(String tableName) throws SQLException{
this.conn = getNew_Conn();
this.statement = conn.createStatement();
this.tableName = tableName;
}
private Connection conn = null;
private Statement statement = null;
private PreparedStatement newStatement = null;
private String tableName = "temp_table";
private boolean create = true;
public void optRows(int sheetIndex,int curRow, List<String> rowlist) throws SQLException {
if (sheetIndex == 0 && curRow == 0) {
StringBuffer preSql = new StringBuffer("insert into " + tableName
+ " values(");
StringBuffer table = new StringBuffer("create table " + tableName
+ "(");
int c = rowlist.size();
for (int i = 0; i < c; i++) {
preSql.append("?,");
table.append(rowlist.get(i));
table.append(" varchar2(100) ,");
}
table.deleteCharAt(table.length() - 1);
preSql.deleteCharAt(preSql.length() - 1);
table.append(")");
preSql.append(")");
if (create) {
statement = conn.createStatement();
try{
statement.execute("drop table "+tableName);
}catch(Exception e){
}finally{
System.out.println("表 "+tableName+" 删除成功");
}
if (!statement.execute(table.toString())) {
System.out.println("创建表 "+tableName+" 成功");
// return;
} else {
System.out.println("创建表 "+tableName+" 失败");
return;
}
}
conn.setAutoCommit(false);
newStatement = conn.prepareStatement(preSql.toString());
} else if(curRow>0) {
// 一般行
int col = rowlist.size();
for (int i = 0; i < col; i++) {
newStatement.setString(i + 1, rowlist.get(i).toString());
}
newStatement.addBatch();
if (curRow % 1000 == 0) {
newStatement.executeBatch();
conn.commit();
}
}
}
private static Connection getNew_Conn() {
Connection conn = null;
Properties props = new Properties();
FileInputStream fis = null;
try {
fis = new FileInputStream("D:/database.properties");
props.load(fis);
DriverManager.registerDriver(new oracle.jdbc.driver.OracleDriver());
// String jdbcURLString =
// "jdbc:oracle:thin:@192.168.0.28:1521:orcl";
StringBuffer jdbcURLString = new StringBuffer();
jdbcURLString.append("jdbc:oracle:thin:@");
jdbcURLString.append(props.getProperty("host"));
jdbcURLString.append(":");
jdbcURLString.append(props.getProperty("port"));
jdbcURLString.append(":");
jdbcURLString.append(props.getProperty("database"));
conn = DriverManager.getConnection(jdbcURLString.toString(), props
.getProperty("user"), props.getProperty("password"));
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
fis.close();
} catch (IOException e) {
e.printStackTrace();
}
}
return conn;
}
public int close() {
try {
newStatement.executeBatch();
conn.commit();
System.out.println("数据写入完毕");
this.newStatement.close();
this.statement.close();
this.conn.close();
return 1;
} catch (SQLException e) {
return 0;
}
}
}
package com.gaosheng.util.examples.xls;
import java.io.FileInputStream;
import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.List;
import java.util.Properties;
import com.gaosheng.util.xls.XxlsAbstract;
public class XxlsBig extends XxlsAbstract {
public static void main(String[] args) throws Exception {
XxlsBig howto = new XxlsBig("temp_table");
howto.processOneSheet("F:/new.xlsx",1);
howto.process("F:/new.xlsx");
howto.close();
}
public XxlsBig(String tableName) throws SQLException{
this.conn = getNew_Conn();
this.statement = conn.createStatement();
this.tableName = tableName;
}
private Connection conn = null;
private Statement statement = null;
private PreparedStatement newStatement = null;
private String tableName = "temp_table";
private boolean create = true;
public void optRows(int sheetIndex,int curRow, List<String> rowlist) throws SQLException {
if (sheetIndex == 0 && curRow == 0) {
StringBuffer preSql = new StringBuffer("insert into " + tableName
+ " values(");
StringBuffer table = new StringBuffer("create table " + tableName
+ "(");
int c = rowlist.size();
for (int i = 0; i < c; i++) {
preSql.append("?,");
table.append(rowlist.get(i));
table.append(" varchar2(100) ,");
}
table.deleteCharAt(table.length() - 1);
preSql.deleteCharAt(preSql.length() - 1);
table.append(")");
preSql.append(")");
if (create) {
statement = conn.createStatement();
try{
statement.execute("drop table "+tableName);
}catch(Exception e){
}finally{
System.out.println("表 "+tableName+" 删除成功");
}
if (!statement.execute(table.toString())) {
System.out.println("创建表 "+tableName+" 成功");
// return;
} else {
System.out.println("创建表 "+tableName+" 失败");
return;
}
}
conn.setAutoCommit(false);
newStatement = conn.prepareStatement(preSql.toString());
} else if(curRow>0) {
// 一般行
int col = rowlist.size();
for (int i = 0; i < col; i++) {
newStatement.setString(i + 1, rowlist.get(i).toString());
}
newStatement.addBatch();
if (curRow % 1000 == 0) {
newStatement.executeBatch();
conn.commit();
}
}
}
private static Connection getNew_Conn() {
Connection conn = null;
Properties props = new Properties();
FileInputStream fis = null;
try {
fis = new FileInputStream("D:/database.properties");
props.load(fis);
DriverManager.registerDriver(new oracle.jdbc.driver.OracleDriver());
// String jdbcURLString =
// "jdbc:oracle:thin:@192.168.0.28:1521:orcl";
StringBuffer jdbcURLString = new StringBuffer();
jdbcURLString.append("jdbc:oracle:thin:@");
jdbcURLString.append(props.getProperty("host"));
jdbcURLString.append(":");
jdbcURLString.append(props.getProperty("port"));
jdbcURLString.append(":");
jdbcURLString.append(props.getProperty("database"));
conn = DriverManager.getConnection(jdbcURLString.toString(), props
.getProperty("user"), props.getProperty("password"));
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
fis.close();
} catch (IOException e) {
e.printStackTrace();
}
}
return conn;
}
public int close() {
try {
newStatement.executeBatch();
conn.commit();
System.out.println("数据写入完毕");
this.newStatement.close();
this.statement.close();
this.conn.close();
return 1;
} catch (SQLException e) {
return 0;
}
}
}
继承类:XxlsPrint,作用:将数据输出到控制台
Java代码
package com.gaosheng.util.examples.xls;
import java.sql.SQLException;
import java.util.List;
import com.gaosheng.util.xls.XxlsAbstract;
public class XxlsPrint extends XxlsAbstract {
@Override
public void optRows(int sheetIndex,int curRow, List<String> rowlist) throws SQLException {
for (int i = 0; i < rowlist.size(); i++) {
System.out.print("'" + rowlist.get(i) + "',");
}
System.out.println();
}
public static void main(String[] args) throws Exception {
XxlsPrint howto = new XxlsPrint();
howto.processOneSheet("F:/new.xlsx",1);
// howto.processAllSheets("F:/new.xlsx");
}
}
源代码在附件中,还包含了说明文件、数据库配置文件、以及整合xls文件和xlsx文件读取的类:Xls2Do。
同之前的版本一样,大数据量文件的读取采用的是事件模型eventusermodel。usermodel模式需要将文件一次性全部读到内存中,07版的既然采用的存储模式是xml,解析用的DOM方式也是如此,这种模式操作简单,容易上手,但是对于大量数据占用的内存也是相当可观,在Eclipse中经常出现内存溢出。
下面就是采用eventusermodel对07excel文件读取。
同上篇,我将当前行的单元格数据存储到List中,抽象出 optRows 方法,该方法会在每行末尾时调用,方法参数为当前行索引curRow(int型)及存有行内单元格数据的List。继承类只需实现该行级方法即可。
经测试,对12万条数据,7M大小的文件也能正常运行。无需设置vm的内存空间。
excel读取采用的API为POI3.6,使用前先下载此包,若运行中出现其他依赖包不存在,请下载相应依赖包。
抽象类:XxlsAbstract ,作用:遍历excel文件,提供行级操作方法 optRows
package com.gaosheng.util.xls;
import java.io.InputStream;
import java.sql.SQLException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import org.apache.poi.xssf.eventusermodel.XSSFReader;
import org.apache.poi.xssf.model.SharedStringsTable;
import org.apache.poi.xssf.usermodel.XSSFRichTextString;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLReaderFactory;
/**
* XSSF and SAX (Event API)
*/
public abstract class XxlsAbstract extends DefaultHandler {
private SharedStringsTable sst;
private String lastContents;
private boolean nextIsString;
private int sheetIndex = -1;
private List<String> rowlist = new ArrayList<String>();
private int curRow = 0;
private int curCol = 0;
//excel记录行操作方法,以行索引和行元素列表为参数,对一行元素进行操作,元素为String类型
// public abstract void optRows(int curRow, List<String> rowlist) throws SQLException ;
//excel记录行操作方法,以sheet索引,行索引和行元素列表为参数,对sheet的一行元素进行操作,元素为String类型
public abstract void optRows(int sheetIndex,int curRow, List<String> rowlist) throws SQLException;
//只遍历一个sheet,其中sheetId为要遍历的sheet索引,从1开始,1-3
public void processOneSheet(String filename,int sheetId) throws Exception {
OPCPackage pkg = OPCPackage.open(filename);
XSSFReader r = new XSSFReader(pkg);
SharedStringsTable sst = r.getSharedStringsTable();
XMLReader parser = fetchSheetParser(sst);
// rId2 found by processing the Workbook
// 根据 rId# 或 rSheet# 查找sheet
InputStream sheet2 = r.getSheet("rId"+sheetId);
sheetIndex++;
InputSource sheetSource = new InputSource(sheet2);
parser.parse(sheetSource);
sheet2.close();
}
/**
* 遍历 excel 文件
*/
public void process(String filename) throws Exception {
OPCPackage pkg = OPCPackage.open(filename);
XSSFReader r = new XSSFReader(pkg);
SharedStringsTable sst = r.getSharedStringsTable();
XMLReader parser = fetchSheetParser(sst);
Iterator<InputStream> sheets = r.getSheetsData();
while (sheets.hasNext()) {
curRow = 0;
sheetIndex++;
InputStream sheet = sheets.next();
InputSource sheetSource = new InputSource(sheet);
parser.parse(sheetSource);
sheet.close();
}
}
public XMLReader fetchSheetParser(SharedStringsTable sst)
throws SAXException {
XMLReader parser = XMLReaderFactory
.createXMLReader("org.apache.xerces.parsers.SAXParser");
this.sst = sst;
parser.setContentHandler(this);
return parser;
}
public void startElement(String uri, String localName, String name,
Attributes attributes) throws SAXException {
// c => 单元格
if (name.equals("c")) {
// 如果下一个元素是 SST 的索引,则将nextIsString标记为true
String cellType = attributes.getValue("t");
if (cellType != null && cellType.equals("s")) {
nextIsString = true;
} else {
nextIsString = false;
}
}
// 置空
lastContents = "";
}
public void endElement(String uri, String localName, String name)
throws SAXException {
// 根据SST的索引值的到单元格的真正要存储的字符串
// 这时characters()方法可能会被调用多次
if (nextIsString) {
try {
int idx = Integer.parseInt(lastContents);
lastContents = new XSSFRichTextString(sst.getEntryAt(idx))
.toString();
} catch (Exception e) {
}
}
// v => 单元格的值,如果单元格是字符串则v标签的值为该字符串在SST中的索引
// 将单元格内容加入rowlist中,在这之前先去掉字符串前后的空白符
if (name.equals("v")) {
String value = lastContents.trim();
value = value.equals("")?" ":value;
rowlist.add(curCol, value);
curCol++;
}else {
//如果标签名称为 row ,这说明已到行尾,调用 optRows() 方法
if (name.equals("row")) {
try {
optRows(sheetIndex,curRow,rowlist);
} catch (SQLException e) {
e.printStackTrace();
}
rowlist.clear();
curRow++;
curCol = 0;
}
}
}
public void characters(char[] ch, int start, int length)
throws SAXException {
//得到单元格内容的值
lastContents += new String(ch, start, length);
}
}
package com.gaosheng.util.xls;
import java.io.InputStream;
import java.sql.SQLException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import org.apache.poi.xssf.eventusermodel.XSSFReader;
import org.apache.poi.xssf.model.SharedStringsTable;
import org.apache.poi.xssf.usermodel.XSSFRichTextString;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLReaderFactory;
/**
* XSSF and SAX (Event API)
*/
public abstract class XxlsAbstract extends DefaultHandler {
private SharedStringsTable sst;
private String lastContents;
private boolean nextIsString;
private int sheetIndex = -1;
private List<String> rowlist = new ArrayList<String>();
private int curRow = 0;
private int curCol = 0;
//excel记录行操作方法,以行索引和行元素列表为参数,对一行元素进行操作,元素为String类型
// public abstract void optRows(int curRow, List<String> rowlist) throws SQLException ;
//excel记录行操作方法,以sheet索引,行索引和行元素列表为参数,对sheet的一行元素进行操作,元素为String类型
public abstract void optRows(int sheetIndex,int curRow, List<String> rowlist) throws SQLException;
//只遍历一个sheet,其中sheetId为要遍历的sheet索引,从1开始,1-3
public void processOneSheet(String filename,int sheetId) throws Exception {
OPCPackage pkg = OPCPackage.open(filename);
XSSFReader r = new XSSFReader(pkg);
SharedStringsTable sst = r.getSharedStringsTable();
XMLReader parser = fetchSheetParser(sst);
// rId2 found by processing the Workbook
// 根据 rId# 或 rSheet# 查找sheet
InputStream sheet2 = r.getSheet("rId"+sheetId);
sheetIndex++;
InputSource sheetSource = new InputSource(sheet2);
parser.parse(sheetSource);
sheet2.close();
}
/**
* 遍历 excel 文件
*/
public void process(String filename) throws Exception {
OPCPackage pkg = OPCPackage.open(filename);
XSSFReader r = new XSSFReader(pkg);
SharedStringsTable sst = r.getSharedStringsTable();
XMLReader parser = fetchSheetParser(sst);
Iterator<InputStream> sheets = r.getSheetsData();
while (sheets.hasNext()) {
curRow = 0;
sheetIndex++;
InputStream sheet = sheets.next();
InputSource sheetSource = new InputSource(sheet);
parser.parse(sheetSource);
sheet.close();
}
}
public XMLReader fetchSheetParser(SharedStringsTable sst)
throws SAXException {
XMLReader parser = XMLReaderFactory
.createXMLReader("org.apache.xerces.parsers.SAXParser");
this.sst = sst;
parser.setContentHandler(this);
return parser;
}
public void startElement(String uri, String localName, String name,
Attributes attributes) throws SAXException {
// c => 单元格
if (name.equals("c")) {
// 如果下一个元素是 SST 的索引,则将nextIsString标记为true
String cellType = attributes.getValue("t");
if (cellType != null && cellType.equals("s")) {
nextIsString = true;
} else {
nextIsString = false;
}
}
// 置空
lastContents = "";
}
public void endElement(String uri, String localName, String name)
throws SAXException {
// 根据SST的索引值的到单元格的真正要存储的字符串
// 这时characters()方法可能会被调用多次
if (nextIsString) {
try {
int idx = Integer.parseInt(lastContents);
lastContents = new XSSFRichTextString(sst.getEntryAt(idx))
.toString();
} catch (Exception e) {
}
}
// v => 单元格的值,如果单元格是字符串则v标签的值为该字符串在SST中的索引
// 将单元格内容加入rowlist中,在这之前先去掉字符串前后的空白符
if (name.equals("v")) {
String value = lastContents.trim();
value = value.equals("")?" ":value;
rowlist.add(curCol, value);
curCol++;
}else {
//如果标签名称为 row ,这说明已到行尾,调用 optRows() 方法
if (name.equals("row")) {
try {
optRows(sheetIndex,curRow,rowlist);
} catch (SQLException e) {
e.printStackTrace();
}
rowlist.clear();
curRow++;
curCol = 0;
}
}
}
public void characters(char[] ch, int start, int length)
throws SAXException {
//得到单元格内容的值
lastContents += new String(ch, start, length);
}
}
继承类:XxlsBig,作用:将数据转出到数据库临时表
Java代码
package com.gaosheng.util.examples.xls;
import java.io.FileInputStream;
import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.List;
import java.util.Properties;
import com.gaosheng.util.xls.XxlsAbstract;
public class XxlsBig extends XxlsAbstract {
public static void main(String[] args) throws Exception {
XxlsBig howto = new XxlsBig("temp_table");
howto.processOneSheet("F:/new.xlsx",1);
howto.process("F:/new.xlsx");
howto.close();
}
public XxlsBig(String tableName) throws SQLException{
this.conn = getNew_Conn();
this.statement = conn.createStatement();
this.tableName = tableName;
}
private Connection conn = null;
private Statement statement = null;
private PreparedStatement newStatement = null;
private String tableName = "temp_table";
private boolean create = true;
public void optRows(int sheetIndex,int curRow, List<String> rowlist) throws SQLException {
if (sheetIndex == 0 && curRow == 0) {
StringBuffer preSql = new StringBuffer("insert into " + tableName
+ " values(");
StringBuffer table = new StringBuffer("create table " + tableName
+ "(");
int c = rowlist.size();
for (int i = 0; i < c; i++) {
preSql.append("?,");
table.append(rowlist.get(i));
table.append(" varchar2(100) ,");
}
table.deleteCharAt(table.length() - 1);
preSql.deleteCharAt(preSql.length() - 1);
table.append(")");
preSql.append(")");
if (create) {
statement = conn.createStatement();
try{
statement.execute("drop table "+tableName);
}catch(Exception e){
}finally{
System.out.println("表 "+tableName+" 删除成功");
}
if (!statement.execute(table.toString())) {
System.out.println("创建表 "+tableName+" 成功");
// return;
} else {
System.out.println("创建表 "+tableName+" 失败");
return;
}
}
conn.setAutoCommit(false);
newStatement = conn.prepareStatement(preSql.toString());
} else if(curRow>0) {
// 一般行
int col = rowlist.size();
for (int i = 0; i < col; i++) {
newStatement.setString(i + 1, rowlist.get(i).toString());
}
newStatement.addBatch();
if (curRow % 1000 == 0) {
newStatement.executeBatch();
conn.commit();
}
}
}
private static Connection getNew_Conn() {
Connection conn = null;
Properties props = new Properties();
FileInputStream fis = null;
try {
fis = new FileInputStream("D:/database.properties");
props.load(fis);
DriverManager.registerDriver(new oracle.jdbc.driver.OracleDriver());
// String jdbcURLString =
// "jdbc:oracle:thin:@192.168.0.28:1521:orcl";
StringBuffer jdbcURLString = new StringBuffer();
jdbcURLString.append("jdbc:oracle:thin:@");
jdbcURLString.append(props.getProperty("host"));
jdbcURLString.append(":");
jdbcURLString.append(props.getProperty("port"));
jdbcURLString.append(":");
jdbcURLString.append(props.getProperty("database"));
conn = DriverManager.getConnection(jdbcURLString.toString(), props
.getProperty("user"), props.getProperty("password"));
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
fis.close();
} catch (IOException e) {
e.printStackTrace();
}
}
return conn;
}
public int close() {
try {
newStatement.executeBatch();
conn.commit();
System.out.println("数据写入完毕");
this.newStatement.close();
this.statement.close();
this.conn.close();
return 1;
} catch (SQLException e) {
return 0;
}
}
}
package com.gaosheng.util.examples.xls;
import java.io.FileInputStream;
import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.List;
import java.util.Properties;
import com.gaosheng.util.xls.XxlsAbstract;
public class XxlsBig extends XxlsAbstract {
public static void main(String[] args) throws Exception {
XxlsBig howto = new XxlsBig("temp_table");
howto.processOneSheet("F:/new.xlsx",1);
howto.process("F:/new.xlsx");
howto.close();
}
public XxlsBig(String tableName) throws SQLException{
this.conn = getNew_Conn();
this.statement = conn.createStatement();
this.tableName = tableName;
}
private Connection conn = null;
private Statement statement = null;
private PreparedStatement newStatement = null;
private String tableName = "temp_table";
private boolean create = true;
public void optRows(int sheetIndex,int curRow, List<String> rowlist) throws SQLException {
if (sheetIndex == 0 && curRow == 0) {
StringBuffer preSql = new StringBuffer("insert into " + tableName
+ " values(");
StringBuffer table = new StringBuffer("create table " + tableName
+ "(");
int c = rowlist.size();
for (int i = 0; i < c; i++) {
preSql.append("?,");
table.append(rowlist.get(i));
table.append(" varchar2(100) ,");
}
table.deleteCharAt(table.length() - 1);
preSql.deleteCharAt(preSql.length() - 1);
table.append(")");
preSql.append(")");
if (create) {
statement = conn.createStatement();
try{
statement.execute("drop table "+tableName);
}catch(Exception e){
}finally{
System.out.println("表 "+tableName+" 删除成功");
}
if (!statement.execute(table.toString())) {
System.out.println("创建表 "+tableName+" 成功");
// return;
} else {
System.out.println("创建表 "+tableName+" 失败");
return;
}
}
conn.setAutoCommit(false);
newStatement = conn.prepareStatement(preSql.toString());
} else if(curRow>0) {
// 一般行
int col = rowlist.size();
for (int i = 0; i < col; i++) {
newStatement.setString(i + 1, rowlist.get(i).toString());
}
newStatement.addBatch();
if (curRow % 1000 == 0) {
newStatement.executeBatch();
conn.commit();
}
}
}
private static Connection getNew_Conn() {
Connection conn = null;
Properties props = new Properties();
FileInputStream fis = null;
try {
fis = new FileInputStream("D:/database.properties");
props.load(fis);
DriverManager.registerDriver(new oracle.jdbc.driver.OracleDriver());
// String jdbcURLString =
// "jdbc:oracle:thin:@192.168.0.28:1521:orcl";
StringBuffer jdbcURLString = new StringBuffer();
jdbcURLString.append("jdbc:oracle:thin:@");
jdbcURLString.append(props.getProperty("host"));
jdbcURLString.append(":");
jdbcURLString.append(props.getProperty("port"));
jdbcURLString.append(":");
jdbcURLString.append(props.getProperty("database"));
conn = DriverManager.getConnection(jdbcURLString.toString(), props
.getProperty("user"), props.getProperty("password"));
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
fis.close();
} catch (IOException e) {
e.printStackTrace();
}
}
return conn;
}
public int close() {
try {
newStatement.executeBatch();
conn.commit();
System.out.println("数据写入完毕");
this.newStatement.close();
this.statement.close();
this.conn.close();
return 1;
} catch (SQLException e) {
return 0;
}
}
}
继承类:XxlsPrint,作用:将数据输出到控制台
Java代码
package com.gaosheng.util.examples.xls;
import java.sql.SQLException;
import java.util.List;
import com.gaosheng.util.xls.XxlsAbstract;
public class XxlsPrint extends XxlsAbstract {
@Override
public void optRows(int sheetIndex,int curRow, List<String> rowlist) throws SQLException {
for (int i = 0; i < rowlist.size(); i++) {
System.out.print("'" + rowlist.get(i) + "',");
}
System.out.println();
}
public static void main(String[] args) throws Exception {
XxlsPrint howto = new XxlsPrint();
howto.processOneSheet("F:/new.xlsx",1);
// howto.processAllSheets("F:/new.xlsx");
}
}
源代码在附件中,还包含了说明文件、数据库配置文件、以及整合xls文件和xlsx文件读取的类:Xls2Do。
- src.rar (9.7 KB)
- 下载次数: 10
相关推荐
Java中的Apache POI库是用于读取和...通过以上策略,可以有效地处理Java POI在导入大数据量Excel时的内存溢出问题,同时提高程序的运行效率。在实践中,应根据具体场景选择合适的优化方法,确保程序的稳定性和性能。
这篇博文“处理大数据量excel”可能提供了针对这一问题的解决方案。博主分享了在处理大量数据时,如何利用编程工具和特定库来提升效率和性能。 首先,我们关注到标签中的“源码”和“工具”,这暗示了博主可能介绍...
在处理大数据量的Excel文件时,Java是一种常用的语言,因为它提供了强大的库,如Apache POI,使得解析大型Excel文件成为可能。Apache POI是Java的一个开源项目,专门用于读写Microsoft Office格式的文件,包括Excel...
Java 动态大数据量EXCEL下载是一个常见的需求,在企业级应用中尤为常见,尤其是在数据分析、报表生成和数据导出场景下。以下是对这个主题的详细解释: 首先,我们需要理解Java如何处理大数据量的EXCEL文件。传统的...
本资料"Java_批量导出大数据量Excel方法.zip"包含两个关键文件:excelproj和ExpXLS,它们可能是项目的源代码或示例,用于演示如何实现这个功能。 1. **Apache POI库**: Apache POI是Java中广泛使用的库,用于读写...
在Java开发中,Apache POI库是一个非常实用的工具,用于读取...总之,Apache POI结合SAX API为在Android上处理大数据量Excel文件提供了有效方案,通过合理配置和优化,可以显著降低内存消耗,提高应用的稳定性和效率。
在Java编程中,处理大数据量的Excel文件是一项挑战,因为Excel文件可能包含成千上万行数据。Apache POI是一个流行的库,专为处理Microsoft Office文档(如Excel)而设计,它提供了API来读取、写入和修改这些文件。在...
在C#编程中,将大数据量导出到Excel是一个常见的需求,特别是在数据分析、报表生成或者数据交换场景下。Microsoft.Office.Interop.Excel库是.NET Framework提供的一个用于与Microsoft Excel交互的COM接口,它允许...
通过结合HTML5的File API、第三方库以及优化技巧,可以实现在前端高效地处理大数据量的Excel文件,从而提升Web应用的性能和用户体验。在实际项目中,开发者需要根据具体需求和场景灵活选择和应用这些技术。
"使用POI导出大数据量到EXCEL"这个主题涉及到如何高效地利用POI处理大量数据并将其导出到Excel文件中。以下是对这个主题的详细讲解。 1. **Apache POI简介** Apache POI 是一个开源项目,它提供了Java API来创建、...
其次,在实验过程中,大数据量的导出很容易引发内存溢出,调整JVM的内存大小治标不治本。很多人建议保存为.CSV格式的文件。不过,.CSV方式导出也存在问题:首先,如果用excel来打开csv,超过65536行的数据都会看不见...
总结来说,使用jxl库进行大数据量Excel导出,关键在于合理分块、优化内存管理和充分利用异步处理。通过这些策略,即使面对大量数据,也能保证导出的效率和系统的稳定性。在实际开发中,还需要根据具体业务场景进行...
总的来说,MiniExcel 是一个针对大数据量Excel处理的高效工具,它的设计理念和实现方式使得在处理大规模数据时,不仅提高了性能,还降低了开发难度,是Java开发者的得力助手。通过合理利用MiniExcel,开发者可以更...
"大数据量Excel读取工具"通常是为了高效、稳定地处理成千上万个甚至百万级别的行和列而设计的。在这个场景中,压缩包文件"大数据量Excel读取工具.zip"包含了一个名为"maven_excel_util-master"的项目,我们可以推测...
usermodel模式对excel操作前需要将文件全部转入内存,对较大文件来说内存开销很大。但是其使用简单。 eventusermodel模式采用事件模型,对文件边读取边处理,内存消耗较低,效率高,因为不用等待文件全部装入内存。...
通过阅读和分析这个文件,我们可以深入理解如何使用POI处理大数据量Excel文件的步骤和最佳实践。 总之,Apache POI 3.8 Beta5为Java开发者提供了一个强大的工具,用于处理Excel文件,特别是大数据量的情况。通过...
默认情况下,Excel 2016及后续版本支持的最大行数为1,048,576行和16,384列,这限制了其处理大规模数据的能力。为应对这一问题,可以采用以下策略: 1. 分区导出:将大表拆分为多个小表,逐个导出到Excel,然后在...
在Excel处理大量数据时,传统的读取方法可能会面临性能瓶颈,尤其是在数据量达到10万条甚至更多时,常见的操作可能导致内存溢出,严重影响程序的稳定性和效率。为了解决这个问题,我们需要采用优化的策略来读取和...
在大数据处理领域,高效地解析Excel文件是一项...总之,Apache POI的SXSSF API是处理大数据量Excel文件的理想选择。通过合理设置内存参数和采取优化策略,我们可以高效地解析和处理这些文件,而不至于让内存成为瓶颈。