`
love~ruby+rails
  • 浏览: 849087 次
  • 性别: Icon_minigender_1
  • 来自: lanzhou
社区版块
存档分类
最新评论

Exporting data to CSV and Excel in your Rails apps

阅读更多

Sometimes users want to slice and dice data as they wish. In such scenarios, it’s usual to export the data in a tabular format so your users can use any spreadsheet editor and do whatever they want.

Usually, we do that using CSV, right? OpenOffice and other editors can open CSV files flawlessly. Just double-click an CSV file and voilà.

However, Excel has some gotchas. If you double-click a CSV file, Excel will open your data with everything in just one column. Of course you can go to Tools > Import, browse to the file, set up comma as separator and then after some clicks you get your data as you wish. This is not user-friendly at all and this is not something we want to explain to our clients. Our clients must be able to simply double-click the my_data.csv file and see the data well structured.

So, here’s what we do at Plataforma to deal with Excel formats.

It’s TSV, not CSV, dude!

First of all, Excel expects your data with tabulations as fields separator. So, what you need is actually a TSV (tab-separated values).

If you are using FasterCSV, you just need to do:

tsv_str = FasterCSV.generate(:col_sep => "\t") do |tsv|
  tsv << headers
  # append your beautiful data here
end

Keep this in mind. Excel demands tabulations, not commas! But there are worse things to come…

No newline in fields

Excel doesnt’ like when you put “\n” inside fields. Although fields are separated by tabulations, it appears Excel can’t cope with extra newlines.

So, if you have some text fields in your model, beware that it may contain newlines and you need to strip those before exporting your data to Excel.

Forget about UTF-8. Use UTF-16!

One of the most hidden specifications of Excel is that it expects our TSV files to be encoded using UTF-16 Little Endian. Did you know that? Well, we didn’t!

Some sources even say that this is the only Unicode format supported in Excel.

What’s the difference between UTF-8 and UTF-16? UTF-8 is a variable byte encoding in which characters may use up to 4 bytes, however for western languages it usually uses one or two bytes. UTF-16 characters uses always 2 bytes (remember that characters and bytes are two different things in Unicode “slang”). Generally speaking, UTF-16 uses more space (most of the time).

There’s also the Little Endian part. UTF-16 always uses a pair of bytes to represent a character, however we need to know what’s the correct order of those bytes. We won’t get into details here, but the order is indicated by Byte-order Mark (BOM). In practice, the BOM in UTF-16 will add two bytes in the beginning of a file (you can see that in a hexeditor).

One way to convert your TSV string (generated by FasterCSV) is with Iconv, a tool written for the GNU C Library. Luckly, Iconv is well wrapped in pure Ruby joy. It’s inside Ruby’s standard library and you just need to require it.

However, if you convert an string to UTF-16 Little Endian, Iconv will NOT put the BOM in the beginning. This is compatible with the Unicode FAQ. But since Excel is way out of the standards, you must manually insert the BOM.

You can use Iconv just before sending the file to the user in your controller. It will look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
require 'iconv'
 
class ProjectsController < ActionController::Base
  BOM = "\377\376" #Byte Order Mark
 
  def index
    @projects = Project.all
    respond_to do |format|
      format.html
      format.csv { export_csv(@projects) }
    end
  end
 
protected
 
  def export_csv(projects)
    filename = I18n.l(Time.now, :format => :short) + "- Projects.csv"
    content = Project.to_csv(projects)
    content = BOM + Iconv.conv("utf-16le", "utf-8", content)
    send_data content, :filename => filename
  end
end

You can see that we are still using CSV as an extension to the file. This is because TSV files are not usually associated with Excel. This excerpt uses a timestamp as the filename, which is generally a good practice.

Wrapping up

So, these are the three rules for dealing with Excel-friendly-CSV:

  1. Use tabulations, not commas.
  2. Fields must NOT contain newlines.
  3. Use UTF-16 Little Endian to send the file to the user. And include a Little Endian BOM manually.

Keep this in mind and you’ll never have to explain to your clients how to open the data export in Excel.

There is one last issue: OpenOffice will NOT open files in the Excel “specification” easily. Google Analytics solves this by showing two links to the user: “Export to CSV” and “Export to Excel”. The first is a regular CSV file and the second is the specially crafted Excel-friendly TSV file.

And blame Microsoft for this odd behaviour.

UPDATE: @danielvlopes pointed us an already existing solution which encapsulates the process described in this post, called csv_builder. You just need to set the @output_encoding to use the “utf-16″ (beware of the BOM).

@jncoward also provided a link to the stylesheet gem, which handles directly Excel formats. TSV is a much simpler and faster format, but the spreadsheet gem might be useful to you in more complex cases.

Finally, Kieran pointed in the comments that we can have XML spreadsheets, while Niko and Chris told us that we can you can also work with HTML tables in Excel, which is simple as well and even allows formatting.

分享到:
评论

相关推荐

    Excel Importing & Exporting Text Data (Data Analysis With Excel) (2016)

    Excel Importing & Exporting Text Data ---Quickly Turn Raw Data Into Excel Tables, By Scott Hartshorn, epub格式版。

    LibXL.for.Windows.3.6.5.0.Incl.License.and.Cracked

    LibXL can help your applications in exporting and extracting data to/from Excel files with minimum effort. Also it can be used as report engine. Library can be used in C, C++, C#, Delphi, Fortran and ...

    Excel.2016.Up.To.Speed.B01

    In no time at all you're be importing and exporting data, creating formulas, modifying and formatting spreadsheets, setting up and formatting data in tables, creating beautiful charts, adding and ...

    Exporting Context Data into Excel Using the WebDynpro Binary Cache

    ### 使用WebDynpro二进制缓存将上下文数据导出到Excel的知识点解析 #### 一、引言 在企业级应用开发中,数据处理与报表生成是常见且重要的需求之一。SAP WebDynpro作为一种高效的企业级应用开发框架,提供了丰富的...

    Haskell Data Analysis Cookbook(PACKT,2014)

    The meat of data analysis occurs in the topics involving statistical techniques, parallelism, concurrency, and machine learning algorithms, along with various examples of visualizing and exporting ...

    Tableau.Creating.Interactive.Data.Visualizations

    Illustrate your data in a more interactive way by implementing data visualization principles and creating visual stories using Tableau About This Book Use data visualization principles to help you to...

    [内含注册机]LibXL.for.Windows.3.7.2.Incl.Keygen

    LibXL can help your applications in exporting and extracting data to/from Excel files with minimum effort. Also it can be used as report engine. Library can be used in C, C++, C#, Delphi, Fortran and ...

    易导出

    QueryRun Data Easy Export Run(DEE) is a powerful program to export your data quickly from databases to any of 14 available formats, including MS Access (Windows version only), MS Excel, MS Word (RTF),...

    Hands-On Data Visualization with Bokeh pdf

    Adding a layer of interactivity to your plots and converting these plots into applications hold immense value in the field of data science. The standard approach to adding interactivity would be to ...

    Data.Modeling.Made.Simple.with.Embarcadero.ERStudio

    By the end of this section, you will have created and saved your first data model in ER/Studio and be ready to start modeling in Section II! Section II explains all of the symbols and text on a data...

    ExcelExport 3.12最新源码版本--

    将数据集输出到Excel 97/2000/2002Export all records of a dataset to MS Excel (97, 2000 or XP).PropertiesName of worksheet and file Header texts Begin row of header, titles (=fieldnames) and data (field...

    西门子_Exporting archived Data from WinCC into a .csv-File.pdf

    导出完成后,可以通过Excel或其他支持CSV格式的软件打开和查看数据。 在导出操作完成后,还需要进行数据验证,确保数据的完整性和准确性。对于导出的数据,应定期进行检查和分析,以确保生产流程的稳定性和优化。 ...

    Business Analysis with Microsoft Excel, 5th Edition

    Business Analysis with Microsoft Excel (5th Edition) ...19 Using Excel 2016 and Power BI to Analyze QuickBooks Data 20 Analyzing Contributions and Margins 21 Pricing and Costing Index

    LibXL 3.6.2cr版

    LibXL can help your applications in exporting and extracting data to/from Excel files with minimum effort. Also it can be used as report engine. Library can be used in C, C++, C#, Delphi, Fortran and ...

    WinCC Professional V15.1 Excel Report Program

    WinCC Professional V15.1 Excel Report Program . WinCC Professional is an extremely powerful SCADA software from Siemens.... It supports exporting data to Excel (csv, txt) with many different forms.

    SAM分割模型onnx导出模型问题:Exporting the operator repeat-interleave to ON

    Exporting the operator repeat_interleave to ONNX opset version 11 is not supported TypeError: 'torch._C.Value' object is not iterable (Occurred when translating repeat_interleave). 问题解决: 1....

    Beginning.Neo4j.1484212

    Beginning Neo4j is your introduction in the world of graph databases, and the benefits they can bring to your applications. Neo4j is the most established graph database on the market, and it's always ...

Global site tag (gtag.js) - Google Analytics