`
hongtoushizi
  • 浏览: 377035 次
  • 性别: Icon_minigender_1
  • 来自: 天津
社区版块
存档分类
最新评论

Storing image in DB vs filesystem for user uploaded images in website

阅读更多

债自: http://stackoverflow.com/questions/23263468/storing-image-in-db-vs-filesystem-for-user-uploaded-images-in-website

 

sigh why does everybody jump to GridFS?

Depending on the size of the images and the exact use case, I'd recommend to store the imagesdirectly in the DB (not via GridFS). Here's why:

File System

  • Storing the images in the file system is proven to work well, but it's not trivial
  • You will need a different backup system, failover, replication, etc. This can be tricky DevOps-wise
  • You will need to create a smart directory structure which is a leaky abstraction, because different file systems have very different characteristics. Some have no problem storing 16k files in one folder, others start to choke at a mere 1k files. A common approach is to use a convention like af/2c/af2c2ab3852df91.jpg, where the folders af and 2c are inferred from the file name (which itself might be a hash of the content for deduplication purposes).

GridFS

GridFS is made for storing large files, and for storing files in a very similar way to a file system. That comes with some disadvantages:

  • For every file, you will need one fs.file and one fs.chunk document. Chunking is totally required for large files, but if your files are below 256k on average, there's no real chunking going on (default chunk size is 256k). So when storing small files in GridFS, you get the overhead without the advantage. Bad deal. It also requires two queries instead of one.
  • It imposes a certain structure on your collection, for instance to have a 'file name'. It depends on the use case, but I often choose to use a hash as the id and store the hash in the user, for example. That deduplicates, is easy to implement, aligns beautifully with caching and doesn't require coming up with any convention. It's also very efficient because the index is a byte array.

Things might look different if you're operating a site for photographers where they can upload their RAW files or large JPEGs at 10MB. In that case, GridFS is probably a good choice. For storing user images, thumbnails, etc., I'd simply throw the image in its own document flat.

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics