原文地址:
http://dukesoferl.blogspot.com/2008/08/scaling-mnesia-with-localcontent.html
so we've been spending the last two weeks trying to scale our mnesia application for a whole bunch of new traffic that's about to show up. previously we had to run alot of ec2 instances since we were using ets (RAM-based) storage; once we solved that we started to wonder how many machines we could turn off.
initial indications were disappointing in terms of capacity. none of cpu, network, memory, or disk i/o seemed particularly taxed. for instance, even though raw sequential performance on an ec2 instance can hit 100mb/s, we were unable to hit above 1mb/s of disk utilization. one clue: we did get about 6mb/s of disk utilization when we started a node from scratch. in that case, 100 parallel mnesia_loader processes grab table copies from remote nodes. thus even under an unfavorable access pattern (writing 100 different tables at once), the machines were capable of more.
one problem we suspected was the registered process mnesia_tm, since all table updates go through this process. the mnesia docs do say that it is "primarily intended to be a memory-resident database". so one thought was that mnesia_tm was hanging out waiting for disk i/o to finish and this was introducing latency and lowering throughput; with ets tables, updates are cpu bound so this design would not be so problematic. (we already have tcerl using the async_thread_pool, but that just means the emulator can do something else, not the mnesia_tm process in particular). thus, we added an option to tcerl to not wait for the return value of the linked-in driver before returning from an operation (and therefore not to check for errors). that didn't have much impact on i/o utilization.
we'd long ago purged transactions from the serving path, but we use sync_dirty alot. we thought maybe mnesia_tm was hanging out waiting for acknowledgements from remote mnesia_tm processes and this was introducing latency and lowering throughput. so we tried async_dirty. well that helped, except that under load the message queue length for the mnesia_tm process began to grow and eventually we would have run out of memory.
then we discovered local_content, which causes a table to have the same definition everywhere, but different content. as a side effect, replication is short-circuited. so with very minor code changes we tested this out and saw a significant performance improvement. of course, we couldn't do this to every table we have; only for data for which we were ok with losing if the node went down. however it's neat because now there are several types of data that can be managed within mnesia, in order of expense:
transactional data. distributed transactions are extremely expensive, but sometimes necessary.
highly available data. when dirty operations are ok, but multiple copies of the data have to be kept, because the data should persist across node failures.
useful data. dirty operations are ok, and it's ok to lose some of the data if a node fails.
the erlang efficiency guide says "For non-persistent database storage, prefer Ets tables over Mnesia local_content tables.", i.e., bypass mnesia for fastest results. so we might do that, but right now it's convenient to have these tables acting like all our other tables.
interestingly, i/o utilization didn't go up that much even though overall capacity improved alot. we're writing about 1.5 mb/s now to local disks. instead we appear cpu bound now; we don't know why yet.
分享到:
相关推荐
这里的"chirp scaling algorithm"是指SAR图像处理中的一个重要算法,用于提高图像质量和解析度。 "Chirp Scaling Algorithm"是SAR图像处理中的一个核心算法,特别是在单站(monostatic)SAR系统中。"Monostatic...
- 自动扩展(Auto-scaling) - 负载均衡(Load-balancing) - Django会话(Django sessions) - Memcached的使用 - S3上传和静态文件的使用 - MongoDB和SimpleDB - 无状态和有状态系统 - 邮件服务和Simple ...
Data Algorithms Recipes for Scaling Up with Hadoop and Spark 英文epub 本资源转载自网络,如有侵权,请联系上传者或csdn删除 本资源转载自网络,如有侵权,请联系上传者或csdn删除
SAR成像算法chirp-scaling
图像缩放源代码,该代码支持输入bmp文件格式,输出bmp格式。
% This function implements a linear fitness scaling algorithm as described% by Goldberg in "Genetic Algorithms in Search Optimization and Machine% Learning" Addison Wesley 1989. It use is ...
在"2_pass_scaling_src.rar_2_pass_scaling_scale"中,我们可以看到一些关键文件,这些文件可能包含了实现2-pass scaling算法的核心代码。首先,"2PassScale_R.h"和"2PassScale.h"很可能是头文件,它们定义了相关的...
而`scaling.m`很可能是实现图像缩放的MATLAB脚本。通过运行这个脚本,我们可以实现对`x.bmp`图像的几何空间变换——缩放。 总的来说,MATLAB为图像处理提供了丰富的工具和函数,涵盖了从基本的几何变换到复杂的仿射...
Scaling Big Data with Hadoop and Solr is a step-by-step guide that helps you build high performance enterprise search engines while scaling data. Starting with the basics of Apache Hadoop and Solr, ...
FFmpeg是领先的多媒体框架,能够解码,编码, 转码,复用,解复用,流,过滤和播放人类和机器创造的任何东西(ffmpeg scaling video demo.c)
标题中的“2_pass_scaling_src.zip_2_pass_scaling”暗示了这是一个关于图像处理技术的压缩包,特别是与两次通过缩放(2-pass scaling)相关的源代码。两次通过缩放是一种优化图像质量的技术,尤其是在放大或缩小...
《Data Algorithms Recipes for Scaling Up with Hadoop and Spark》是一本专为大数据处理和分析领域设计的实战指南。这本书深入探讨了如何利用Hadoop和Spark这两种强大的大数据处理框架来解决实际问题,提升数据...
Data Algorithms Recipes for Scaling Up with Hadoop and Spark 英文mobi 本资源转载自网络,如有侵权,请联系上传者或csdn删除 本资源转载自网络,如有侵权,请联系上传者或csdn删除
这个名为"matlab.rar_gray scaling_matlab 图片缩放_matlab图片旋转"的压缩包,显然包含了使用MATLAB进行图像处理的一些示例和源代码。我们将详细探讨其中涉及到的几个关键知识点:灰度化、图片缩放以及图片旋转。 ...
3D-AR_change_Scaling_and_Moving.zip,使用vuforia sdk在unity上创建增强现实的教程,以创建多图像目标,并在运行时将大小(缩放)更改为游戏对象(图像或三维模型或文本),3D建模使用专门的软件来创建物理对象的...
标题中的“2_pass_scaling_src.rar_scale”暗示了我们讨论的主题是关于图像处理中的双遍缩放算法,可能是一个源代码库。这个算法通常用于提高图像缩放的质量,尤其是在缩小图像时保持细节清晰度或者放大图像时减少...
A New Cross-range Scaling Algorithm Based on FrFT
本文提出一种基于多维定标的无线传感器网络三维定位算法,结合RSS经验衰减模型和最短路径建立相异性矩阵,采用轻量级矩阵分解算法降低相异性矩阵分解的计算复杂性,并利用网络中存在的周期性消息将初始定位信息回送,在...