`
13146489
  • 浏览: 251370 次
  • 性别: Icon_minigender_1
  • 来自: 成都
社区版块
存档分类
最新评论

scala VS Erlang

阅读更多
1http://www.infoq.com/news/2008/01/languages-in-future-systems;jsessionid=8B2FAA2DE942EFAD2DCFFC305C348C3C
2http://www.infoq.com/news/2008/06/scala-vs-erlang
3、http://www.coderanch.com/t/457859/Scala/Scala-vs-erlang
4、原文地址:http://yarivsblog.blogspot.com/2008_05_01_archive.html
SUNDAY, MAY 18, 2008

Erlang vs. Scala
In my time wasting activities on geeky social news sites, I've been seeing more and more articles about Scala. The main reasons I became interested in Scala are 1) Scala is an OO/FP hybrid, and I think that any attempt to introduce more FP concepts into the OO world is a good thing and 2) Scala's Actors library is heavily influenced by Erlang, and Scala is sometimes mentioned in the same context as Erlang as a great language for building scalable concurrent applications.

A few times, I've seen the following take on the relative mertis of Scala and Erlang: Erlang is great for concurrent programming and it has a great track record in its niche, but it's unlikely to become mainstream because it's foreign and it doesn't have as many libraries as Java. Scala, on the hand, has the best of both worlds. Its has functional semantics, its Actors library provides Erlang style concurrency, and it runs on the JVM and it has access to all the Java libraries. This combination makes Scala it a better choice for building concurrent applications, especially for companies that are invested in Java.

I haven't coded in Scala, but I did a good amount of research on it and it looks like a great language. Some of the best programmers I know rave about it. I think that Scala can be a great replacement for Java. Function objects, type inference, mixins and pattern matching are all great language features that Scala has and that are sorely missing from Java.

Although I believe Scala is a great language that is clearly superior to Java, Scala doesn't supersede Erlang as my language of choice for building high-availability, low latency, massively concurrent applications. Scala's Actors library is a big improvement over what Java has to offer in terms of concurrency, but it doesn't provide all the benefits of Erlang-style concurrency that make Erlang such a great tool for the job. I did a good amount of research into the matter and these are the important differences I think one should consider when choosing between Scala and Erlang. (If I missed something or got something wrong, please let me know. I don't profess to be a Scala expert by any means.)

Concurrent programming


Scala's Actor library does a good job at emulating Erlang style message passing. Similar to Erlang processes, Scala actors send and receive messages through mailboxes. Like Erlang, Scala has pattern matching sematics for receiving messages, which results in elegant, concise code (although I think Erlang's simpler type system makes pattern matching easier in Erlang).

Scala's Actors library goes pretty far, but it doesn't (well, it can't) provide an important feature that makes concurrent programming so easy in Erlang: immutability. In Erlang, multiple processes can share the same data within the same VM, and the language guarantees that race conditions won't happen because this data is immutable. In Scala, though, you can send between actors pointers to mutable objects. This is the classic recipe for race conditions, and it leaves you just where you started: having to ensure synchronized access to shared memory.

If you're careful, you may be able to avoid this problem by copying all messages or by treating all sent objects as immutable, but the Scala language doesn't guarantee safe access to shared objects. Erlang does.

Hot code swapping


Hot code swapping it a killer feature. Not only does it (mostly) eliminates the downtime required to do code upgrades, it also makes a language much more productive because it allows for true interactive programming. With hot code swapping, you can immediately test the effects of code changes without stopping your server, recompiling your code, restarting your server (and losing the application's state), and going back to where you had been before the code change. Hot code swapping is one of the main reasons I like coding in Erlang.

The JVM has limited support for hot code swapping during development -- I believe it only lets you change a method's body at runtime (an improvement for this feature is in Sun's top 25 RFE's for Java). This capability is not as robust as Erlang's hot code swapping, which works for any code modification at any time.

A great aspect of Erlang's hot code swapping is that when you load new code, the VM keeps around the previous version of the code. This gives running processes an opportunity to receive a message to perform a code swap before the old version of the code is finally removed (which kills processes that didn't perform a code upgrade). This feature is unique to Erlang as far as I know.

Hot code swapping is even more important for real-time applications that enable synchronous communications between users. Restarting such servers would cause user sessions to disconnect, which would lead to poor user experience. Imagine playing World of Warcraft and, in the middle of a major battle, losing your connection because the developers wanted to add a log line somewhere in the code. It would be pretty upsetting.

Garbage collection


A common argument against GC'd languages is that they are unsuitable for low latency applications due to potential long GC sweeps that freeze the VM. Modern GC optimizations such as generational collection alleviate the problem somewhat, but not entirely. Occasionally, the old generation needs to be collected, which can trigger long sweeps.

Erlang was designed for building applications that have (soft) real-time performance, and Erlang's garbage collection is optimized for this end. In Erlang, processes have separate heaps that are GC'd separately, which minimizes the time a process could freeze for garbage collection. Erlang also has ets, an in-memory storage facility for storing large amounts of data without any garbage collection (you can find more information on Erlang GC at http://prog21.dadgum.com/16.html).

Erlang might not have a decisive advantage here. The JVM has a new concurrent garbage collector designed to minimize freeze times. This article and this whitepaper (PDF warning) have some information about how it works. This collector trades performance and memory overhead for shorter freezes. I haven't found any benchmarks that show how well it works in production apps, though, and if it is as effective as Erlang's garbage collector for low-latency apps.

Scheduling


The Erlang VM schedules processes preemptively. Each process gets a certain number of reductions (roughly equivalent to function calls) before it's swapped out for another process. Erlang processes can't call blocking operations that freeze the scheduler for long periods. All file IO and communications with native libraries are done in separate OS threads (communications are done using ports). Similar to Erlang's per-process heaps, this design ensures that Erlang's lightweight processes can't block each other. The downside is some communications overhead due to data copying, but it's a worthwhile tradeoff.

Scala has two types of Actors: thread-based and event based. Thread based actors execute in heavyweight OS threads. They never block each other, but they don't scale to more than a few thousand actors per VM. Event-based actors are simple objects. They are very lightweight, and, like Erlang processes, you can spawn millions of them on a modern machine. The difference with Erlang processes is that within each OS thread, event based actors execute sequentially without preemptive scheduling. This makes it possible for an event-based actor to block its OS thread for a long period of time (perhaps indefinitely).

According to the Scala actors paper, the actors library also implements a unified model, by which event-based actors are executed in a thread pool, which the library automatically resizes if all threads are blocked due to long-running operations. This is pretty much the best you can do without runtime support, but it's not as robust as the Erlang implementation, which guarantees low latency and fair use of resources. In a degenerate case, all actors would call blocking operations, which would increase the native thread pool size to the point where it can't grow anymore beyond a few thousand threads.

This can't happen in Erlang. Erlang only allocates a fixed number of OS threads (typically, one per processor core). Idle processes don't impose any overhead on the scheduler. In addition, spawning Erlang processes is always a very cheap operation that happens very fast. I don't think the same applies to Scala when all existing threads are blocked, because this condition first needs to be detected, and then new OS threads need to be spawned to execute pending Actors. This can add significant latency (this is admittedly theoretical: only benchmarks can show the real impact).

Depends on what you're doing, the difference between process scheduling in Erlang and Scala may not impact performance much. However, I personally like knowing with certainty that the Erlang scheduler can gracefully handle pretty much anything I throw at it.

Distributed programming


One of Erlang's greatest strengths is that it unifies concurrent and distributed programming. Erlang lets you send a message to a process in the local or on a remote VM using exactly the same semantics (this is sometimes referred to as "location transparency"). Furthermore, Erlang's process spawning and linking/monitoring works seamlessly across nodes. This takes much of the pain out of building distributed, fault-tolerant applications.

The Scala Actors library has a RemoteActor type that apparently provides the similar location-transparency, but I haven't been able to find much information about it. According to this article, it's also possible to distribute Scala actors using Terracotta, which does distributed memory voodoo between nodes in a JVM cluster, but I'm not sure how well it works or how simple it is to set up. In Erlang, everything works out of the box, and it's so simple to get it working it's in the language's Getting Started manual.

Mnesia


Lightweight concurrency with no shared memory and pure message passing semantics is a fantastic toolset for building concurrent applications... until you realize you need shared (transactional) memory. Imagine building a WoW server, where characters can buy and sell items between each other. This would be very hard to build without a transactional DBMS of sorts. This is exactly what Mnesia provides -- with the a number of extra benefits such as distributed storage, table fragmentation, no impedance mismatch, no GC overhead (due to ets), hot updates, live backups, and multiple disc/memory storage options (you can read the Mnesia docs for more info). I don't think Scala/Java has anything quite like Mnesia, so if you use Scala you have to find some alternative. You would probably have to use an external DBMS such as MySQL cluster, which may incur a higher overhead than a native solution that runs in the same VM.

Tail recursion


Functional programming and recursion go hand-in-hand. In fact, you could hardly write working Erlang programs without tail recursion because Erlang doesn't have loops -- it uses recursion for *everything* (which I believe is a good thing ). Tail recursion serves for more than just style -- it's also facilitates hot code swapping. Erlang gen_servers call their loop() function recursively between calls to 'receive'. When a gen_server receive a code_change message, they can make it a remote call (e.g. Module:loop()) to re-enter its main loop with the new code. Without tail recursion, this style of programming would quickly result in stack overflows.

From my research, I learned that Scala has limited support for tail recursion due to bytecode restrictions in most JVMs. From http://www.scala-lang.org/docu/files/ScalaByExample.pdf:


In principle, tail calls can always re-use the stack frame of the calling function. However, some run-time environments (such as the Java VM) lack the primitives to make stack frame re-use for tail calls efficient. A production quality Scala implementation is therefore only required to re-use the stack frame of a directly tail-recursive function whose last action is a call to itself. Other tail calls might be optimized also, but one should not rely on this across implementations.


(If I understand the limitation correctly, tail call optimization in Scala only works within the same function (i.e. x() can make a tail recursive call to x(), but if x() calls y(), y() couldn't make a tail recursive call back to x().)

In Erlang, tail recursion Just Works.

Network IO


Erlang processes are tightly integrated with the Erlang VM's event-driven network IO core. Processes can "own" sockets and send and receive messages to/from sockets. This provides the elegance of concurrency-oriented programming plus the scalability of event-driven IO (the Erlang VM uses epoll/kqueue under the covers). From Googling around, I haven't found similar capabilities in Scala actors, although they may exist.

Remote shell


In Erlang, you can get a remote shell into any running VM. This allows you to analyzing the state of the VM at runtime. For example, you can check how many processes are running, how much memory they consume, what data is stored Mnesia, etc.

The remote shell is also a powerful tool for discovering bugs in your code. When the server is in a bad state, you don't always have to try to reproduce the bug offline somehow to devise a fix. You can log right into it and see what's wrong. If it's not obvious, you can make quick code changes to add more logging and then revert them when you've discovered the problem. I haven't found a similar feature in Scala/Java from some Googling. It probably wouldn't be too hard to implement a remote shell for Scala, but without hot code swapping it would be much less useful.

Simplicity


Scala runs on the JVM, it can easily call any Java library, and it is therefore closer than Erlang to many programmers' comfort zones. However, I think that Erlang is very easy to learn -- definitely easier than Scala, which contains a greater total number of concepts you need to know in order to use the language effectively (especially if you consider the Java foundations on which Scala is built). This is to a large degree due to Erlang's dynamic typing and lack of object orientation. I personally prefer Erlang's more minimalist style, but this is a subjective matter and I don't want to get into religious debates here

Libraries


Java indeed has a lot of libraries -- many more than Erlang. However, this doesn't mean that Erlang has no batteries included. In fact, Erlang's libraries are quite sufficient for many applications (you'll have to decide for yourself if they are sufficient for you). If you really need to use a Java library that doesn't have an Erlang equivalent, you could call it using Jinterface. It may or may not be a suitable option for your application. This can indeed be a deal breaker for some people who are deciding between the two languages.

There's an important difference between Java/Scala and Erlang libraries besides their relative abundance: virtually all "big" Erlang libraries use Erlang's features concurrency and fault tolerance. In the Erlang ecosystem, you can get web servers, database connection pools, XMPP servers, database servers, all of which use Erlang's lightweight concurrency, fault tolerance, etc. Most of Scala's libraries, on the other hand, are written in Java and they don't use Scala actors. It will take Scala some time to catch up to Erlang in the availability of libraries based on Actors.

Reliability and scalability


Erlang has been running massive systems for 20 years. Erlang-powered phone switches have been running with nine nines availability -- only 31ms downtime per year. Erlang also scales. From telcom apps to Facebook Chat we have enough evidence that Erlang works as advertised. Scala on the other hand is a relatively new language and as far as I know its actors implementation hasn't been tested in large-scale real-time systems.

Conclusion


I hope I did justice to Scala and Erlang in this comparison (which, by the way, took me way too much to write!). Regardless of these differences, though, I think that Scala has a good chance of being the more popular language of the two. Steve Yegge explains it better than I can:


Scala might have a chance. There's a guy giving a talk right down the hall about it, the inventor of – one of the inventors of Scala. And I think it's a great language and I wish him all the success in the world. Because it would be nice to have, you know, it would be nice to have that as an alternative to Java.

But when you're out in the industry, you can't. You get lynched for trying to use a language that the other engineers don't know. Trust me. I've tried it. I don't know how many of you guys here have actually been out in the industry, but I was talking about this with my intern. I was, and I think you [(point to audience member)] said this in the beginning: this is 80% politics and 20% technology, right? You know.

And [my intern] is, like, "well I understand the argument" and I'm like "No, no, no! You've never been in a company where there's an engineer with a Computer Science degree and ten years of experience, an architect, who's in your face screaming at you, with spittle flying on you, because you suggested using, you know... D. Or Haskell. Or Lisp, or Erlang, or take your pick."


Well, at least I'm not trying too hard to promote LFE...
分享到:
评论

相关推荐

    并发需求下的Scala及Erlang语言的比较与使用

    ### 并发需求下的Scala及Erlang语言的比较与使用 在当今的高并发、大数据处理场景下,选择合适的编程语言对于系统性能至关重要。在众多编程语言中,Scala和Erlang因其强大的并发处理能力和函数式编程特性而受到关注...

    学到一些Erlang,以造福所有人!Learn You Some Erlang for Great Good!

    本书是为那些具有命令式语言编程基础知识(例如C / C ++,Java,Python,Ruby等)并且可能会或可能不知道函数式编程(例如Haskell,Scala,Erlang)的人们学习Erlang的一种方法。 ,Clojure,OCaml等)。

    scala-2.10.2

    scala.actors - Concurrency framework inspired by Erlang. scala.io - Input and output. scala.math - Basic math functions and additional numeric types. scala.sys - Interaction with other processes and ...

    Scala_Scala编程

    Scala语言的设计受到诸如Erlang、Haskell、OCaml、Java以及自己的前身语言Pizza的启发,它旨在解决传统面向对象语言的不足,同时使函数式编程的特性在工业环境中可行。 Martin Odersky是Scala的首席架构师,同时也...

    Scala、Groovy++、Stackless Python、Erlang 学习笔记及分享

    Scala是一种多范式编程语言,它融合了面向对象和函数式编程的概念,旨在提供一种高效、灵活且类型安全的编程环境。Scala运行在Java虚拟机(JVM)上,因此可以充分利用Java生态系统的丰富资源。学习Scala,你需要理解...

    Scala并发编程程.rar

    Actor模型是Erlang语言引入的概念,Scala对其进行了进一步的发展。Actor是一个独立运行的单元,它可以接收消息、处理消息并发送新消息。每个Actor都有自己的状态,并且与其他Actor之间的通信是通过异步消息传递来...

    ScalaVsErlang.pptx

    Scala 和 Erlang 是两种在并发处理和分布式计算领域被广泛使用的编程语言。它们各自具有独特的特性和优势,但在很多方面也存在显著的区别。 首先,Scala 由 Martin Odersky 创建,它是一种结合了面向对象(OO)和...

    七周七语言.epub

    书中介绍了Ruby、Io、Prolog、Scala、Erlang、Clojure和Haskell这七种语言,关注每一门语言的精髓和特性,重点解决如下问题:这门语言的类型模型是什么,编程范式是什么,如何与其交互,有哪些决策构造和核心数据...

    erlang开发入门教程

    erlang是爱立信开发的程序开发语言,融合了函数式编程与面向对象编程,并行处理内建与程序语言内部,特别适合创建并发行、容错性、分布性要求比较高的软实时系统,掌握它程序员必备的一种编程技能,与它相似的语言...

    scalang:Scalang是一个scala包装器,可以轻松编写与erlang交互的服务

    Scalang是消息传递和参与者库,它使Scala和Erlang应用程序可以轻松进行通信。 Scalang包含Erlang分布式节点协议的完整实现。 它提供了一个面向参与者的API ,该API可用于以惯用的,符合OTP的方式与Erlang节点进行...

    Akka Scala 学习高清原版pdf

    Akka框架借鉴了Erlang的并发模型,但它是建立在JVM之上,并且提供了丰富的抽象和工具,能够简化开发工作。 标题“Akka Scala 学习高清原版pdf”表明该文档是一个专注于Scala语言在Akka框架中应用的指南,而“描述”...

    scala pattern match

    除了Scala之外,许多其他的函数式编程语言如Haskell、Erlang、F#等也提供了强大的模式匹配功能。例如,Haskell使用模式匹配来进行函数定义,这使得函数的定义更加直观且易于理解。而在Erlang中,模式匹配是处理元组...

    SpringBoot如何使用Scala进行开发的实现

    Scala 把 Erlang 风格的基于 actor 的并发带进了 JVM,开发者可以利用 Scala 的 actor 模型在 JVM 上设计具伸缩性的并发应用程序,它会自动获得多核心处理器带来的优势,而不必依照复杂的 Java 线程模型来编写程序。...

    Scala基础教程

    Scala也内置了类似于Erlang的Actors库,使得并发编程变得简单,但是运行在JVM上。 Scala编译器的主要版本,即scalac,能够生成可以在JVM上运行的Java类文件。不过,Scala也存在一个可以生成能够在.NET公共语言运行...

    新出现的新语言scala新动向

    Scala的并发模型受到了Erlang的影响,引入了actor模型,允许开发人员轻松构建高度可伸缩的并发应用程序,自动利用多核处理器的优势,避免了传统Java线程模型的复杂性。此外,Scala还提供了轻量级的函数语法,如匿名...

Global site tag (gtag.js) - Google Analytics