趣味编程：函数式链表的快速排序

RednaxelaFX

浏览: 3067539 次
性别:
来自: 海外

最近访客更多访客>>

fangang

kknd97

peakmeng

wszt

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

编程 Haskell Google C C++

（恢复自2009-08-28的备份。幸好做了备份，不然换机房过程中损失的8小时数据就……）

题目

老赵在趣味编程：函数式链表的快速排序一帖中出了个题目，说：

Jeffrey Zhao 写道

前一段时间有朋友问我，以下这段Haskell快速排序的代码，是否可以转化成C#中等价的Lambda表达式实现：

qsort [] = []
qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++ qsort (filter (>= x) xs)

我当时回答，C#中缺少一些基础的数据结构，因此不行。经过补充之后，就没有任何问题了。后来，我觉得这个问题挺有意思，难度适中，也挺考察“基础编程”能力的，于是就自己写了一个。如果您感兴趣的话，也不妨一试。

那我也来凑个热闹来做做吧～肯定会有人能写得很简短精悍，我就反其道而行，写个又长又啰嗦的版本出来 ;-p
顺带一提：老赵原文中确实是说“C#中等价的Lambda表达式”，但从后面老赵提供的代码模板来看，似乎并不是想让读者真的用一个lambda表达式就解决问题。我这里尽量跟随代码模板做。

======================================================================

预备知识

老赵在原帖里对涉及的Haskell版qsort函数做了解释，请先参照之。我这里补充些我自己的理解。

首先是“表”的概念。在众多函数式语言中，不可变的表都是核心数据结构之一。它可以定义如下：（使用Haskell记法）
(1)空表[]是一个表。
(2) 将一个元素x连接在一个表l之前，构成的x:l也是一个表。此时x称为新表的头（head），l称为新表的尾（tail）。
(3) 在有限步数内应用(2)得到的是一个表。
这是一种递归定义法，其中(1)称为basis step，(2)称为recursive step，(3)称为closure step。这个闭包是指Kleene闭包。

Haskell记法中，含有多个元素的表可以写为这种形式：

[x, y, z]

实际上它是前面的表的定义的简写：

x : y : z : []

冒号表示连接，习惯上也叫cons。

可以看出，这种表结构可以很直观的用链表来实现。不过有点麻烦的是，表的基础——空表无法分解为头和尾，所以相关操作要注意空表的特例。下面会再提到。

其次是函数。写Haskell程序的一个好习惯就是通过类型来理解函数的意义。把老赵给的例子的代码补全，如下：

qsort :: (Ord a) => [a] -> [a]
qsort [] = []
qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++ qsort (filter (>= x) xs)

在开头加上了qsort函数的类型声明。注意到该函数的参数类型是[a]，也就是元素类型为a的表；返回值类型也是[a]。a是一个泛型参数，但带有限制：它必须是Ord这个typeclass的实例。Ord定义如下：

class (Eq a) => Ord a where
  compare :: a -> a -> Ordering
  (<) :: a -> a -> Bool
  (>=) :: a -> a -> Bool
  (>) :: a -> a -> Bool
  (<=) :: a -> a -> Bool
  max :: a -> a -> a
  min :: a -> a -> a

Ord是继承自Eq类的，定义了上述方法。正是因为qsort的泛型参数a带有Ord的“限制”，所以在函数里可以自然的使用<和>=这两个函数，Ord保证这两个函数的存在。

如果要映射到C#的话，虽然不能按原本的概念直接映射，不过勉强可以找到一个类似效果的语言结构：泛型约束。类似这样：

public static ImmutableList<T> QuickSort<T>(
    this ImmutableList<T> src) where T : IComparable<T> {
    // ...
}

IComparable<T>接口上的CompareTo()方法足够表达Ord typeclass需要的运算了，只是没那么方便……
老赵给的代码模板里不是通过泛型约束，而是通过传一个compare函数来解决判断大小的问题。这么做也可以达到目的没错，不过比泛型约束跟原Haskell代码的差异又大些。

关于函数还有一点，就是Haskell的函数是单一参数单一返回值的。我在写这次的代码时有一版本是用x => y => x.CompareTo(y)之类的写法来应对Haskell原本的代码的特征。后来觉得算了，这个例子里这么写意义不大。

然后Haskell的lazy求值在这个例子里也没有明显体现，不展开了。

说来……好的快速排序的实现关键就是选好pivot。老赵原本给的Haskell代码就是以表中第一个元素为pivot，虽然不太好不过我也懒得写得更麻烦，就跟了 =u=

======================================================================

关于实现

OK，那么我们需要实现一个不可改变的单向链表，很简单对吧。C里要自己写个链表那还不是再普通不过的事情了，

typedef struct tagNode {
    int value;
    struct tagNode* next;
} Node;

咋表示结尾呢？往next扔个NULL就是了呗。
……好吧，但这题不是要用C做。

留意到前文提到空表的特例。如果选择使用null来表现空表，虽然可以很好的表现出它不支持头和尾的拆分，但同时会带来诸多麻烦——必须要到处检查null避免遇到NullReferenceException。有一种减少null带来的麻烦的办法，叫做空对象模式（Null Object Pattern）。该模式的关键在于提供一个正常的接口，在为正常状况给出一个实现之外，为“空对象”的特例情况也实现该接口。这里提到的“接口”是泛指一种抽象，而不是特指Java或者C#中的interface。

老赵原本提供的代码模板是使用null来表现空表的。后来我建议用空对象模式后老赵做了些修改，但跟我预期的不一样。下面我给出的代码是按照我对空对象模式的理解的实现。

其实与其说是从空对象模式获得灵感，我下面的代码中很多习惯都是从DLR的代码中学来的。例如不对外提供公共的构造器，而是提供更可控制的工厂方法，根据需要返回不同的特化的子类实例。又例如将字符串字面量放到一个静态类中统一管理，抛出的异常也如法炮制。
工厂方法还有一个妙用就是充分利用C#的类型推导，可以少写些类型参数，舒畅。我本来只在ImmutableList<T>上写了Of()方法，后来写到Main()要用这方法时想起来居然还得写T是什么，不爽了，就在ImmutableList静态类上加了个Of<T>()来解决问题，爽多了。Java里这招也是常用，看看Google Collections Library里这种技巧应用的密度……

代码如下。本来最好是分成多个文件的，既然是为发帖而写的就不管了，都凑在一起也罢。

using System;
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;

namespace TestImmutableDataStructure {
    // Represents an immutable list.
    //
    // A proper list is one that holds the first element in its head,
    // and the rest of the elements as a sublist in its tail.
    // A tail with an empty list denotes the end of list.
    // An empty list has no head or tail.
    // The tail of a non-empty list, must not be null.
    public abstract class ImmutableList<T> : IEnumerable<T> {
        public static readonly ImmutableList<T> Empty =
            EmptyImmutableList<T>.Instance;

        #region Factory methods

        // create a list from an array of items
        public static ImmutableList<T> Of( params T[ ] items ) {
            Debug.Assert( null != items );

            var length = items.Length;
            // if ( 0 == items.Length ) return Empty;

            ImmutableList<T> result = Empty;
            for ( var i = length - 1; i >= 0; i-- ) {
                result = Cons( items[ i ], result );
            }

            return result;
        }

        // constructs a list by prepending the head onto the tail
        public static ImmutableList<T> Cons( T head, ImmutableList<T> tail ) {
            return new NonEmptyImmutableList<T>( head, tail );
        }

        #endregion

        #region Constructors

        protected ImmutableList( ) { }

        #endregion

        public abstract T Head { get; }
        public abstract ImmutableList<T> Tail { get; }
        public abstract bool IsEmpty { get; }

        #region IEnumerable<T> Members

        public abstract IEnumerator<T> GetEnumerator( );

        #endregion

        #region IEnumerable Members

        IEnumerator IEnumerable.GetEnumerator( ) {
            return GetEnumerator( );
        }

        #endregion
    }

    // Represents the special case of an empty list.
    internal class EmptyImmutableList<T> : ImmutableList<T> {
        public static readonly EmptyImmutableList<T> Instance =
            new EmptyImmutableList<T>( );

        public override T Head {
            get { throw Errors.ListIsEmpty; }
        }

        public override ImmutableList<T> Tail {
            get { throw Errors.ListIsEmpty; }
        }

        public override bool IsEmpty {
            get { return true; }
        }

        public override IEnumerator<T> GetEnumerator( ) {
            yield break;
        }
    }

    // Represents a non-empty list.
    internal class NonEmptyImmutableList<T> : ImmutableList<T> {
        private T _head;
        private ImmutableList<T> _tail;

        internal NonEmptyImmutableList( T head, ImmutableList<T> tail ) {
            Debug.Assert( null != tail );

            _head = head;
            _tail = tail;
        }

        public override T Head {
            get { return _head; }
        }

        public override ImmutableList<T> Tail {
            get { return _tail; }
        }

        public override bool IsEmpty {
            get { return false; }
        }

        public override IEnumerator<T> GetEnumerator( ) {
            ImmutableList<T> list = this;
            while ( !list.IsEmpty ) {
                yield return list.Head;
                list = list.Tail;
            }
        }
    }

    // ImmutableList<T> extensions and convience methods
    public static class ImmutableList {
        // convience method for creating a list with ease of type inference
        public static ImmutableList<T> Of<T>( params T[ ] items ) {
            return ImmutableList<T>.Of( items );
        }

        // convience extension method for constructing a list
        // by prepending the head onto the tail
        public static ImmutableList<T> Cons<T>(
            this T head,
            ImmutableList<T> tail ) {
            if ( null == tail ) throw Errors.ArgumentIsNull( "tail" );

            return ImmutableList<T>.Cons( head, tail );
        }

        // concatenates two lists
        public static ImmutableList<T> Concat<T>(
            this ImmutableList<T> first,
            ImmutableList<T> second ) {
            if ( null == first ) throw Errors.ArgumentIsNull( "first" );
            if ( null == second ) throw Errors.ArgumentIsNull( "second" );

            if ( first.IsEmpty ) return second;
            return first.Head.Cons( first.Tail.Concat( second ) );
        }

        // filters a list with a predicate
        public static ImmutableList<T> Where<T>(
            this ImmutableList<T> src,
            Func<T, bool> pred ) {
            if ( null == src ) throw Errors.ArgumentIsNull( "src" );
            if ( null == pred ) throw Errors.ArgumentIsNull( "pred" );

            return src.WhereCore( pred );
        }

        private static ImmutableList<T> WhereCore<T>(
            this ImmutableList<T> src,
            Func<T, bool> pred ) {
            if ( src.IsEmpty ) return src;

            var head = src.Head;
            if ( pred( head ) ) {
                return head.Cons( src.Tail.WhereCore( pred ) );
            } else {
                return src.Tail.WhereCore( pred );
            }
        }

        // quicksorts a list
        public static ImmutableList<T> QuickSort<T>(
            this ImmutableList<T> src )
            where T : IComparable<T> {
            return src.QuickSort( ( x, y ) => x.CompareTo( y ) );
        }

        // quicksorts a list
        public static ImmutableList<T> QuickSort<T>(
            this ImmutableList<T> src,
            Func<T, T, int> compare ) {
            if ( null == src ) throw Errors.ArgumentIsNull( "src" );

            return src.QuickSortCore( compare );
        }

        private static ImmutableList<T> QuickSortCore<T>(
            this ImmutableList<T> src,
            Func<T, T, int> compare ) {
            if ( src.IsEmpty ) return src;

            var pivot = src.Head;
            var tail = src.Tail;
            return tail.Where( x => compare( x, pivot ) < 0 )
                       .QuickSortCore( compare )
                       .Concat(
                           pivot.Cons(
                               tail.Where( x => compare( x, pivot ) >= 0 )
                                   .QuickSortCore( compare ) ) );
        }
    }

    // string resources
    internal static class Strings {
        public static readonly string ListIsEmpty = "the list is empty";
    }

    // exception resources
    internal static class Errors {
        public static InvalidOperationException ListIsEmpty {
            get { return new InvalidOperationException( Strings.ListIsEmpty ); }
        }

        public static ArgumentNullException ArgumentIsNull( string paramName ) {
            return new ArgumentNullException( paramName );
        }
    }

    static class Program {
        static void Main( string[ ] args ) {
            var list = ImmutableList.Of( 3, 1, 2, 5, -1, 2, 0 );
            list = list.QuickSort( );
            foreach ( var i in list ) {
                Console.WriteLine( i );
            }
        }
    }
}

（编辑：刚看到老赵新给出的参考答案，发现我不应该用NotImplementedException的。本来我顺手敲的是UnsupportedOperationException，可是敲进VS2008发现没高亮，知道有问题了，然后再在列表里选异常类型时手滑了 T T 现改为InvalidOperationException）

对了，我在开头不是提过在QuickSort<T>上用泛型约束嘛，但还是想接近老赵提供的模板来做，所以又加了个没有约束的版本，变成现在这样。
还有一点，细心看的人肯定很快就看到了，就是我在QuickSortCore<T>()里拼接表的操作跟原本的Haskell代码不同。我的版本换回到Haskell会是类似这样：

qsort [] = []
qsort (x:xs) = qsort (filter (< x) xs) ++ (x : qsort (filter (>= x) xs))

减少了两个cons而已，没什么大不了的，效果还是一样。
（第一个cons是++ [x] ++中的[x]，它是x : []的简写，有一个cons；
第二个cons是[x] ++ ...的时候，要把x拆出来再跟后面的部分concat，里面包含了一个cons）

我已经做好心理准备被拍砖了～来吧 =v=
在老赵的帖里，装配脑袋同学已经出现了IEnumerable<T>版的答案，如：

static Func<T, T> Fix<T>(Func<Func<T, T>, Func<T, T>> f) {
    return x => f(Fix(f))(x); 
}

var qsort = Fix<IEnumerable<int>>(f => l =>
    l.Any() ?
    f(l.Skip(1)
       .Where(e => e < l.First()))
    .Concat(Enumerable.Repeat(l.First(), 1))
    .Concat(f(l.Skip(1)
               .Where(e => e >= l.First()))) :
    Enumerable.Empty<int>());

呵呵，连Y组合子都用上了。我觉得老赵说这个不是出题的本意，是不是说他想看到的更多是表结构富有特征的地方：组装一个表得从后向前做。
另外的话，我的ImmutableList<T>虽然也实现了IEnumerable<T>，也可以用上面这段代码，但得到的结果类型就不再是原来的表，而是别的IEnumerable<T>的实现了，也有点不尽人意，毕竟原本的那段Haskell代码的返回值类型也是个表。

分享到：

要让CLR挂掉的话…… | 第一次看到HotSpot挂在JIT线程上……

2009-08-31 08:53
浏览 3472
评论(2)
分类:编程语言
查看更多

2 楼 RednaxelaFX 2009-09-01

liujinmarshall 写道

Google Reader里也有备份，不用担心

师兄 T T
但是我没有用Google Reader，如果我自己要找回在那里的备份要怎么做？
再说在JavaEye发帖如果没保存原本的BBCode的话比较麻烦……WYSIWYG编辑器用得不太顺

1 楼 liujinmarshall 2009-09-01

Google Reader里也有备份，不用担心

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论