序列“乱序数”的求解（更优的算法求解）

CalvinMnakor

浏览: 52485 次
性别:
来自: 上海

最近访客更多访客>>

poterban

draculav

lirihong

粉墨登场A380

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

小程序/算法

算法 J#

题目论述：

引用

两个序列由相同的字母组成，只是次序不同
两个序列的“乱序”是指
任在一个序列中选择两个元素，看在另一个序列中是否顺序不同
问题是如何数出总共有多少“乱序”

比如有两个序列
序列1为：
1，2，4，3
序列2为
2，4，1，3

序列1中（1，2）在2中是（2，1），这就是一个乱序；（1，4）在2中式（4，1）还是乱序
；逐个检查过去，数一数，就是两个序列的“乱序数”

以下是我的一些看法，可能有些疏忽或差错，还望知道。

引用

个人看法：
本题和最大字段和有点相像
法一：每一对进行判断(还有先比对其是不是相同二元组)
        s1(i,j);
        s2(i,j);
        for(;i<N;)
           for(j=i+1;j<N;)
              for(;m<N;)
                 for(n=m+1;n<N;) test(s1(i,j),s2(m,n));
        ★时间复杂度O（n^4）

       上述算法，简单的修改一下，就可稍微减少复杂度。
        for(;i<N;)
           for(j=i+1;j<N;)
              for(;m<N;) s2[m]=s1[i];m;
              for(;n<N;) s2[n]=s1[j];n;
              test(i,j,m,n);
        ★时间复杂度O(n^3),Simple()

法二、要达到O（nlogn）,一般情况下采用分治算法，将规模每次缩小至原来的一半。

      分治法的相关定义：
      即T(1) = O(n)
        T(n) = 2T(n/2) + O(n)
      when n=2^i
        then T(n) = O(nlogn)

      So,here is more thinking
      把是s（0，n）分成s（0，n/2)和s（n/2,n)
      s1和 s2 same divided into two halves,and compare each half by itself
      for the middle one,we use this:
          question:s1(l,u) and s2(l,u) and middle = (l+u)/2
          此部分的二元组个数为：n^2 （n指当前分半后的规模），即（n/2）2


      由此看来，运算的规模还是没有有效地降低下来，我们应该想想其他办法。
      。。。。。
      。。。。。
      想想，应该是二元组比对的效率在影响，所以应该从这方面入手
            其二，大概有n^2个二元组，所以应该至少n^2次比较，而并非上述的（nlogn)

法三、以尾字符为标志
      for(i=1;N;)
          setA:[0...i-1]
          for(j;N;)
              s2[j] == s1[i] break;
          setB:[0...j-1]
          setA-setB;
      上述主要时间复杂度在setA-setB上，即是求出两集合（差）运算。可以先排序，
      使用一般的插入排序，耗时nlogn，然后以n时间集合运算。综上所述，
       ★时间复杂度O(n^2logn),OutofOrder()

法四、位置标志法
      此法是对法一的改进
      算法描述：
          tmp1[];辅助数组
          tmp2[];辅助数组
          for(;N;)
             for(;N;) test(tmp1[i],tmp1[j],tmp2[i],tmp2[j]);
       ★时间复杂度O(n^2),Best()

其中，带★的后三种我都实现了，下面逐一论述：
原始序列的产生不再赘述（为简化，简单的为0到n间的数，就这一点后面还有论述，关于排序部分）；
算法一：

int Simple(int *s1,int *s2,int n)
{
	//improved simple method
	int i,j,p,q,sum;
	sum = 0;
	for (i = 0;i < n-1;++ i)
	{
		for (j = i+1;j < n;++ j)
		{
			//find the first number in s2[]
			for (p = 0;p < n;++ p)
				if (s2[p] == s1[i]) break;
			//find the second number in s2[]
			for (q = 0;q < n;++ q)
				if (s2[q] == s1[j]) break;
            if (!test(i,j,p,q)) sum ++;
		}
	}
	return sum;
}

其中test（）函数是判断四个元素是否同一顺序,即升序，降序，以此决定两元组是否顺序相同。上述时间复杂度位O(n^3).

算法二：

int OutofOrder(int *setA,int *setB,int n)
{
	//sum up the out order number
	int i,j,k,sum,*tmp1,*tmp2;
	tmp1 = new int [n];
	tmp2 = new int [n];
	sum = 0;
	for (i = 1;i < n;++ i)
	{
		for (j = 0;j < n;++ j)
			if (setB[j] == setA[i]) break;
		//copy the temp array for summation
		for (k = 0;k < i;++ k) tmp1[k] = setA[k];
		for (k = 0;k < j;++ k) tmp2[k] = setB[k];
		InsertSort(tmp1,i);
		InsertSort(tmp2,j);
		sum += Margin(tmp1,0,i,tmp2,0,j);//i,j not included
	}
	delete [] tmp1;
	delete [] tmp2;
	return sum;
}

上述Margin（）函数就是下面的集合差集的运算：

int Margin(int *setA,int a1,int a2,int *setB,int b1,int b2)
{
	//compute the balance number
	if (a1 == a2) return 0;
	if (b1 == b2) return a2 - a1;
    if (setA[a1] == setB[b1]) return Margin(setA,a1+1,a2,setB,b1+1,b2);
	else if (setA[a1] < setB[b1]) return 1+Margin(setA,a1+1,a2,setB,b1,b2);
	else return Margin(setA,a1,a2,setB,b1+1,b2); 
}

此算法的时间复杂度最好保持在O（n^2logn），nlogn就是上面的集合做差集的时间复杂度。

算法三：

int Best(int *s1,int *s2,int n)
{
	//improved method using T(n^2)
    int i,j,*tmp1,*tmp2,sum;
	sum = 0;
	//suppose that all these number are in sequence from 1...n
	tmp1 = new int [n];
	tmp2 = new int [n];
	SignPlace_2(s1,tmp1,n);
	SignPlace_2(s2,tmp2,n);
	for (i = 0;i < n;++ i)
		for (j = i+1;j < n;++ j)
			if (!test(tmp1[i],tmp1[j],tmp2[i],tmp2[j])) sum ++;
	return sum;
}

上面的SignPlace_2（）是对无序数组求解其顺序排序后在原序列位置的函数，比如：
原序列： 1 4 5 2
位置序列：0 3 1 2
关于这个函数，我写了三个版本，一个版本是建立在原序列是不间断的一连串数字组成，所以实现起来很简单，另一版本是一个通用版本，可以忽略上述的限制条件，使用类似插入排序的算法实现。最后一个版本采用空间换取时间的方法，是对版本一的扩展。一下是相应代码:

void SignPlace_2(int *s,int *tmp,int n)
{
	//numbers are not in sequence
	//O(nlogn)
	int i,j,temp,*p;
	p = new int [n];
	//copy s[] to p[]
	for (i = 0;i < n;++ i) p[i] = s[i];
	tmp[0] = 0;
	for (i = 1;i < n;++ i)
	{
		//the current palce i
		temp = p[i];
		for (j = i;j > 0&&temp < p[j-1];-- j)
		{
			p[j] = p[j-1];
			tmp[j] = tmp[j-1];
		}
		p[j] = temp;
		tmp[j] = i;
	}
}

void SignPlace_3(int *s,int *tmp,int n)
{
	//use more place for less time
    //O(SIZE)
	int i,j,*p;
	p = new int [SIZE];
	j = 0;
    for (i = 0;i < SIZE;++ i) p[i] = -1;
	for (i = 0;i < n;++ i) p[s[i]] = i;
	for (i = 0;i < SIZE;++ i) 
		if (p[i] != -1) tmp[j++] = p[i];
	delete [] p;
}

此算法时间复杂度达到n^2（如果SIZE<n*n的话）。

         综述，实验的测试结果(n=1000)：
Simple：       2433
OutofOrder:     671
Best:                  15
单位ms.

         或许还有更优的算法来实现这一问题，我也在继续思考，不过，个人感觉既然是比对二元组，而二元组共有n^2级别个，时间复杂度总要在n^2以上吧？但是其他一些算法，比如最大字段和、字符串匹配等，不都越过了数量这一坎了嘛。

1
顶

0
踩

分享到：

UltraEdit-32 v16.00.1032 key注册码 | 几个小算法问题

2010-04-09 23:40
浏览 2697
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论