我是在 aigamedev.com 上的2008年第17周的 RoundUp 里看到这篇文章的推荐的,出于自己对中国象棋及其计算机博弈方面的兴趣,虽然对于围棋和围棋AI一窃不通,但还是挺仔细地阅读了这篇文章,觉得这里的内容跟自己以前了解的计算机博弈方面的知识有不同。所以把它翻译一下,为的是让自己更好地理解其中的知识。本人英语甚差,如有译错,敬请赐教。
另,本文的作者应该是中国人,真希望他以后也用中文写写他的研究所得。
====================================
Monte Carlo Method in Game AIs
蒙特卡罗算法在游戏(围棋)AI中的应用
Friday, April 25th, 2008 12:28 pm
Written by: qqqchn
作者:qqqchn
翻译:赖勇浩(恋花蝶)
原文地址:http://expertvoices.nsdl.org/cornell-cs322/2008/04/25/monte-carlo-method-in-game-ais/
As many of my classmates have posted, the Monte Carlo method isn’t actually any single method, but actually represents an entire class of methods which involve taking random samples to find a result.An interesting application my partner and I found for the Monte Carlo method was for one of the GO AIs we made for one of our other projects. (GO is an ancient Chinese Board Game that is still very popular today in East Asia, the rules and details can be found here)
像我很多同学说过的,蒙特卡罗算法不是一个算法,而是一系列关于通过随机抽样来求解的算法。我的 partner 和我发现了一个有趣的蒙特卡罗算法应用:把它用在围棋的人工智能上。(围棋是一种来自中国的古老的智力游戏,直到今天在东亚仍然非常流行,
参考这里)
One of the reasons we chose to use the Monte Carlo method was because the immense number of possible moves in GO made using the Minimax Algorithm (one of the more common methods used for finding the next ”best” move in many game AIs like chess by consecutively maximizing and minimizing the score for a player up to a certain depth, more details here) far too computationally intensive when looking at more than 2 or 3 moves ahead (looking only 4 moves ahead on a mere 9×9 board takes about 81^4> 4 million board evaluations). An interesting quote illustrating the computational intensity of GO games on a full 19×19 board is that “the number ofpossible GO games far exceeds the number of atoms in the universe” (more details and derivation here) Interesting Facts: Lower bound on number of possible GO games on 19×19 board is about 10^(10^48) . Upper bound is 10^(10^171).
So another way we used to evaluate how “good” a move is was to use the Monte Carlo method. What the Monte Carlo method does in this case to estimate how good or bad a certain move is for a given board position is to play “virtual games” illustrating what would happen if two Random AIs (AIs playing completely randomly) played out those moves.The way it does this is to start from this board position and play each of the viable moves in a fixed number of games with all subsequent moves being completely random. Then after all of the ”virtual games” are finished, we would average the total scores of each game and let it represent the “goodness” of the original move which spawned that game. Finally by choosing the move with the highest average score, the Monte Carlo AI would then play this move in the actual game itself, based on the assumption that the moves which score better over a large number of random games would be “better” moves in general.
因此我们使用蒙特卡罗算法来评估一个着法有多好(差)。蒙特卡罗算法评估某一着法有多好(差)的方法是由两个随机AI(选择的着法完全随机)对一个给定的盘面下若干盘“虚拟棋”。从一个给定的盘面开始,然后对每一可行着法计算指定数量的后续着法完全随机的“虚拟棋”。之后,我们统计所有可行走法的平均值,以反映出“好”的着法。最后是选择有着最高的平均值的着法,蒙特卡罗AI在真正的棋局中应用这一着法。这是基于假设这一高分着法通常比其它的选择产生的结局都要好来做的。
For our project, we let our AI play about 500 virtual games for each move, which on slower computers actually can take a while, but it is still far faster than trying to use the Minimax Algorithm to look ahead just 4 moves (just over 1 million evaluations compared to 4 million +). In addition, the results of the Monte Carlo AI are pretty good as it can generally defeat most of our other AIs (Minimax AI looking 2 or 3 moves ahead and Random AIs), and it even put up a decent fight against some beginner human players as well.
在我们的项目中,我们让AI对每一个着法下500局“虚拟棋”。这也有不小的计算量,如果机器比较“破落”,可能需要计算挺长的一段时间。但它仍然比用极小极大算法向前计算4步(计算量大约是9x9棋盘计算4步(约需评估4百多万个盘面,见前文)的1百万倍)要快得多。蒙特卡罗AI 的效果很好,它通常能够打败极大极小算法AI(计算2或3步)和随机AI,这样的棋力跟初学围棋的人类差不多。
Worth noting is that one very important factor for how well the Monte Carlo method works in this case is the scoring function which you use to decide a player’s score given a certain board position. The one we used which is very straightforward and relatively simple in that it just assigns an empty spot to whoever has the closest stones to that spot, with ties being broken by number of stones near it. This isn’t the most accurate or effective scoring method, but it worked decently well enough for our purposes.
值得注意的是蒙特卡罗算法依赖于一个很重要的因素,那就是对特定盘面的估值函数。我们用了一个简单的函数:把空的点归属于最近的棋子,如果有多个棋子,则平分。它可能不够准确和高效,但对于我们来说,已经足够。
The AI we developped using Monte Carlo methods was one of the better AIs we made, but it is still nowhere near the capabilities of a decently experienced amateur human player. Especially, the AI starts losing out near the end game when tactics mean a lot more than overall strategy (which Monte Carlo and Minimax seem to do well at). And the fact that we are using random moves to play each “virtual game” means that we can get very different results each time we play it, especially near the end game where results of moves really depend on the quality of subsequent moves, which in this case are completely random.
我们开发的蒙特卡罗算法AI是我们开始的AI中较好的一个,但它与训练有素的棋手仍然相距甚远。尤其在游戏将结束时,战术比策略显得更为重要,AI 就容易输棋(蒙特卡罗算法和极小极大算法都有这种问题)。我们使用随机着法来下每一个局“虚拟棋”,所以我们每一次都会得到不同的结果。在将近结局的时候,最后的结果依赖于后续着法的质量,而在这里后续着法是完全随机的,所以效果差强人意。
GO is considered by many to be the most complicated game we know of to date, and it is very unlikely that we will be able to come even marginally close to solving the game anytime soon (want to even try writing out 10^(10^48)?). But it seems equally unlikely that people will give up on trying anytime soon either, as has been proven by human tenacity in the face of other “insurmountable” odds in the past (landing on the Moon…).
围棋被认为是目前为止最复杂的游戏,而且我们不可能在很近的将来解决它。但大家都不会放弃,因为已经证明人类在面对“不可逾越”的问题上是坚忍不拔的(例如登月)。
NOTE: when I said “random” in this post, I naturally mean the pseudorandom number generators computers use, which isn’t really random, but was more than close enough for our project.
注意:本文中的“随机”是指计算机使用的伪随机数,而非真随机,但从项目中来看已经不错了。
CITATIONS:
引用
GO AI Project CS478 (Gordon Briggs, Qin Chen) -unfortunately not finished yet so don’t really have any statistics yet to cite-
围棋AI项目CS478(Gordon Briggs, Qin Chen)尚未完成,所以无法提供真正的统计数据。
―――――――――――――――――――――――――――
中文参考:
―――――――――――――――――――――――――――
<!-- SiteSearch Google -->
有问题不明白?请教Google大神吧!
|
输入您的搜索字词 提交搜索表单
|
|
|
<!-- SiteSearch Google -->
分享到:
相关推荐
围棋AI是一种人工智能应用,它利用复杂的算法来模拟和学习围棋策略。在这个项目中,我们重点关注的是基于蒙特卡洛树搜索(MCTS...对于想要深入理解人工智能、游戏AI或者蒙特卡洛算法的开发者来说,这是一个宝贵的资源。
JavaScript基于蒙特卡洛树搜索(MCTS)算法实现AI围棋源码.zipJavaScript基于蒙特卡洛树搜索(MCTS)算法实现AI围棋源码.zipJavaScript基于蒙特卡洛树搜索(MCTS)算法实现AI围棋源码.zipJavaScript基于蒙特卡洛树搜索...
它使用C++编写,其目的是演示如何使用蒙特卡洛搜索算法来实现一个简单的人工智能游戏。该示例包含了一个不围棋游戏引擎和一个基于蒙特卡洛搜索算法的AI玩家。 该示例的主要功能包括: 1. 实现了一个简单的不围棋...
4. **评估函数**:对于没有人工智能对手的命令行围棋游戏,评估函数可能并不重要。但如果支持人机对战,评估函数就会用来判断当前棋局的优劣,通常是通过对棋盘上各区域的控制权、活棋和死棋的状态进行量化分析。 5...
总结,人工智能在围棋程序中的应用是一个持续发展的领域,它挑战着计算机处理模糊性、复杂性和动态变化的能力。随着技术的进步,我们期待看到更加智能的围棋程序,它们不仅能模拟人类的思维,还能开拓新的棋艺境界。
Java开发基于蒙特卡洛树搜索(MCTS)算法实现围棋AI源码(课程大作业).zipJava开发基于蒙特卡洛树搜索(MCTS)算法实现围棋AI源码(课程大作业).zipJava开发基于蒙特卡洛树搜索(MCTS)算法实现围棋AI源码(课程大作业).zip...
由于上面已经讲过,围棋的状态空间极大,需要用神经网络来学习搜索算法中获得的知识,模拟搜索算法的行为,而在状态空间较小的博弈游戏中,则可以直接使用搜索算法,以供学习使用。 本项目选取黑白棋(又称翻转...
标题中的“引入了UCT算法的围棋AI程序代码”是指该程序使用了一种名为Upper Confidence Trees(UCT)的算法来实现人工智能在围棋游戏中的决策。UCT是蒙特卡洛树搜索(MCTS)的一种变体,特别适用于有限离散决策过程...
这个算法由Shun-ichi Gelly和Toby Silver在2006年提出,最初用于围棋AI程序,后来被广泛应用到其他棋类游戏和复杂决策问题中。 在围棋游戏中,由于状态空间极其庞大,传统的蒙特卡洛树搜索(MCTS)方法难以高效地...
在当今信息技术飞速发展的时代,人工智能(AI)已经渗透到各个领域,其中在棋类游戏中的应用尤其引人注目。本篇硕士论文以围棋AI为研究主题,深入探讨了如何利用计算机科学的技术来模拟人类在围棋游戏中的智能决策...
同时,本文还介绍了机器学习在围棋机器博弈中的解决方法,包括蒙特卡洛思想、人工神经网络以及增强学习,并重点介绍了增强学习中时间差分算法的原理机制与应用。 围棋博弈动作的量化和改进是本文的另一个重要方面。...
在本实验中,我们将深入探讨“人工智能实验围棋博弈”这一主题。这个实验是人工智能学习领域的一个典型应用,它涉及到计算机程序模拟人类智慧进行棋类游戏的策略决策。在这个实验中,我们将使用Java语言来实现围棋...
在实际的围棋AI中,点目算法通常是计算棋盘上双方占有地盘大小的一种方式。它涉及到复杂的边界判断和空点归属,有时还会用到特定的规则来处理特殊情况。初级的围棋AI可能采用简单的策略,如模拟有限步数的随机落子,...
蒙特卡罗算法在围棋游戏中的应用,主要利用随机模拟来评估棋局和选择下棋点。然而,由于蒙特卡罗算法需要进行大量的随机模拟,它在处理围棋提子时的效率较低,导致速度上无法满足要求。由于通用处理器冯·诺依曼架构...
围棋,作为一种古老的棋类游戏,蕴含着丰富的策略和智慧,其在计算机科学领域的重要性不言而喻,尤其是在人工智能(AI)的发展中。本压缩包文件的标题“围棋重要算法围棋重要算法”暗示了其核心内容可能围绕围棋算法...
在IT领域,游戏开发是一项...通过理解并掌握这些算法,开发者不仅可以制作出高质量的棋类游戏,还能在人工智能、机器学习等领域中找到应用。对于想要深入了解游戏开发或者棋类智能算法的人来说,这是一个宝贵的资源。