- 浏览: 160435 次
- 性别:
- 来自: 北京
最新评论
-
w156445045:
我最近想写这方面的论文,学位论文啊,哎。希望博主能给点思路,谢 ...
《Lucene+Nutch搜索引擎》看过以后。。。 -
inprice:
这也要研究啊!!!!! 失望ing.........
完成了对于heritrix的初步研究 -
dt_fb:
您好,我想问问您,你有跳过recrawl.sh这个脚本文件么? ...
Nutch开源搜索引擎的crawl日志分析及工作目录说明 -
lovepoem:
能增量吗?是不是还是把所有的url遍历出来。和以前的对比。算是 ...
Nutch开源搜索引擎增量索引recrawl的终极解决办法 -
itang:
见到牛人照片了, MS下巴动过刀(开玩笑)
搜索引擎名人堂之Jeff Dean
从该文对googlebot的分析看,googlebot似乎是想先对网站的结构和规模做出分析,然后再规划抓取行为,googlebot的行为确 实很有意思。Yahoo的机器人似乎是以月为单位周期的更新,抓取新网页和索引,感觉是想以量取胜,并且对网页似乎没有进一步的分析。感觉MSNbot在 整体上还略逊于另两个竞争对手。
Introduction
引言
In the previous edition - Binary Search Tree 2 - a large scale experiment on search engine behaviour was staged with more than two billion different web pages. This experiment lasted exactly one year, until April 13th. In this period the three major search engines requested more than one million pages of the tree, from more than hundred thousand different URLs. The home page of drunkmenworkhere.org grew from 1.6 kB to over 4 MB due to the visit log and the comment spam displayed there.
在上一版(Binary Search Tree 2
)中我用了200亿以个上的web页面进行了一个搜索引擎行为的大
This edition presents the results of the experiment.
本文就是这次试验的结果。
Setup
安装2,147,483,647 web pages ('nodes') were numbered and arranged in a binary search tree . In such a tree, the branch to the left of each node contains only values less than the node's value, while the right branch contains only values higher than the node's value. So the leftmost node in this tree has value 1 and the rightmost node has value 2,147,483,647.
在这次试验中 二叉查找树 上总共放置了 2,147,483,647个标了值的网页。对于二叉查找树,每个节点的左子树只包含比这个节点小的值,右子树上包含比这个节点大的值。所以树的左边最远的节点值为1,右边最远的节点值为2,147,483,647
The depth of the tree is the number of nodes you have to traverse from the root to the most remote leaf. Since you can arrange 2n+1 - 1 numbers in a tree of depth n, the resulting tree has a depth of 30 (231 = 2,147,483,648). The value at the root of the tree is 1073741824 (230 ).
树的深度是从根到最远的树叶所要经过的节点总数。因为在深度为n的树上你总共可以放置 2n+1 - 1个节点,所以这棵树的深度为30 (231 = 2,147,483,648),其根部的值为1073741824 (230 )。
For each page the traffic of the three major search bots (Yahoo! Slurp , Googlebot and msnbot ) was monitored over a period of one year (between 2005-4-13 and 2006-4-13).
这个试验中监控了三大搜索爬虫( Yahoo! Slurp , Googlebot 和 msnbot )在一年时间里(2005-4-13 到 2006-4-13)在每个页面上的流量。
To make the content of each page more interesting for the search engines, the value of each node is written out in American English (short scale) and each page request from a search bot is displayed in reversed chronological order. To enrich the zero-content even more, a comment box was added to each page (it was removed on 2006-4-13). These measures were improvements over the initial Binary Search Tree which uses inconvenient long URLs.
为了让搜索引擎对页面的内容更感兴趣,所有的节点值都以美国英语(short scale)为语言写的并且爬虫请求的页面按照时间倒序显示。为了进一步丰富0内容(zero-content?),每个页面上都添加了一个评论框(于2006-4-13移除)。这些措施是对 二叉查找树 最初麻烦的长URL的改进。( 转载请注明出处 blog.csdn.net/uoyevoli www.farproc.com)
Every node shows an image of three trees. Each tree in the image visualises which nodes are crawled by each search engine. Each line in the image represents a node, the number of times a search bot visited the node determines the length of the line. The tree images below are modified large versions of the original image, without the very long root node and with disconnected (wild) branches.
每个节点上显示了一张三棵树的图像 。每棵树展现了被每个搜索引擎爬过的所有节点。图像中的每根线代表一个节点,爬虫访问节点的次数决定了线的长度。下面的图像是原始图像的修改版,去掉了很长的根节点,添加了断开的(野)分枝。
Overall results
总体 结果From the start Yahoo! Slurp was by far the most active search bot. In one year it requested more than one million pages and crawled more than hundred thousand different nodes. Although this is a large number, it still is only 0.0049% of all nodes. The overall statistics of all bots is shown in the table below.
从一开始,Yahoo! Slurp就一直是最活跃的爬虫。在一年中它请求了超过百万个页面,爬过了数十万计个节点。这虽然是个大数目,但是只占总节点的0.0049%。所有爬虫的全面统计数据见下表。
1,030,396 | 20,633 | 4,699 |
105,971 | 7,556 | 1,390 |
0.0049% | 0.00035% | 0.000065% |
120,000 | 554 | 1 |
113.23% | 7.33% | 0.07% |
1,030,396 | 20,633 | 4,699 |
105,971 | 7,556 | 1,390 |
0.0049% | 0.00035% | 0.000065% |
120,000 | 554 | 1 |
113.23% | 7.33% | 0.07% |
The growth of the number of pageviews and the number of crawled nodes over the year the experiment lasted, is shown in figure 1 and 2. The way the bots crawled the tree is visualised in detail with the animations for each bot in the sections below.
图1和图2是在这个历时一年的试验中页面访问量(pageview)和爬过的节点数的增长趋势。爬虫们爬这棵树的行为方式在下面一节中会以动画的形式详细说明。
Fig. 1
- The cumulative number of pageviews by the search bots in time.
Fig. 2
- The cumulative number of nodes crawled by the search bots in time.
The graph below (fig. 3) shows how many nodes of each level of the tree were crawled by the bots (on a logarithmic scale). The root of the tree is at level 0, while the most remote nodes (e.g. node 1) are at level 30. Since there are 2n nodes at the level n (there is only 1 root and there are 230 nodes at level 30) crawling the entire tree would result in a straight line.
下图(图.3)显示了树的每个层次上被爬虫爬过的节点数(对数比例)。树的根节点位于第0层,最远的节点位于第30层。由于在第n层上有 2n 个节点([第0层]只有一个节点,第30层有 230 个节点 )所以完整爬过整棵树会形成一条直线。 ( 转载请注明出处 blog.csdn.net/uoyevoli www.farproc.com)
Fig. 3
- The number of nodes crawled after 1 year, grouped by node level.
图.3 - 1年中爬过的节点数,按层次分组
Google closely follows this straight line, until it breaks down after the level 12. Most nodes at level 12 or less were crawled (5524 out of 8191), but only very few nodes at higher levels were crawled by Googlebot. MSN shows similar behaviour, but breaks down much earlier, at the level 9 (656 out of 1023 nodes were crawled). Yahoo, however, does not break down. At high levels it gradually fails to request all nodes.
Google 在12层以下几乎是直线发展的,然后开始下跌。12层以下的节点大部分(8191个中的5524个)被Google爬过,但是Google很少去爬较高层 的节点。MSN的行为也类似,只是下跌得更早,在第9层就开始下跌(爬了1023个节点中的656个)。而Yahoo不同,它没有下跌,但是在高层上它渐 渐不再访问所有节点了。
The nodes at high levels that were crawled by Yahoo, were requested quite often compared to the other bots: at level 14 to 30 each page was requested 10 times at average (see fig. 4).
和其他爬虫相比Yahoo对高层节点的访问要频繁地多:在14至30层,平均每个页面被请求多达10次(见图4)。
Fig. 4
- The average number of pageviews per node after 1 year, grouped by node level.
图.4 - 一年中每个节点的平均访问量(pageview),按节点层次分组。( 转载请注明出处 blog.csdn.net/uoyevoli www.farproc.com)
Yahoo! Slurp
- large version (4273x3090, 1.5MB)
- animated version over 1 year (2005-04-13 - 2006-04-13, 13MB)
- animated version of the first 2 hours (2006-04-14 00:40:00-02:40:00, 2.2MB)
- 查看大图 (4273x3090, 1.5MB)
- 一年来的动画 (2005-04-13 - 2006-04-13, 13MB)
- 最开始2小时的动画 (2006-04-14 00:40:00-02:40:00, 2.2MB)
Fig. 5 - The Yahoo! Slurp tree.
图.5 - Yahoo! Slurp的二叉树
Yahoo! Slurp was the first search engine to discover Binary Search Tree 2. In the first hours after discovery it crawled the tree vigorously, at a speed of over 2.3 nodes per second (see the short animation ). The first day it crawled approximately 30,000 nodes.
Yahoo! Slurp最早发现 二叉树2 。在它发现这个一个小时后,爬虫就全力开工了,速度超过2.3个节点/秒(看小动画 )。第一天它爬了大约30000个节点。
In the following month Slurp's activity was low, but after exactly one month it requested all pages it visited before, for the second time. In the animation you can see the size of the tree double on 2005-05-14. This phenomenon is repeated a month later: on 2005-06-13 the tree grows to three times it original size. The number of pageviews is then almost 90,000 while the number of crawled nodes still is 30,000. Figure 6 shows this stepwise increment in the number of pageviews during the first months.
在接下来的一个月里,Slurp的活跃度变低了,但是刚好一个月后它又一次请求了曾经访问过的所有节点。在动画 里你可以看到在2005-05-14树扩张了一倍。这种现象随后又在2005-06-13重复了一次,树增长到了原来的三倍。页面访问量(pageview)为将近90,000,而爬过的节点数仍然为30,000。图6展示了在第一个月里的这种阶梯式增长趋势。
Fig. 6
- The cumulative number of pageviews by Yahoo! Slurp in time.
After four months Slurp requested a large number of 'new' nodes, for the first time since the initial round. It simply requested all URLs it had. Since it had already indexed 30,000 pages, that each link to two pages at a deeper level, it requested 60,000 pages at the end of August (the number of pageviews jumps from 100,000 to 160,000 pages in fig. 6) and it doubled the number of nodes it had crawled (see the fig. 7).
4个月后,Slurp开始了初步阶段里第一次对"新"节点的大规模请求。它直接访问了它拥有的所有URL。因为他已经索引了30,000个页面而每个页面又连接了两个更深层的页面,所以到八月底它总共请求了60,000个页面(页面访问量从100,000飚升至160,000,参看图6)而且它爬过的节点数也翻了一翻。
After 5 months Yahoo! Slurp started requesting nodes more regularly. It still had periods of 'discovery' (e.g. after 10 months).
5个月后Yahoo! Slurp对节点的请求变得更有规律,但仍然有"发现期"(比如10个月后)。
Fig. 7
- The cumulative number of nodes crawled by Yahoo! Slurp in time.
( 转载请注明出处 blog.csdn.net/uoyevoli www.farproc.com)
Yahoo reported 120,000 pages in it's index ( current value ). This may seem impossible since it only visited 105,971 nodes, but every node is available on two different domain names: www.drunkmenworkhere.org and drunkmenworkhere.org .
120,000 个页面被包含在Yahoo的索引中(当前值 )。这看起来好像不大可能因为他仅仅访问了105,97个节点,但是事实是每个节点都有两个不同的域名:www.drunkmenworkhere.org 和 drunkmenworkhere.org 。
Note: the query submitted to Google and MSN yielded 35,600 pages on Yahoo. Yahoo is the only search engine that returns results with the query used above.
注意:向Google和MSN提交的查询比Yahoo的少 35,600 页。Yahoo是唯一一个使用上述查询返回结果的搜索引擎。
Googlebot
- large version (4067x4815, 180kB)
- animated version (2005-04-13 - 2006-04-13, 1.2MB)
Fig. 8 The Googlebot tree.
图.8 Googlebot树
In comparison with Yahoo's tree, Google's tree looks more like a natural tree. This is because Google visited nodes at deeper levels less frequently than their parent nodes. Yahoo only visited the nodes at the first three levels more frequently, while Google did so for the first 12 levels (see fig. 4).
和Yahoo的树相比,Google的看起来更像一棵天然的数。这是因为Google访问深层节点的频率小于访问它们父节点的频率。Yahoo仅对前3层节点访问比较频繁,而Google是对前12层(见图.4)。
The form of the tree follows from Google's PageRank algorithm. PageRank is defined as follows:
"We assume page A has pages T1...Tn which point to it (i.e., are citations). The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85. There are more details about d in the next section. Also C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows:
PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn)) "
Google树的形式遵循了Google的PageRank算法。PageRank被defined 如下:
"我们假设页面A有指向它的页面T1...Tn(比如A被它们引用)。参数d为可以赋0到1之间值的阻尼因数。我们通常设置d为0.85。下面一节会有详细说明。同时C(A)被定义为从页面A链接出去的页面数量。页面A的PageRank可计算如下:
PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn)) "
Since most nodes in the tree are not linked to by other sites, the PageRank of a node can be calculated with this formula (ignoring links in the comments):
PR(node) = 0.15 + 0.85 (PR(parent) + PR(left child) + PR(right child))/3
由于大部分的节点没有被别的站链接,PageRank可以用如下的公式计算:
PR(node) = 0.15 + 0.85 (PR(parent) + PR(left child) + PR(right child))/3
The only unknown when applying this formula iteratively, is the PageRank of the root node of the tree. Since this node was the homepage of drunkmenworkhere.org for a year, a high rank may be assumed. The calculated PageRank tree (fig. 9) shows similar proportions as Googlebot's real tree, so the frequency of visiting a page seems to be related to the PageRank of a page.
当应用这个公式时唯一的例外是根节点,因为这个节点一年以来是drunkmenworkhere.org 的主页,理应拥有一个较高的rank。计算出来的PageRank树(图.9)和Googlebot的真实树有相似之处,所以一个页面的访问频繁程度看起来和它的PageRank有关。
Fig. 9
- A binary tree of depth 17 visualising calculated PageRank as length
of each line, when the PageRank of the root node is set to 100.
图.9 - 一个深度为17的二叉树。计算出来的PageRank决定了每根线的长度,根的PR被设置为100。
The animation of the Googlebot tree shows some interesting erratic behaviour, that cannot be explained with PageRank.
Googlebot树的动画 表现出了一些无法用PageRank解释的奇怪行为。
A few hours later, Googlebot crawled node 2 , which is linked as a parent node by node 1. These two nodes are displayed as a tiny dot in the animation on 2005-06-30, floating above the left branch. Then, a week later, on 2005-07-06 (two days after the attempt to find rightmost node), between 06:39:39 and 06:39:59 Googlebot finds the path to these disconnected nodes by visiting the 24 missing nodes in 20 seconds. It started at the root and found it's way up to node 2, without selecting a right branch. In the large version of the Googlebot tree, this path is clearly visible. The nodes halfway the path were not requested for a second time and are represented by thin short line segments, hence the steep curve.
This subtree is the reason the number of nodes crawled by Googlebot, grouped by level, increases again from level 18 to level 30 in fig. 3.( 转载请注明出处 blog.csdn.net/uoyevoli www.farproc.com)
Over the last six months Googlebot requested pages at a fixed rate (about 260 pages per month, fig. 10). Like Yahoo! Slurp it seems to alternate between periods of discovery (see fig. 11) and periods of refreshing it's cache.
在最后的6个月里,Googlebot以恒定的速率请求页面(大约260个页面/月,图.10)。和Yahoo! Slurp类似,它在发现新节点(图.11)和回顾旧节点之间交替运行。
Fig. 10
- The cumulative number of pageviews by Googlebot in time.
图.10 - 按时间顺序显示的Googlebot累计页面访问量(pageview)。
Fig. 11
- The cumulative number of nodes crawled by Googlebot in time.
图.11 - 按时间顺序显示的Googlebot累计爬过的节点数
Google returned 554 results when searching for nodes. The first nodes reported by Google are node 1 and 2, which are very deep inside the tree at level 29 and 30. Their higher rank is also reflected in the curve shown above (Searching node 1), which indicates a high number of pageviews. They probably appear first because of their short URLs. The other nodes at the first result page are all at level 4, probably because the first three levels are penalised because of comment spam. The current number of results can be checked here .
当搜索节点时Google返回554个结果 。Google 显示的首批节点为节点1和2,这是两个深深隐藏在29和30层里的节点。这两个节点的高PR值也可以从上面(搜索节点1)预示着高PV值的曲线看出来。它 们首先出现可能是因为它们的URL较短的缘故。搜索结果的第一页上的其他节点都来自第4层,这可能是因为前3层由于有较多的评论垃圾而被惩罚了。当前搜索 结果的数目可以看这里 。
评论
对于你这个问题,我有空可以去研究一下!
--------------------------------------------
我最近还是在搞nutch舍不得放手
最近碰到了个难题
我想将nutch配置为本地文件全文索引器
于是启用了 protocol-file
可是捣腾半天
发现nutch/lucene都不支持对含有中文目录或者中文文件名的文件进行索引
只支持路径名和文件名均为英文的文件进行索引。
有什么办法解救嘛?
发表评论
-
Nutch1.0开源搜索引擎与Paoding在eclipse中用plugin方式集成(终极篇)
2009-09-14 13:15 4347本文主要描述的是如何将paoding分词用plugi ... -
Nutch1.0的那些事
2009-09-10 12:37 2207很久没有更新博客了,应该快一年了。现在呢,我把去年 ... -
配置linux服务器之间ssh不用密码访问
2008-11-05 13:55 3918在配置nutch的时候,我 ... -
搜索引擎术语
2008-10-15 15:30 2558最近monner兄共享了一篇 ... -
搜索引擎算法研究
2008-10-13 15:11 21291.引言 万维网WWW(World Wide Web ... -
谁说搜索引擎只关注结果-看我viewzi的72变
2008-10-04 20:15 1848搜索引擎给大家的感觉,就是用起来简单,以google为首,一个 ... -
《Lucene+Nutch搜索引擎》看过以后。。。
2008-10-03 23:42 7648研究搜索引擎技术快一 ... -
微软有趣的人物关系搜索引擎——人立方
2008-10-03 20:00 3987最近,微软亚洲研究院 ... -
Nutch开源搜索引擎增量索引recrawl的终极解决办法(续)
2008-09-28 19:30 3487十一要放假了,先祝广大同学们节日快乐! 在之前的一篇文章中, ... -
Nutch:一个灵活可扩展的开源web搜索引擎
2008-09-28 11:46 2276在网上找到一篇于2004年11月由CommerceNet La ... -
Google公司都是些什么牛人?
2008-09-27 17:31 2093Google公司都是些什么牛人? 1 Vi ... -
搜索引擎名人堂之Doug Cutting
2008-09-27 11:41 2654Doug Cutting是一个开源搜索技术的提倡者和创造者。他 ... -
Nutch开源搜索引擎增量索引recrawl的终极解决办法
2008-09-26 19:12 5186本文重点是介绍Nutch开源搜索引擎如何在Hadoop分布式计 ... -
Nutch开源搜索引擎与Paoding中文分词用plugin方式集成
2008-09-26 15:31 4603本文是我在集成中文分词paoding时积累的经验,单独成一篇文 ... -
关于Hadoop的MapReduce纯技术点文章
2008-09-24 18:10 3529本文重点讲述Hadoop的整 ... -
MapReduce-免费午餐还没有结束?
2008-09-24 09:57 1493Map Reduce - the Free Lunch is ... -
搜索引擎名人堂之Jeff Dean
2008-09-22 15:09 14989最近一直在研究Nutch,所以关注到了搜索引擎界的一些名人,以 ... -
Lucene于搜索引擎技术(Analysis包详解)
2008-09-22 14:55 2235Analysis 包分析 ... -
Lucene与搜索引擎技术(Document包详解)
2008-09-22 14:54 1731Document 包分析 理解 ... -
Lucene的查询语法
2008-09-22 14:53 1424原文来自:http://liyu2000.nease.net/ ...
相关推荐
9. 人工智能助手:如智能聊天机器人,它们能与用户进行对话,解答问题,甚至完成复杂的任务,使得搜索引擎服务更加智能化。 10. 搜索引擎优化(SEO):随着技术进步,搜索引擎对网站的排名算法也不断调整,企业需要...
**基于JAVA技术的搜索引擎研究报告及实现** 在信息技术飞速发展的今天,搜索引擎已成为互联网用户获取信息的重要工具。本研究报告聚焦于基于JAVA技术构建的搜索引擎,旨在深入探讨其原理、设计与实现,以及关键技术...
本文首先详细介绍了基于英特网的搜索引擎的系统结构,然后从网络机器人、索引引擎、Web服务器三个方面进行详细的说明。为了更加深刻的理解这种技术,本人还亲自实现了一个自己的搜索引擎——新闻搜索引擎。 新闻搜索...
搜索引擎爬虫,也称为网络蜘蛛或机器人,是搜索引擎的一部分,负责自动地遍历互联网上的网页。它们遵循网页间的链接,抓取页面内容并将其存储在搜索引擎的数据库中。爬虫的抓取频率、深度和广度取决于多种因素,包括...
同时,这个资料还包含了搜索引擎机器人的研究报告,可能涵盖了最新研究进展和实际应用中的挑战。 中文全文检索网和全文检索相关知识介绍,则为我们提供了搜索引擎在处理中文文本时的具体应用场景和知识。全文检索...
报告指出,虽然深度学习在电商搜索中已有显著成果,但在亚马逊等电商平台的搜索引擎中仍处于实验阶段。 此外,报告讨论了电商搜索中的一些挑战,如同义词和归一化的处理。为解决语义词汇差异,如“理发器”、“理发...
东北大学人工智能与机器人研究所对此进行了深入研究,并由著名学者、中国机器博弈事业的先驱、中国自动化学会(CAAI)会士徐心和教授撰写了相关的学术报告。这份名为“经典中国象棋博弈原理(徐心和).ppt”的文件,...
1. 智能搜索:用户可以通过内置的搜索引擎快速找到所需的学习资源,无论是特定的教材、研究报告还是在线课程,都可以通过关键词进行精准定位。 2. 批量下载:对于需要下载的多个文件,纳米机器人可以一次性添加到...
2. 搜索引擎蜘蛛:搜索引擎为了更新索引,会派出机器人程序(也称为爬虫或蜘蛛)遍历互联网上的网站。它们抓取网页内容,并将这些信息存储在搜索引擎的数据库中。通过分析蜘蛛访问日志,我们可以了解哪些页面被爬取...
上述内容分析了基于中国象棋机器人的人工智能实验平台设计的多个关键点,包括双系统控制、中国象棋引擎程序构成、单片机及STM32的应用、平台的实用意义、中国象棋在AI研究中的地位和对人工智能技术的贡献。...
自然语言处理技术使计算机能理解和生成人类语言,应用于搜索引擎、翻译、智能客服等。跨媒体分析推理技术是结合多种媒体形式的信息分析,例如在教育领域进行一对一教学模拟。智适应学习技术关注个性化教学方法,如...
ChatGPT:又一个“人形机器人”主题研究报告 本报告对ChatGPT的技术特点、应用前景和市场潜力进行了深入分析。ChatGPT是OpenAI推出的对话式AI模型,具有语言类AI底层技术NLP的显著进步和Transformer和RLHF算法的...
例如,简洁的导航可以帮助用户快速找到所需信息,而搜索引擎优化则能确保酒店在搜索引擎结果中排名靠前,增加曝光率。 此外,信息技术在酒店管理信息系统中的应用也日益广泛。通过集成预订系统、客房管理系统、财务...
### 2023年AIGC之ChatGPT行业研究报告关键知识点解析 #### 一、ChatGPT及其核心技术 **1.1 ChatGPT简介** ChatGPT是一款由OpenAI开发的人工智能对话机器人,它能够理解并生成自然语言,从而与用户进行持续、深入...
这项技术在客服机器人、搜索引擎等领域有着广泛的应用。典型的公司如百度的DuerOS就是一个例子。 ##### 计算机视觉 计算机视觉技术使机器能够识别和解析视觉信息,如图像或视频。它在无人驾驶汽车、安防监控等方面...
微软已经在必应搜索引擎中整合ChatGPT功能,并推出付费版ChatGPT PLUS,标志着商业化的开端。谷歌推出了LaMDA驱动的Bard,计划短期内向公众开放。百度则准备上线中文版的ChatGPT——“文心一意”(ERNIE Bot)。这些...
《AIGC行业深度报告 - ChatGPT,重新定义搜索“入口”》是华西证券在2023年2月8日发布的一份研究报告,该报告深入探讨了人工智能生成内容(AIGC)领域的新趋势,特别是ChatGPT如何颠覆传统搜索引擎的使用方式,对...
领军企业积极跟进,商业应用... 随着微软将新的OpenAI模型整合至自身产品中,谷歌、百度等AI领军企 业也宣布推出聊天机器人,未来有望将聊天机器人整合至搜索引擎甚至 办公软件等业务当中,商业化应用有望加速落地。