精华帖 (0) :: 良好帖 (0) :: 新手帖 (0) :: 隐藏帖 (0)
|
|
---|---|
作者 | 正文 |
发表时间:2010-11-02
最后修改:2010-11-02
wikipedia提供了api可以供我们对其内容进行操作。其API文档地址为: http://en.wikipedia.org/w/api.php
列举一些常见用法: 1、全文搜索 http://en.wikipedia.org/w/api.php?action=query&list=search&srsearch=fluoxetine srsearch为要检索的内容
结果: <?xml version="1.0"?> <api> <query> <searchinfo totalhits="224" /> <search> <p ns="0" title="Fluoxetine" snippet="<span class='searchmatch'>Fluoxetine</span> (also known by the tradenames Prozac, Sarafem) is an antidepressant of the selective serotonin reuptake inhibitor (SSRI) class <b>...</b> " size="53978" wordcount="7052" timestamp="2010-10-31T23:22:00Z" /> <p ns="0" title="Olanzapine/fluoxetine" snippet="The drug combination olanzapine/<span class='searchmatch'>fluoxetine</span> (trade name Symbyax, created by Eli Lilly and Company ) is a single capsule containing the <b>...</b> " size="5703" wordcount="629" timestamp="2010-09-21T09:10:34Z" /> <p ns="0" title="Sertraline" snippet="Evidence suggests that sertraline may work better than <span class='searchmatch'>fluoxetine</span> (Prozac) for some subtypes of depression. Sertraline is highly <b>...</b> " size="104510" wordcount="13933" timestamp="2010-10-28T22:13:04Z" /> <p ns="0" title="Antidepressant" snippet="The first such compound to be patented was zimelidine in 1971, while the first released clinically was indalpine . <span class='searchmatch'>Fluoxetine</span> was <b>...</b> " size="128712" wordcount="17532" timestamp="2010-10-30T08:05:06Z" /> <p ns="0" title="Selective serotonin reuptake inhibitor" snippet="four newer antidepressants (including the SSRIs paroxetine and <span class='searchmatch'>fluoxetine</span> , and two non-SSRI antidepressants nefazodone and venlafaxine ). <b>...</b> " size="78327" wordcount="10398" timestamp="2010-11-01T00:11:30Z" /> <p ns="0" title="Paroxetine" snippet="Unlike two other popular SSRI antidepressants, <span class='searchmatch'>fluoxetine</span> and sertraline , paroxetine is associated with clinically significant weight <b>...</b> " size="48886" wordcount="6491" timestamp="2010-10-31T23:11:12Z" /> <p ns="0" title="Venlafaxine" snippet="Its efficacy is similar to or better than sertraline (Zoloft) and <span class='searchmatch'>fluoxetine</span> (Prozac), depending on the criteria and rating scales used <b>...</b> " size="49655" wordcount="6574" timestamp="2010-11-01T00:38:00Z" /> <p ns="0" title="Olanzapine" snippet="Olanzapine (trade names Zyprexa, Zalasta, Zolafren, Olzapin, Oferta, Zypadhera or in combination with <span class='searchmatch'>fluoxetine</span> Symbyax ) is an atypical <b>...</b> " size="34028" wordcount="4540" timestamp="2010-10-30T17:45:42Z" /> <p ns="0" title="Prozac (disambiguation)" snippet="Prozac is a proprietary name for the antidepressant drug <span class='searchmatch'>fluoxetine</span>. Prozac may also refer to: Prozac+ , an Italian punk band <b>...</b> " size="581" wordcount="78" timestamp="2010-04-23T20:24:31Z" /> <p ns="0" title="SSRI discontinuation syndrome" snippet="paroxetine having the highest number of withdrawal syndrome reports and <span class='searchmatch'>fluoxetine</span> the highest number of drug dependence reports; the note <b>...</b> " size="41099" wordcount="5444" timestamp="2010-09-23T06:19:55Z" /> </search> </query> <query-continue> <search sroffset="10" /> </query-continue> </api>
2、列举wikipedia 的 category: http://en.wikipedia.org/w/api.php?action=query&list=allcategories&acprefix=drug&aclimit=10 返回10条以drug开头的category; 结果: <?xml version="1.0"?> <api> <query> <allcategories> <c xml:space="preserve">Drug-induced Suicide</c> <c xml:space="preserve">Drug-realted suicides</c> <c xml:space="preserve">Drug-related Films</c> <c xml:space="preserve">Drug-related Suicides</c> <c xml:space="preserve">Drug-related death in California</c> <c xml:space="preserve">Drug-related deaths</c> <c xml:space="preserve">Drug-related deaths by country</c> <c xml:space="preserve">Drug-related deaths in Alabama</c> <c xml:space="preserve">Drug-related deaths in Alaska</c> <c xml:space="preserve">Drug-related deaths in Arizona</c> </allcategories> </query> <query-continue> <allcategories acfrom="Drug-related deaths in Arkansas" /> </query-continue> </api> 3、返回具有相应title页面的timestamp|user|comment|content 信息;
结果: <?xml version="1.0"?> <api> <query> <pages> <page pageid="27697087" ns="0" title="API"> <revisions> <rev user="Graham87" timestamp="2010-06-13T08:41:17Z" comment="Protected API: restore protection ([edit=sysop] (indefinite) [move=sysop] (indefinite))" xml:space="preserve">#REDIRECT [[Application programming interface]]{{R from abbreviation}}</rev> </revisions> </page> </pages> </query> </api> 4、解析页面: http://en.wikipedia.org/w/api.php?action=parse&format=xml&page=fluoxetine
用上面的查询返回的[content]是wikipedia的标记格式,这个api返回的是html格式的文本:
可以用xpath="api/parse/text" 返回html内容。
* action=parse * 声明:ITeye文章版权属于作者,受法律保护。没有作者书面许可不得转载。
推荐链接
|
|
返回顶楼 | |
浏览 2238 次