`

在MYSQL中使用全文索引(FULLTEXT index)

 
阅读更多

MYSQL的一个很有用的特性是使用全文索引(FULLTEXT index)查找文本的能力.目前只有使用MyISAM类型表的时候有效(MyISAM是默认的表类型,如果你不知道使用的是什么类型的表,那很可能就是 MyISAM).全文索引可以建立在TEXT,CHAR或者VARCHAR类型的字段,或者字段组合上.我们将建立一个简单的表用来解释各种特性.
简单用法(MATCH()函数)对3.23.23以后的版本有效,复杂的用法(IN BOOLEAN MODE修饰语)对4以后的版本有效,本文的第一部分着重简单用法,第二部分讲复杂用法.
一个简单的表
我们将在整个过程中使用下面的表.
CREATE TABLE fulltext_sample(copy TEXT,FULLTEXT(copy)) TYPE=MyISAM;
如果你没有把默认的表类型设置成MyISAM以外的类型那么TYPE=MyISAM可以省略.建表之后,向其中填充一些数据,例如:
INSERT INTO fulltext_sample VALUES
('It appears good from here'),
('The here and the past'),
('Why are we hear'),
('An all-out alert'),
('All you need is love'),
('A good alert');

如果你已经建立好了一个表,你可以使用ALTER TABLE(就像CREATE INDEX语句一样)语句添加一个全文索引,例如:
ALTER TABLE fulltext_sample ADD FULLTEXT(copy)
查找文本
全文索引搜索的语法很简单,你只要MATCH字段,AGAINST你要查找的文本,例如:
mysql> SELECT * FROM fulltext_sample WHERE MATCH(copy) AGAINST('love');
+----------------------+
| copy |
+----------------------+
| All you need is love |
+----------------------+

在全文索引上进行搜索是不区分大小写的,因此下面的语句也可以正常运行:
mysql> SELECT * FROM fulltext_sample WHERE MATCH(copy) AGAINST('LOVE');
+----------------------+
| copy |
+----------------------+
| All you need is love |
+----------------------+

全文索引通常用来搜索自然语言文本,例如报纸文章,网页内容等等.因此MySQL为这类搜索添加了很多特性.MySQL不索引任何长度小于等于3的文本, 也不索引有50%机会出现的单词.这意味着如果你的表少于2条记录,基于全文索引的搜索不会返回任何东西.将来,MySQL会使这项功能更灵活,但是现在 它应该可以适合大部分自然语言的使用.如果你的数据库中的大部分记录都包含”music”,你很可能不希望返回这些记录,你可以使用IN BOOLEAN MODE修饰符来获得50%左右的阀值,见本文第二部分.
结果将按照关联性从高到底的顺序返回.
主要特性
下面是标准的全文索引搜索的主要特性:
1.排除重复词语
2.排除长度小于4的词语
3.排除在多于一半记录中出现的词语(就是说只要要有3条记录)
4.带连字符的词语被认为两个词语
5.结果按照关联度降序返回
6.忽略列表中的词语也被从搜索结果中排除.忽略列表基于普通的英文单词,因此如果你的数据用作不同的目的,你可能希望改变忽略列表.不幸的是,这样作并 不容易.你需要编辑文件myisam/ft_static.c,重新编辑MySQL,并重建索引!这里有一个忽略列表.注意,这些在不同的版本里有所更 改.
忽略列表
"a", "a's", "able", "about", "above", "according", "accordingly", "across", "actually", "after", "afterwards", "again", "against", "ain't", "all", "allow", "allows", "almost", "alone", "along", "already", "also", "although", "always", "am", "among", "amongst", "an", "and", "another", "any", "anybody", "anyhow", "anyone", "anything", "anyway", "anyways", "anywhere", "apart", "appear", "appreciate", "appropriate", "are", "aren't", "around", "as", "aside", "ask", "asking", "associated", "at", "available", "away", "awfully", "b", "be", "became", "because", "become", "becomes", "becoming", "been", "before", "beforehand", "behind", "being", "believe", "below", "beside", "besides", "best", "better", "between", "beyond", "both", "brief", "but", "by", "c", "c'mon", "c's", "came", "can", "can't", "cannot", "cant", "cause", "causes", "certain", "certainly", "changes", "clearly", "co", "com", "come", "comes", "concerning", "consequently", "consider", "considering", "contain", "containing", "contains", "corresponding", "could", "couldn't", "course", "currently", "d", "definitely", "described", "despite", "did", "didn't", "different", "do", "does", "doesn't", "doing", "don't", "done", "down", "downwards", "during", "e", "each", "edu", "eg", "eight", "either", "else", "elsewhere", "enough", "entirely", "especially", "et", "etc", "even", "ever", "every", "everybody", "everyone", "everything", "everywhere", "ex", "exactly", "example", "except", "f", "far", "few", "fifth", "first", "five", "followed", "following", "follows", "for", "former", "formerly", "forth", "four", "from", "further", "furthermore", "g", "get", "gets", "getting", "given", "gives", "go", "goes", "going", "gone", "got", "gotten", "greetings", "h", "had", "hadn't", "happens", "hardly", "has", "hasn't", "have", "haven't", "having", "he", "he's", "hello", "help", "hence", "her", "here", "here's", "hereafter", "hereby", "herein", "hereupon", "hers", "herself", "hi", "him", "himself", "his", "hither", "hopefully", "how", "howbeit", "however", "i", "i'd", "i'll", "i'm", "i've", "ie", "if", "ignored", "immediate", "in", "inasmuch", "inc", "indeed", "indicate", "indicated", "indicates", "inner", "insofar", "instead", "into", "inward", "is", "isn't", "it", "it'd", "it'll", "it's", "its", "itself", "j", "just", "k", "keep", "keeps", "kept", "know", "knows", "known", "l", "last", "lately", "later", "latter", "latterly", "least", "less", "lest", "let", "let's", "like", "liked", "likely", "little", "look", "looking", "looks", "ltd", "m", "mainly", "many", "may", "maybe", "me", "mean", "meanwhile", "merely", "might", "more", "moreover", "most", "mostly", "much", "must", "my", "myself", "n", "name", "namely", "nd", "near", "nearly", "necessary", "need", "needs", "neither", "never", "nevertheless", "new", "next", "nine", "no", "nobody", "non", "none", "noone", "nor", "normally", "not", "nothing", "novel", "now", "nowhere", "o", "obviously", "of", "off", "often", "oh", "ok", "okay", "old", "on", "once", "one", "ones", "only", "onto", "or", "other", "others", "otherwise", "ought", "our", "ours", "ourselves", "out", "outside", "over", "overall", "own", "p", "particular", "particularly", "per", "perhaps", "placed", "please", "plus", "possible", "presumably", "probably", "provides", "q", "que", "quite", "qv", "r", "rather", "rd", "re", "really", "reasonably", "regarding", "regardless", "regards", "relatively", "respectively", "right", "s", "said", "same", "saw", "say", "saying", "says", "second", "secondly", "see", "seeing", "seem", "seemed", "seeming", "seems", "seen", "self", "selves", "sensible", "sent", "serious", "seriously", "seven", "several", "shall", "she", "should", "shouldn't", "since", "six", "so", "some", "somebody", "somehow", "someone", "something", "sometime", "sometimes", "somewhat", "somewhere", "soon", "sorry", "specified", "specify", "specifying", "still", "sub", "such", "sup", "sure", "t", "t's", "take", "taken", "tell", "tends", "th", "than", "thank", "thanks", "thanx", "that", "that's", "thats", "the", "their", "theirs", "them", "themselves", "then", "thence", "there", "there's", "thereafter", "thereby", "therefore", "therein", "theres", "thereupon", "these", "they", "they'd", "they'll", "they're", "they've", "think", "third", "this", "thorough", "thoroughly", "those", "though", "three", "through", "throughout", "thru", "thus", "to", "together", "too", "took", "toward", "towards", "tried", "tries", "truly", "try", "trying", "twice", "two", "u", "un", "under", "unfortunately", "unless", "unlikely", "until", "unto", "up", "upon", "us", "use", "used", "useful", "uses", "using", "usually", "v", "value", "various", "very", "via", "viz", "vs", "w", "want", "wants", "was", "wasn't", "way", "we", "we'd", "we'll", "we're", "we've", "welcome", "well", "went", "were", "weren't", "what", "what's", "whatever", "when", "whence", "whenever", "where", "where's", "whereafter", "whereas", "whereby", "wherein", "whereupon", "wherever", "whether", "which", "while", "whither", "who", "who's", "whoever", "whole", "whom", "whose", "why", "will", "willing", "wish", "with", "within", "without", "won't", "wonder", "would", "would", "wouldn't", "x", "y", "yes", "yet", "you", "you'd", "you'll", "you're", "you've", "your", "yours", "yourself", "yourselves", "z", "zero",
让我们看一下其中的一些词.如果你懒的输入,但是想查找”love”这个词,象下面这样:
mysql> SELECT * FROM fulltext_sample WHERE MATCH(copy) AGAINST('lov');
Empty set (0.00 sec)

什么都没返回,因为全文索引只包含完整的单词,不是部分单词.如果想得到返回,你必须把单词写完整,就像第一个例子里一样.
就像我们提过的,连字符单词在全文索引中被排除(它们被作为单独的单词索引),因此下面的语句什么都不返回:
mysql> SELECT * FROM fulltext_sample WHERE MATCH(copy) AGAINST('all-out');
Empty set (0.00 sec)

很不幸,两个单词都小于4个字符,因此单独搜索时也不会出现,而且通常的搜索中也不会出现.本文的第二部分中使用BOOLEAN MODE搜索可以搜索部分的或者包含连字符的单词.
你也可以一次搜索多个单词,用逗号分隔.下面的例子查找包含”here”和”appears”的记录:
mysql> SELECT * FROM fulltext_sample WHERE MATCH(copy) AGAINST('here','appears');
Empty set (0.01 sec)

出乎意料这个语句没有返回.但是仔细看看忽略列表,这个词被列在其中,因此被从索引中排除了.忽略列表可能是人们解释MySQL全文索引没有生效的通常原因.如果你的查询返回了一个结果,那么你的版本的MySQL的忽略列表不包含”here”这个词.
关联度
下面的例子说明记录返回的优先级
mysql> SELECT * FROM fulltext_sample WHERE MATCH(copy) AGAINST('good,alert');
+---------------------------+
| copy |
+---------------------------+
| A good alert |
| It appears good from here |
| An all-out alert |
+---------------------------+

记录”A good alert”首先出现,因为它同时包含要搜索的两个词.你不必相信我-只需要看看MySQL在结果中显示的优先级.简单的在字段列表中重复MATCH()函数,例如:
mysql> SELECT copy,MATCH(copy) AGAINST('good,alert') AS relevance
FROM fulltext_sample WHERE MATCH(copy) AGAINST('good,alert');
+---------------------------+------------------+
| copy | relevance |
+---------------------------+------------------+
| A good alert | 1.3551264824316 |
| An all-out alert | 0.68526663197496 |
| It appears good from hear | 0.67003110026735 |
+---------------------------+------------------+

关联度的计算非常复杂,它基于索引中单词的数量,记录中不同单词的个数,索引和返回结果中单词的总数,以及单词的重要程度.这个数字可能在你的MySQL版本中有所不同,MySQL偶尔会强化计算逻辑.
对大多数应用来说标准的全文索引搜索非常有用而充分,MySQL 4让它更加强大.

原文地址:http://www.databasejournal.com/features/mysql/article.php/1578331

分享到:
评论

相关推荐

    MySQL数据库:创建索引.pptx

    FULLTEXT:表示创建全文索引; CREATE INDEX 语句并不能创建主键索引。 创建索引 CREATE [UNIQUE | FULLTEXT] INDEX 索引名 ON 表名(列名[(长度)] [ASC | DESC],...) 说明: 索引名:索引的名称,索引名在一个表中...

    如何在MySQL中提高全文搜索效率

    在MySQL中提高全文搜索效率是优化数据库性能的关键步骤,尤其是对于那些处理大量文本数据的应用程序...正确配置和使用全文索引,对于那些需要处理大量文本信息的互联网应用来说,是提升用户体验和系统性能的关键策略。

    MySQL创建全文索引分享

    MySQL全文索引是一种提高数据库查询性能的技术,尤其适用于大规模文本数据的检索。它通过分词技术和特定的算法,分析文本中的关键词频率和重要性,从而快速定位到匹配的记录。在MySQL中,全文索引主要应用于MYISAM...

    mysql中创建各种索引的语句整理.pdf

    添加FULLTEXT(全文索引) 添加多列索引 ) mysql>ALTER TABLE `table_name` ADD INDEX index_name (`column1`, `column2`, 、where条件列 、排序列或者分组列 、主键本身就是索引,无需再次...

    Laravel开发-mysql-fulltext-laravel

    创建了全文索引后,我们可以在Laravel的控制器中使用`whereRaw`或`where`方法来进行全文搜索。例如: ```php use App\Post; use Illuminate\Http\Request; public function search(Request $request) { $query = ...

    Mysql全文搜索match against的用法

    - 在创建表时使用`FULLTEXT`关键字指定字段创建全文索引。 - 对于已存在的表,可以使用`ALTER TABLE`或`CREATE INDEX`语句添加全文索引,例如: ```sql CREATE FULLTEXT INDEX index_name ON table_name(column_...

    mysql 全文模糊查找 便捷解决方案

    在MySQL中,全文索引主要通过`MATCH AGAINST`语句实现,可以配合`FULLTEXT`关键字在表的指定列上创建。例如: ```sql CREATE FULLTEXT INDEX idx_title ON articles(title); ``` 这样就在`articles`表的`title`列上...

    MySQL创建索引,查看以及删除

    4. 全文索引(Fulltext Index):用于全文搜索,仅适用于MyISAM和InnoDB存储引擎。 5. 复合索引(Composite Index):由多个列组成的索引,按列的顺序进行排序。 创建索引的基本语法如下: ```sql CREATE INDEX ...

    mysql的索引优化

    - 修改表添加全文索引:`ALTER TABLE 表名 ADD FULLTEXT INDEX [索引名] (列名列表);` #### 五、索引的选择与应用 1. **单列索引与多列索引**:索引可以是针对单个列的,也可以是多个列的组合。多列索引是指基于...

    MySQL-数据库-索引详解

    使用 fulltext 参数可以设置索引为全文索引。全文索引只能创建在 char、varchar、text 类型字段上,查询数据量较大的字符串类型的字段时,使用全文索引可以提高查询速度。 #### 单列索引 在表中单个字段上创建索引...

    基于MySQL的中文全文搜索研究.pdf

    全文索引的建立是提升查询性能的关键,它允许数据库根据特定语言的规则进行语言搜索,避免了像LIKE查询中使用通配符%导致的效率低下,减少了对数据库的压力。 在使用n-gram parser插件进行中文全文搜索前,需在...

    请描述MySQL有哪些索引类型

    组合索引是在多个字段上创建的索引,只有在查询条件中使用了组合索引的第一个字段时,索引才会被使用。创建组合索引的示例: `ALTER TABLE table ADD INDEX name_city_age (name, city, age)` 5. **全文索引...

    MySQL索引与Index Condition Pushdown

    全文索引(Full-text Index) - **定义**:适用于文本搜索,仅在MyISAM和InnoDB存储引擎中可用。 - **实现**:使用`FULLTEXT`关键字创建。 ##### 5. 联合索引(Composite Index) - **定义**:由多个列组成的...

    mysql添加索引.pdf

    ADD INDEX`,创建全文索引使用`ALTER TABLE ... ADD FULLTEXT INDEX`。 在添加索引后,需要验证其是否创建成功,可以通过`SHOW INDEXES FROM table_name`命令查看表的索引情况。同时,使用`EXPLAIN`语句分析查询...

    MySQL数据库:索引概述.pptx

    4. **全文索引(FULLTEXT)**:专用于文本搜索,支持对VARCHAR或TEXT类型列的全文检索,只能在MyISAM存储引擎的表中创建。 **不宜使用索引的情况** 并非所有场景都适合建立索引。例如,对于数据量较小的表,索引...

    MySQL索引分析和优化.pdf

    - 创建表时指定全文索引:`CREATE TABLE tablename ([...], FULLTEXT (列名列表));` - 通过`ALTERTABLE`或`CREATE INDEX`命令创建全文索引。 5. **单列索引与多列索引**:索引可以针对单个列创建(单列索引),也...

    在MYsql里面建索引

    在MySQL中,索引主要有以下几种类型:主键索引、唯一索引、普通索引(又称为非唯一索引)、全文索引和空间索引。 1. **主键索引**:每个表只能有一个主键,它的值必须是唯一的,并且不允许为空。主键索引是自动创建...

    mysql中创建各种索引的语句整理知识.pdf

    4. 添加 FULLTEXT(全文索引) 添加 FULLTEXT 索引的语句为: ALTER TABLE `table_name` ADD FULLTEXT ( `column` ) 其中,`table_name` 是要添加索引的表名,`column` 是要作为全文索引的列名。例如: ALTER ...

Global site tag (gtag.js) - Google Analytics