`

分享一本文本挖掘的书

阅读更多

好不容易从国外找到的,有需要的可以下来看看。

The information age has made it easy to store large amounts of data. The proliferation
of documents available on the Web, on corporate intranets, on news wires, and
elsewhere is overwhelming. However, although the amount of data available to us
is constantly increasing, our ability to absorb and process this information remains
constant. Search engines only exacerbate the problem by making more and more
documents available in a matter of a few key strokes.
Text mining is a new and exciting research area that tries to solve the information
overload problem by using techniques from data mining, machine learning, natural
language processing (NLP), information retrieval (IR), and knowledge management.
Text mining involves the preprocessing of document collections (text categorization,
information extraction, term extraction), the storage of the intermediate representations,
the techniques to analyze these intermediate representations (such as distribution
analysis, clustering, trend analysis, and association rules), and visualization of
the results.
This book presents a general theory of text mining along with the main techniques
behind it.We offer a generalized architecture for text mining and outline the
algorithms and data structures typically used by text mining systems.
The book is aimed at the advanced undergraduate students, graduate students,
academic researchers, and professional practitioners interested in complete coverage
of the text mining field. We have included all the topics critical to people
who plan to develop text mining systems or to use them. In particular, we have
covered preprocessing techniques such as text categorization, text clustering, and
information extraction and analysis techniques such as association rules and link
analysis.
The book tries to blend together theory and practice; we have attempted to
provide many real-life scenarios that show how the different techniques are used in
practice.When writing the book we tried to make it as self-contained as possible and
have compiled a comprehensive bibliography for each topic so that the reader can
expand his or her knowledge accordingly.
x

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics