`
san_yun
  • 浏览: 2639040 次
  • 来自: 杭州
文章分类
社区版块
存档分类
最新评论

Introduction to Topic Modeling learning

    博客分类:
  • nlp
 
阅读更多

原文:http://chentingpc.me/article/?id=616

 

Topic Modeling(主题模型)是一个比较神奇的东西,之前听说过,没意识到它的重要性。直到唐总的点拨后重新认真看看,可以说文本挖掘的一个基础吧(比较 高级的基础?)。问题的输入是文档,输出是低维空间的主题,是unsupervised算法。基本经历发展是 LSI->pLSI->LDA->various LDA,pLSI和LDA都是生成模型,特别是LDA,这种看待文本的思想是很奇妙的。LDA的思想虽简单,但是利用EM/Gibbs等进行概率推导学起 来就没那么简单(写此文时候这部分还没完全弄清楚;唐总说TM是用一个月来学的问题或用两三个月来学的问题,呼呼,真的假的。。不知道他说这句话时候的要 求是多高)。

 

仔细看LDA有两三天了,今晚也跑了跑Mallet,也有了感性的认识。下面就把入门的文章整理下吧(这些文章都可以从网上公开下载,所以这里附件其中不算侵权吧。。。):

 

 

Survey

Specific

Video Lecture

  • D.Blei的一个很不错的lecture,由于网速原因,我只能看到其课件不能看lecture,但毫无疑问是好lecture(这东西就是D.Blei等人03年提出的)。
  • 另一个D. Blei的lecture

Open Source

Derived (not recommended for newcomers)

  • dynamic LDA : dynamic_topic_models.pdf
  • The Author-Topic Model for Authors and Documents
  • Correlated Topic Models
  • Automatic Labeling of Multinomial Topic Model
分享到:
评论

相关推荐

    Machine Learning Algorithms 2017.8

    Machine Learning Algorithms ... Topic Modeling And Sentiment Analysis In Nlp Chapter 14. A Brief Introduction To Deep Learning And Tensorflow Chapter 15. Creating A Machine Learning Architecture

    Machine Learning for Text

    – Basic algorithms: Chapters 1 through 7 discuss the classical algorithms for machine learning from text such as preprocessing, similarity computation, topic modeling, matrix factorization, ...

    INTRODUCTION TO PYTHON SCRIPTING FOR MAYA ARTISTS.

    To deepen your understanding of Python scripting in Maya, consider exploring additional resources such as online tutorials, community forums, and books dedicated to the topic. **Conclusion** By ...

    Unsupervised.Learning.with.R

    Starting from the beginning, this book introduces you to unsupervised learning and provides a high-level introduction to the topic. We quickly move on to discuss the application of key concepts and ...

    Learning PostgreSQL 10

    ASIN: B077NQGV1G, ISBN: 1788392019 Year: 2017 Format: AZW3 ... Practical tips and examples are provided at every step to ensure you are able to grasp each topic as quickly as possible.

    Artificial Intelligence With Python[January 2017]

    We will learn how to use these techniques to do sentiment analysis and topic modeling. Chapter 11, Probabilistic Reasoning for Sequential Data, shows you techniques used to analyze time series and ...

    Unsupervised Learning by Probabilistic Latent Semantic Analysis

    - **Natural Language Processing**: In NLP, PLSA can help in tasks such as text classification, sentiment analysis, and topic modeling. It can also aid in understanding the semantic relationships ...

    Packt.Python.for.Finance.2nd.Edition.2017

    - **Introduction to Statsmodels**: Statsmodels is a library for statistical modeling and econometric analysis in Python. It includes classes and functions for regression analysis, time series analysis...

    The Element of Statistical Learnging (统计学习基础:数据挖掘、推理与预测)

    Tibshirani proposed the Lasso and is co-author of the very successful <EM>An Introduction to the Bootstrap</EM>. Friedman is the co-inventor of many data-mining tools including CART, MARS, and ...

    AI_01_sfuadsadad

    ### AI in CMPT 310: An Introduction to Artificial Intelligence #### Overview The PowerPoint (PPT) files for CMPT 310 provide an introduction to artificial intelligence (AI), covering various topics ...

    Web and Big Data_First International Joint Conference, Part I-Springer(2017).pdf

    9. **Topic Modeling**: Methods for identifying and extracting topics from large collections of documents. 10. **Machine Learning**: Advanced machine learning algorithms and frameworks for predictive ...

    Handbook of Research on Soft Computing and Nature-Inspired Algorithms

    Soft computing and nature-inspired computing both play a significant role in developing a better understanding to machine learning. When studied together, they can offer new perspectives on the ...

    LDA数学八卦笔记

    0.7.3 主题模型的其他相关方法(Other Topic Modeling Methods) - 除LDA和PLSA之外,可能还介绍了其他主题模型方法,如SVD(奇异值分解)等。 文章的内容涉及了统计学、概率论以及机器学习中的核心概念,通过深入...

Global site tag (gtag.js) - Google Analytics