Biei——LDA原文
这是LDA原作者Blei提出LDA的硕士毕业论文Probabilistic models of text and imagesCopyright c 2004David meir bleiAbstractProbabilistic models of text and imagesDavid meir biDoctor of Philosophy in Computer Sciencewith a designated emphasis inCommunication, Computation, and StatisticsUniversity of California. BerkeleyProf. Michael I Jordan. ChairManaging large and growing collections of information is a central goal of moderncomputer science. Data repositories of texts, images, sounds, and genetic informationhave become widely accessible, thus necessitating good methods of retrieval, organi-zation, and exploration. In this thesis, we describe a suite of probabilistic models ofinformation collections for which the above problems can be cast as statistical queriesWe use directed graphical models as a flexible, modular framework for describingappropriate modeling assumptions about the data. Fast approximate posterior inference algorithms based on variational methods free us from having to specify tractablemodels, and further allow us to take the bayesian perspective, even in the face oflarge datasetsWith this framework in hand, we describe latent Dirichlet allocation(LDA),agraphical inodel particularly suited to analyzing text, collections. LDA pOSits a finiteindex of hidden topics which describe the underlying documents. New documentsare situated into the collection via approximate posterior inference of their associatedindex terms. Extensions to LDA can index a set of images, or multimedia collectionsof interrelated text and imagesFinally, we describe nonparametric bayesian methods for relaxing the assumptionof a fixed number of topics, and develop models based on the natural assumption thatthe size of the index can grow with the collection. This idea is extended to trees, andto models which represent the hidden structure and content of a topic hierarchy thatunderlies a collectionAcknowledgementsThis dissertation would not have been possible without the help and support ofmany friends and colleaguesForemost, I thank my advisor Michael I Jordan. For the past five years, Mike hasbeen an exemplary teacher, mentor, and collaborator. i also thank the dissertationcommittee, Peter Bickel and Stuart J. Russell, for their insightful comments andsuggestionsam fortunate to have interacted with a number of superb colleagues, and Iwould like to recognize their contribution to this work francis bach, drew bagnellKobus Barnard, Jaety Edwards, Barbara Engelhardt, David Forsyth, Thomas griffiths, Marti Hearst, Leslie Kaelbling, John Lafferty, Gert Lanckriet, Jon McAuliffe,Andrew Mccallum, Brian Milch, Pedro Moreno, Andrew Ng, Mark Paskin, SamRoweis, Martin Wainwright, Alice Zheng, and Andrew Zimdars have all been influ-ential to my research through collaboration. discussion, and constructive criticism. Iparticularly thank Andrew Ng, who effectively launched this line of work when hesketched the picture in Figure 3.2 on an envelope one afternoonMy comfortable life in Berkeley would not have been possible without the generousfinancial support of a fellowship from the Microsoft corporation. Other financialsupport came from the Berkeley Microelectronics Fellowship, and travel grants fromthe NIPS foundation, UAI society, and SIGIr foundation. I also thank MicrosoftResearch, Whizbang! Labs, and Compag Research for hosting three excellent andenriching summer internshiI thank all my friends and family for their support and distraction. I especiallthank my parents Ron and Judy Blei and sister Micaela Blei who have given me alifetime of love and care. Finally, I thank Toni gantz. Her kindness, patience, humor,and friendship sustain meDedicated to my parents, Ron and Judy bleiContents1 Introduction2 Graphical models and approximate posterior inference2. 1 Latent variable graphical model14462.1.1 Exponential families2.1.2 Conjugate exponential families2.1.3 Exponential family conditionals2.2 Approximate posterior inference2.2.1 Gibbs sampling2.2.2 Mean-field variational methods122.3 Discussion163 Latent dirichlet allocation183. 1 Notation and terminology213.2 Latent dirichlet allocation212.1 The dirichlet distribution223.2.2 Joint, distribution of a corpus243.2.3 LDA and exchangeability253.2.4 A continuous mixture of unigrams263.3 Other latent variable models for text273.3.1 Unigram model273.3.2 Mixture of unigrams273. 3.3 Probabilistic latent semantic indexing3.3.4 A geometric interpretation313.4 Posterior inference333.4.1 Mean-field variational inference343.4.2 Empirical Bayes estimates363.4.3 Smoothing383.5 Example403.6 Applications and empirical Results413.6.1 Document modeling433.6.2 Document classification473.6.3 Collaborative filtering483.7 Discussion4 Modeling annotated data524.1 Hierarchical models of image/caption data534.1.1 Gaussian-multinomial mixture4.1.2 Gaussian-multinomiaI lda564.1.3 Correspondence LDa4.2 Empirical results624.2.1 Test set likelihood624.2.2 Caption perplexity634.2.3 Annotation examples4.2.4 Text -based image retrieval674.3 Discussion5 Nonparametric Bayesian inference715.1 The Dirichlet process725.I. Polya urns and the Chinese restaurant process745.1.2 Sethuraman's stick-breaking construction5.2 Dirichlet process mixture models765.2.1 The truncated Dirichlet process5.2.2 Exponential family mixtures5.2.3 Exponential family mixtures785.3 MCMC for dP mixtures795.3.1 Collapsed Gibbs sampling5.3.2 Blocked Gibbs sampling5.3.3 Placing a prior on the scaling parameter5.4 Variational inference for the dp mixture835.4.1 Coordinate ascent algorithm845.5 Example and results865.5.1 Simulated mixture models875.6 Discussion6 Hierarchical latent Dirichlet allocation926.1 The nested Chinese restaurant, process936.2 Hierarchical latent dirichlet allocation946. 2.1 Approximate inference with Gibbs sampling976.3 Examples and empirical results986.4 Discussion1007 Conclusions104
下载地址
用户评论