文本挖掘(英文版)
P1:JZZ
0521836573peCB1028Feldman0521836573
October12,200618:37
P1:JZZ
0521836573peCB1028Feldman0521836573
October12,200618:37
THETEXT
MININGHANDBOOK
Advancedapproachesin
AnalyzingUnstructuredData
RonenFeldman
Bar-llanUniversity,Israel
Jamessanger
ABSVentures.WalthamMassachusetts
颗CAMBRIDGE
魔影UNIVERSITYPRESS
CAMBRIDGEUNIVERSITYPRESS
Cambridge,NewYork,Melbourne,Madrid,CapeTown,Singapore,Saopaulo
CambridgeUniversitypress
TheEdinburghBuilding,CambridgeCB28RU,UK
PublishedintheUnitedStatesofAmericabyCambridgeUniversityPress,NewYork
www.cambridge.ors
Informationonthistitlewww.cambridge.org/9780521836579
oRonenFeldmanandJamesSanger2007
Thispublicationisincopyright.Subjecttostatutoryexceptionandtotheprovisionof
relevantcollectivelicensingagreements,noreproductionotanypartmaytakeplace
withoutthewrittenpermissionofCambridgeUniversitypress
Firstpublishedinprintformat2006
ISBN-13978-0-511-33507-5eBook(Netlibrary
ISBN-100-511-33507-5eBook(Netlibrary
ISBN-13978-0-521-83657-9hardback
ISBN-100-521-83657-3hardback
CambridgeUniversityPresshasnoresponsibilityforthepersistenceoraccuracyofurls
forexternalorthird-partyinternetwebsitesreferredtointhispublication,anddoesnot
guaranteethatanycontentonsuchwebsitesis,orwillremain,accurateorappropriate
P1:JZZ
0521836573peCB1028Feldman0521836573
October12,200618:37
Inlovingmemoryofmyfather,Issacfeldman
P1:JZZ
0521836573peCB1028Feldman0521836573
October12,200618:37
P1:JZZ
0521836573peCB1028Feldman0521836573
October12,200618:37
Contents
Preface
pagex
IntroductiontoTextMining
1
I1DefiningTextMining
1.2GeneralArchitectureofTextMiningSystems
l.CoreTextMiningoperations
II.1CoreTextMiningoperations
II.2UsingBackgroundKnowledgeforTextMining
41
11.3TextMiningQueryLanguages
51
lI.TextMiningPreprocessingTechniques
III1Task-OrientedApproaches
1.2FurtherReading
62
I.Categorization
64
TV.1ApplicationsofTextCategorization
65
IV2Definitionoftheproblem
66
IV3DocumentRepresentation
68
IV4KnowledgeEnginccringApproachtoTC
70
TV.5MachineLearningApproachtoTC
70
IV6UsingUnlabeleddatatolmproveclassification
IV.Evaluationoftextclassifiers
IV8Citationsandnotes
V.Clustering
82
V1ClusteringTasksinTextAnalysis
V.2TheGeneralClusteringProblem
84
V3ClusteringAlgorithms
85
V4ClusteringofTextualData
V5Citationsandnotes
92
0521836573peCB1028Feldman0521836573
October12,200618:37
Contents
ViInformationExtraction
VI1IntroductiontoInformationextraction
VI2HistoricalEvolutionofIE:TheMessageUnderstanding
ConferencesandTipster
VI3IEExamples
101
VI4ArchitectureofIESystems
104
VI5Anaphoraresolution
109
VI6InductiveAlgorithmsforIE
119
VI.StructuralIe
122
VI8FurtherReading
12
VIl.ProbabilisticModelsforInformationextraction
131
VIL.1Hiddenmarkovmodels
131
VIL.2StochasticContext-Freegrammars
137
VIl.3MaximalEntropymodeling
138
VIl.4MaximalEntropymarkovmodels
140
VIL.sConditionalrandomfields
142
VIl.6Furtherreading
145
VIll.PreprocessingapplicationsUsingprobabilistic
andHybridApproaches
146
VIll.1ApplicationsofHMMtoTextualAnalysis
146
VIll.2UsingmemmforInformationExtraction
152
VIm.3ApplicationsofCrFstoTextualanalysis
153
VIIL.4TEG:UsingSCfGRulesforHybrid
Statistical-Knowledge-BasedIE
155
VIII.5Bootstrapping
166
ⅤII.6Furtherreadins
175
IX.Presentation-LayerConsiderationsforBrowsing
andQueryRefinement
177
IX1Browsing
177
IX2AccessingConstraintsandSimpleSpecificationFilters
atthepresentationlayer
185
IX3AccessingtheUnderlyingQueryLanguage
186
IX.4Citationsandnotes
187
X.VisualizationApproaches
189
X.Introduction
189
X2Architecturalconsiderations
192
X3CommonVisualizationApproachesforTextMining
194
X4VisualizationTechniquesinLinkAnalysis
225
X5Real-WorldExample:TheDocumentExplorerSystem
235
XI.LinkAnalysis
244
XLIPreliminaries
244
P1:JZZ
0521836573peCB1028Feldman0521836573
October12,200618:37
Contentsix
XI.2AutomaticlayoutofNetworks
46
XI.3PathsandCyclesinGraphs
250
XI.4Centrality
251
XI.5PartitioningofNctworks
259
XI.6PatternMatchinginNetworks
272
XI.7SoftwarePackagesforLinkAnalysis
XI.8Citationsandnotes
74
XII.TextMiningApplications
275
XILIGeneralconsiderations
276
XIl.2CorporateFinance:MiningIndustryLiteraturefor
Businessintelligence
281
XI.3A"Horizontal'TextMiningApplication:PatentAnalysis
Solutionleveragingacommercialtextanalytics
Platorm
297
XIl.4LifeSciencesResearch:MiningBiologicalPathway
InformationwithGeneWays
309
AppendixA:DIAL:ADedicatedInformationExtractionLanguagefor
TextMining
317
A.1WhatIsthediallanguage?
A2Informationextractioninthedialenvironment
a,3Texttokenization
320
A4ConceptandRuleStructure
320
A.5PatternMatching
322
A6Patternelements
323
A7Rulcconstraints
327
A8Conceptguard
328
A9CompleteDIALExamples
329
Bibliograph
337
Index
391
下载地址
用户评论
初步看了,非常经典的英语介绍内容挖掘的资料,值得一看,如果有中文版更容易理解。