Machine learning, data mining, and the World Wide Web : design of special-purpose search engines

dc.contributor.advisorOmlin, Christian W.en_ZA
dc.contributor.authorKruger, Andries F.en_ZA
dc.contributor.otherStellenbosch University. Faculty of Science. Department of Mathematical Sciences.en_ZA
dc.date.accessioned2012-08-27T11:35:30Z
dc.date.available2012-08-27T11:35:30Z
dc.date.issued2003-04en_ZA
dc.descriptionThesis (MSc)--Stellenbosch University, 2003.en_ZA
dc.description.abstractENGLISH ABSTRACT: We present DEADLINER, a special-purpose search engine that indexes conference and workshop announcements, and which extracts a range of academic information from the Web. SVMs provide an efficient and highly accurate mechanism for obtaining relevant web documents. DEADLINER currently extracts speakers, locations (e.g. countries), dates, paper submission (and other) deadlines, topics, program committees, abstracts, and affiliations. Complex and detailed searches are possible on these fields. The niche search engine was constructed by employing a methodology for rapid implementation of specialised search engines. Bayesian integration of simple extractors provides this methodology, that avoids complex hand-tuned text extraction methods. The simple extractors exploit loose formatting and keyword conventions. The Bayesian framework further produces a search engine where each user can control each fields false alarm rate in an intuitive and rigorous fashion, thus providing easy-to-use metadata.en_ZA
dc.description.abstractAFRIKAANSE OPSOMMING: Ons stel DEADLINER bekend: 'n soekmasjien wat konferensie en werkvergaderingsaankondigings katalogiseer en wat uiteindelik 'n wye reeks akademiese byeenkomsmateriaal sal monitor en onttrek uit die Web. DEAD LINER herken en onttrek tans sprekers, plekke (bv. landname), datums, o.a. sperdatums vir die inlewering van akademiese verrigtings, onderwerpe, programkomiteë, oorsigte of opsommings, en affiliasies. 'n Grondige soek is moontlik oor en deur hierdie velde. Die nissoekmasjien is gebou deur gebruik te maak van 'n metodologie vir die vinnige oprigting van spesialiteitsoekmasjiene. Die metodologie vermy komplekse instelling m.b.v. hande-arbeid van die teksuittreksels deur gebruik te maak van Bayesiese integrering van eenvoudige ontsluiters. Die ontsluiters buit dan styl- en gewoonte-sleutelwoorde uit. Die Bayesiese raamwerk skep hierdeur 'n soekmasjien wat gebruikers toelaat om elke veld se kans om verkeerd te kies op 'n intuïtiewe en deeglike manier te beheer.af_ZA
dc.format.extent1 v. (various pagings) : illustrationsen_ZA
dc.identifier.urihttp://hdl.handle.net/10019.1/53492
dc.language.isoen_ZAen_ZA
dc.publisherStellenbosch : Stellenbosch Universityen_ZA
dc.rights.holderStellenbosch Universityen_ZA
dc.subjectSearch enginesen_ZA
dc.subjectData miningen_ZA
dc.subjectComputer-assisted instructionen_ZA
dc.subjectWeb sites -- Abstracting and indexingen_ZA
dc.subjectMachine learningen_ZA
dc.subjectDEADLINERen_ZA
dc.subjectBayesian frameworken_ZA
dc.titleMachine learning, data mining, and the World Wide Web : design of special-purpose search enginesen_ZA
dc.typeThesisen_ZA
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
kruger_machine_2003.pdf
Size:
19.18 MB
Format:
Adobe Portable Document Format
Description: