N-gram representations for comment filtering
dc.contributor.author | Brand, Dirk | en_ZA |
dc.contributor.author | Kroon, Steve | en_ZA |
dc.contributor.author | Van der Merwe, Brink | en_ZA |
dc.contributor.author | Cleophas, Loek | en_ZA |
dc.date.accessioned | 2016-02-01T10:38:45Z | |
dc.date.available | 2016-02-01T10:38:45Z | |
dc.date.issued | 2015-09 | |
dc.description | CITATION: Brand, D., Kroon, S., Van der Merwe, B. & Cleophas, L. 2015. N-Gram Representations For Comment Filtering in Proceeding SAICSIT '15. Proceedings of the 2015 Annual Research Conference on South African Institute of Computer Scientists and Information Technologists, Article No. 6. STIAS, Wallenberg Centre, Stellenbosch, South Africa. 28-30 September 2015. doi:10.1145/2815782.2815789. | en_ZA |
dc.description | The original publication is available at http://dl.acm.org/authorize.cfm?key=N08849 | en_ZA |
dc.description | SAICSIT '15. Proceedings of the 2015 Annual Research Conference on South African Institute of Computer Scientists and Information Technologists, Article No. 6. September 2015. | en_ZA |
dc.description.abstract | Accurate classifiers for short texts are valuable assets in many applications. Especially in online communities, where users contribute to content in the form of posts and comments, an effective way of automatically categorising posts proves highly valuable. This paper investigates the use of N- grams as features for short text classification, and compares it to manual feature design techniques that have been popu- lar in this domain. We find that the N-gram representations greatly outperform manual feature extraction techniques. | en_ZA |
dc.description.version | Publishers version | en_ZA |
dc.identifier.citation | Brand, D., Kroon, S., Van der Merwe, B. & Cleophas, L. 2015. N-Gram Representations For Comment Filtering in Proceeding SAICSIT '15. Proceedings of the 2015 Annual Research Conference on South African Institute of Computer Scientists and Information Technologists, Article No. 6. STIAS, Wallenberg Centre, Stellenbosch, South Africa. 28-30 September 2015. doi:10.1145/2815782.2815789. | en_ZA |
dc.identifier.isbn | 978-1-4503-3683-3 | en_ZA |
dc.identifier.other | doi:10.1145/2815782.2815789 | en_ZA |
dc.identifier.uri | http://hdl.handle.net/10019.1/98228 | |
dc.language.iso | en_ZA | en_ZA |
dc.publisher | ACM, Inc. | en_ZA |
dc.rights.holder | Authors retain copyright | en_ZA |
dc.subject | N-gram models | en_ZA |
dc.subject | Computational linguistics | en_ZA |
dc.subject | Texts -- Electronic analysis | en_ZA |
dc.subject | Online texts -- Classification | en_ZA |
dc.subject | Information filtering systems | en_ZA |
dc.subject | Vector spaces | en_ZA |
dc.subject | Text mining | en_ZA |
dc.title | N-gram representations for comment filtering | en_ZA |
dc.type | Conference Paper | en_ZA |