Putra, Oddy Virgantara and Wasmanson, Fathin Muhammad and Harmini, Triana and Utama, Shoffin Nahwa (2020) Sundanese Twitter Dataset for Emotion Classification. In: International Conference on Computer Engineering, Network, and Intelligent Multimedia (CENIM) 2020, 2020-11-17, Surabaya.
FILE TEXT (Prosiding)
2020 - Sundanese Twitter Dataset for Emotion Classification.pdf - Published Version Download (454kB) |
|
FILE TEXT (Cek Plagiasi)
Sundanese Twitter Dataset for Emotion Classificatio.pdf - Published Version Download (1MB) |
|
FILE TEXT (Reviewer)
2020 - sundanese_compressed.pdf - Published Version Download (495kB) |
Abstract
Sundanese is the second-largest tribe in Indonesia which possesses many dialects. This condition has gained at- tention for many researchers to analyze emotion especially on social media. However, with barely available Sundanese dataset, this condition makes understanding sundanese emotion is a challenging task. In this research, we proposed a dataset for emo- tion classification of Sundanese text. The preprocessing includes case folding, stopwords removal, stemming, tokenizing, and text representation. Prior to classification, for the feature generation, we utilize term frequency-inverse document frequency (TFIDF). We evaluated our dataset using k-Fold Cross Validation. Our experiments with the proposed method exhibit an effective result for machine learning classification. Furthermore, as far as we know, this is the first Sundanese emotion dataset available for public.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Uncontrolled Keywords: | emotion classification, dataset, sundanese, support vector machine, text mining |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science Q Science > QA Mathematics > QA76 Computer software |
Divisions: | Fakultas Sains dan Teknologi UNIDA Gontor > Teknik Informatika |
Depositing User: | Oddy Virgantara Putra |
Date Deposited: | 08 Aug 2021 15:25 |
Last Modified: | 09 Aug 2021 11:10 |
URI: | http://repo.unida.gontor.ac.id/id/eprint/1129 |
Statistics Downloads of this Document
View Item |