Search for collections on UNIDA Gontor Repository

Sundanese Twitter Dataset for Emotion Classification

Harmini, Triana (2020) Sundanese Twitter Dataset for Emotion Classification. Sundanese Twitter Dataset for Emotion Classification. ISSN 978-1-7281-8283-4

[img] FILE TEXT (jurnal)
10. Sundanese Twitter Dataset.pdf - Published Version
License Creative Commons Attribution Non-commercial Share Alike.

Download (518kB)
[img] FILE TEXT (riviewer)
10. sundanese twitter dataset.pdf

Download (1MB)
[img] FILE TEXT (plagiarism)
10. Sundanese Twitter Dataset.pdf

Download (1MB)

Abstract

Sundanese is the second-largest tribe in Indonesia which possesses many dialects. This condition has gained attention for many researchers to analyze emotion especially on social media. However, with barely available Sundanese dataset, this condition makes understanding sundanese emotion is a challenging task. In this research, we proposed a dataset for emotion classification of Sundanese text. The preprocessing includes case folding, stopwords removal, stemming, tokenizing, and text representation. Prior to classification, for the feature generation, we utilize term frequency-inverse document frequency (TFIDF). We evaluated our dataset using k-Fold Cross Validation. Our experiments with the proposed method exhibit an effective result for machine learning classification. Furthermore, as far as we know, this is the first Sundanese emotion dataset available forpublic

Item Type: Article
Subjects: T Technology > T Technology (General)
Divisions: Fakultas Sains dan Teknologi UNIDA Gontor > Teknik Informatika
Depositing User: Tryan Arza SAINTEKK3
Date Deposited: 11 Sep 2021 06:19
Last Modified: 11 Sep 2021 06:19
URI: http://repo.unida.gontor.ac.id/id/eprint/1207

Statistics Downloads of this Document

Downloads per month in the last year

View more statistics

 View Item View Item