Language Independent Analysis and Classification of Discussion Threads in Coursera MOOC Forums

Download PDF.

“Language Independent Analysis and Classification of Discussion Threads in Coursera MOOC Forums” by Lorenzo A. Rossi and Omprakash Gnawali. In Proceedings of the IEEE International Conference on Information Reuse and Integration (IRI 2014), Aug. 2014.

Abstract

In this work, we analyze the discussion threads from the forums of 60 Massive Open Online Courses (MOOCs) offered by Coursera and taught in 4 different languages. The types of interactions in such threads vary: there are discussions on close ended problems (e.g. solutions to assignments), open ended topics, course logistics, or just small talk among fellow students. We first study the evolution of the forum activities with respect to the normalized course duration. Then we investigate several language independent features to classify the discussion threads based on the types of the interactions among the users. We use default Coursera subforum categories (Study Groups, Assignments, Lectures, ...) to define the classes of interest and so the labels. We extract features related to structure, popularity, temporal dynamics of threads and diversity of the ids of the users. Text related features, word count aside, are avoided to apply the methods across discussion threads written in different languages and with various technical terminologies. Experiments show a classification performance with ROC AUC between 0.58 and 0.89, depending on the subforum class considered and with possibly noisy labels.

Download PDF.

BibTeX entry:

@inproceedings{coursera-iri2014,
   author = {Lorenzo A. Rossi and Omprakash Gnawali},
   title = {{Language Independent Analysis and Classification of
	Discussion Threads in Coursera MOOC Forums}},
   booktitle = {Proceedings of the IEEE International Conference on
	Information Reuse and Integration (IRI 2014)},
   month = aug,
   year = {2014}
}