TY - Type of reference TI - RFreeStem: A language and rule-free stemmer AU - Xavier Baril AU - Oihana Coustié AU - Josiane Mothe AU - Olivier Teste AB - With the large expansion of available textual data, text mining has become of special interest. Due to their unstructured nature, such data require important preprocessing steps. Among them, stemming algorithms conflate the variants of words into their stems. However, the most popular algorithms are rule-based, and therefore highly languagedependent. In contrast, corpus-based stemmers often exhibit significant algorithmic complexity, making them inefficient. They do not necessarily provide the extracted stems either, which are required for certain text mining tasks. We propose a new approach, RFreeStem, that is corpus-based and can therefore be applied on many languages. The implementation of our method is flexible and efficient, since it relies on a single running through the words’ n-grams. We also detail a method to extract the stems. Our experiments show that RFreeStem improves the results of text mining tasks, even more than the Porter reference, while providing a stemming solution on poorly endowed languages, which do not benefit from a version of Porter. DO - 10.21494/ISTE.OP.2021.0605 JF - Open Journal in Information Systems Engineering KW - information systems, Text Mining, information retrieval, Sentiment Analysis, stemmer, NLP, Système d’information, fouille de texte, recherche d’information, analyse de sentiments, racinisation, L1 - https://openscience.fr/IMG/pdf/iste_roisi21v2n1_4.pdf LA - en PB - ISTE OpenScience DA - 2021/01/19 SN - 2634-1468 TT - RFreeStem : Une méthode de racinisation indépendante de la langue et sans règle UR - https://openscience.fr/RFreeStem-A-language-and-rule-free-stemmer IS - Issue 1 VL - 2 ER -