An Efficient Classification Model for Cyber Text
Abstract
Modified TF-IDF algorithm and IRLBA dimensionality reduction technique offer more computationally efficient text analytics solutions with reduced carbon footprint compared to deep learning approaches.
The uprising of deep learning methodology and practice in recent years has brought about a severe consequence of increasing carbon footprint due to the insatiable demand for computational resources and power. The field of text analytics also experienced a massive transformation in this trend of monopolizing methodology. In this paper, the original TF-IDF algorithm has been modified, and Clement Term Frequency-Inverse Document Frequency (CTF-IDF) has been proposed for data preprocessing. This paper primarily discusses the effectiveness of classical machine learning techniques in text analytics with CTF-IDF and a faster IRLBA algorithm for dimensionality reduction. The introduction of both of these techniques in the conventional text analytics pipeline ensures a more efficient, faster, and less computationally intensive application when compared with deep learning methodology regarding carbon footprint, with minor compromise in accuracy. The experimental results also exhibit a manifold of reduction in time complexity and improvement of model accuracy for the classical machine learning methods discussed further in this paper.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper