Journal article

A comparative analysis of text data classification accuracy and speed using neural networks, Bloom filter and naive Bayes

Year:

2021

Published in:

Technology Center PC
text data classification
Bloom filter
naive Bayes
neural network
classification time and accuracy

The object of research is the methods of fast classification for solving text data classification problems. The need for this study is due to the rapid growth of textual data, both in digital and printed forms. Thus, there is a need to process such data using software, since human resources are not able to process such an amount of data in full. A large number of data classification approaches have been developed. The conducted research is based on the application of the following methods of classification of text data: Bloom filter, naive Bayesian classifier and neural networks to a set of text data in order to classify them into categories. Each method has both disadvantages and advantages. This paper will reflect the strengths and weaknesses of each method on a specific example. These algorithms were comparatively among themselves in terms of speed and efficiency, that is, the accuracy of determining the belonging of a text to a certain class of classification. The work of each method was considered on the same data sets with a change in the amount of training and test data, as well as with a change in the number of classification groups. The dataset used contains the following classes: world, business, sports, and science and technology. In real conditions of the classification of such data, the number of categories is much larger than that considered in the work, and may have subcategories in its composition. In the course of this study, each method was analyzed using different parameter values to obtain the best result. Analyzing the results obtained, the best results for the classification of text data were obtained using a neural network.

Other publications by

14 publications found

2025
Journal article

The development of an electronic circuit simulation system using variable tabular bases

Publisher: Technology Center PC

Authors: Vadym Yaremenko, Bogdan Bulakh, Yaroslav Kornachevskyy, Oleksandr Beznosyk, Kostyantyn Kharchenko

2018
Journal article

Огляд наявних мультиагентних систем для задач інтелектуального аналізу даних

Publisher: Таврійський національний університет ім. В.І. Вернадського

Authors: Vadym Yaremenko

2021
Working paper

Neural Networks and Monte‑Carlo Method Usage in Multi‑Agent Systems for Sudoku Problem Solving

Publisher: SSRN

Authors: Vadym Yaremenko, Kateryna Poloziuk

2020
Journal article

Mobile Driving License System Deployment Model With Security Enhancement

Publisher: Theoretical and cryptographic problems of cybersecurity

Authors: Vadym Yaremenko, V. Blynkov

2020
Journal article

Модель Мультиагентної Системи Для Семантичного Аналізу Текстів

Publisher: Міжвузівський збірник "НАУКОВІ НОТАТКИ".

Authors: Vadym Yaremenko, Serhii Khudiakov

2023
Journal article

Performance evaluation of LU matrix decomposition using the SYCL standard

Publisher: ZBW

Authors: Vadym Yaremenko, Dmytro Nasikan

2023
Journal article

The development of the method of optimizing costs for software testing in the Agile model

Publisher: Technology Audit and Production Reserves

Authors: Vadym Yaremenko, Kostyantyn Kharchenko, Oleksandr Beznosyk, Bogdan Bulakh, Ganna Ishchenko

2020
Working paper

Development of a Multi‑Agent System for Solving Domain Dictionary Construction Problem

Publisher: SSRN

Authors: Vadym Yaremenko, Oleksandr Syrotiuk

2019
Journal article

Порівняльний Аналіз Програмних Бібліотек Для Класифікації Текстових Даних Із Використанням Штучних Нейронних Мереж

Publisher: Вчені записки ТНУ імені В.І. Вернадського

Authors: Vadym Yaremenko, Maksym Tarasenko

2024
Journal article

Forecasting software development costs in scrum iterations using ordinary least squares method

Publisher: Technology Center PC

Authors: Vadym Yaremenko, Kostyantyn Kharchenko, Oleksandr Beznosyk, Bogdan Bulakh, Bogdan Kyriusha