Most of web users use various search engines to get specific information. A key factor in the success of web search engines are their ability to rapidly find good quality results to the queries that are based on specific terms. This paper aims at retrieving more relevant documents from a huge corpus based on the required information. We propose a particle swarm optimization algorithm based on latent semantic indexing (PSO+LSI) for text clustering. PSO family of bio-inspired algorithms has recently successfully been applied to a number of real word clustering problems. We use an adaptive inertia weight (AIW) that do proper exploration and exploitation in search space. PSO can merge with LSI to achieve best clustering accuracy and efficiency. This framework provides more relevant documents to the user and reduces the irrelevant documents. It would be seen that for all numbers of dimensions, PSO+LSI are faster than PSO+Kmeans algorithms using vector space model (VSM). It takes 22.3 s for PSO+LSI method with 1000 terms to obtain its best performance on 150 dimensions.
Key words: Vector space model, particle swarm optimization (PSO) algorithm, latent semantic indexing, text clustering, adaptive inertia weight.
Copyright © 2023 Author(s) retain the copyright of this article.
This article is published under the terms of the Creative Commons Attribution License 4.0