Scientific Research and Essays

  • Abbreviation: Sci. Res. Essays
  • Language: English
  • ISSN: 1992-2248
  • DOI: 10.5897/SRE
  • Start Year: 2006
  • Published Articles: 2763

Full Length Research Paper

An efficient hybrid distributed document clustering algorithm

J. E. Judith
  • J. E. Judith
  • Department of CSE, Noorul Islam Centre for Higher Education, Kumaracoil, India.
  • Google Scholar
J. Jayakumari
  • J. Jayakumari
  • Department of ECE, Noorul Islam Centre for Higher Education, Kumaracoil, India
  • Google Scholar

  •  Received: 03 September 2014
  •  Accepted: 23 December 2014
  •  Published: 15 January 2015


Anna H (2008). Similarity Measures for Text Document Clustering. Proc. of the New Zealand Computer Science Research Student Conference, pp. 49 - 56.
Cui X, Potok TE (2005). Document clustering analysis based on hybrid PSO + KMeans Algorithm. J. Computer Sci. Spl. Iss. pp. 27-33.
Datta S, Giannella CR, Kargupta H (2009). Approximate distributed k-means clustering over P2P network. IEEE trans. on Knowl. Data Eng. 21(10):1372-1388.
Eshref J, Hans-Peter K, Martin P (2003).Towards effective and efficient distributed clustering. Workshop on Clustering Large Data Sets. pp. 1-10.
Hu Z, Zhu W, Li Y E, Du X, Yan F (2013). A Fuzzy approach to clustering of text documents based on MapReduce. Fifth Int'l. Conf. Computat'l. Inf. Sci. (ICCIS). pp. 666-669.
Ibrahim A, Simone A, Ludwig (2012). Parallel particle swarm optimization clustering algorithm based on MapReduce methodology. Fourth World Congress on Nature and Biologically Inspired Computing (NaBIC). pp. 105-111.
Jianxiong Y, Watada J (2011). Decomposition of term-document matrix for cluster analysis. IEEE Int'l. Conf. Fuzzy Systems. pp. 976-983.
Kehua Y, Guoxiong H, Guohui H (2012). Research and application of MapReduce-based MST text clustering algorithm. pp. 753-757.
Khaled MH, Kitchener ON, Kamel MS (2009). Hierarchically distributed peer-to-peer document clustering and cluster summarization. IEEE transac. on Knowl. Data Eng. 21(5):681-698.
Lei Q, Bin W, Qing K, Yuxiao D (2011). SAKU: A distributed system for data analysis in large-scale dataset based on cloud computing. Eighth Int'l. Conf. on Fuzzy Systems and Knowl. Discovery (FSKD). pp. 1257-1261.
Liu P, Ge S (2012). A new distributed name disambiguation system based on MapReduce. IEEE 14th Int'l. Conf. on Commu. Technol. (ICCT). pp. 550-554.
Odysseas Papapetrou, Wolf Siberski, Norbert Fuhr (2012). Decentralized Probabilistic Text Clustering. IEEE Trans. Knowl. Data Engg. 24(10):1848 – 1861.
Patil YK, Nandedkar VS (2014). HADOOP: A New Approach for Document Clustering. Int. J. Adv. Res. in IT and Eng. 3(7):1-8.
Ping H, Jingsheng L, Wenjun Y (2011). Large-Scale Data Sets Clustering Based on MapReduce and Hadoop. J. Computat. Inf. Sys. pp. 5956-5963.
Porter MF (1980). An algorithm for suffix stripping. Program electronic library inf. sys. 14(3):130-137.
Reuters - 21578 text categorization test collection distribution 1.0 
Salton G, Wong A, Yang CS (1975). A vector space model for automatic Indexing. Commu. ACM, pp. 613 – 620.
Surendra B, Xian-He S (2011). Special issue on Data Intensive Computing. J. Parallel Distributed Comput. 71(2):143 -144.
Thangamani M, Thangaraj P (2012). Effective fuzzy semantic clustering scheme for decentralised network through multi-domain ontology model. Int. J. Metadata, Semantics Ontol. 7(2):131-139.
Wan J, Yu W, Xu X (2009). Design and implementation of distributed document clustering based on MapReduce. Proc. Symp. Int. Comp. Sci. Comput. Technol. pp. 278-280.
Yang L, Maozhen L, Hammoud S, Khalid AN, Ponraj M (2010). A MapReduce based distributed LSI. Int'l Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pp. 2978-2982.