Scientific Research and Essays

  • Abbreviation: Sci. Res. Essays
  • Language: English
  • ISSN: 1992-2248
  • DOI: 10.5897/SRE
  • Start Year: 2006
  • Published Articles: 2752

Full Length Research Paper

Learning morphosyntactic patterns for multiword term extraction

  José Luis Ochoa1, Ángela Almela2, Maria Luisa Hernández-Alcaraz3 and Rafael Valencia-García3*    
  1Departamento de Ingeniería Industrial, Universidad de Sonora. Blvd. Rosales y Transversal, Hermosillo, Sonora, México. C.P. 83000. 2English Department, Universidad de Murcia, España. 3Faculty of Computer Science, Universidad de Murcia 30071 Espinardo (Murcia).España.  
Email: [email protected]

  •  Accepted: 04 October 2011
  •  Published: 09 November 2011

Abstract

 

The identification of valid terms in any domain is fundamental to its computerization. For this reason, in this paper we present a method for obtaining automated morphosyntactic patterns, which will help researchers obtain valid terms from the proposed patterns, in order to build quality ontologies for the translation from one language to another, or to find relevant terms in short sentences, which can be used as parameters in question-answer systems. For this purpose, we use some statistical methods which show candidates in a pattern vector. Then, a heuristic process unfolds to refine the pattern vector obtained, basing on two main parameters: the statistical results previously obtained and the length of the pattern analyzed. As a result, we obtain the collection of the best patterns for the detection of real multiword terms.

 

Key words: Morphosyntactic patterns, multiword terms, incremental learning.