Open Access Open Access  Restricted Access Subscription or Fee Access

An Ontology-based Semantic Clustering Algorithm for Accounting Text

Yanhui Jiang, Mo Li, Kaohua Yao

Abstract


The feature selection and semantic similarity computing between texts are essential components of accounting text clustering. In the past, several approaches for generic text feature selection and similarity computing by exploiting different measures (vector space model, words frequency, thesauri, domain corpora, etc.) have been proposed. However, accounting field is different from general field. Accounting has its own concepts and rules. These generic methods are not so suitable for accounting text clustering. In this paper, a novel accounting ontology-based feature selection and similarity computing algorithm for accounting text is proposed. Firstly, characterizing the accounting texts, we get a terms vector. Secondly, terms vector is mapped into concept of accounting ontology and converted into concept vector. Based on the structure of concept, the semantic similarity between texts is computed. Then, trough an improved clustering method, accounting texts are clustered effectively. The experiments results imply that our proposal outperforms most of the previous measures as well as eliminates some of their limitations.


Keywords


text mining, similarity, clustering, semantics, accounting text.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.


Disclaimer/Regarding indexing issue:

We have provided the online access of all issues and papers to the indexing agencies (as given on journal web site). It’s depend on indexing agencies when, how and what manner they can index or not. Hence, we like to inform that on the basis of earlier indexing, we can’t predict the today or future indexing policy of third party (i.e. indexing agencies) as they have right to discontinue any journal at any time without prior information to the journal. So, please neither sends any question nor expects any answer from us on the behalf of third party i.e. indexing agencies.Hence, we will not issue any certificate or letter for indexing issue. Our role is just to provide the online access to them. So we do properly this and one can visit indexing agencies website to get the authentic information.