Open Access Open Access  Restricted Access Subscription or Fee Access

A Robust MCMC-based Method for Piecemeal Estimation of Distributional Features in Continuous Data

Raed A. T. Said

Abstract


In the big data era data mining applications typically rely on learning algorithms to extract subtle knowledge from data attributes. The veracity of such algorithms is a function of training, validation and test data - usually sampled from high-dimensional multi-faceted data. Yet, despite the interests and the analytical tools available, attaining data modelling veracity in the big data era remains a huge challenge mainly due to the dynamics in data volumes and varieties. One commonly used method of estimation is the Markov Chain Monte Carlo (MCMC) simulation which involves drawing large random samples from a known probability distribution. Its main idea is that as the number of samples grows, the estimate converges to the true expectation parameter. The paper proposes an MCMC-based method for sampling from high-dimensional multi-faceted data. Its main idea is to discretise any given data vector at different points irrespective of known class boundaries – thus, yielding different density estimates when are then compared for accuracy.
Implementation on 65436 data points (861 seismic signals of 76 observations each) and 3160 open-source monthly average sunspots readings exhibited unprecedented robustness.

Keywords


Big Data, Data Mining, Markov Chain Monte Carlo, Metropolitan-Hastings Algorithm,Robustness, Seismic Signals, Sunspots.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.


Disclaimer/Regarding indexing issue:

We have provided the online access of all issues and papers to the indexing agencies (as given on journal web site). It’s depend on indexing agencies when, how and what manner they can index or not. Hence, we like to inform that on the basis of earlier indexing, we can’t predict the today or future indexing policy of third party (i.e. indexing agencies) as they have right to discontinue any journal at any time without prior information to the journal. So, please neither sends any question nor expects any answer from us on the behalf of third party i.e. indexing agencies.Hence, we will not issue any certificate or letter for indexing issue. Our role is just to provide the online access to them. So we do properly this and one can visit indexing agencies website to get the authentic information.