About the full prediction approximation by a lot of partial predictions in case of incomplete data

Authors

  • Nina P. Alexeyeva St Petersburg State University, 7-9, Universitetskaya nab., St Petersburg, 199034, Russian Federation
  • Fatema S. Sh. Al-Juboori University of Information Technology and Communications, Iraq, Baghdad, St Al-Nidal

DOI:

https://doi.org/10.21638/spbu01.2022.401

Abstract

In this article, we are talking about the random subspaces method in forecasting under the condition of incomplete data and about estimation of a full forecast based on a set of partial predictions. Centered partial predictions are considered without loss of generality. According to the statistical model, off-diagonal elements in the correlation matrix of partial predictions are considered random with known mathematical expectation and variance. In case of this random matrix, analytical expressions are obtained for the mathematical expectation of the determinant and minors. Based on these results, a class of more accurate estimates of the full prediction is constructed, which differ from the mean partial prediction by a multipliers that depend on the statistical parameters of the correlation matrix of partial predictions. The results of modeling and practical forecasting based on incomplete biogeographic data are presented.

Keywords:

the random subspace method, statistical model, matrix with random elements, partial predictions, multiple regression

Downloads

Download data is not yet available.
 

References

Литература

1. Vink G., Frank L.E., Pannekoek J., Buuren S. Predictive mean matching imputation of semicontinuous variables. Statistica Neerlandica 68 (1), 61-90 (2014). https://doi.org/10.1111/stan.12023

2. Van Buuren S., Groothuis-Oudshoorn C. mice: Multivariate imputation by chained equations in R. Journal of Statistimathbf Software 45 (3), 1-67 (2011).

3. Alexeyeva N. Dual balance correction in Repeated Measures ANOVA with missing data. Electronic Journal of Applied Statistical Analysis 10 (1), 146-159 (2017). https://doi.org/10.1285 /i20705948v10n1p146

4. Барт А.Г. Анализ медико-биологических систем. Метод частично обратных функций.Санкт-Петербург, Изд-во С.-Петерб. ун-та (2003).

5. Ho Tin Kam. The Random Subspace Method for Constructing Decision Forests. JEE Transactions on Pattern Analysis and Machine Intelligence 20 (8), 832-844 (1998).https://doi.org/10.1109/34.709601

6. Крамер Г. Математические методы статистики, пер. с англ. Москва, Мир (1975).

7. Ho Tin Kam. Random Decision Forests. Proceedings of the 3rd International Conference on ocument Analysis and Recognition. Montreal, QC, 14-16 August 1995, 278-282 (1995).

8. Алексеева Н.П., Горлова И.А., Бондаренко Б.Б. Возможности прогнозирования артериальнойгипертензии на основе метода проективнойклассификации. Артериальная гипертензия 3 (5), 472-480 (2017). https://doi.org/10.18705/1607-419X-2017-23-5-472-480

9. Эсмедляева Д.С., Алексеева Н.П., Новицкая Т.А., Дьякова М.Е., Ариэль Б.М., Соколович Е.Г. Активность воспалительного процесса и маркеры деструкции внеклеточного матрикса при туберкулёме легких. Бюллетень сибирской медицины 19 (2), 112-119 (2020). ttps://doi.org/10.20538/1682-0363-2020-2-112-119

10. Afifi A.A., Azen S.P. Statistical Analysis. A Computer Oriented Approach. 2nd ed. New York; San Francisco; London, Academic Press (1979).

References

1. Vink G., Frank L.E., Pannekoek J., Buuren S. Predictive mean matching imputation of semicontinuous variables. Statistica Neerlandica 68 (1), 61-90 (2014). https://doi.org/10.1111/stan.12023

2. Van Buuren S., Groothuis-Oudshoorn C. mice: Multivariate imputation by chained equations in R. Journal of Statistimathbf Software 45 (3), 1-67 (2011).

3. Alexeyeva N. Dual balance correction in Repeated Measures ANOVA with missing data. Electronic Journal of Applied Statistical Analysis 10 (1), 146-159 (2017). https://doi.org/10.1285 /i20705948v10n1p146

4. Bart A.G. Analysis of biomedical systems. Method partially inverse functions. St Petersburg, St Рetersburg University Press (2003). (In Russian)

5. Ho Tin Kam. The Random Subspace Method for Constructing Decision Forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20 (8), 832-844 (1998). https://doi.org/10.1109/34.709601

6. Cramer H. Mathematical Methods Of Statistics. Asia Publishing House (1975) [Rus. ed.: Cramer H. Matematicheskie metodu statistiki, Mir Publ. (1975)].

7. Ho Tin Kam. Random Decision Forests. Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, 14-16 August 1995, 278-282 (1995).

8. Alexeyeva N.P., Gorlova I.A., Bondarenko B.B. Possibilities of predicting arterial hypertension based on the method of projective classification. Arterial hypertension. 23 (5), 472-480 (2017). https://doi.org/10.18705/1607-419X-2017-23-5-472-480 (In Russian)

9. Esmedlyaeva D.S., Alexeyeva N.P., Novitskaya T.A., Dyakova M.E., Ariel B.M., Sokolovich E.G. Inflammatory process activity and markers of extracellular matrix destruction in pulmonary tuberculoma. Bulletin of Siberian Medicine 19 (2), 112-119 (2020). https://doi.org/10.20538/1682-0363-2020-2-112-119 (In Russian)

10. Afifi A.A., Azen S.P. Statistical Analysis. A Computer Oriented Approach. 2nd ed. New York; San Francisco; London, Academic Press (1979).

Published

2022-12-26

How to Cite

Alexeyeva, N. P., & Al-Juboori, F. S. S. (2022). About the full prediction approximation by a lot of partial predictions in case of incomplete data. Vestnik of Saint Petersburg University. Mathematics. Mechanics. Astronomy, 9(4), 575–589. https://doi.org/10.21638/spbu01.2022.401

Issue

Section

Mathematics