PP-Loss: An imbalanced regression loss based on plotting position for improved precipitation nowcasting

Theoretical and Applied Climatology (2024)

Lei Xu Xuechun Li Hongchu Yu Wenying Du Zeqiang Chen Nengcheng Chen

Citation

Reference

Related Paper

Citation Trend

Keywords:

Nowcasting

Position (finance)

Topics:

Precipitation Measurement and Analysis

Meteorological Phenomena and Simulations

Soil Moisture and Remote Sensing

10.1007/s00704-024-04984-w

Cite

Analysis of the Influencing Factors and Predictions of Poston house Prices based on a Multiple Linear Regression Model

Highlights in Science Engineering and Technology (2024)

Liangyu Jiang

In this study, the classic Boston house price data set is selected for the analysis of house price correlation. According to the variables in the Boston housing price data set, the linear regression model of Boston housing prices is established by using Python software. The regression equation and regression coefficient were tested for significance, excluding the variable of p >=0.5, multiple linear regression was carried out, and the regression equation with good fitting was obtained. It is found that there are too many variables after multiple linear regression, which is difficult to analyze and predict, so the correlation analysis of variables is carried out. This paper gets the percentage of lower status population, pupil-teacher ratio and average number of rooms per dwelling with medv (The median quoted price for an owner-occupied home, $1,000 per unit) has a significant relationship. Finally, a linear regression equation is established for the independent variable whose correlation coefficient is greater than 0.5, and the housing price is predicted.

Variables

Multiple correlation

Standardized coefficient

10.54097/hkgqcv31

Cite

Citations (0)

Assessment of Biological Age by Multiple Regression Analysis

Journal of Gerontology (1975)

Takeshi Furukawa Michitoshi Inoue F Kajiya Hiroshi Inada Shinji Takasugi

Journal Article Assessment of Biological Age by Multiple Regression Analysis Get access Toshiyuki Furukawa, MD, PhD, Toshiyuki Furukawa, MD, PhD Search for other works by this author on: Oxford Academic PubMed Google Scholar Michitoshi Inoue, MD, PhD, Michitoshi Inoue, MD, PhD Search for other works by this author on: Oxford Academic PubMed Google Scholar Fumihiko Kajiya, MD, Fumihiko Kajiya, MD Search for other works by this author on: Oxford Academic PubMed Google Scholar Hiroshi Inada, MD, PhD, Hiroshi Inada, MD, PhD Search for other works by this author on: Oxford Academic PubMed Google Scholar Seiichi Takasugi, MD, Seiichi Takasugi, MD Search for other works by this author on: Oxford Academic PubMed Google Scholar Sugao Fukui, MD, Sugao Fukui, MD Search for other works by this author on: Oxford Academic PubMed Google Scholar Hiroshi Takeda, MD, Hiroshi Takeda, MD Search for other works by this author on: Oxford Academic PubMed Google Scholar Hiroshi Abe, MD, PhD Hiroshi Abe, MD, PhD 1First Dept. of Medicine, Osaka Univ. Medical SchoolOsaka, Japan. Search for other works by this author on: Oxford Academic PubMed Google Scholar Journal of Gerontology, Volume 30, Issue 4, July 1975, Pages 422–434, https://doi.org/10.1093/geronj/30.4.422 Published: 01 July 1975

Value (mathematics)

10.1093/geronj/30.4.422

Cite

Citations (113)

Relationship between PM 10 and PM 2.5 levels in high-traffic area determined using path analysis and linear regression

Journal of Environmental Sciences (2017)

Narut Sahanavin Tassanee Prueksasit Kraichat Tantrakarnapa

Path coefficient

10.1016/j.jes.2017.01.017

Cite

Citations (37)

Study on Models for Monitoring of Grassland Biomass around Qinghai Lake Assisted by Remote Sensing

Acta Geographica Sinica (2003)

Niu Zhichun

Taking the region around the Qinghai Lake as the study area and using the Landsat Thematic Mapper data and the measured grass yield data, the monadic linear regression models and the non-linear regression models were established, respectively, to express the relations between grassland biomass and the vegetation indices. There are two types of sampling site, i.e., the larger one is 30 m×30 m and the smaller one is 1 m×1 m. Each larger sampling site includes one smaller one which was randomly selected. The major conclusions from this study are: 1) the fitting accuracies of the non-linear regression models are much higher than those of the non-linear regression models, namely, the results obtained from the non-linear regression models are more accordant with the measured grassland biomass data in comparison with those from the monadic linear regression models; 2) the comparison of different forms of the non-linear regression analysis on the relations between the vegetation indices and the measured grassland biomass data indicates that the cubic equation is the best one in terms of the suitability of use in the study area; 3) the results from the non-linear regression analysis show that the order is RVI, NDVI, SAVI, MSAVI and DVI in terms of the fitting accuracy between these vegetation indices and grassland biomass data; and 4) the non-linear model Y = -18.626RVI3+220.317RVI2-648.271RVI+691.093 is the best model which can be used in monitoring grassland biomass based on the vegetation indices in the region around the Qinghai Lake.

Thematic Mapper

Thematic map

Source

Cite

Citations (10)

On the use of regression analysis for the estimation of human biological age.

Biogerontology (2000)

Jens Krøll O. Saxtrup

Regression diagnostic

10.1023/a:1026594602252

Cite

Citations (44)

Regression Analysis and Estimating Regression Models

Springer eBooks (2022)

John B. Guerard Anureet Saxena Mustafa Gültekin

10.1007/978-3-030-87269-4_12

Cite

Citations (1)

Determination of the Physical Working Capacity in Children Using Three Different Regression Models

International Journal of Sports Medicine (1984)

Karin A. Pfeiffer Gerhard Ernst Steyer

Experimental findings of the working capacity at a heart rate of 170 bts/min (W170) were compared to predicted values. Statistical tests were applied to examine the suitability and the error of prediction of three different regression models: a linear regression line, a polynomial regression model, and a "break point" regression model, which were compared to the time course of the heart rate during a linearly increasing work load from 0 to 100 W during 10 min. For this study the results of 28 children, 15 and 16 years old, and students of physical education were investigated. When a linear regression line was compared to these data, systematic deviations between measured data and the values estimated by this model were found. When the W170 was predicted using this model from the data collected during the first 10 min of an exercise procedure for the determination of the heart rate index, the physical working capacity was overestimated. The polynomial regression model and the "break point" regression model agreed with the time course of the heart rate without systematic error and allowed an unbiased prediction of the W170 from the first 10 min of the exercise test.

Regression diagnostic

10.1055/s-2008-1025885

Cite

Citations (4)

Prediction of Coke Strength Based on Multiple Linear Regression Analysis

Shandong Metallurgy (2009)

Wu Xianxi

Coke strength was predicted by selecting the indexes of blending coal quality.The indexes were used as independent variables for multiple linear regression analysis and got regression models.Then took regression coefficient to make t test,eliminated the no significant index and renewed to make the linear regression models.The research indicated that the model can be used to effectively predict coke strength and the relative error forecasting results was within 5%.

Source

Cite

Citations (0)

Cycling performance prediction based on cadence analysis by using multiple regression

Journal of Physics Conference Series (2021)

Sukhairi Sudin Azizi Naim Abdul Aziz Fathinul Syahir Ahmad Saad Nurul Syahirah Khalid I Ibrahim

Abstract This project examined the influence of the cadence, speed, heart rate and power towards the cycling performance by using Garmin Edge 1000. Any change in cadence will affect the speed, heart rate and power of the novice cyclist and the changes pattern will be observed through mobile devices installed with Garmin Connect application. Every results will be recorded for the next task which analysis the collected data by using machine learning algorithm which is Regression analysis. Regression analysis is a statistical method for modelling the connection between one or more independent variables and a dependent (target) variable. Regression analysis is required to answer these types of prediction problems in machine learning. Regression is a supervised learning technique that aids in the discovery of variable correlations and allows for the prediction of a continuous output variable based on one or more predictor variables. A total of forty days’ worth of events were captured in the dataset. Cadence act as dependent variable, (y) while speed, heart rate and power act as independent variable, (x) in prediction of the cycling performance. Simple linear regression is defined as linear regression with only one input variable (x). When there are several input variables, the linear regression is referred to as multiple linear regression. The research uses a linear regression technique to predict cycling performance based on cadence analysis. The linear regression algorithm reveals a linear relationship between a dependent (y) variable and one or more independent (y) variables, thus the name. Because linear regression reveals a linear relationship, it determines how the value of the dependent variable changes as the value of the independent variable changes. This analysis use the Mean Squared Error (MSE) expense function for Linear Regression, which is the average of squared errors between expected and real values. Value of R squared had been recorded in this project. A low R-squared value means that the independent variable is not describing any of the difference in the dependent variable-regardless of variable importance, this is letting know that the defined independent variable, although meaningful, is not responsible for much of the variance in the dependent variable’s mean. By using multiple regression, the value of R-squared in this project is acceptable because over than 0.7 and as known this project based on human behaviour and usually the R-squared value hardly to have more than 0.3 if involve human factor but in this project the R-squared is acceptable.

Cadence

Variables

Linear predictor function

Regression diagnostic

10.1088/1742-6596/2107/1/012058

Cite

Citations (2)

Applied Statistical Analysis of the Linear Regression Models

Journal of selçuk üniversity natural and applied science (2015)

Ioan Miloșan

The aim of the study is to achieve a statistical analysis of linear regression models of specific industrial processes data. The work strategy involves the regression analysis which is the most widely used statistical tools to understand which among the independent variables are related to the dependent variable, and to explore the forms of these relationships. This application focused on the fitting and checking of linear regression models, using small and large data sets, with pocket calculators or computers. The performance of regression analysis methods in practice depends on the form of the data generating process , and how it relates to the regression approach being used. It was used some statistical criteria as: Cochran criteria; Student criteria and Fischer criteria. After solving statistical analysis of the linear regression models, in the end there was obtained an applied statistical analysis of the linear regression model through the use of the classical method with a pocket computer. The same data were calculated with C++ software. By using this software we obtained more accurate results and the application time was reduced by several hours to 2-3 minutes.

Regression diagnostic

Variables

Statistical Analysis

Statistical software

Source

Cite

Citations (0)