To develop a good regression model, feature selection of input variables was performed using a correlation analysis and a recursive feature elimination algorithm. Thus, in this study, we determined three sets of variables as the optimal combination for regression models: proximate analysis variables (...
(5) Data Preprocessing for Regression: Prepare the data for regression analysis models. (6) Training and prediction using 7 different regression analysis models. (7) Computation of error metrics for the regression analyses. (8) Error Metrics Visualization. Data Description The data being read from...
Lecturers were incredibly skilful at describing new ideas by using a range of engaging techniques: citing the literature (eg how the natural phenomenon of regression to the mean reinforces incorrect use of negative feedback); asking questions rather than telling the answer; using videos to stimulate...
The remaining (30%) of the data were used as a testing set for evaluation. Three metrics: Correlation Coefficient (R), Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE), were used to measure the prediction performance of the regression model. Finally, the results compare models ...
The concentration of FC in the effluent of the MSL-TS system was estimated by three machine learning algorithms: artificial neural network (ANN), Cubist, and multiple linear regression (MLR). The accuracy of the models was measured by comparing the real and predicted values. Significant (p < ...
The models include various traffic and non-traffic (e.g. demographic and socio-economic) variables. Overall, the Cubist model shows better performance compared to support vector regression and random forests. Additionally, the Cubist approach provides rule-based equations for different subsets of data...
To do this, a complimentary training dataset of independent LAI was derived from a regularized model inversion of RapidEye surface reflectances and subsequently used to guide the development of LAI regression models via Cubist and random forests (RF) decision tree methods. The application of the ...
ModelsVegetation indicesRegression-krigingThe soil organic matter (SOM) content is strongly related to soil fertility and greenhouse gas emissions. Knowledge of the soil SOM content is therefore necessary for efficient and sustainable management practices. In this study, we compare the performance of ...
The results showed that kriging achieved the best performance, followed by regression-kriging. The models using only Cubist and Random forest had the poorest performance. The results, therefore, demonstrate that kriging can predict SOM contents without the need of auxiliary variables for fields with ...
Therefore, this work presents a novel ML framework based on data engineering approaches and the Cubist regression method to predict the UDSS of cohesive soil. A dataset including six different features and one target variable were used for building prediction models. The performance of ML models ...