In: Proc. of the ACM SIGKDD international conference on knowledge discovery & data mining. New York: ACM, pp 204–213 Ntoutsi I, Kalousis A, Theodoridis Y (2008) A general framework for estimating similarity of
2023, Data Science, Analytics and Machine Learning with RLuiz Paulo Fávero, ... Rafael de Freitas Souza Chapter Correspondence Analysis and Dual Scaling Multiple-Choice Data: An Example of Incidence Data CA is a name for quantification of contingency tables. When CA is applied to multiple-choice...
IF-MCA: Importance Factor-Based Multiple Correspondence Analysis for Multimedia Data Analyticsdoi:10.1109/TMM.2017.2760623Feature extractionDecision treesAlgorithm design and analysisTrainingMultimedia communicationData miningTestingMultimedia concept detection is a challenging topic due to the well-known class ...
60]. The idea here is to extract important patterns hidden in the EEG data. This is a complex task because, even after cleaning, EEG is known to have a high noise-to-signal ratio. Therefore, to get meaningful information from EEG, advanced signal processing...
Unlike boosting, the random forest does not use various prediction techniques but combines multiple decision trees into a “forest”, each constructed using bootstrap samples of the training data and random feature selection. In this ensemble, each tree relies on the values of a random vector samp...
We collected data on Google in the second step. To analyze the data related to the flood events, we examined the data in Google. Specifically, to understand the popularity (based on data from Google) of selected flood management keywords that reflect the theme and characteristics of the subject...
Peer-to-peer accommodation has gained prominence in the sharing economy and e-commerce sectors, with big data playing a crucial role in understanding customer preferences and evaluating homestay satisfaction. This study proposes a novel methodology that
The parameters used in the RF were 501 decision trees (ntree), three predictors at each split (mtry), and 100 times iterations. We validated these parameter choices through a series of cross-validation experiments, in which we systematically varied the parameters and evaluated the performance of ...
With the fast development of cloud computing, more and more data storage and computation are moved from the local to the cloud, especially the applications of machine learning and data analytics. However, the cloud servers are run by a third party and cannot be fully trusted by users. As a...
Methods, devices, and computer programs are presented for creating a unified data stream from multiple data streams acquired from multiple devices. One method includes an operation