1 简介 Titanic,就是当年第一航行便失事的超级大船——泰坦尼克号,大家可能对 Rose 与 Jack 的爱情故事念念不忘,但同时 Titanic 留下的乘客数据也是数据分析一笔宝贵的财富,很多新人都会拿这套数据集练练手。本想做一下 Titanic 的数据分析练练手,然后上kaggle一搜,一堆 Titanic 数据分析的 Paper,于是决定偷个懒...
for dataset in combine: dataset.loc[ dataset['Age'] <= 16, 'Age'] = 0 dataset.loc[(dataset['Age'] > 16) & (dataset['Age'] <= 32), 'Age'] = 1 dataset.loc[(dataset['Age'] > 32) & (dataset['Age'] <= 48), 'Age'] = 2 dataset.loc[(dataset['Age'] > 48) & (data...
Knowing from a training set of samples listing passengers who survived or did not survive the Titanic disaster, can our model determine based on a given test dataset not containing the survival information, if these passengers in the test dataset survived or not. 分析训练集的数据,建立模型预测训练...
主要选取的特征 客舱等级、性别、旁系、直系亲友、票价和年龄,年龄是有数据缺失的,我做了最简单处理就是取平均值填充(但是我发现不加年龄最后kaggle得分更高)最终的kaggle得分是76左右。 下面是详细代码,代码使用GPU加速训练了: import torch import pandas as pd from torch.utils.data importDataset,DataLoaderimport...
dataset['FamilySize']=dataset['SibSp']+dataset['Parch']+1 构造了家庭大小数属性。 添加IsAlone属性标识是否一人登船。 从Name属性中提取Title,因为我们发现所有的名字几乎都含有title,就是Mr、Miss等称呼 因此我们提取(,)与(.)之间的元素,构成Title属性,该属性隐藏着性别与社会地位等信息。
machine-learningdeep-learningtitanic-kaggletitanic-survival-predictiontitanic-dataset UpdatedJul 4, 2021 Python Predicting the survival of passengers on RMS Titanic using information about the passengers. rtitanic-kaggletitanic-survival-predictiontitanic-dataset ...
A Bagging classifier is an ensemble meta-estimator that fits base classifiers each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final prediction. 大意就是: ...
dataset = titanic[['Age','Sex','Pclass']] guess_ages = np.zeros((2, 3)) l = [1, 4]for iinrange(len(l)):for jin range(0, 3): guess_df = dataset[(dataset['Sex'] == l[i]) & (dataset['Pclass'] == j + 1)]['Age'].dropna()#age_mean = guess_df.mean()#age_...
KAGGLE是一个是数据分析建模的应用竞赛平台。正如首页所写的"Your Home for Data Science"。 关于Titanic: Machine Learning from Disaster Kaggle首页就可以找到这个项目,这个被无数先驱推荐的入门Kaggle练习项目,Kaggle上最受关注的题目之一了。跟着各路大神的思路学习完之后,或许不仅仅是电影故事加分,Kaggle提供的数据...
()fordatasetindata_cleaner_dummy:dataset['AgeBin_Code']=le.fit_transform(dataset['Age_Bin'])dataset['FareBin_Code']=le.fit_transform(dataset['Fare_Bin'])drop_col=['Sex','Embarked','Title','Fare_Bin','Age_Bin']data_train_dummy=data_cleaner_dummy[0].drop(drop_col,axis=1)data_...