This repository contains notes and projects of Data scientist track from dataquest course work. git commandline data machine-learning statistics sql numpy machine-learning-algorithms probability pandas kaggle datascience statistical-analysis machinelearning deeplearning dataanalysis datastructures-algorithms data-...
该题目来自于Kaggle:Titanic: Machine Learning from Disaster,也是Kaggle上最受关注的题目之一,不仅因为它浪漫的背景故事,更因为数据集的一些feature也蛮耐人琢磨。最近struggle出一个模型,在此跟大家分享,其中kaggle上的一些算法也给予了我一些启发。 分析语言采用的R,特征工程的灵感主要来自于画图和统计检验,最终模型用...
python-projectdata-analysis-pythondiwali-sales-analysis UpdatedFeb 22, 2024 Jupyter Notebook A collection of data analysis and visualization projects designed to uncover insights from diverse datasets. These projects include analyses on COVID-19 trends, stock trading patterns, housing market prices, IoT...
EDA(Experimental Data Analysis)之常见分析方法总结--以kaggle的泰坦尼克号之灾为例 先引入包,一般EDA需要引入如下包: View Code 2.读入数据,一般使用data = pd.read_csv('filepath/file.csv')读取 3.正式开始EDA 看看数据的格式: data.head() 2.看看数据的各个字段有多少个为null的记录 data.isnull().sum...
To conduct our study, we focus on the Credit Card Fraud Detection Dataset, a set of anonymized financial transactions available on Kaggle [12]. This dataset is the only publicly available large data for credit card fraud analysis. Hence the scope of the study is limited to one dataset. With...
Data science projects mostly start with a business question or problem. A problem triggers an initiation phase, in which a set of possible solutions is defined, and initial feasibility is assessed. Initial data collection or an exploratory data analysis of available data is done to see what is...
pythondata-sciencepandasdata-visualizationdata-analysismicrosoft-for-beginners UpdatedOct 15, 2024 Jupyter Notebook donnemartin/data-science-ipython-notebooks Star27.7k Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduc...
common data analysis and machine learning tasks using python pythondata-sciencedata-scientistspython-tutorial UpdatedApr 3, 2024 Python PizzaDeDados/datascience-pizza Star2.4k Code Issues Pull requests 🍕 Repositório para juntar informações sobre materiais de estudo em análise de dados e áreas...
这也是实际工作与Kaggle比赛或课程项目之间的区别之一。虽然表结构设计并不难,但我们需要注意根据数据内容...
5. Time Series Analysis Dataset 时间序列是数据科学中的常用技术,它在工业界中有广泛的应用—天气预报,销量预测,趋势预测等等。这个数据集就是专门针对时间序列预测的。 问题:预测未来的交通状况 数据:https://datahack.analyticsvidhya.com/contest/practice-problem-time-series-2/ ...