A full exploratory and statistical analysis project built around the Ames Housing dataset, a well-known dataset in data science and machine learning. It demonstrates end-to-end data wrangling, visualization, and statistical modeling techniques.
Project Workflow
Data Cleaning
- Handling missing values and outliers.
- Encoding categorical variables for analysis.
Exploratory Data Analysis (EDA)
- Univariate, bivariate, and multivariate plots.
- Distribution of sale prices, neighborhood comparisons, and quality ratings.
Feature Engineering
- Transformations (e.g., log of sale price).
- Derived features to improve model performance.
Statistical Testing
- Hypothesis tests confirming neighborhood effects on prices.
- Regression analysis to identify key predictors.