3

Ames Iowa analysis

Pre-processing, exploration (EDA) and statistical analysis of the Ames Iowa Housing Dataset using Python, pandas, numpy and matplotlib

A full exploratory and statistical analysis project built around the Ames Housing dataset, a well-known dataset in data science and machine learning. It demonstrates end-to-end data wrangling, visualization, and statistical modeling techniques.

Project Workflow

Data Cleaning

  • Handling missing values and outliers.
  • Encoding categorical variables for analysis.

Exploratory Data Analysis (EDA)

  • Univariate, bivariate, and multivariate plots.
  • Distribution of sale prices, neighborhood comparisons, and quality ratings.

Feature Engineering

  • Transformations (e.g., log of sale price).
  • Derived features to improve model performance.

Statistical Testing

  • Hypothesis tests confirming neighborhood effects on prices.
  • Regression analysis to identify key predictors.