Home Value Trends Analysis

This is a project provided by Codecademy. The objective is to help a company make more informed decisions on real estate investments. First I started analyzing the data on median estimated values of single family homes by zip codes from the past two decades.

1. How many distinct zip codes are in this dataset?
SELECT COUNT(DISTINCT zip_code) FROM project_1; --15452--

2. How many zip codes are from each state?
SELECT state, COUNT(DISTINCT zip_code) FROM project_1 GROUP BY state;

3. What range of years are represented in the data?
SELECT DISTINCT SUBSTR(date,1,4)  FROM project_1;

4. Using the most recent month of data available, what is the range of estimated home values across the nation?
SELECT date, value FROM project_1 WHERE date = (SELECT MAX(date) FROM project_1);

5. Using the most recent month of data available, which states have the highest average home values? How about the lowest?
SELECT date, state, AVG(value) FROM project_1 WHERE date = (SELECT max(date) FROM project_1) GROUP BY 2 ORDER BY 3 DESC;

6. Which states have the highest/lowest average home values for the year of 2017? What about for the year of 2007? 1997?
SELECT date, state, AVG(value) FROM project_1 WHERE substr(date,1,4) = "2017" GROUP BY 2 ORDER BY 3 DESC; --2017
SELECT date, state, AVG(value) FROM project_1 WHERE substr(date,1,4) = "2007" GROUP BY 2 ORDER BY 3 DESC; --2007
SELECT date, state, AVG(value) FROM project_1 WHERE substr(date,1,4) = "1997" GROUP BY 2 ORDER BY 3 DESC; --1997