PROJECT DEMO: Top 5 Zip Codes

Top Five Zip Codes is a housing market prediction model that uses seasonal ARIMA time-series analysis and GridsearchCV to recommend the top 5 zip codes for purchasing a single-family home in Westchester, New York.

Interactive Dashboard:

To see the dashboard in action, go to RealtyRabbit.

Business Case

Recommend top 5 zipcodes for client interested in buying a single-family home in Westchester County, NY within the next two years.

Required Parameters (from Client)

  1. Timeframe: Purchase of new home would take place within the next two years
  2. Budget: Cost to buy not to exceed 800,000 USD maximum
  3. Location: Zip Code search radius restricted to Westchester County, New York

Ideal Parameters (from Client)

Success Criteria for Model

  1. Maximum ROI (highest forecasted home value increase for lowest upfront cost)
  2. Confidence Intervals
  3. Risk mitigation: financial stability of homeowners and homes based on historic data

Objective

Make forecast predictions for zip codes and their mean home values using Seasonal ARIMA time-series analysis.

Commute Times

Since commute time to Grand Central Station is part of the client’s required criteria, I first had to look up which towns/zip codes were on which train lines. Grand Central has 3 main lines on Metro North Railroad: Hudson, Harlem, and New Haven. The first question I was interested in answering was if the average home prices for the zip codes that fall under these geographic sections display any trends.

Mean Values by Train Line

Mean Values by Train Line Area Plot

New Haven line

New Haven Line Zip Code Timeseries

NOTE

Note that this does not include zip codes in Connecticut (which the New Haven line covers) since the client is only interested in towns in New York state.

Harlem Line

Harlem Line Zip Code Timeseries

Hudson Line

Hudson Line Zip Code Timeseries

TOP FIVE ZIP CODES

The top five I selected based on the above criteria were 10549, 10573, 10604, 10605, 10706:

TOP FIVE RECOMMENDATIONS

The top five results that fit the required criteria were 10549, 10573, 10604, 10605, 10706:

10549

timeseries 10549

10573

timeseries 10573

10604

timeseries 10604

10605

timeseries 10605

10706

timeseries 10706

Top Five Zip Codes in Westchester County

top five zipcodes timeseries

FUTURE WORK

My client was keen on accounting for public school districts, which upon initial inspection would have required a great deal of manual plug and play. However, if there is an API or some other way to scrape this data from the web, I would definitely incorporate school districts as an exogenous factor for the sake of making recommendations for a client. Someone might actually not prefer schools with a rating of 10 as these tend to be predominantly all-white. My client in particular was looking for decent school districts below the 10-mark because she wants her child to grow up in a more ethnically-diverse community. Being able to account for such preferences would be part of the future work of this project.

Contact

You can reach me at rukeine@gmail.com.

License

This project uses the following license: MIT License.

                       
           /\    _       _                           _                      *  
/\_/\_____/  \__| |_____| |_________________________| |___________________*___
[===]    / /\ \ | |  _  |  _  | _  \/ __/ -__|  \| \_  _/ _  \ \_/ | * _/| | |
 \./    /_/  \_\|_|  ___|_| |_|__/\_\ \ \____|_|\__| \__/__/\_\___/|_|\_\|_|_|
                  | /             |___/        
                  |/   
Your Browser Doesn't Support Canvas, Please Download Chrome or compatible browser.