2020 Census Data was sourced from Census.Gov. We filtered this data to show the number of households, families, and non-family households for each zipcode. From this dataset we display the Median income, mean income, and the Percent of Income allocated to Housing Cost for each type of household.
For our analysis we devised an unsupervised K-Means Machine Learning model. The dataset inputted into the model includes median income, mean income, and percent of income allocated to housing costs for households, families, and non-family households. Additionally, we identified how many amenties are found in each zipcode, including hospitals, schools, markets, parks, and public transportation.
The zipcodes identified in the K-Means cluster model as being in Class 4 should be first priority when considering where to place low-income affordable housing units. These areas are highlighted in the map on the right and infomration on these areas can be found in the table below.
Zip Code | Median Income (dollars) | Percent of Income allocated to Housing Cost | Class |
---|---|---|---|
91910 | 70283.0 | 46.8 | 4 |
91911 | 65074.0 | 50.9 | 4 |
91950 | 48359.0 | 44.8 | 4 |
91977 | 71396.0 | 44.8 | 4 |
92020 | 61830.0 | 43.0 | 4 |
92021 | 60510.0 | 43.7 | 4 |
92105 | 48072.0 | 47.2 | 4 |
92113 | 43958.0 | 41.9 | 4 |
92114 | 73090.0 | 51.2 | 4 |
92115 | 56137.0 | 36.2 | 4 |
92126 | 99376.0 | 33.0 | 4 |
92154 | 70846.0 | 45.2 | 4 |
The zipcodes identified in the K-Means cluster model as being in Class 0 would be second priority when considering where to place low-income affordable housing units. These areas are highlighted in the map on the left and infomration on these areas can be found in the table below.
Zip Code | Median Income (dollars) | Percent of Income allocated to Housing Cost | Class |
---|---|---|---|
91932 | 59795.0 | 39.6 | 0 |
91941 | 94111.0 | 39.4 | 0 |
91942 | 66551.0 | 34.1 | 0 |
91945 | 67236.0 | 44.1 | 0 |
92008 | 86046.0 | 25.8 | 0 |
92019 | 85067.0 | 40.1 | 0 |
92025 | 56866.0 | 40.4 | 0 |
92026 | 76534.0 | 36.3 | 0 |
92027 | 70009.0 | 43.5 | 0 |
92028 | 80775.0 | 37.1 | 0 |
92040 | 83692.0 | 40.3 | 0 |
92054 | 63355.0 | 31.2 | 0 |
92056 | 85047.0 | 34.3 | 0 |
92057 | 81339.0 | 33.8 | 0 |
92058 | 57213.0 | 31.7 | 0 |
92065 | 100645.0 | 34.1 | 0 |
92069 | 77618.0 | 35.7 | 0 |
92071 | 85751.0 | 36.2 | 0 |
92078 | 91564.0 | 33.6 | 0 |
92081 | 80584.0 | 30.3 | 0 |
92083 | 68551.0 | 32.1 | 0 |
92084 | 77970.0 | 39.8 | 0 |
92102 | 54862.0 | 35.3 | 0 |
92107 | 84190.0 | 35.0 | 0 |
92108 | 80572.0 | 26.8 | 0 |
92110 | 77579.0 | 33.7 | 0 |
92111 | 77249.0 | 40.8 | 0 |
92117 | 91347.0 | 34.9 | 0 |
92120 | 102597.0 | 31.4 | 0 |
92123 | 90602.0 | 35.6 | 0 |
92124 | 94485.0 | 30.3 | 0 |
92139 | 75576.0 | 52.7 | 0 |
92173 | 48967.0 | 44.1 | 0 |
92672 | 89029.0 | 35.7 | 0 |
Through the K-Means Machine Learning model we identified Zip Codes that have a need for more affordable housing.
Next we asked the question: “In which of these areas would the people living in those accommodations benefit the most?”
To answer this question we investigated the quantity and quality of amenities in these areas. Using Google Places API search we were able to scrape the quantity of operational supermarkets and public parks in each zip code. Public transportation data was gathered from the San Diego MTS website. We were able to find the quantity of public hospitals in each zip code using ushospitalfinder.com. Lastly, from data gathered from greatschools.com we were able to identify which of these zip codes have the best ranking schools.
Finally, the accessibility data was compared to the K-Means Machine Learning to identify 3 areas where affordable housing is needed and which areas would bring the greatest benefit to the people who would live in them.