1. Identifying Key Factors in Gentrification
Machine learning was used to analyze the factors most influential in gentrification trends in neighborhoods, specifically examining changes in housing prices from 2000 to 2022. The analysis utilized a model called Random Forest Classification to determine which factors, such as crime rates, education levels, and poverty, are most closely associated with changes in housing prices. Neighborhoods where housing prices increased by 50% or more were considered to have experienced gentrification.
The model focused on identifying the relationships between demographic and socioeconomic factors and housing price changes. The results indicated that crime rate was the most significant factor linked to gentrification, suggesting that reductions in crime may be a key driver of neighborhood transformation.
2. Feature Importance
Understanding the most important features influencing gentrification was a key part of the analysis. The Random Forest model generated feature importance scores, which ranked factors based on their impact on predicting gentrification.
Among the various factors, crime rate emerged as the most influential, highlighting its central role in neighborhood transformation.

The importance of different factors in predicting gentrification.
3. Exploring Relationships with a Heatmap

Visual representation of the relationships between various factors.
A heatmap was created to visually represent the correlations between various factors, helping to identify key relationships and potential multicollinearity. This tool provided insights into how different factors, such as housing appreciation and poverty rate, are interconnected.
Notably, the strongest correlation was found between housing appreciation (change in housing prices) and gentrification. Additionally, a high correlation was observed between poverty rates and education levels (percentage of population without a high school diploma), suggesting that these two factors often overlap in neighborhoods.
4. Boxplots: Comparing Gentrified vs. Non-Gentrified Neighborhoods
Boxplots were used to compare the distribution of key factors between gentrified and non-gentrified neighborhoods. They highlight significant differences in the distribution of factors like education levels, poverty rates, and population density.
For example, neighborhoods that experienced gentrification typically had lower levels of education (fewer individuals aged 25+ without a high school diploma), while non-gentrified neighborhoods showed higher levels of this factor.
Population density, however, showed little difference between the two groups, suggesting that it is not a significant differentiator for gentrification.

Boxplot 0 represents gentrified neighborhoods.
Boxplot 1 represents non-gentrified neighborhoods.
Key Findings
The analysis revealed that crime rate was the most significant factor in predicting gentrification. This suggests that neighborhoods with higher crime rates are more likely to experience changes due to gentrification. Additionally, factors like poverty rate and education levels (e.g., the percentage of the population without a high school diploma) were strongly linked to gentrification trends, often occurring together in the same neighborhoods.
Interestingly, population density showed little correlation with gentrification, meaning that simply having more people in an area doesn't necessarily predict if it will gentrify.
Conclusion
Overall, the analysis highlights that rising housing prices, crime, and socio-economic factors like poverty and education levels are key drivers of gentrification. Understanding these patterns can help urban planners, policymakers, and community leaders address the challenges of gentrification and work toward creating more equitable and sustainable neighborhoods.