Article Text

Download PDFPDF

Effects of demolishing abandoned buildings on firearm violence: a moderation analysis using aerial imagery and deep learning
  1. Jonathan Jay1,
  2. Jorrit de Jong2,
  3. Marcia P Jimenez3,
  4. Quynh Nguyen4,
  5. Jason Goldstick5
  1. 1Department of Community Health Sciences, Boston University School of Public Health, Boston, Massachusetts, USA
  2. 2Harvard University John F Kennedy School of Government, Cambridge, Massachusetts, USA
  3. 3Boston University School of Public Health, Boston, Massachusetts, USA
  4. 4Department of Epidemiology and Biostatistics, University of Maryland at College Park, College Park, Maryland, USA
  5. 5Department of Emergency Medicine, University of Michigan, Ann Arbor, Michigan, USA
  1. Correspondence to Jonathan Jay, Boston University School of Public Health, Boston, MA 02118, USA; jonjay{at}


Purpose Demolishing abandoned buildings has been found to reduce nearby firearm violence. However, these effects might vary within cities and across time scales. We aimed to identify potential moderators of the effects of demolitions on firearm violence using a novel approach that combined machine learning and aerial imagery.

Methods Outcomes were annual counts of fatal and non-fatal shootings in Rochester, New York, from 2000 to 2020. Treatment was demolitions conducted from 2009 to 2019. Units of analysis were 152×152 m grid squares. We used a difference-in-differences approach to test effects: (A) the year after each demolition and (B) as demolitions accumulated over time. As moderators, we used a built environment typology generated by extracting information from aerial imagery using convolutional neural networks, a deep learning approach, combined with k-means clustering. We stratified our main models by built environment cluster to test for moderation.

Results One demolition was associated with a 14% shootings reduction (incident rate ratio (IRR)=0.86, 95% CI 0.83 to 0.90, p<0.001) the following year. Demolitions were also associated with a long-term, 2% reduction in shootings per year for each cumulative demolition (IRR=0.98, 95% CI 0.95 to 1.00, p=0.02). In the stratified models, densely built areas with higher street connectivity displayed following-year effects, but not long-term effects. Areas with lower density and larger parcels displayed long-term effects but not following-year effects.

Conclusions The built environment might influence the magnitude and duration of the effects of demolitions on firearm violence. Policymakers may consider complementary programmes to help sustain these effects in high-density areas.

  • violence
  • environmental modification
  • statistical Issues
  • urban
  • firearm

Data availability statement

Data are available in a public, open access repository. Rochester, NY, shootings data:, NY, demolitions data: State orthoimagery portal:

This article is made freely available for use in accordance with BMJ’s website terms and conditions for the duration of the covid-19 pandemic or until otherwise determined by BMJ. You may use, download and print the article for any lawful, non-commercial purpose (including text and data mining) provided that all copyright notices and trade marks are retained.

Statistics from


Improving the physical environment in resource-deprived neighbourhoods is an evidence-based approach for addressing firearm violence.1–4 This effect is thought to operate by promoting community ownership over shared spaces, which in turn reduces opportunities for firearm violence.2 Leveraging this effect may be a critical target for alleviating racial disparities in firearm injury,5 since residential segregation in US cities exposes communities of colour to higher levels of property abandonment.6 7 Fully leveraging this effect requires understanding when and where it matters most, which is the purpose of this work.

Several effective strategies have focused on remediating vacant and abandoned spaces. For example, converting unkempt vacant lots into green spaces reduced firearm violence in multiple prospective trials.1 2 Additionally, studies have found violence reductions from programmes that target abandoned buildings, which can harbour unsafe activities from the view of passersby.8 Strategies can include requiring property owners to improve the appearance and secure the doors and windows of vacant buildings, using city ordinances, which has reduced firearm assaults in Philadelphia, Pennsylvania.4 9

For the most physically deteriorated buildings, demolition may be the only feasible remediation strategy. Historically, large-scale demolitions were used to clear paths for highways and other developments, often targeting marginalised communities.10 However, demolitions targeting dilapidated, abandoned buildings can be necessary for health and safety. Recent studies have found that demolitions can also reduce violence, including firearm violence. In Buffalo, New York, Wheeler and colleagues11 found that demolitions were associated with violence reductions up to several blocks away, and indications that cumulative demolitions reduced firearm violence at the census tract level.11 In Detroit, Michigan, Jay and colleagues3 found that cumulative demolitions within census block groups were associated with reduced firearm assaults.

Researchers have begun to examine how other factors may moderate treatment effects, that is, influence the effectiveness of remediating a particular parcel. For example, an analysis of vacant lot remediation from Philadelphia found that treatment was less effective near train stations and alcohol outlets.12 The authors proposed that these factors increased foot traffic near the newly remediated sites, hindering neighbours’ ability to exert collective ownership over the public spaces. Similar mechanisms (ie, collective efficacy, civic engagement and signalling that a space is cared for) may influence demolition effects as well. However, the mechanisms that moderate demolition effects may be even more complex, since the impact of demolishing an abandoned building may depend on: (A) how the building influences nearby firearm violence and (B) how a vacant lot, newly created by demolishing the building, will influence future firearm violence. For these reasons, moderators of short-term effects could differ from moderators of long-term effects.

In this study of Rochester, New York, we used a novel approach to identifying built environment influences that might moderate demolition effects. We used deep learning to extract information from high-resolution aerial imagery, then cluster analysis to classify spaces according to built environment types. This approach incorporates dimensions of urban design that are visible in the imagery, including building density, street connectivity and land use. This data-mining approach overcomes limitations of prior work requiring the need to prespecify spatial features of interest,13 and using historical aerial imagery enabled us to observe the built environment immediately preintervention. We could then estimate the extent to which demolition effects varied across built environment types and over differing time scales.

Data sources

Rochester is a city of approximately 200 000 residents in upstate New York. The city exhibits high rates of poverty and racial segregation, each of which contributes to firearm violence14 and disinvestment in the physical environment. The city government has advanced efforts to improve public health and racial justice through environmental remediation, including a proactive inspection programme for rental units15 and removing highways that have disconnected predominantly black neighbourhoods from the downtown core.16 The city’s police department maintains the most extensive public database on firearm violence incidents of any US city.17

We used city administrative data on demolitions occurring from 2009 to 2019 (n=1792) and Rochester Police Department data on fatal and non-fatal shooting incidents from 2000 to 2020 (n=3728). The availability of shooting outcomes that pre-dated the demolitions allowed us to establish baseline levels. To generate spatial units, we overlaid a 152×152 m (ie, 500×500 ft) square grid over the Rochester city boundary. This grid-based approach was consistent with prior work.18 The grid size corresponds with the finding from a similar city that demolition effects were strongest within a 152 m buffer of each demolition.11

To adjust our models for time-varying demographic patterns, we obtained 2000 decennial census data and estimates from the 5-year American Community Survey ending each year from 2009 to 2019, for population, poverty rate and housing occupancy rate by census tract. For those years, we used inverse distance weighting to assign values from census tract centroids to grid squares. Comparable census data were not available for 2001–2008 or 2020. For 2001–2008, we interpolated the missing data using spline regression. Because we expected the COVID-19 pandemic might nullify trends, we did not extrapolate 2020 values but rather assigned the same values as 2019.

We downloaded a high-resolution aerial image, provided by the New York State Digital Orthoimagery Program,19 corresponding to the boundaries of each grid square. This programme collects imagery at 12-inch resolution across New York State, at approximately 4-year intervals. We chose imagery collected in April 2009, prior to 98% of the demolitions analysed here, the collection date that best captured preintervention environmental conditions.

Built environment clusters

To extract data from images, we passed each image through VGG-16, a pretrained convolutional neural network (CNN) designed to classify images according to one of 1000 categories of physical objects.20 VGG-16 was trained on ImageNet, a database of over 1 million labelled images. Although ImageNet depicts objects from the horizontal (not aerial/overhead) perspective, VGG-16 and other CNNs trained on ImageNet have proven capable of extracting features applicable to a wide range of computer vision tasks,21 including aerial imagery.13 22 We extracted the image data after the third convolutional block out of five, which yielded 256 ‘mid-level’ features per image. These features do not map onto specific environmental features such as trees, buildings or roads. Instead, they could be considered akin to latent variables that an algorithm, originally designed for other image analytic tasks,20 has learnt are generically relevant to image interpretation tasks. Midlevel features are more abstract than the features detected by higher level features (eg, ears and noses) but more sophisticated than the ‘blobs’ encoded by low-level features.23

Our procedure, including analytic code, is further detailed elsewhere.13 Here, we refined our extraction approach by rotating each image by a random multiple of 90° prior to extracting features, such that inconsequential differences in the orientation of the road network would not bias results.

Although this deep learning approach sacrifices interpretability, Maharana and Nsoesie22 found that features extracted from aerial imagery predicted obesity better than the density of 96 specified classes of spatial features (eg, fast food restaurants). Whereas the spatial features indicated how physical spaces were used (eg, for selling fast food), the imagery data captured other attributes of those spaces (eg, size, layout and surrounding green space) that supported a better-fitting model. It can be inferred from the findings that an imagery-based approach incorporates important information about the built environment that traditional approaches do not.

After extracting features for each grid square, we performed k-means clustering to identify clusters of similar-looking locations. We used silhouette tests and manual inspection to determine the appropriate number of clusters, then visually assessed these clusters to generate descriptions of the environmental features that they represented. We assessed for characteristics such as building size and density; residential, commercial or industrial land use; and street connectivity, which describes the density of intersections and of short, direct links between places on the road network.24

Modelling demolition effects

To model treatment effects, we used a comparative interrupted time series design that leveraged differences in the quantity and timing of demolitions across different areas. Out of 6075 total grid squares covering the city boundary, we included only the 647 grid squares that received at least one demolition during the intervention period, consistent with a staggered difference-in-differences design (see table 1). Outcomes were calculated at 1-year time intervals from 2000 to 2020, as the yearly counts of shooting incidents involving at least one fatal or non-fatal injury.

Table 1

Sample characteristics for study of demolitions (n=1792) on firearm violence in Rochester, New York, 2009–2019

We calculated two exposures of interest: (1) the count of demolitions carried out in the previous year and (2) the cumulative count of demolitions completed from the beginning of the intervention period through the end of the previous year. Each of these variables represented a different intended time scale for demolition effects. Previous year’s demolitions were considered medium-term effects. This timeframe is consistent with the follow-up period used in prior experimental1 and observational3 4 work on blight remediation and violence. By contrast, cumulative demolitions were considered long-term effects. To avoid inadvertently encoding time trends through this cumulative count variable, we included year fixed effects (FEs) in our model (discussed further).

Our main model was a negative binomial regression with two-way FEs for each grid square and year. This setup was designed to control for differences in firearm violence base rates between grid squares (via spatial unit FEs) and citywide changes in firearm violence over time (via year FEs). This approach automatically controls for time-invariant confounders and for global secular trends. To address time-varying confounders, we added fixed effects for population, poverty rate and housing occupancy rate. We clustered standard errors by grid square and year, as well as by neighbourhood quadrant, using Rochester’s neighbourhood service boundaries, to address potential spatial dependence.

To assess effect modification, we then stratified our main model by built environment cluster. We omitted the clusters (D and E) where zero demolitions occurred and/or zero total shootings occurred (ie, where unit fixed effects would absorb all variance). To test the model residuals for spatial autocorrelation, we calculated Moran’s I at each time step, based on queen’s contiguity of the grid squares in the analysis. Consistent with prior work, we also conducted secondary analyses to test for displacement effects, that is, whether demolitions caused firearm violence to move from the treated unit to adjacent areas. In these analyses, a spatially lagged outcome term was interacted with the exposure variable for each of our main models.

Patient and Public Involvement (PPI) statement: This study did not engage members of the public before publication.


Grid squares averaged 0.20 shootings per year throughout the study period (2000–2020). During the intervention period (2009–2019), the mean count of total demolitions in each grid square was 2.77 (table 1). Demolitions and shootings tended to occur in the same areas of the city, outside of downtown and disproportionately in the Northeast, Northwest and Southwest quadrants (figure 1).

Figure 1

Locations of demolitions (years 2009–2019) and firearm violence (years 2000–2020) in Rochester, New York.

Test statistics and silhouette plots best supported the use of either three or five built environment clusters. On manual inspection, using three clusters only differentiated between water, open land and developed land, which would not have been useful for this analysis, so we used five clusters instead. The number of preintervention shootings and demolitions during the intervention period were comparable across clusters A, B and C (table 1). Preintervention trends in shooting outcomes—an important assumption of our difference-in-differences approach—were similar across clusters A, B and C (figure 2).

Figure 2

Preintervention firearm violence trends.

Figure 3 displays the visual characteristics of these clusters. Figure 4 displays their spatial distribution. Cluster A, where a majority of demolitions occurred, included tightly gridded street networks (ie, high connectivity) with comparatively small buildings and parcels. Cluster B, which disproportionately appeared in the Southeast quadrant, included residential areas with comparatively large buildings and parcels, and street blocks set off from larger avenues that did not run parallel (ie, lower connectivity). Cluster C was typically developed land with relatively few buildings, often situated on the outskirts of neighbourhoods, or else contained parks or other open spaces. Clusters D and E were within the city boundary but typically spanned undeveloped land or river.

Figure 3

Sample images from built environment clusters.

Figure 4

Spatial distribution of built environment clusters.

In our models, each demolition was associated with a 14% shootings reduction (IRR=0.86, 95% CI 0.83 to 0.90, p<0.001) the following year. Demolitions were also associated with a long-term, 2% reduction in shootings per year for each cumulative demolition (IRR=0.98, 95% CI 0.95 to 1.00, p=0.02) (table 2). In other words, each completed demolition was associated with a drop in shootings risk the following year and a smaller annual reduction in shootings over the remaining intervention period. In the moderation analysis, only cluster A (ie, high density/connectivity) displayed following-year effects (IRR=0.85, 95% CI (0.79 to 0.90). Cluster A did not display long-term effects. Cluster B (moderate density/connectivity) displayed long-term effects (IRR=0.91, 95% CI 0.85 to 0.98, p=0.02) but not following-year effects. Cluster C (low-density, neighbourhood margins) also displayed long-term effects (IRR=0.90, 95% CI 0.84 to 0.97, p=0.01) and did not show following-year effects at p<0.05 (table 2).

Table 2

Estimated effects of demolitions on firearm violence with and without moderation

For each of the main models, Moran’s I tests returned results at p<0.05 for 1 out of the 21 time periods, as expected by chance, which indicates that model residuals were not spatially autocorrelated. When spatially lagged shootings were added to the main models, the interaction of lagged shootings and treatment was not statistically significant, suggesting no displacement of shootings due to demolitions.


We found that abandoned building demolitions were associated with reductions in nearby firearm violence in Rochester, and those reductions varied across built environment types. Citywide, these reductions were substantial in the year after demolitions occurred and evident over a long intervention period (12 years) during which demolitions accumulated. Our moderation analysis, using aerial imagery and machine learning, found that the effects of demolitions and their duration varied across built environment clusters. In the medium term, demolitions produced sizeable effects in dense, mixed-use areas (cluster A), whereas in the long term, their accumulation produced sizeable effects in medium-density, residential areas (cluster B) and sparsely developed neighbourhood fringes (cluster C). These findings suggest that attributes of the built environment, such as land use, building density and street connectivity, moderate the effects of an abandoned building demolition on nearby firearm violence.

One possible explanation for our findings is that abandoned buildings, and the vacant lots created by their demolition, exert differing influences on community life according to the nearby environment. A likely difference in community life across our built environment clusters is the amount of foot traffic in each cluster, which is likely highest in cluster A, followed by clusters B and C. An abandoned building in a high-traffic area (cluster A) might exert stronger upward pressure on firearm violence than in a moderate-traffic or low-traffic area (clusters B and C), such that the medium-term impact of a demolition is greater. However, as Macdonald and colleagues12 proposed, newly created vacant lots may be more difficult to observe, maintain and control in areas with higher foot traffic. Thus, we might expect a greater long-term effect in areas where benefits more readily accumulate over time.

Methodologically, we demonstrated how historical aerial imagery—an abundant, often freely available, ‘big data’ source—can enable retrospective study designs focused on a range of built environment influences. Alternative approaches to analysing the preintervention built environment would have posed the challenge of identifying accurate spatial features data from 2009 when we conducted this analysis in 2021. Moreover, imagery allowed us to examine aspects of urban form that appear important to the problem at hand, but which traditional methods would likely have omitted. One such method, called risk terrain modelling, is typically used to mine possible predictors from lists of location types, for example, barber shops, pawn shops, etc, that are considered crime generators.25 Using deep features from aerial images incorporated a range of additional factors; future work should examine combining these approaches. Moreover, our cluster-based approach favours understanding systemic influences and interactions over reducing the problem to just a few, prespecified spatial features, giving a more robust understanding of the moderators.

For violence prevention practitioners and urban policymakers, our study adds to the evidence that demolishing abandoned buildings reduces the incidence of firearm violence, without displacing it to nearby areas. Local governments, therefore, should factor firearm violence prevention as a consideration when deciding which abandoned buildings to demolish. Our findings indicate that these allocation decisions can be tailored by neighbourhood context and intended effects. Over time, demolitions alone may be most useful for preventing firearm violence when they are deployed in moderate-density or low-density urban areas. However, demolitions appear effective in the medium term for preventing firearm violence in high-traffic areas and might therefore be used to curb firearm violence in the highest risk locations even if they are highly trafficked. Moreover, it is possible that long-term effects in high-traffic areas could be improved by implementing plans to maintain community control over the resulting vacant lot: for example, in Flint, Michigan, community groups perform routine maintenance on vacant lots, with financial support from the county land bank.2 This consideration may be particularly important to preserve racial equity in demolitions programmes, since residents of colour are often segregated in high-density areas with smaller housing units.


In this observational design, we were not able to rule out all potential confounders. Our two-way fixed effect setup accounted for citywide time trends in firearm violence and time-invariant characteristics of spatial units, and we controlled for several potential time-varying confounders. However, the assignment of demolitions could have been based, in part, on recent firearm violence or unmeasured, time-varying characteristics of certain areas. However, we did not find evidence of differing pretrends.

Additionally, our analysis of built environment clusters is presented here to demonstrate how imagery and machine learning can be used for exploratory analysis of spatial influences, particularly as they pertain to urban form. Because this analysis was exploratory and multifactorial, we did not attempt to draw conclusions about the influence, in isolation, of any of the factors discussed here (eg, street network attributes). A weakness of our CNN-based approach is loss of interpretability, such that none of our CNN-derived features maps directly onto those isolated factors. However, using our approach as an exploratory analysis, future work could test hypotheses using traditional datasets, or they could employ more interpretable image analysis techniques, such as land cover classification.

Finally, we do not expect that results from Rochester will generalise to cities to relatively low rates of property abandonment. Future work should leverage the scalability and replicability of our approach to conduct similar analyses across a range of geographies.


This study introduced a novel approach to identify optimal targets for place-based interventions. Combining big data and causal inference methods can help prioritise buildings eligible for demolition based on the likelihood of firearm violence reduction in their respective locations. Moderation analysis can add nuance to our understanding of why, how and under what conditions this intervention may produce the desired effect. Importantly, the approach enhances policy evaluation efforts even when such evaluation was not built into the original intervention plan, because aerial imagery can be obtained retroactively.

Our results contribute to the evidence that numerous non-policing interventions can improve safety.26 Amidst the economic fallout of the COVID-19 pandemic, with firearm violence rising27 and a renewed commitment to investments in physical and human infrastructure by federal policymakers, there is an opportunity to further integrate and fine-tune firearm violence prevention and community development policies. City agencies should coordinate and use data to inform their strategies, with equity as a central consideration.

Key messages

What is already known on the subject

  • Demolishing abandoned buildings is associated with reductions in nearby violence, including firearm violence.

What this study adds

  • We found that demolitions produced differing effects on firearm violence depending on characteristics of the physical environment in different parts of Rochester, New York. These effects differed in both magnitude and duration.

  • We showed that high-resolution aerial imagery can be used to identify types of environments where demolition effects may differ.

Data availability statement

Data are available in a public, open access repository. Rochester, NY, shootings data:, NY, demolitions data: State orthoimagery portal:

Ethics statements

Patient consent for publication

Ethics approval

Since all study data were publicly available, this study was deemed non-human subjects research by the Boston University Institutional Review Board.


The authors would like to thank Elijah de la Campa, Eleanor Dickens, Quinton Mayne and Andres Sevtsuk for helpful comments on an earlier draft. Manish Patel contributed to aerial imagery collection. Several members of the New York State GIS Listserv provided helpful insights on aerial imagery data sources.



  • Contributors JJ and JdJ conceived the study. JJ and JG designed the study. JJ conducted the analysis and drafted the manuscript. JdJ, MPJ, QN and JG provided critical feedback and revisions.

  • Funding This study was supported by the Bloomberg Harvard City Leadership Initiative, funded by a gift to Harvard University by Bloomberg Philanthropies. MPJ received support from the National Institutes of Health (NIA 5K99AG066949-02). JG received support from the Firearm-safety Among Children and Teens (FACTS) Consortium (NICHD 1R24HD087149).

  • Map disclaimer The depiction of boundaries on this map does not imply the expression of any opinion whatsoever on the part of BMJ (or any member of its group) concerning the legal status of any country, territory, jurisdiction or area or of its authorities. This map is provided without any warranty of any kind, either express or implied.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.