Travel times from Output Area population-weighted centroids to GP surgeries in Walthamstow.
Transport modality is by foot and by private car.
Surgeries are geolocated from postcodes using open street map’s Nominatim API. This would not scale well so other location data sources should be explored.
Scaling out to other areas hinges on the ability to provide location data for the services of interest.
Including bus & rail modes is possible and subject to the peculiarities of GTFS data within the area of interest. Modelling bus & rail for other areas of UK would be trivial.
Click to show code
from datetime import datetime, timedeltaimport globimport osfrom pathlib import Pathimport subprocessimport tempfilefrom time import sleepimport tomlimport contextily as ctximport foliumimport geopandas as gpdfrom geopy.geocoders import Nominatimfrom geopy.exc import GeocoderUnavailablefrom geopy import pointimport matplotlib.pyplot as pltimport pandas as pdfrom pyprojroot import hereimport r5pyimport requestsfrom shapely import geometryfrom transport_performance.osm.osm_utils import filter_osmfrom travel_time_experiments.ingest_ons_geo import ( get_ons_geo_data, get_ons_geo_paginated )from travel_time_experiments.munge_gpd import add_linestring_colfrom travel_time_experiments.viz_gpd import viz_revised_coordinates# get private user agentUSER_AGENT = toml.load(here(".secrets.toml"))["nominatim"]["USER_AGENT"]
Ingest Data
Ingest London Open Street Map.
London OSM doesn’t include Loughton & Cheshunt. To avoid edge effects, ingest England OSM latest & filter to custom bounding box.
Click to show code
osm_pth = here("data/external/osm/england-latest-osm.pbf")ifnot os.path.exists(osm_pth): subprocess.run( ["curl","https://download.geofabrik.de/europe/united-kingdom/england-latest.osm.pbf","-o", osm_pth ])# Clip London OSM to custom BBOXfiltered_osm_pth = here("data/external/osm/walthamstow-aoi-osm.pbf")BBOX = [-0.368022,51.432659,0.296356,51.765395]ifnot os.path.exists(filtered_osm_pth): filter_osm( pbf_pth=osm_pth, out_pth=filtered_osm_pth, bbox=BBOX, tag_filter=False)
Ingest GP Addresses.
From NHS Service Search , surgeries within 50 miles serving Hoe Street, Wolthamstow. Postcodes provided and will need to be geocoded. Investigate whether coordinates are available in NHS developer portal, registration required.
Make this Notebook Trusted to load map: File -> Trust Notebook
Note that the accuracy of surgery location varies dependent upon whether the surgeries were geolocated from the organisation name or the postcode. Most are from postcode. Accuracy of these points will depend on size of postcode.
Click to show code
code_stats = geocd_surgeries["geocode_type"].describe()print(f"{round((code_stats.freq /len(geocd_surgeries)) *100, 1)} % of "f"{len(geocd_surgeries)} surgeries were geocoded by postcode.")
57.4 % of 535 surgeries were geocoded by postcode.
Ingest Population-Weighted Centroids.
Click to show code
ENDPOINT ="https://services1.arcgis.com/ESMARspQHYMw9BZ9/arcgis/rest/services/Output_Areas_2021_PWC_V3/FeatureServer/0/query"PARAMS = {"where": "OA21CD like 'E%'","f": "geoJSON", "outFields": "*","outSR": 4326,"returnCountOnly": True,}resp = requests.get(ENDPOINT, PARAMS)cont = resp.json()n_centroids = cont['properties']['count']print(f"There are {n_centroids:,} OA Centroids with the specified OA21CD pattern" )
There are 178,605 OA Centroids with the specified OA21CD pattern
Use LAD21 to match centroid release, though 23 is available. LADCD for Wolthamstow is E09000031.
Click to show code
ENDPOINT ="https://services1.arcgis.com/ESMARspQHYMw9BZ9/arcgis/rest/services/LAD_Dec_2021_GB_BFC_2022/FeatureServer/0/query"PARAMS["where"] ="LAD21CD = 'E09000031'"del PARAMS["resultOffset"]PARAMS["outSR"] =27700# get in BNG as will need to buffer_, boundary = get_ons_geo_data(ENDPOINT, PARAMS)boundary.explore()
Make this Notebook Trusted to load map: File -> Trust Notebook
Engineering the Data
Buffer the LAD boundary to avoid edge effects - people may prefer surgeries in the adjacent local authority. Buffer by 5km, pretty arbitrary rule of thumb that can be adjusted to suit.
Make this Notebook Trusted to load map: File -> Trust Notebook
These are the journey origins (navy) and destinations (green).
Adjust Point Locations
As many of the points will not be situated in routable locations, it can be beneficial to use the transport network to ‘snap’ the locations to the nearest feature of the network. This will often be the nearest node, but could also be configured as the nearest junction or bus stop.
Make this Notebook Trusted to load map: File -> Trust Notebook
Note that there are some issues with the data. There are entries that share the same postcode that appear to be duplicate entries, these are overplotted on the maps but can be revealed with a little random noise.
It is also apparent that the quality of geocoding is variable. In the worst case, Nightingale House Surgery has been geocoded from postcode at a local school on Nightingale Road, some 600 metres away. Visual inspection of point samples confirm that this is the exception rather than the rule, but a cut of the geolytix surgery location data or equivalent would be best.
count 771.000000
mean 16.688795
std 11.664236
min 0.017786
25% 7.375896
50% 15.289275
75% 23.811737
max 73.991204
Name: snap_dist_m, dtype: float64
Most of the centroids have been adjusted by less than 25 metres. Though there is at least one case snapped to over 70 metres.
Click to show code
walth_centroids["snap_dist_m"].plot.hist( bins=20, title="Distribution of Centroid Snap Distance in Metres")
It is likely that these snapped centroids now fall outside of the output area polygon they were intended to represent. Inspect the 10 largest revisions, check that they are within the bounds of their output area polygon.
Make this Notebook Trusted to load map: File -> Trust Notebook
As can be seen from the map, some points now do fall outside of the polygon they are intended to represent. We can account for this adjustment by adding an average walking time penalty to computed travel times.
Calculate Travel Times
Compute travel times from centroids to proximal surgeries. Add a walking time penalty to account for the snap distance. Use walking speed of 4.8km/h, consistent with DfT travel time work. Calculate median travel times across surgeries.
Make this Notebook Trusted to load map: File -> Trust Notebook
This is an interesting situation - the high density of destinations in this locality could form the basis for a different treatment of proximal surgeries. The median travel time will be from one output area centroid to every GP on the map.
It may not be reasonable to consider all these options for each centroid. You could potentially draw a smaller buffer around each output area instead, and then calculate travel times to a smaller subset of proximal GPs.
Option 2: Nearest 10 Surgeries
This option is used to model greater patient discrimination in surgery selection, due to the wealth of surgeries in the vicinity. In this scenario, patients select from the 10 surgeries with the smallest travel time only. Each centroid gets a median travel time across these 10 shortest journies only.
Make this Notebook Trusted to load map: File -> Trust Notebook
Option 3: Output Area 1km buffer
Buffer each output area by 1km and consider travel times to surgeries falling within those extents only. Find surgeries for each buffered output area. Check on surgeries within one output area.
Click to show code
walth_polys["buffered_oas"] = walth_polys.to_crs(27700).buffer(1000)walth_polys.set_geometry("buffered_oas", inplace=True)walth_polys["buffered_area_m"] = walth_polys.areawalth_polys.to_crs(4326, inplace=True)found_surgeries = gpd.GeoDataFrame()for i, r in walth_polys.iterrows(): buff_gdf = gpd.GeoDataFrame( {"OA21CD": r["OA21CD"], "geometry": r["buffered_oas"]}, index=[0], crs=4326) out_df = buff_gdf.sjoin(geocd_surgeries, how="right").dropna() found_surgeries = gpd.GeoDataFrame( pd.concat( [found_surgeries, out_df], ignore_index=True))which_oa ="E00021834"imap = found_surgeries.query("OA21CD == @which_oa").explore( marker_type="marker", zoom_start=11.5, marker_kwds={"icon": folium.map.Icon( color="green", icon="briefcase-medical", prefix="fa"), })# show the buffered polygonimap = walth_polys.query("OA21CD == @which_oa").explore( m=imap, style_kwds={"color":"black", "fillOpacity":0.1})# show the original polygonimap = walth_polys.query("OA21CD == @which_oa").set_geometry("geometry_l").explore( m=imap, style_kwds={"color": "purple", "fillOpacity":0.5})# get all surgeries that are not within the buffered OAimap = geocd_surgeries.loc[~geocd_surgeries["geometry"].isin( found_surgeries.query("OA21CD == @which_oa")["geometry"]), :].explore( m=imap, marker_type="marker", marker_kwds={"icon": folium.map.Icon( color="red", icon="briefcase-medical", prefix="fa") })imap
Make this Notebook Trusted to load map: File -> Trust Notebook
Calculate Number of surgeries and travel times for each output area. With this option, the output areas are routed to unequal numbers of surgeries. Though every output area has at least one destination.
Click to show code
n_surgeries = found_surgeries.groupby("OA21CD")["Organisation Name"].count().to_frame()walth_centroids = walth_centroids.set_index("OA21CD").join(n_surgeries).reset_index()walth_centroids.rename( columns={"Organisation Name": "n_surgeries"}, inplace=True)opt_3_pth = here("data/interim/walthamstow-oa-1km-buffer-surgery-tts.parquet")ifnot os.path.exists(opt_3_pth): tt_matrix_2 = pd.DataFrame()for i, row in walth_centroids.iterrows(): row_gdf = gpd.GeoDataFrame( {"id": row["id"], "geometry": row["snapped_geometry"]}, index=[0], crs=4326) tts = r5py.TravelTimeMatrixComputer( tn, origins=row_gdf, destinations=found_surgeries.query("OA21CD == @row['OA21CD']"), transport_modes=[r5py.TransportMode.CAR, r5py.TransportMode.WALK], departure=dept_time, departure_time_window=timedelta(minutes=10), # Up to 08:10 snap_to_network=False, max_time=timedelta(minutes=30), # 30 minute journey max ).compute_travel_times() tt_matrix_2 = pd.concat([tt_matrix_2, tts]) tt_matrix_2.to_parquet(opt_3_pth)else: tt_matrix_2 = pd.read_parquet(opt_3_pth)med_tts_3 = tt_matrix_2.drop("to_id", axis=1).groupby("from_id").median()med_tts_3 = med_tts_3.join(walth_centroids.set_index("id"))# add the walk time penaltymed_tts_3["travel_time"] = med_tts_3["travel_time"] +\ med_tts_3["walk_penalty_mins"]med_tts_3["n_surgeries"].describe()
count 771.000000
mean 5.041505
std 2.460776
min 1.000000
25% 3.000000
50% 5.000000
75% 7.000000
max 12.000000
Name: n_surgeries, dtype: float64
As can be seen below, travel time does trend by the number of surgeries in scope of the buffered output areas. This scenario is distinct to options 1 and 2 in that respect.
Click to show code
# normalise n_surgeries, get areas from the polygonsmed_tts_3 = walth_polys.loc[:, ["OA21CD", "buffered_area_m"]].set_index("OA21CD").join(med_tts_3.set_index("OA21CD"))med_tts_3["surgeries_per_km2"] = med_tts_3["n_surgeries"] / ( med_tts_3["buffered_area_m"]/1_000_000)med_tts_3.plot.scatter( x="surgeries_per_km2", y="travel_time", title="Median Travel Time by Surgeries per KM2")plt.show()