Menu Sidebar
Menu

Shotcharts Revisited – From NBA Stats to Feature Service in Less Than 20 Lines of Code


View larger map

A guest post by Gregory Brunner

About two years ago, I wrote about creating shotcharts in ArcGIS using Python and arcpy. In the post, I demonstrated how to scrape the data from stats.nba.com and create a shots feature class from the data. I then shared the resulting shotchart as several web maps in ArcGIS Online. What I was unable to do at the time was automate the creation of the feature service and web maps that I shared in that post. Recent enhancements to the ArcGIS API for Python allow me to automate the process of sharing the shot data as a hosted feature service in ArcGIS Online and design web maps using the shot data. What I really like is that I can do this with less code than I wrote for my original blog post! In my previous post, I had to scrape the shot data, create a feature class, add fields to the feature class, and then add the shot data to the feature class. At that point, I would manually create a hosted feature service from the shot feature class. With recent enhancements to the ArcGIS API for Python, I can do this all in Python and I can to this in less than 20 lines of code! In this post, I will demonstrate how to to scrape data from the NBA stats site, put it into a spatial dataframe, share the spatial dataframe as a hosted feature service, and design web maps that use the hosted feature service. I will try to do this in as little code as possible!

Python Packages

In this post, I will use arcgis, pandas, and requests. After importing these packages, I will log into my ArcGIS Online account in order to save the shot data to a hosted feature service.

import arcgis
from arcgis.gis import GIS
from arcgis import SpatialDataFrame

import pandas as pd
import requests

from IPython.display import display

gis = GIS("https://www.arcgis.com", "gregb")

Getting the Shot Data

In my original post I looked at shots taken by Russell Westbrook from the 2014 – 2015 regular season. Here I will look at Russell Westbrook’s shots taken during the 2016 – 2017 regular season. I originally showed how to do this with urllib. Here, I will use requests similar to how Savvas Tjortjoglou does in How to Create NBA Shot Charts in Python.

player_id= '201566' #Russell Westbrook
season = '2016-17'  #MVP Season of 2016-17
seasontype="Regular+Season" #or use "Playoffs" for shots taken in playoffs

I will form the NBA stats request URL using Russell Westbrook’s NBA.com player ID and use requests to get the data.

nba_call_url = 'http://stats.nba.com/stats/shotchartdetail?AheadBehind=&CFID=&CFPARAMS=&ClutchTime=&Conference=&ContextFilter=&ContextMeasure=FGM&DateFrom=&DateTo=&Division=&EndPeriod=10&EndRange=28800&GameEventID=&GameID=&GameSegment=&GroupID=&GroupQuantity=5&LastNGames=0&LeagueID=00&Location=&Month=0&OpponentTeamID=0&Outcome=&PORound=0&Period=0&PlayerID=%s&PlayerPosition=&PointDiff=&Position=&RangeType=0&RookieYear=&Season=%s&SeasonSegment=&SeasonType=%s&ShotClockRange=&StartPeriod=1&StartRange=0&StarterBench=&TeamID=0&VsConference=&VsDivision=' % (player_id, season, seasontype)
response = requests.get(nba_call_url, headers={'User-Agent': "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.82 Safari/537.36"})

I will push the response JSON into a pandas dataframe.

shots = response.json()['resultSets'][0]['rowSet']
headers = response.json()['resultSets'][0]['headers']
shot_df = pd.DataFrame(shots, columns=headers)

From DataFrame to SpatialDataFrame

In order to publish the shot data to ArcGIS Online as a shotchart, I will convert the pandas dataframe to an arcgis spatial dataframe. In addition to passing the shot_df to SpatialDataFrame, I will pass the geometries associated with each row in the dataframe.

shot_coords = shot_df.iloc[:,17:19].values.tolist()
sdf = SpatialDataFrame(shot_df,geometry=[arcgis.geometry.Geometry({'x':-r[0], 'y':r[1], 
                    'spatialReference':{'wkid':3857}}) for r in shot_coords])

From SpatialDataFrame to Feature Service

Now that the shots are in a spatial dataframe, I will publish the spatial dataframe to ArcGIS Online.

shot_chart = gis.content.import_data(sdf)

Voila! I have the shots as a hosted feature service! It took less than 20 lines of code (including import statements) and to go from the NBA stats response to a hosted feature service only took 7 lines!

display(shot_chart)

From Feature Service to Web Maps

Even though I have the shots as a feature service, I am not done. I can use the ArcGIS API for Python to visualize the shots. I will create three maps: one that shows the shots without any renderer, one the that shows the shots symbolized by whether it was made or missed, and one that shows all the shots taken as a heat map.

For the basketball court, I will use the court feature class that I have in ArcGIS Online.

court_tiles = gis.content.search("Basketball Court", outside_org=False, item_type="Feature")
court_layers = court_tiles[1].layers

The court is broken into two feature services: One for the court outline and one for the court lines. I need to add them separately to the web map.

Shot Locations

Now that I have the basketball court layers and the shots as a feature service, I can display them on my map. I will set the shot chart renderer to “None” so that the default point symbology is applied to the shot_chart layer.

chart1 = gis.map((0.002,0), zoomlevel=17)
display(chart1)
chart1.add_layer(court_layers[1])
chart1.add_layer(court_layers[0])
chart1.add_layer(shot_chart,{"renderer":"None"})
chart1.basemap='dark-gray'


View Web Map

Shots Made and Missed

I want to be able to differentiate between shots made and shots missed. I can do that by changing the renderer when I add the shot_chart layer to the map. I will use the “ClassedColorRenderer” and apply the class colors based on the values in the “SHOT_MADE_FLAG” field.

chart2 = gis.map((0.002,0), zoomlevel=17)
display(chart2)
chart2.add_layer(court_layers[1])
chart2.add_layer(court_layers[0])
chart2.add_layer(shot_chart, {"renderer":"ClassedColorRenderer",
               "field_name": "SHOT_MADE_FLAG"})
chart2.basemap='dark-gray'


View Web Map

Shots missed are in dark gray. Shots made are in white. I assume that this is the default “ClassedColorRenderer”. I need to do some more investigating to figure out how to apply my desired color map.

Shots Taken as a Heat Map

I also want to see the shots taken as a heat map. I will use the “HeatmapRenderer” to do that.

chart3 = gis.map((0.002,0), zoomlevel=17)
display(chart3)
chart3.add_layer(court_layers[1])
chart3.add_layer(court_layers[0])
chart3.add_layer(shot_chart, {"renderer":"HeatmapRenderer",
               "opacity": 0.75})
chart3.basemap='dark-gray'


View Web Map

Again, I assume the orange-to-red color ramp is the default heat map renderer. I need to learn a little bit more about the API if I want to change the heat map color ramp.

Conclusions

I thought it would be fun to revisit my post on mapping shotcharts with ArcGIS and Python and see how the code changes with the introduction of the ArcGIS API for Python. Using the trick to convert the pandas dataframe to ArcGIS spatial dataframe makes it very easy to go from NBA stats as JSON to a hosted feature service. It also reduces the amount of code I need to write. I no longer need to create a feature class with arcpy, add fields to the feature class, then add the data to the feature class. I can now go from the stats as JSON to hosted feature service in only a few lines of code! I am still learning some of the ins-and-outs of the API (like applying renderers to the MapView), but if you have any questions, don’t hesitate to ask!

Analyzing Landsat Image Metadata with the Spatial DataFrame

A guest post by Gregory Brunner

A few weeks ago, Esri released an update to the ArcGIS API for Python. The newest release includes:

Hopefully, you can tell that the new functionality in the API that I am most excited about is the spatial dataframe! The spatial dataframe extends the pandas dataframe by adding geometry, spatial reference, and other spatial components to the dataframe. In adding the spatial dataframe to the API, ArcGIS users can now read feature classes, feature services, and image services directly into a dataframe. Once in a spatial dataframe, users can perform fast statistical and spatial analysis on the data, update existing feature services, and convert the dataframe to a feature class or shapefile. These are just a few examples of how you can use the spatial dataframe.

What really interests me is how this can be used with an ArcGIS image service. Can I use the spatial dataframe to extract image footprints from an image service? Can I use it to perform statistical analysis image footprints over a specific part of the world?

The answer to both of these questions is Yes! In this post, I’ll walk through how to use the API for Python to extract image service footprints from the Landsat 8 Views image service, show how to use a spatial filter to extract only footprints over New Jersey, determine the mean cloud cover and most recent acquisition date of the images, and share those image footprints as a feature service. If you have ever been interested in doing any of these, keep reading!

Getting Started

When using the ArcGIS API for Python, we first need to import the GIS.

import arcgis
from arcgis.gis import GIS
from IPython.display import display

# create a Web GIS object
gis = GIS("https://www.arcgis.com", "gregb")

We are going to look at Landsat footprints over the United States, so using gis.map we can open up our map over the USA.

#Looking over the USA
map1 = gis.map("USA", zoomlevel = 4)
display(map1)

View larger map

Using search I can find the Landsat 8 Views service in ArcGIS.com and add that layer to the map.

landsat_search = gis.content.search('"Landsat 8 Views"',outside_org=True)
map1.add_layer(landsat_search[0])

View larger map

I also need to grab the URL for this service and assign it to a variable.

url = landsat_search[0].url 
print(url)

https://landsat2.arcgis.com/arcgis/rest/services/Landsat8_Views/ImageServer

I am going to read the Landsat 8 Views service into a spatial dataframe. In order to do so, I will import SpatialDataFrame and ImageryLayer from arcgis.

from arcgis import SpatialDataFrame
from arcgis.raster._layer import ImageryLayer

Using ImageryLayer I will create the image service object.

imserv = ImageryLayer(url=url)

Within the image service object is a spatial dataframe. I can access it as follows.

image_service_sdf = imserv.query().df

I did not submit any query parameters here. I will do that shortly to show how you can specify query parameters such as a spatial filter.

The dataframe is essentially the attribute table.

image_service_sdf.head()

Let’s see how many footprints are in the dataframe.

print("There are " + str(len(image_service_sdf)) + " Landsat image service footprints in this dataframe.")

There are 1000 Landsat image service footprints in this dataframe.

There are 1000 Landsat image footprints in this dataframe. There are really hundreds of thousands, if not millions of images in this service. Only 1000 are returned because the service has a Max Record Count that is set to 1000.

Applying Query Parameters

Now that I know how to get the image service table as a spatial dataframe, I will apply a spatial filter so that I only retrieve image service footprints over New Jersey. I will also specify a where clause so that I only retrieve Primary footprints, meaning that I will exclude any Overview footprints from the dataframe.

In order to specify my extent, I am going to use arcpy and the Describe method to read a feature class that holds the geometry for New Jersey.

import arcpy

fc = r'C:\PROJECTS\STATE_OF_THE_DATA\DATA\usa_states\usa.gdb\nj'

grid_desc = arcpy.Describe(fc)
grid_sr = grid_desc.spatialReference
grid_extent = grid_desc.extent

In order to use this extent as a spatial filter on my image service query, I need to import filters from the ArcGIS API for Python.

from arcgis.geometry import filters

To create my filter, I need to pass the filter the geometry, the spatial reference, and the spatial relationship that I want the extent and the image service to have. In this case, I want to return footprints that intersect with New Jersey.

geometry = grid_extent
sr = grid_sr
sp_rel = "esriSpatialRelIntersects"
sp_filter = filters._filter(geometry, sr, sp_rel)

I also only want to return Primary image service footprints, meaning that I want to exclude the image service Overviews. I can do this by querying the Category field in the image service attribute table for footprints that have a value of 1.

search_field = 'Category'
search_val = 1
where_clause = search_field + '=' + str(search_val)

Now, when I query the image service and return the dataframe, I will also submit the where_clause and sp_filter. This should return only Primary image service footprints over New Jersey.

image_service_sdf = imserv.query(where=where_clause,geometry_filter=sp_filter,return_geometry=True).df
image_service_sdf.head()

In order to verify this, I can check and see how many footprints were returned.

print("There are " + str(len(image_service_sdf)) + " Landsat image service footprints over New Jersey.")

There are 535 Landsat image service footprints over New Jersey.

Viewing the Footprints in ArcGIS Online

Before I can convert the image service footprints to a feature class, I need to convert the AcquisitionDate from Unix time to a pandas datetime object; otherwise, the API for Python will throw an error. This can be done very easily with the pandas to_datetime method.

import pandas as pd
image_service_sdf['AcquisitionDate'] = pd.to_datetime(image_service_sdf['AcquisitionDate'] /1000,unit='s')

Now, I can verify the spatial filter worked by viewing the footprints to ArcGIS Online or viewing them in ArcMap. I will add them to ArcGIS Online using import_data.

item = gis.content.import_data(df=image_service_sdf) 

I will verify that the item now exists in portal as a Feature Layer.

item

I will also load the footprints onto the map to visualize them.

map2 = gis.map("New Jersey", zoomlevel = 6)
display(map2)
map2.add_layer(item, {"renderer":"None"})

View larger map

Exporting Footprints to a Feature Class

The spatial dataframe can very easily be converted into a feature class using the to_featureclass method. I will save the image service footprints to a feature class named nj_landsat_footprints.

import os
footprint_fc =  r'C:\PROJECTS\STATE_OF_THE_DATA\DATA\usa_states\usa.gdb\nj_landsat_footprints'
image_service_sdf.to_featureclass(os.path.dirname(footprint_fc),
                                os.path.basename(footprint_fc),
                                overwrite='overwrite')

Performing Statistical Calculations on the Landsat Metadata

My motivation for doing this analysis is actually to demonstrate how easily I can use the spatial dataframe to perform statistical summaries of the imagery metadata. Having created a dataframe of only imagery metadata over New Jersey, I can now do statistical calculations that give us additional insight into our imagery data.

Collection Date

I am interested in how recently imagery over New Jersey was collected. I can use max to find the most recent image dates from my dataframe.

import datetime
most_recent_image_date = image_service_sdf['AcquisitionDate'].max()
print(most_recent_image_date)
#datetime.datetime.utcfromtimestamp(most_recent_image_date/1000).strftime('%Y-%m-%dT%H:%M:%SZ')

2017-07-14 15:40:05.235000

Similarly, I can use min to find the oldest image date in the dataframe.

mean_image_date = image_service_sdf['AcquisitionDate'].min()
print(mean_image_date)
#datetime.datetime.utcfromtimestamp(mean_image_date/1000).strftime('%Y-%m-%dT%H:%M:%SZ')

2013-08-20 15:41:30.238000

Cloud Cover

I can also analyze cloud cover in the Landsat scenes. What is the mean cloud cover of the scenes over New Jersey? How many scenes are there with zero cloud cover? These are questions that we can quickly answer using the spatial dataframe.

mean_cloud_cover = image_service_sdf['CloudCover'].mean()
print(mean_cloud_cover)

0.440778691589

The mean Cloud Cover is .4407.

I can also use the dataframe to find out how many scenes have fewer than 10% Cloud Cover.

pd.value_counts(image_service_sdf['CloudCover'].values<0.1, sort=False)

There are 131 scenes of the 535 (about 25% of scenes) that have less than 10% Cloud Cover.

Hopefully you can see the usefulness of using a spatial dataframe with image services. I’m just scratching the surface here. If you have any questions, let me know!

Published Web App

Last week Missouri got a lot of rain, which caused flooding in the Meramec river valley near St. Louis. I received a link to an ArcGIS Image Service that Surdex had created and provided that showed the flooding as of Tuesday 5/2. They flew this imagery as a service to the government and first responders, hoping to help those affected quicker and more efficiently. Here’s a sample:

I was really interested in this imagery, but it lacked some context – I couldn’t tell how the flooding depicted in the imagery compared to normal conditions. I knew ArcGIS Web AppBuilder had a swipe widget though, so I quickly:

  1. Added the Image Service (provided by Surdex) to a web map, and chose the ArcGIS Imagery basemap and an initial extent
  2. Created a Web App from that web map using Web AppBuilder.
  3. In Web AppBuilder, enabled the swipe tool to be enabled by default.
  4. Shared the web map and app publicly.

This app provided a really compelling, interesting view of the data:


(full app)

The app web viral on Twitter and Facebook – gaining 20,000 views over a three-day period, and even got published on the St. Louis Post Dispatch.

So I’m now an official, published map author now, right? 🙂

A few links:

St. Louis Lambert Airport Passenger Data

Our city’s airport posts data on how many passengers travel through the airport each month and year. It’s posted as a PDF but the data is hard to use in that format. I grabbed the data and put it into a better format:

Why not convert to a graph too?

I’ve posted this data on GitHub. It tells an interesting story of how the early 2000s was tough years for Lambert, but recent years the demand has ticked up. What other insights can we glean from better open data?

Open Data in University City

I live in University City, MO, which has no crime. Don’t believe me? This map says so:

crime-reports-u-city

Open Government

The benefits of open government data – crime data being just one aspect of that – is well documented. It allows you and I to take the data the process it, innovate on it, be informed based on it, and keep our government officials accountable.

It is easier now more than ever to have a basic open data program at any government level. Your government can use a product like ArcGIS Open Data or Socrata which are both fancy and have a lot of nice open government features. Or for cheaper, a government representative can create a free GitHub account and upload the data. Or even cheaper AND easier – simply post CSV files on an existing website. The key is that the data be machine readable and in an open format.

University City was not doing that – crime data, before a few months ago, was being released in non-machine-readable PDFs. Thus it’s not able to be placed on crime maps, or analyzed by experts, or any of that. And so we end up with the hole in the crime map over our town.

“Let’s fix this!” I said. Little did I know it’d be a years-long adventure.

The initial contact

I originally reached out to my city council representative, mentioning the benefits of open data and why our city should be doing more to provide open data.

10/27/2013
To: City Council Rep
From: Gavin

I was just wondering if you know of any current or future plans of U City to open our city data? I recently wrote a post on open data – http://www.gavinr.com/2013/10/16/share-data-github/ – and would love to help U City in getting our public data more available in open formats.

I got a quick response saying my message was received, but after a few back-and-forth emails, nothing ever happened.

Go to the Top!

A couple years later, I emailed the police chief directly:

3/26/2015
To: Colonel Adams (U City Police Chief)
CC: City Council Rep
From: Gavin

I am writing to inquire why the crime stats posted on http://www.ucitymo.org/index.aspx?nid=482 are posted in a hard-to-process PDF format, instead of a more open CSV (Excel) format. I think the city at large would benefit greatly from having that data more easily process-able. Is there any way your team could provide this?

Response from this: nothing. Crickets. I took this to mean neither my city council rep nor the police chief has any interest in helping me get this done.

Finally

In 2016, University City adopted “MyGov” as their website CMS. One portion of this solution allows citizens to make a request digitally to the city. One of those options is “Police Issues – Crime Statistics.” Interesting. So I fill out the form, using the same request as above and wait. No response.

XLSX files on University City homepage

The new files, unceremoniously relegated to the bottom of the crime stats page

After three months of waiting, I forwarded the email to my city council rep, who forwarded to the city manager, who forwarded to the police department, who finally acknowledged my request and posted the crime data in XLSX format (only this year). I had to further ask for previous years to be posted.

WHEW. Finally after almost three years of discussing this, University City has made a small step toward open data. I learned through this process that when making requests to government officials, you sometimes have to be extremely specific, because general ideas or initiatives will probably get dropped. You also need to persevere – doing the right thing is not always the easiest thing (although it’s quite easy now – uploading an XLSX file? c’mon), so you’ll probably have to ask multiple times, and multiple ways, to get them to do what’s right.

What’s Next?

The good news: my city is posting the data in a decent format. Bad news: it’s not the best layout of the data. In a later post I’ll show how I wrote some scripts to process that and get it into databases and maps.

Older Posts

Gavin Rehkemper

JavaScript, WordPress, and GeoDev