diff --git a/.codespellrc b/.codespellrc new file mode 100644 index 0000000..c7b13f1 --- /dev/null +++ b/.codespellrc @@ -0,0 +1,3 @@ +[codespell] +# Ignore stuff that looks like base64 blobs in Jupyter Notebooks +ignore-regex = [A-Za-z0-9+/]{100,} diff --git a/modules/06-geojupyter/.gitignore b/modules/06-geojupyter/.gitignore new file mode 100644 index 0000000..7a1c9d6 --- /dev/null +++ b/modules/06-geojupyter/.gitignore @@ -0,0 +1,2 @@ +data/* +*.jGIS diff --git a/modules/06-geojupyter/demo.ipynb b/modules/06-geojupyter/demo.ipynb new file mode 100644 index 0000000..0f7b71e --- /dev/null +++ b/modules/06-geojupyter/demo.ipynb @@ -0,0 +1,561 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "189c7230-a3b3-4c90-b053-68bb5857d5a6", + "metadata": {}, + "source": [ + "# GeoJupyter demo" + ] + }, + { + "cell_type": "markdown", + "id": "77660f5a-8fe5-436b-bcc2-13cc2dd39be4", + "metadata": {}, + "source": [ + "## Today: [🔗JupyterGIS](https://jupytergis.readthedocs.io/)" + ] + }, + { + "cell_type": "markdown", + "id": "e34edad4-220d-4acb-a214-048c5bba3b32", + "metadata": {}, + "source": [ + "JupyterGIS is a **real-time collaborative** Geographical Information System (GIS) environment in JupyterLab.\n", + "\n", + "You can [🔗try it right now in JupyterLite](https://jupytergis.readthedocs.io/en/latest/lite/lab/)!\n", + "\n", + "Let's explore some functionality together (based on [🔗Carl Boettiger](https://ourenvironment.berkeley.edu/people/carl-boettiger)'s [🔗ESPM-288 course](https://espm-288.carlboettiger.info/)). We'll explore the question: are neighborhoods that were highly-rated (A) under the disciminatory 1930s practice of [🔗redlining](https://en.wikipedia.org/wiki/Redlining) are greener today than neighborhoods graded D?\n", + "\n", + ":::{warning}\n", + "Please be aware that JupyterGIS is a young project and it's likely you'll run in to bugs.\n", + "We encourage you to try to break it and [report issues you find on GitHub](https://github.com/geojupyter/jupytergis/issues)!\n", + ":::" + ] + }, + { + "cell_type": "markdown", + "id": "a7ada6c0-e94b-400b-bc3f-d6d97fa6ab3a", + "metadata": {}, + "source": [ + "### Constants\n", + "\n", + "Some variables that will be used throughout the Notebook." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5ec4b0db-7b9c-4779-9bef-01ae8bb14041", + "metadata": {}, + "outputs": [], + "source": [ + "from pathlib import Path\n", + "\n", + "DATA_DIR = Path().cwd() / \"data\"\n", + "INEQUALITY_GEOJSON_FILE = DATA_DIR / \"redlining_newhaven_ct.geojson\"\n", + "NDVI_GEOTIFF_FILE = DATA_DIR / \"ndvi.tif\"\n", + "INEQUALITY_NDVI_GEOJSON_FILE = DATA_DIR / \"redlining_ndvi_newhaven_ct.geojson\"" + ] + }, + { + "cell_type": "markdown", + "id": "caaed0ec-07be-4675-bf2f-c626d500d443", + "metadata": {}, + "source": [ + "### Set up a JupyterGIS project\n", + "\n", + "We can build a JupyterGIS project from scratch in Python. Let's start by adding an OpenStreetMap basemap and displaying the widget in a side panel." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4eb60a7f-dd77-4043-8812-32cac1bdd800", + "metadata": {}, + "outputs": [], + "source": [ + "from jupytergis import GISDocument\n", + "\n", + "jgis_project = GISDocument(\"new_haven_redlining_analysis.jGIS\")\n", + "jgis_project.add_raster_layer(\n", + " url=\"https://a.basemaps.cartocdn.com/dark_all/{z}/{x}/{y}@2x.png\",\n", + " name=\"Basemap\",\n", + ")\n", + "\n", + "# Open this in a new panel on the right\n", + "jgis_project.sidecar()" + ] + }, + { + "cell_type": "markdown", + "id": "40c7da62-6405-4bb7-8c81-ff950db59ff3", + "metadata": {}, + "source": [ + "You should see a map on the right with a simple dark basemap.\n", + "\n", + "Try closing or shrinking the file browser on the left to make more room!" + ] + }, + { + "cell_type": "markdown", + "id": "b2133145-ef92-4d5e-84b8-a53546ef1fe1", + "metadata": {}, + "source": [ + "### Get historical redlining data\n", + "\n", + "We're using [🔗DuckDB](https://duckdb.org/) to connect to a [🔗geopackage](https://www.geopackage.org/) dataset containing data about redlining, and filter that data to select residential neighborhoods in New Haven, Connecticut, USA." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b3a59b77-e25f-4353-ac66-2269fcb90107", + "metadata": {}, + "outputs": [], + "source": [ + "import ibis\n", + "from ibis import _\n", + "\n", + "\n", + "con = ibis.duckdb.connect(extensions=[\"spatial\"])\n", + "\n", + "redlines = (\n", + " con\n", + " .read_geo(\"/vsicurl/http://dsl.richmond.edu/panorama/redlining/static/mappinginequality.gpkg\")\n", + " .filter(_.city == \"New Haven\", _.residential)\n", + " # Add a numeric grade to provide a scalar value for JupyterGIS symbology\n", + " # \"A\" is ascii value 65, so we subtract 64 to start with 1.\n", + " # TODO: Do this in Pandas not DuckDB! (.astype('category').cat.codes)\n", + " # Or: https://pandas.pydata.org/docs/reference/api/pandas.Series.map.html#pandas.Series.map\n", + " .mutate(numeric_grade=_.grade.ascii_str() - 64)\n", + ")\n", + "\n", + "new_haven_redlining = redlines.execute().set_crs(\"EPSG:4326\")\n", + "new_haven_bbox = new_haven_redlining.total_bounds\n", + "\n", + "new_haven_redlining.to_file(INEQUALITY_GEOJSON_FILE, engine=\"fiona\")\n", + "new_haven_redlining.head()" + ] + }, + { + "cell_type": "markdown", + "id": "d5b8285d-b3c3-4dc8-b6dd-6c2c654ea806", + "metadata": {}, + "source": [ + "You should notice that 5 rows of data are shown above. They're all residential neighborhoods with `grade` \"A\", meaning the neighborhoods were mostly occupied by white citizens. This grade was used to discriminate against non-white people seeking home loans.\n", + "\n", + "There is a `fill` column containing a hexadecimal color code, and a `geom` column containing polygon shapes.\n", + "\n", + "There's an additional `numeric_grade` column which contains a number-encoded version of the `grade` column." + ] + }, + { + "cell_type": "markdown", + "id": "63f92fa9-49cd-4e9e-a19d-7d6a60747b65", + "metadata": {}, + "source": [ + "#### Explore the data\n", + "\n", + "Let's explore the data a little bit in JupyterGIS. First, add the data to the map:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2c86c5ab-e38d-4503-a45b-8cd5c588515d", + "metadata": {}, + "outputs": [], + "source": [ + "jgis_project.add_geojson_layer(\n", + " path=INEQUALITY_GEOJSON_FILE,\n", + " name=\"New Haven neighborhood redlining\",\n", + ");" + ] + }, + { + "cell_type": "markdown", + "id": "d10bc86d-5c57-4b81-b52b-731bcd153059", + "metadata": {}, + "source": [ + "You should notice that the layers panel on the left of the JupyterGIS UI contains a new layer: \"New Haven neighborhood redlining\"." + ] + }, + { + "cell_type": "markdown", + "id": "7663ab54-2d69-479b-b8ce-cbdc0bc13143", + "metadata": {}, + "source": [ + ":::{important} In the JupyterGIS UI...\n", + "After running the cell above, **right-click the \"New Haven neighborhood redlining\" layer** in the JupyterGIS interface, and **select \"Zoom to layer\"**. \n", + "\n", + "Now, with the \"New Haven neighborhood redlining\" layer selected, **click the `i` (identify) icon in the toolbar** at the top of the JupyterGIS interface.\n", + "\n", + "Select some neighborhoods and view their \"Grade\" and \"Category\" attributes.\n", + ":::" + ] + }, + { + "cell_type": "markdown", + "id": "37e363b6-8fbc-4cf0-aa77-1d099872fca7", + "metadata": {}, + "source": [ + "#### Configure symbology\n", + "\n", + "This dataset has a \"fill\" attribute which determines which color each polygon should be rendered with based on its grade.\n", + "\n", + "::::::{important} In the JupyterGIS UI...\n", + "\n", + ":::{warning} Beware of bugs!\n", + "The symbology menu is a bit fragile. Stick to the instructions and you'll be OK! If anything goes wrong, try removing the layer (right-click) and re-adding it.\n", + ":::\n", + "\n", + "Right-click the layer \"New Haven neighborhood redlining\" and select **Edit Symbology**. This symbology mode renders each feature based on a data attribute containing a color.\n", + "\n", + "**Select \"Canonical\"** from the \"Render Type\" menu. The \"Value\" field should contain the \"fill\" attribute already, so **click OK**.\n", + "\n", + "You should notice that the neighborhoods ranked \"A\" are green, \"B\" are blue, \"C\" are yellow, and \"D\" are red.\n", + "::::::" + ] + }, + { + "cell_type": "markdown", + "id": "61803898-9832-44a9-a04f-23bdb1a4ac3c", + "metadata": {}, + "source": [ + "### Calculate NDVI\n", + "\n", + "We're going to calculate NDVI from Sentinel-2 data." + ] + }, + { + "cell_type": "markdown", + "id": "e6dd8c69-3254-45a1-b8f9-6a5004bb5e7b", + "metadata": {}, + "source": [ + "#### Open Sentinel-2 data\n", + "\n", + "We are using a [🔗STAC catalog](https://stacspec.org/en) to locate the data files we're interested in (covering New Haven during Summer 2024, with <20% cloud cover) and opening them as an Xarray DataSet." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ad4f51ef-822c-4f31-b9e4-ec28864d9f2b", + "metadata": {}, + "outputs": [], + "source": [ + "import odc.stac\n", + "from pystac_client import Client\n", + "\n", + "items = Client.open(\n", + " \"https://earth-search.aws.element84.com/v1\"\n", + ").search(\n", + " collections = ['sentinel-2-l2a'],\n", + " bbox=new_haven_bbox,\n", + " datetime = \"2024-06-01/2024-09-01\",\n", + " query={\"eo:cloud_cover\": {\"lt\": 20}}\n", + ").item_collection()\n", + "\n", + "data = odc.stac.load(\n", + " items,\n", + " bands=[\"nir08\", \"red\"],\n", + " bbox=new_haven_bbox,\n", + " resolution=10,\n", + " groupby=\"solar_day\",\n", + " chunks = {},\n", + ")\n", + "data" + ] + }, + { + "cell_type": "markdown", + "id": "bb7cfddd-8a32-447c-97f2-823a86b5531f", + "metadata": {}, + "source": [ + "You should notice that the two requested data variables (near-infrared and red) are present in the xarray DataSet." + ] + }, + { + "cell_type": "markdown", + "id": "81a2b33a-117d-411b-8f73-4dd873252738", + "metadata": {}, + "source": [ + "#### Do the NDVI calculation" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1a1225c8-80f6-4207-b065-7a302524eff8", + "metadata": {}, + "outputs": [], + "source": [ + "ndvi = (\n", + " (data.nir08 - data.red) / (data.red + data.nir08)\n", + ").median(\n", + " \"time\",\n", + " keep_attrs=True,\n", + ").where(\n", + " lambda ndvi: ndvi < 1\n", + ").compute()\n", + "\n", + "ndvi.plot.imshow()" + ] + }, + { + "cell_type": "markdown", + "id": "809c8abf-2d96-4e22-8a41-2604ccc8e419", + "metadata": {}, + "source": [ + "#### Save the NDVI raster to file" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1155a480-3e54-4e9e-adc2-c191e4ff55fe", + "metadata": {}, + "outputs": [], + "source": [ + "import rioxarray\n", + "\n", + "# TODO: Use jupytergis_tiler\n", + "ndvi.rio.reproject(\n", + " \"EPSG:4326\",\n", + ").rio.to_raster(\n", + " raster_path=NDVI_GEOTIFF_FILE, \n", + " driver=\"COG\", # Cloud-Optimized GeoTIFF\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "14a184a5-00ab-4347-b576-bc384b715c92", + "metadata": {}, + "source": [ + "#### Explore the data\n", + "\n", + "Let's explore the data in JupyterGIS again.\n", + "This time, we'll add the NDVI data as a layer with the GUI.\n", + "\n", + "::::::{important} In the JupyterGIS UI...\n", + "\n", + ":::{warning} Beware of bugs!\n", + "The symbology menu is a bit fragile. You may see some things that look a little weird, but stick to the instructions and you'll be OK! If anything goes wrong, try removing the layer (right-click) and re-adding it.\n", + ":::\n", + "\n", + "If the \"identify\" tool is still active, click the `i` icon in the toolbar again to disable it.\n", + "\n", + "Now, **click the `+` icon in the toolbar** to open the new layer interface.\n", + "**Select \"Add Raster Layer\", then \"New GeoTiff Layer\"**.\n", + "\n", + "**Select \"Browse Server Files\"** and then **navigate to `ndvi.tif` in the module 6 data directory** (`workshop-open-source-geospatial/modules/06-geojupyter/data/ndvi.tif`).\n", + "\n", + "Click **Select**.\n", + "\n", + "**Set the \"Min\" field to `0` and \"Max\" to `1`**.\n", + "\n", + "**Uncheck \"Normalize\"**.\n", + "\n", + "Scroll down to **input the layer name as \"NDVI\"**.\n", + "\n", + "**Click \"OK\"** to add the layer to the map.\n", + "\n", + "You should notice that the data doesn't look right -- it's entirely black-and-white.\n", + "\n", + "We need to set up the symbology now.\n", + "\n", + "Finally, **right-click the \"NDVI\" layer** and **select \"Edit Symbology\"**. The symbology menu may take a moment to load. Be patient! **Select \"Classify\" then click \"OK\".**\n", + "\n", + "The brighter areas have a higher NDVI value, and the darker areas have a lower one.\n", + "\n", + "We can use the identify tool (`i` icon in the toolbar) to explore the raw values.\n", + "\n", + "You might notice that the colored polygons show through in areas where the NDVI is `NaN`. You can hide that layer with the 👁️ icon in the JupyterGIS layers panel.\n", + "::::::" + ] + }, + { + "cell_type": "markdown", + "id": "41a3f550-04da-4170-9e69-b4131d428fdd", + "metadata": {}, + "source": [ + "### Zonal Statistics: Calculating mean NDVI for each New Haven neighborhood\n", + "\n", + "To find out whether neighborhoods graded \"A\" are greener than neighborhoods graded \"D\", we'll calculate the mean NDVI for each neighborhood using [🔗exactextract](https://isciences.github.io/exactextract/background.html), which is known for its capability to include fractional grid cells in its calculation (as opposed to other tools, where a cell is binary, either inside or outside the polygon)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "40d9b9d9-2cd1-4f14-8758-ea6c4bcd0c97", + "metadata": {}, + "outputs": [], + "source": [ + "from exactextract import exact_extract\n", + "\n", + "new_haven_redlining_and_ndvi = exact_extract(\n", + " NDVI_GEOTIFF_FILE,\n", + " new_haven_redlining,\n", + " \"mean_ndvi=mean\", # Give the column a human-readable name!\n", + " include_geom = True,\n", + " include_cols=[\"label\", \"grade\", \"city\", \"fill\"],\n", + " output=\"pandas\",\n", + ")\n", + "\n", + "new_haven_redlining_and_ndvi.set_crs(\n", + " \"EPSG:4326\"\n", + ").to_file(INEQUALITY_NDVI_GEOJSON_FILE, engine=\"fiona\")\n", + "\n", + "new_haven_redlining_and_ndvi.head()" + ] + }, + { + "cell_type": "markdown", + "id": "2d9542f4-131a-4fb4-9328-7b7a2981510b", + "metadata": {}, + "source": [ + "You should notice that the dataset now includes a `mean_ndvi` column with the results of our zonal statistics calculation! 🎉" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8244a541-59d2-4372-a8b6-9f7f8b6483de", + "metadata": {}, + "outputs": [], + "source": [ + "jgis_project.add_geojson_layer(\n", + " path=INEQUALITY_NDVI_GEOJSON_FILE,\n", + " name=\"New Haven neighborhood redlining w/ NDVI\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "7a1d4e12-9363-4625-9001-bc6aef655a7f", + "metadata": {}, + "source": [ + "Notice that there are now two copies of the New Haven neighborhoods polygon layer. The new one is titled the same as the old one with \"w/ NDVI\" appended.\n", + "\n", + "::::::{important} In the JupyterGIS UI...\n", + "\n", + ":::{warning} Beware of bugs!\n", + "If the new redlining w/ NDVI layer disappears, remove it from the map by right-clicking it in the JupyterGIS layers panel and selecting \"Remove Layer\". Then re-run the cell above.\n", + ":::\n", + "\n", + "Temporarily hide the \"NDVI\" raster layer with the 👁️ icon in the JupyterGIS layers panel so you can more clearly see the new layer.\n", + "\n", + "Hide the \"old\" copy of the redlining layer (\"New Haven neighborhood redlining\") with the 👁️ icon in the JupyterGIS layers panel.\n", + ":::::::\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "0e66de07-3861-414e-b2e0-d3d320d4673c", + "metadata": {}, + "source": [ + "### Visualizing correlation\n", + "\n", + "Now that we have neighborhood grades and mean NDVI for each neighborhood, how do we visualize any potential correlation?\n", + "\n", + "TODO: Bivariate choropleth?\n", + "\n", + "TODO: ?" + ] + }, + { + "cell_type": "markdown", + "id": "455001f2-e26b-4b23-97a5-610151fbc659", + "metadata": {}, + "source": [ + "## Future" + ] + }, + { + "cell_type": "markdown", + "id": "f5f96150-f589-4fdd-b21f-2a0b2d933426", + "metadata": {}, + "source": [ + "### Story maps / \"scrolly telling\"\n", + "\n", + "Story map support for JupyterGIS is in progress.\n", + "\n", + "We anticipate working with the [🔗MyST](https://mystmd.org/) and [🔗Closeread](https://closeread.dev/) developers to develop interactive scrollytelling experiences in MyST Markdown documents." + ] + }, + { + "cell_type": "markdown", + "id": "ec8b9c06-a35f-49c4-a061-0009518c57a7", + "metadata": {}, + "source": [ + "### \"microgis\" (placeholder name)\n", + "\n", + "We're working on a [🔗project](https://github.com/geojupyter/jupyter-microgis) to provide an instant layered visual environment for any number of Python datasets (starting with rioxarray DataArrays and GeoPandas GeoDataFrames) in a widget.\n", + "The goal is to minimize time-to-visualization.\n", + "\n", + "It would provide sensible default symbology choices, and customization would be available with as-needed complexity.\n", + "In other words, you shouldn't need to learn a complex symbology expression language when your needs are simple, but complex expression is available if you need it.\n", + "\n", + "```python\n", + "from microgis import explore\n", + "\n", + "\n", + "explore(\n", + " da1, da2, gdf1,\n", + " {\n", + " \"data\": gdf2,\n", + " \"symbology\": {\n", + " \"choropleth\": {\n", + " \"steps\": 11,\n", + " \"classification\": \"natural\",\n", + " },\n", + " },\n", + " },\n", + ")\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "e5152113-a7bc-45dd-84d1-0d19e22196d2", + "metadata": {}, + "source": [ + "### More!\n", + "\n", + ":::{image} https://geojupyter.org/assets/images/community-diagram.svg\n", + ":width: 400px\n", + ":align: center\n", + ":::\n", + "\n", + "GeoJupyter's priorities are broad, and are based on our community's needs. We can only know what those needs are if you join us!\n", + "\n", + "Please join the [🔗Jupyter Zulip](https://jupyter.zulipchat.com) today and find us in the `#geojupyter` channel!\n", + "\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.13.0" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}