-
-
Notifications
You must be signed in to change notification settings - Fork 217
docs: update broken source URLs in dataset metadata #728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: update broken source URLs in dataset metadata #728
Conversation
Updates several broken or outdated source URLs in datapackage_additions.toml: - airports.csv: Update to aviation-facilities1 (old dataset ID removed) - londonBoroughs.json: Update to current data.london.gov.uk URL - population_engineers_hurricanes.csv: Remove outdated FactFinder source (deprecated) - us-10m.json: Fix LICENSE URL (remove incorrect .md extension) - world-110m.json: Fix LICENSE URL (remove incorrect .md extension) - penguins.json: Update LTER URL to palmerpenguins R package site Also fixes documentation URLs in CONTRIBUTING.md: - Update datapackage.org/standard URL All URLs verified accessible. Broken URLs discovered during link checking with lychee. Regenerated datapackage.json and datapackage.md via npm build. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Update LICENSE reference to point to GitHub README.md#license section instead of local ./LICENSE file which doesn't exist in the repository. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Update Palmer Station source URL to the correct LTER site at Rutgers (https://pallter.marine.rutgers.edu/) instead of the palmerpenguins R package site. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
|
For removing URLs. Can we instead link to the internet archive? |
…n_engineers_hurricanes.csv Updated resource descriptions and sources in the datapackage_additions TOML file for data/population_engineers_hurricanes.csv. Corrects year and ACS dataset for population and employment data and confirms ratio denominator. Identifies likely source, a NOAA FAQ, of state-level hurricane count aggregation that specifies the methodology used (e.g. 'direct hits') as well as link to disaggregated data table.
should be good now. as it turns out the prior source was inaccurate anyway. replaced with correct sources. thanks @domoritz |
|
For all the updated URLs, are they for exactly the same data as before? Otherwise, I'd prefer keeping a non working but correct link. |
Valid point. In this case, everything is confirmed identical (or were old links that had always been wrong) except for the GIven the inevitability of link rot, maybe we can best address the tradeoff between provenance and accessibility by hammering out a policy for the Policy for Updating URLs |
|
After a closer look. I should be able to replicate (or nearly replicate) |
@domoritz Happy to answer any further concerns. That said, I am comfortable these changes improve accuracy and do not mask the provenance or sources of any of the datasets. |
|
URLs should never change but of course do. Yet, it's not our job to keep them up to date. If you want to update URLs, that's fine but I don't want a policy as it implies that we continue to update URLs. |
Summary
Fixes broken and outdated source URLs in dataset metadata and documentation. These issues were identified during link checking as part of #724.
Changes
Dataset Metadata URLs (
_data/datapackage_additions.toml)airports.csv: Updated Data.gov dataset ID
https://catalog.data.gov/dataset/airports-5e97asource of this file is unknown, this data is consistent with files provided on a monthly frequency by the FAA's National Airspace System Resource."
londonBoroughs.json: Updated London Datastore URL
https://data.london.gov.uk/dataset/statistical-gis-boundary-files-londonhttps://data.london.gov.uk/dataset/statistical-gis-boundary-files-for-london-20od9/population_engineers_hurricanes.csv: Removed outdated FactFinder source
https://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=ACS_07_3YR_S1901&prodType=tableus-10m.json: Fixed LICENSE URL
https://github.com/topojson/us-atlas/blob/master/LICENSE.mdhttps://github.com/topojson/us-atlas/blob/master/LICENSEworld-110m.json: Fixed LICENSE URL
https://github.com/topojson/world-atlas/blob/master/LICENSE.mdhttps://github.com/topojson/world-atlas/blob/master/LICENSEpenguins.json: Updated Palmer Station LTER URL
https://pal.lternet.edu/https://pallter.marine.rutgers.edu/Documentation URLs (
CONTRIBUTING.md)Data Package Standard reference
https://datapackage.org/standard/https://datapackage.org//standard/endpoint returns 404LICENSE link
./LICENSEhttps://github.com/vega/vega-datasets/blob/main/README.md#licenseVerification
datapackage.jsonanddatapackage.mdvianpm run builduvx taplo fmt --check --diff(TOML formatting)uvx ruff check(Python linting)uvx ruff format --check(Python formatting)Related
Links checked as part of #724