Skip to content

Commit 2ab9740

Browse files
committed
Simplify the project
1 parent 694987b commit 2ab9740

20 files changed

+24
-2091
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -95,3 +95,4 @@ _rust.h
9595
uv.lock
9696
tests/temp_models/
9797
*.cast
98+
*.proptest-regressions

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ rust-test: rust-format ## Run tests
5656
.PHONY: rust-coverage
5757
rust-coverage: ## Generate code coverage report for Gaggle crate
5858
@echo "Generating coverage report..."
59-
@cargo tarpaulin --manifest-path gaggle/Cargo.toml --all-targets --out Xml
59+
@cargo tarpaulin --manifest-path gaggle/Cargo.toml --all-targets --out Xml --features "expose_internal"
6060

6161
.PHONY: rust-lint
6262
rust-lint: rust-format ## Run linter checks on Rust files

ROADMAP.md

Lines changed: 8 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -10,15 +10,14 @@ It outlines features to be implemented and their current status.
1010

1111
* **Authentication**
1212
* [x] Set Kaggle API credentials programmatically.
13-
* [x] Support environment variables (using `KAGGLE_USERNAME` and `KAGGLE_KEY`).
14-
* [x] Support `~/.kaggle/kaggle.json file`.
13+
* [x] Support environment variables for authentication (`KAGGLE_USERNAME` and `KAGGLE_KEY`).
14+
* [x] Support reading credentials from `~/.kaggle/kaggle.json file`.
1515
* **Dataset Operations**
16-
* [x] Search for datasets.
16+
* [x] Search for datasets on Kaggle.
1717
* [x] Download datasets from Kaggle.
1818
* [x] List files in a dataset.
1919
* [x] Get dataset metadata.
20-
* [ ] Upload datasets to Kaggle.
21-
* [ ] Delete datasets from Kaggle.
20+
* [ ] Upload DuckDB tables to Kaggle.
2221

2322
### 2. Caching and Storage
2423

@@ -28,7 +27,6 @@ It outlines features to be implemented and their current status.
2827
* [x] Get cache information (size and storage location).
2928
* [ ] Set cache size limit.
3029
* [ ] Cache expiration policies.
31-
* [ ] Support for partial file downloads and resumes.
3230
* **Storage**
3331
* [x] Store datasets in configurable directory.
3432
* [ ] Support for cloud storage backends (S3, GCS, and Azure).
@@ -37,13 +35,11 @@ It outlines features to be implemented and their current status.
3735

3836
* **File Format Support**
3937
* [x] CSV and TSV file reading.
40-
* [x] JSON file reading.
4138
* [x] Parquet file reading.
39+
* [x] JSON file reading.
4240
* [ ] Excel and XLSX file reading.
43-
* **Direct Query Integration**
41+
* **Querying Datasets**
4442
* [x] Replacement scan for `kaggle:` URLs.
45-
* [ ] Direct SQL queries on remote datasets without full download (true streaming).
46-
* [ ] Streaming data from Kaggle without caching.
4743
* [ ] Virtual table support for lazy loading.
4844

4945
### 4. Performance and Concurrency
@@ -54,7 +50,7 @@ It outlines features to be implemented and their current status.
5450
* [ ] Concurrent dataset downloads.
5551
* **Network Optimization**
5652
* [x] Configurable HTTP timeouts.
57-
* [ ] Retry logic with backoff (configurable attempts/delay; planned).
53+
* [ ] Retry logic with backoff for failed requests.
5854
* **Caching Strategy**
5955
* [ ] Incremental cache updates.
6056
* [ ] Background cache synchronization.
@@ -67,7 +63,7 @@ It outlines features to be implemented and their current status.
6763
* [x] Clear error messages for `NULL` inputs.
6864
* [ ] Detailed error codes for programmatic error handling.
6965
* **Resilience**
70-
* [ ] Automatic retry on network failures (planned with backoff settings).
66+
* [ ] Automatic retry on network failures.
7167
* [ ] Graceful degradation when Kaggle API is unavailable.
7268
* [ ] Local-only mode for cached datasets.
7369

0 commit comments

Comments
 (0)