-
-
Notifications
You must be signed in to change notification settings - Fork 4
feat: implement S3 compatible API with authentication support #205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| mimalloc = { version = "0.1.48", features = ["v3", "override"] } | ||
| rand = "0.9.1" | ||
| reqwest = { version = "0.12.22" } | ||
| rust-s3 = "0.37.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is the "official" aws-sdk-s3, but its bloatware just like all other AWS packages.
| /// | ||
| /// - `OS__HIGH_VOLUME_STORAGE__REQUEST_TIMEOUT_SECS=30` | ||
| /// - `OS__LONG_TERM_STORAGE__REQUEST_TIMEOUT_SECS=30` | ||
| request_timeout_secs: Option<u64>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should probably be a Duration, with a humantime annotation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL humantime has an annotation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
humantime_serde to be precise, like here:
| #[allow(dead_code)] | ||
| pub extra_headers: HeaderMap, | ||
| pub request_timeout: Option<Duration>, | ||
| pub path_style: Option<bool>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Option<bool> is kind of an antipattern, so lets just make this a bool.
| _path: &ObjectPath, | ||
| response: ResponseData, | ||
| ) -> Result<Option<(Metadata, BackendStream)>> { | ||
| let metadata = Metadata::from_hashmap(response.headers(), "").unwrap_or_default(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should propagate any error that happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how we should propagate it, as I don't want it to return any error if there are any failure during parsing Metadata. I also have no idea on how error handling works on objectstore since we're using anyhow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say we want the correct metadata, or an error. Silently getting empty metadata would be wrong here.
And you can just propagate with ? like any other error.
| let bytes = Bytes::from(response.to_vec()); | ||
| let stream: BackendStream = Box::pin(stream::iter(std::iter::once(Ok(bytes)))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
S3 will most likely be used for large payloads, and we do not want to buffer those in memory. There should be some kind of way to get a stream from the ResponseData.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is actually: https://docs.rs/rust-s3/latest/s3/bucket/struct.Bucket.html#method.get_object_stream
But it's only compatible with async File IO type. I'm struggling to convert it into reqwest's streaming format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tokio_util::io::ReaderStream is the struct that converts from an AsyncRead to a stream, like here:
Though I know all these conversions can be a mess, so maybe this takes a bit of fiddling around :-(
| /// Extracts metadata from the given [`HashMap`]. | ||
| /// | ||
| /// A prefix can be also be provided which is being stripped from custom non-standard headers. | ||
| pub fn from_hashmap(hash_map: HashMap<String, String>, prefix: &str) -> Result<Self, Error> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is almost identical to the method above, let's try to avoid the duplication
Closes #127