-
Notifications
You must be signed in to change notification settings - Fork 227
Switch VLMs Download to Streaming #1752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| with open(temp_file_path, 'wb') as f: | ||
| for chunk in response.iter_content(chunk_size=chunk_size): | ||
| if chunk: | ||
| f.write(chunk) |
Check failure
Code scanning / CodeQL
Clear-text storage of sensitive information High
sensitive data (password)
This expression stores
sensitive data (password)
This expression stores
sensitive data (password)
This expression stores
sensitive data (secret)
This expression stores
sensitive data (secret)
This expression stores
sensitive data (password)
This expression stores
sensitive data (secret)
This expression stores
sensitive data (secret)
This expression stores
sensitive data (password)
This expression stores
sensitive data (secret)
This expression stores
sensitive data (password)
This expression stores
sensitive data (secret)
This expression stores
sensitive data (password)
This expression stores
sensitive data (password)
This expression stores
sensitive data (secret)
This expression stores
sensitive data (secret)
This expression stores
sensitive data (secret)
This expression stores
sensitive data (password)
This expression stores
sensitive data (password)
This expression stores
sensitive data (password)
This expression stores
sensitive data (password)
This expression stores
sensitive data (password)
This expression stores
sensitive data (password)
This expression stores
sensitive data (secret)
This expression stores
sensitive data (password)
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 1 day ago
To fix this issue, we need to ensure that responses resulting from API calls where secrets (API keys) are sent via query parameters are not stored as cleartext files on disk, especially if these responses could include confidential information (such as the API key itself, credentials, or sensitive configuration). There are two options:
- Option 1: Avoid sending API keys as query parameters. Instead, use HTTP headers for authentication. This will remove the risk of API keys being echoed in the response and stop them from appearing in URLs written to logs or disk.
- Option 2: If the URLs must include API keys (for backwards compatibility or external requirements), then prior to writing the response content to disk, we must either:
- Parse and redact any API keys or secret information from the response content before writing,
- Or, if the content itself is non-sensitive binary (e.g., model weights), ensure by contract that no secrets are returned. At minimum, do not log the URL with queries, and do not persist such responses if they could contain secrets.
Recommended fix: Refactor all relevant HTTP request code (specifically _get_from_url() and _stream_url_to_cache()) so that API keys are never sent via query parameters, but rather through HTTP headers such as Authorization: Bearer <token> or custom headers. Update construction of API URLs in get_roboflow_model_data() so "api_key" is removed from query parameters, and instead set proper headers for requests. This way, no secret will ever be present in the request URL, nor in the persisted content.
Code changes:
- Update all locations where the
"api_key"is appended toparamsin API URL construction to remove it. - Update
build_roboflow_api_headers()(presumed utility) to include the API key as a header for authentication if not already. - Ensure that
_get_from_url(), and thus_stream_url_to_cache(), are calling requests with the API key in headers only. - Ensure that logging of URLs in these flows does not print secrets.
All changes are to be made in blocks shown for inference/core/roboflow_api.py; adjust only the code you have been shown related to query parameter construction and network request authorization. No additional dependencies are needed.
-
Copy modified lines R368-R369 -
Copy modified line R885
| @@ -365,8 +365,8 @@ | ||
| ("device", device_id), | ||
| ("dynamic", "true"), | ||
| ] | ||
| if api_key is not None: | ||
| params.append(("api_key", api_key)) | ||
| # Remove API key from query params – authentication will use headers instead. | ||
| # params.append(("api_key", api_key)) | ||
|
|
||
| if ( | ||
| INTERNAL_WEIGHTS_URL_SUFFIX == "serverless" | ||
| @@ -882,7 +882,7 @@ | ||
| try: | ||
| response = requests.get( | ||
| wrap_url(url), | ||
| headers=build_roboflow_api_headers(), | ||
| headers=build_roboflow_api_headers(api_key=api_key if 'api_key' in locals() else None), | ||
| timeout=ROBOFLOW_API_REQUEST_TIMEOUT, | ||
| verify=ROBOFLOW_API_VERIFY_SSL, | ||
| stream=stream, |
| if chunk: | ||
| f.write(chunk) | ||
| if computed_md5 is not None: | ||
| computed_md5.update(chunk) |
Check failure
Code scanning / CodeQL
Use of a broken or weak cryptographic hashing algorithm on sensitive data High
Sensitive data (password)
Sensitive data (password)
Sensitive data (password)
Sensitive data (password)
Sensitive data (password)
Sensitive data (password)
Sensitive data (password)
Sensitive data (password)
Sensitive data (password)
Sensitive data (password)
Sensitive data (password)
Sensitive data (password)
Sensitive data (password)
Sensitive data (password)
Sensitive data (password)
Sensitive data (secret)
Sensitive data (secret)
Sensitive data (secret)
Sensitive data (secret)
Sensitive data (secret)
Sensitive data (secret)
Sensitive data (secret)
Sensitive data (secret)
Sensitive data (secret)
Sensitive data (secret)
Copilot Autofix
AI 1 day ago
Copilot could not generate an autofix suggestion
Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.
| f"Downloading {filename}: {downloaded_mb:.1f}MB" | ||
| ) | ||
|
|
||
| logger.info(progress_msg) |
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (password)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (password)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (password)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (password)
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 1 day ago
To fix this vulnerability, the code must ensure that sensitive information (e.g., API keys, secrets, tokens) is never logged. Specifically, the filename variable—used in the progress messages—must be sanitized before logging/printing. The sanitization should guarantee that sensitive query parameters like api_key, token, or any similar fields are removed from the string prior to logging.
In practical terms, this could be accomplished by stripping any query string from filenames extracted from URLs or by redacting any sensitive fields before logging. The safest approach is to avoid logging filenames that are derived directly from URLs containing sensitive query parameters. If filenames must be logged (for traceability or progress reporting), they should be passed through a sanitizer that removes query parameters or replaces known-sensitive values with [REDACTED].
Implementation steps:
- Before using
filenamein any log or print statement, ensure any sensitive query parameters (such asapi_key,token,signature, etc.) are removed or replaced with a placeholder (e.g.,[REDACTED]). - Implement a helper function, e.g.,
sanitize_filename_for_logging, that parsesfilenameas a URL (if applicable), strips query parameters, or replaces the value of sensitive fields with[REDACTED]. - Use this sanitized version of
filenamein all logging and printing within the affected progress reporting code ininference/core/roboflow_api.pyin the_stream_url_to_cachefunction.
-
Copy modified lines R965-R979 -
Copy modified lines R1005-R1006 -
Copy modified line R1024 -
Copy modified line R1029
| @@ -962,6 +962,21 @@ | ||
| ) | ||
| from inference.core.utils.file_system import dump_bytes_atomic, dump_bytes | ||
|
|
||
| def sanitize_filename_for_logging(fname: str) -> str: | ||
| try: | ||
| # Only keep the base path, drop the query string if present | ||
| # If this is not a URL, just return as is | ||
| if "?" in fname: | ||
| fname = fname.split("?")[0] | ||
| # Redact any common sensitive keywords that may appear | ||
| # as a belt-and-suspenders catch. This is basic but helpful. | ||
| sensitive_keys = ["api_key", "token", "signature", "key"] | ||
| for keyword in sensitive_keys: | ||
| fname = fname.replace(keyword, "[REDACTED]") | ||
| return fname | ||
| except Exception: | ||
| return "[UNSAFE_FILENAME]" | ||
|
|
||
| initialise_cache(model_id=model_id) | ||
| cache_file_path = get_cache_file_path(file=filename, model_id=model_id) | ||
|
|
||
| @@ -987,6 +1002,8 @@ | ||
| last_logged_mb = 0 | ||
| log_interval_mb = 10 | ||
|
|
||
| safe_filename = sanitize_filename_for_logging(filename) | ||
|
|
||
| with open(temp_file_path, "wb") as f: | ||
| for chunk in response.iter_content(chunk_size=chunk_size): | ||
| if chunk: | ||
| @@ -1004,12 +1021,12 @@ | ||
| total_mb = total_size_int / (1024 * 1024) | ||
| percent = (downloaded_bytes / total_size_int) * 100 | ||
| progress_msg = ( | ||
| f"Downloading {filename}: {downloaded_mb:.1f}MB / " | ||
| f"Downloading {safe_filename}: {downloaded_mb:.1f}MB / " | ||
| f"{total_mb:.1f}MB ({percent:.1f}%)" | ||
| ) | ||
| else: | ||
| progress_msg = ( | ||
| f"Downloading {filename}: {downloaded_mb:.1f}MB" | ||
| f"Downloading {safe_filename}: {downloaded_mb:.1f}MB" | ||
| ) | ||
|
|
||
| logger.info(progress_msg) |
| ) | ||
|
|
||
| logger.info(progress_msg) | ||
| print(f"\r{progress_msg}", end="", flush=True) |
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (password)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (password)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (password)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (password)
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 1 day ago
The best way to fix this is to prevent sensitive information such as API keys from ever being printed to stdout or logs. Specifically, any variable derived from URLs or filenames that may contain sensitive query parameters must be sanitized before logging or printing. In this context, it is necessary to ensure that the API key (or any other sensitive data) is not included in progress_msg on line 1016. The sanitization should be applied to the filename variable, removing any query strings or sensitive components, so that the printed output is guaranteed to be devoid of sensitive data.
Steps to fix:
- Sanitize the
filenameso that it contains only the base filename (no query params, no API key). - Replace logging/printing code (line 1015 and 1016) to use the sanitized version.
- Make sure sanitization is enforced even if future changes modify how the filename is derived.
No additional imports are needed; use Python’s standard string manipulation methods.
Edit only the affected file and region:
- In
inference/core/roboflow_api.py, sanitize thefilenamebefore constructingprogress_msgand use the sanitized version for logging/printing in the_stream_url_to_cachefunction.
-
Copy modified lines R990-R991 -
Copy modified line R1009 -
Copy modified line R1014
| @@ -987,6 +987,8 @@ | ||
| last_logged_mb = 0 | ||
| log_interval_mb = 10 | ||
|
|
||
| # Sanitize filename to avoid leaking sensitive data | ||
| sanitized_filename = filename.split("?")[0].split("/")[-1] if filename else "" | ||
| with open(temp_file_path, "wb") as f: | ||
| for chunk in response.iter_content(chunk_size=chunk_size): | ||
| if chunk: | ||
| @@ -1004,12 +1006,12 @@ | ||
| total_mb = total_size_int / (1024 * 1024) | ||
| percent = (downloaded_bytes / total_size_int) * 100 | ||
| progress_msg = ( | ||
| f"Downloading {filename}: {downloaded_mb:.1f}MB / " | ||
| f"Downloading {sanitized_filename}: {downloaded_mb:.1f}MB / " | ||
| f"{total_mb:.1f}MB ({percent:.1f}%)" | ||
| ) | ||
| else: | ||
| progress_msg = ( | ||
| f"Downloading {filename}: {downloaded_mb:.1f}MB" | ||
| f"Downloading {sanitized_filename}: {downloaded_mb:.1f}MB" | ||
| ) | ||
|
|
||
| logger.info(progress_msg) |
Description
Currently all roboflow_api downloads save the whole object into RAM. For large VLMs, the weight files size exceed the RAM capacity in many cases, making VLMs unusable. This PR fixes the issue by adding chunked download and switching transformers models to use it.
Type of change
Please delete options that are not relevant.
How has this change been tested, please provide a testcase or example of how you tested the change?
Below are testing screenshots. Prior to a fix RAM usage kept increasing alongside file download progress, eventually killing the whole process (1st screenshot).
After the fix RAM usage stays fixed at chunk size ~1MB , in this case the inference works (2nd screenshot).
Tested it by running Qwen-2.5-VL inference on L4 machine with 16GB RAM. Prior to the fix the process was getting killed and Qwen-2.5-VL, after the fixed inference successfully ran on my machine.
Any specific deployment considerations
No
Docs