-
Notifications
You must be signed in to change notification settings - Fork 278
Refactor database access to use context managers for connection handling #607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Updated all database interaction functions to utilize `get_db_connection()` for managing SQLite connections. - Removed manual connection handling and commit/rollback logic from individual functions. - Improved error handling and logging for database operations. - Ensured that all database operations are performed within a single transaction where applicable. - Enhanced code readability and maintainability by reducing boilerplate connection code.
WalkthroughRefactors multiple DB modules to use a centralized get_db_connection() context manager, replaces manual sqlite3 connect/commit/close patterns, propagates optional sqlite3.Connection parameters across functions, and tightens album-image validation and error logging. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Caller
participant ConnMgr as get_db_connection()
participant DB as SQLite
Caller->>ConnMgr: with get_db_connection() as conn
ConnMgr->>DB: open connection + PRAGMA (WAL, busy_timeout)
Caller->>ConnMgr: conn.cursor() -> execute SQL
alt success
ConnMgr->>DB: COMMIT
ConnMgr-->>Caller: results
else error
ConnMgr->>DB: ROLLBACK
ConnMgr-->>Caller: exception (logged)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (1)📓 Common learnings🧬 Code graph analysis (1)backend/app/utils/face_clusters.py (4)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/app/database/faces.py (1)
88-132: Consider transaction atomicity for multi-face insertion.When inserting multiple faces for one image (lines 113-126), each call to
db_insert_face_embeddings()creates a separate transaction. If insertion fails partway through, earlier faces are already committed, potentially leaving partial data.Consider refactoring to use a single transaction for all faces:
def db_insert_face_embeddings_by_image_id( image_id: ImageId, embeddings: Union[FaceEmbedding, List[FaceEmbedding]], confidence: Optional[Union[float, List[float]]] = None, bbox: Optional[Union[BoundingBox, List[BoundingBox]]] = None, cluster_id: Optional[Union[ClusterId, List[ClusterId]]] = None, ) -> Union[FaceId, List[FaceId]]: # Handle multiple faces in one image if ( isinstance(embeddings, list) and len(embeddings) > 0 and isinstance(embeddings[0], np.ndarray) ): - face_ids = [] - for i, emb in enumerate(embeddings): - conf = ( - confidence[i] - if isinstance(confidence, list) and i < len(confidence) - else confidence - ) - bb = bbox[i] if isinstance(bbox, list) and i < len(bbox) else bbox - cid = ( - cluster_id[i] - if isinstance(cluster_id, list) and i < len(cluster_id) - else cluster_id - ) - face_id = db_insert_face_embeddings(image_id, emb, conf, bb, cid) - face_ids.append(face_id) - return face_ids + with get_db_connection() as conn: + cursor = conn.cursor() + face_ids = [] + for i, emb in enumerate(embeddings): + conf = ( + confidence[i] + if isinstance(confidence, list) and i < len(confidence) + else confidence + ) + bb = bbox[i] if isinstance(bbox, list) and i < len(bbox) else bbox + cid = ( + cluster_id[i] + if isinstance(cluster_id, list) and i < len(cluster_id) + else cluster_id + ) + embeddings_json = json.dumps([emb.tolist() for emb in emb]) + bbox_json = json.dumps(bb) if bb is not None else None + cursor.execute( + "INSERT INTO faces (image_id, cluster_id, embeddings, confidence, bbox) VALUES (?, ?, ?, ?, ?)", + (image_id, cid, embeddings_json, conf, bbox_json), + ) + face_ids.append(cursor.lastrowid) + return face_ids else: # Single face return db_insert_face_embeddings( image_id, embeddings, confidence, bbox, cluster_id )backend/app/utils/face_clusters.py (1)
378-435: Inconsistent error handling between connection paths.Similar to
_update_cluster_face_image, this function has asymmetric error handling:
- Lines 422-427: Raises when using external connection
- Lines 430-435: Returns
Nonewhen using managed connectionFor consistency and proper transaction rollback, both paths should raise exceptions.
Apply this diff:
# Case 2: Use managed connection try: with get_db_connection() as conn: return perform_query(conn) except Exception as e: logger.error(f"Error getting face data for cluster {cluster_uuid}: {e}") - return None + raise
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
backend/app/database/albums.py(7 hunks)backend/app/database/connection.py(2 hunks)backend/app/database/face_clusters.py(10 hunks)backend/app/database/faces.py(10 hunks)backend/app/database/folders.py(15 hunks)backend/app/database/images.py(8 hunks)backend/app/database/metadata.py(3 hunks)backend/app/utils/face_clusters.py(6 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Hemil36
Repo: AOSSIE-Org/PictoPy PR: 570
File: backend/app/database/connection.py:16-24
Timestamp: 2025-10-31T17:00:50.116Z
Learning: In PictoPy backend, the user prefers not to use database connection retry logic or extended busy timeouts in the centralized get_db_connection() context manager, even though the app has concurrent access patterns via ProcessPoolExecutor and FastAPI.
📚 Learning: 2025-10-31T17:00:50.116Z
Learnt from: Hemil36
Repo: AOSSIE-Org/PictoPy PR: 570
File: backend/app/database/connection.py:16-24
Timestamp: 2025-10-31T17:00:50.116Z
Learning: In PictoPy backend, the user prefers not to use database connection retry logic or extended busy timeouts in the centralized get_db_connection() context manager, even though the app has concurrent access patterns via ProcessPoolExecutor and FastAPI.
Applied to files:
backend/app/database/images.pybackend/app/database/faces.pybackend/app/database/connection.py
🧬 Code graph analysis (7)
backend/app/database/albums.py (1)
backend/app/database/connection.py (1)
get_db_connection(11-37)
backend/app/database/images.py (2)
backend/app/database/connection.py (1)
get_db_connection(11-37)backend/app/utils/images.py (1)
image_util_parse_metadata(496-513)
backend/app/database/faces.py (2)
backend/app/database/connection.py (1)
get_db_connection(11-37)backend/app/database/metadata.py (1)
_update(63-67)
backend/app/database/metadata.py (2)
backend/app/database/connection.py (1)
get_db_connection(11-37)backend/app/database/faces.py (1)
_update(292-307)
backend/app/database/folders.py (1)
backend/app/database/connection.py (1)
get_db_connection(11-37)
backend/app/utils/face_clusters.py (4)
backend/app/database/face_clusters.py (2)
db_delete_all_clusters(36-57)db_insert_clusters_batch(60-104)backend/app/database/faces.py (1)
db_update_face_cluster_ids_batch(270-313)backend/app/database/metadata.py (1)
db_update_metadata(49-75)backend/app/database/connection.py (1)
get_db_connection(11-37)
backend/app/database/face_clusters.py (1)
backend/app/database/connection.py (1)
get_db_connection(11-37)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Backend Tests
🔇 Additional comments (11)
backend/app/database/faces.py (5)
5-5: LGTM: Centralized connection import added.The import of
get_db_connectionaligns with the PR objective to standardize connection handling across the codebase.
29-48: LGTM: Table creation now uses context manager.The refactoring correctly delegates transaction management to
get_db_connection(), eliminating manual commit/close calls.
50-86: LGTM: Insertion now uses context manager.The function correctly uses the context manager and returns the face_id before the automatic commit. This is safe because the context manager commits on successful exit.
135-204: LGTM: Read-only operations correctly refactored.All query functions properly use the context manager for connection handling. The automatic commit on read-only operations is harmless and simplifies the pattern.
Also applies to: 207-228, 231-267, 316-363
270-313: LGTM: Batch update supports optional connection reuse.The signature change from cursor to optional connection, combined with the dual-path pattern, correctly enables both standalone use and transaction reuse. This matches the pattern used in
backend/app/database/face_clusters.pyfor consistency.backend/app/utils/face_clusters.py (2)
124-150: Excellent: Atomic transaction for full reclustering.All database operations for the full reclustering path—deleting old clusters, inserting new ones, updating face assignments, generating face images, and updating metadata—are executed within a single transaction. This ensures atomicity: either all changes succeed together or all are rolled back on failure.
13-13: LGTM: Supporting changes for transaction pattern.The
Tupleimport supports the updated type hints (line 380), and makingconna required parameter for_generate_cluster_face_imageis appropriate since this function is only called within an active transaction (line 136).Also applies to: 555-594
backend/app/database/face_clusters.py (4)
3-3: LGTM: Centralized connection import.Consistent with the refactoring in
backend/app/database/faces.py.
21-33: LGTM: Table creation refactored correctly.The context manager pattern is correctly applied for DDL operations.
36-57: LGTM: Consistent dual-path pattern for connection reuse.These functions correctly implement the same dual-path pattern seen in
backend/app/database/faces.py::db_update_face_cluster_ids_batch, enabling both standalone use and connection reuse within transactions. The pattern is consistent across all batch/update operations in the refactored codebase.Also applies to: 60-104, 161-205
107-131: LGTM: Read-only operations refactored correctly.All query operations properly use the context manager. The refactoring eliminates manual connection management while maintaining correct behavior.
Also applies to: 134-158, 208-247, 250-316
This pull request refactors the database access layer to use a context-managed connection helper (
get_db_connection) across the albums, face clusters, and faces modules. This change centralizes connection handling, improves code readability, and ensures more robust transaction management and error logging. Additionally, it updates database transaction settings for better concurrency and reliability.Database connection management and transaction handling:
Replaced direct usage of
sqlite3.connectand manual connection/transaction handling inalbums.py,face_clusters.py, andfaces.pywith the centralizedget_db_connectioncontext manager, significantly reducing boilerplate code and potential connection leaks.Refactored batch and update operations in
face_clusters.pyto accept an optional connection, further supporting reuse and atomic transactions. [Database reliability and logging improvements:
get_db_connectionto set SQLite PRAGMAs for Write-Ahead Logging (journal_mode=WAL), increased busy timeout, and added exception logging on transaction failure for better reliability and debugging.These changes make the database access layer more robust, maintainable, and easier to debug.
Summary by CodeRabbit
Improvements
Bug Fixes
Refactor