Skip to content

Commit a221c03

Browse files
Serj-Normsbee
authored andcommitted
feat: [FC-0092] Optimize Course Info Blocks API (#37122)
The Course Info Blocks API endpoint has been known to be rather slow to return the response. Previous investigation showed that the major time sink was the get_course_blocks function, which is called three times in a single request. This commit aims to improve the response times by reducing the number of times that this function is called. Solution Summary The first time the function get_course_blocks is called, the result (transformed course blocks) is stored in the current WSGI request object. Later in the same request, before the second get_course_blocks call is triggered, the already transformed course blocks are taken from the request object, and if they are available, get_course_blocks is not called (if not, it is called as a fallback). Later in the request, the function is called again as before (see Optimization Strategy and Difficulties). Optimization Strategy and Difficulties The original idea was to fetch and transform the course blocks once and reuse them in all three cases, which would reduce get_course_blocks call count to 1. However, this did not turn out to be a viable solution because of the arguments passed to get_course_blocks. Notably, the allow_start_dates_in_future boolean flag affects the behavior of StartDateTransformer, which is a filtering transformer modifying the block structure returned. The first two times allow_start_dates_in_future is False, the third time it is True. Setting it to True in all three cases would mean that some blocks would be incorrectly included in the response. This left us with one option - optimize the first two calls. The difference between the first two calls is the non-filtering transformers, however the second call applies a subset of transformers from the first call, so it was safe to apply the superset of transformers in both cases. This allowed to reduce the number of function calls to 2. However, the cached structure may be further mutated by filters downstream, which means we need to cache a copy of the course structure (not the structure itself). The copy method itself is quite heavy (it calls deepcopy three times), making the benefits of this solution much less tangible. In fact, another potential optimization that was considered was to reuse the collected block structure (pre-transformation), but since calling copy on a collected structure proved to be more time-consuming than calling get_collected, this change was discarded, considering that the goal is to improve performance. Revised Solution To achieve a more tangible performance improvement, it was decided to modify the previous strategy as follows: * Pass a for_blocks_view parameter to the get_blocks function to make sure the new caching logic only affects the blocks view. * Collect and cache course blocks with future dates included. * Include start key in requested fields. * Reuse the cached blocks in the third call, which is in get_course_assignments * Before returning the response, filter out any blocks with a future start date, and also remove the start key if it was not in requested fields
1 parent ec1d959 commit a221c03

File tree

6 files changed

+168
-5
lines changed

6 files changed

+168
-5
lines changed

lms/djangoapps/course_api/blocks/api.py

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
"""
22
API function for retrieving course blocks data
33
"""
4-
4+
from edx_django_utils.cache import RequestCache
55

66
import lms.djangoapps.course_blocks.api as course_blocks_api
77
from lms.djangoapps.course_blocks.transformers.access_denied_filter import AccessDeniedMessageFilterTransformer
@@ -14,6 +14,7 @@
1414
from .toggles import HIDE_ACCESS_DENIALS_FLAG
1515
from .transformers.blocks_api import BlocksAPITransformer
1616
from .transformers.milestones import MilestonesAndSpecialExamsTransformer
17+
from .utils import COURSE_API_REQUEST_CACHE_NAMESPACE, REUSABLE_BLOCKS_CACHE_KEY
1718

1819

1920
def get_blocks(
@@ -29,6 +30,7 @@ def get_blocks(
2930
block_types_filter=None,
3031
hide_access_denials=False,
3132
allow_start_dates_in_future=False,
33+
cache_with_future_dates=False,
3234
):
3335
"""
3436
Return a serialized representation of the course blocks.
@@ -61,6 +63,7 @@ def get_blocks(
6163
allow_start_dates_in_future (bool): When True, will allow blocks to be
6264
returned that can bypass the StartDateTransformer's filter to show
6365
blocks with start dates in the future.
66+
cache_with_future_dates (bool): When True, will use the block caching logic using RequestCache
6467
"""
6568

6669
if HIDE_ACCESS_DENIALS_FLAG.is_enabled():
@@ -118,6 +121,10 @@ def get_blocks(
118121
),
119122
]
120123

124+
if cache_with_future_dates:
125+
# Include future dates such that get_course_assignments can reuse the block structure from RequestCache
126+
allow_start_dates_in_future = True
127+
121128
# transform
122129
blocks = course_blocks_api.get_course_blocks(
123130
user,
@@ -128,6 +135,19 @@ def get_blocks(
128135
include_has_scheduled_content=include_has_scheduled_content
129136
)
130137

138+
if cache_with_future_dates:
139+
# Store a copy of the transformed, but still unfiltered, course blocks in RequestCache to be reused
140+
# wherever possible for optimization. Copying is required to make sure the cached structure is not mutated
141+
# by the filtering below.
142+
request_cache = RequestCache(COURSE_API_REQUEST_CACHE_NAMESPACE)
143+
request_cache.set(REUSABLE_BLOCKS_CACHE_KEY, blocks.copy())
144+
145+
# Since we included blocks with future start dates in our block structure,
146+
# we need to include the 'start' field to filter out such blocks before returning the response.
147+
# If 'start' field is not requested, it will be removed from the response.
148+
requested_fields = set(requested_fields)
149+
requested_fields.add('start')
150+
131151
# filter blocks by types
132152
if block_types_filter:
133153
block_keys_to_remove = []
@@ -142,7 +162,7 @@ def get_blocks(
142162
serializer_context = {
143163
'request': request,
144164
'block_structure': blocks,
145-
'requested_fields': requested_fields or [],
165+
'requested_fields': requested_fields,
146166
}
147167

148168
if return_type == 'dict':

lms/djangoapps/course_api/blocks/utils.py

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
"""
22
Utils for Blocks
33
"""
4+
from edx_django_utils.cache import RequestCache
45
from rest_framework.utils.serializer_helpers import ReturnList
56

67
from openedx.core.djangoapps.discussions.models import (
@@ -9,6 +10,10 @@
910
)
1011

1112

13+
COURSE_API_REQUEST_CACHE_NAMESPACE = "course_api"
14+
REUSABLE_BLOCKS_CACHE_KEY = "reusable_transformed_blocks"
15+
16+
1217
def filter_discussion_xblocks_from_response(response, course_key):
1318
"""
1419
Removes discussion xblocks if discussion provider is openedx.
@@ -63,3 +68,18 @@ def filter_discussion_xblocks_from_response(response, course_key):
6368
response.data['blocks'] = filtered_blocks
6469

6570
return response
71+
72+
73+
def get_cached_transformed_blocks():
74+
"""
75+
Helper function to get an unfiltered course structure from RequestCache,
76+
including blocks with start dates in the future.
77+
78+
Caution: For performance reasons, the function returns the structure object itself, not its copy.
79+
This means the retrieved structure is supposed to be read-only and should not be mutated by consumers.
80+
"""
81+
request_cache = RequestCache(COURSE_API_REQUEST_CACHE_NAMESPACE)
82+
cached_response = request_cache.get_cached_response(REUSABLE_BLOCKS_CACHE_KEY)
83+
reusable_transformed_blocks = cached_response.value if cached_response.is_found else None
84+
85+
return reusable_transformed_blocks

lms/djangoapps/course_api/blocks/views.py

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
CourseBlocks API views
33
"""
44

5+
from datetime import datetime, timezone
56

67
from django.core.exceptions import ValidationError
78
from django.db import transaction
@@ -237,6 +238,7 @@ def list(self, request, usage_key_string, hide_access_denials=False): # pylint:
237238
params.cleaned_data['return_type'],
238239
params.cleaned_data.get('block_types_filter', None),
239240
hide_access_denials=hide_access_denials,
241+
cache_with_future_dates=True
240242
)
241243
)
242244
# If the username is an empty string, and not None, then we are requesting
@@ -339,9 +341,50 @@ def list(self, request, hide_access_denials=False): # pylint: disable=arguments
339341
if not root:
340342
raise ValidationError(f"Unable to find course block in '{course_key_string}'")
341343

344+
# Earlier we included blocks with future start dates in the collected/cached block structure.
345+
# Now we need to emulate allow_start_dates_in_future=False by removing any such blocks.
346+
include_start = "start" in request.query_params['requested_fields']
347+
self.remove_future_blocks(course_blocks, include_start)
348+
342349
recurse_mark_complete(root, course_blocks)
343350
return response
344351

352+
@staticmethod
353+
def remove_future_blocks(course_blocks, include_start: bool):
354+
"""
355+
Mutates course_blocks in place:
356+
- removes blocks whose 'start' is in the future
357+
- also removes references to them from parents' 'children' lists
358+
- removes 'start' key from all blocks if it wasn't requested
359+
"""
360+
if not course_blocks:
361+
return course_blocks
362+
363+
now = datetime.now(timezone.utc)
364+
365+
# 1. Collect IDs of blocks to remove
366+
to_remove = set()
367+
for block_id, block in course_blocks.items():
368+
get_field = block.get if include_start else block.pop
369+
start = get_field("start")
370+
if start and start > now:
371+
to_remove.add(block_id)
372+
373+
if not to_remove:
374+
return course_blocks
375+
376+
# 2. Remove the blocks themselves
377+
for block_id in to_remove:
378+
course_blocks.pop(block_id, None)
379+
380+
# 3. Clean up children lists
381+
for block in course_blocks.values():
382+
children = block.get("children")
383+
if children:
384+
block["children"] = [cid for cid in children if cid not in to_remove]
385+
386+
return course_blocks
387+
345388

346389
@method_decorator(transaction.non_atomic_requests, name='dispatch')
347390
@view_auth_classes(is_authenticated=False)

lms/djangoapps/courseware/courses.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626
from common.djangoapps.static_replace import replace_static_urls
2727
from common.djangoapps.util.date_utils import strftime_localized
2828
from lms.djangoapps import branding
29+
from lms.djangoapps.course_api.blocks.utils import get_cached_transformed_blocks
2930
from lms.djangoapps.course_blocks.api import get_course_blocks
3031
from lms.djangoapps.courseware.access import has_access
3132
from lms.djangoapps.courseware.access_response import (
@@ -636,7 +637,10 @@ def get_course_assignments(course_key, user, include_access=False, include_witho
636637

637638
store = modulestore()
638639
course_usage_key = store.make_course_usage_key(course_key)
639-
block_data = get_course_blocks(user, course_usage_key, allow_start_dates_in_future=True, include_completion=True)
640+
641+
block_data = get_cached_transformed_blocks() or get_course_blocks(
642+
user, course_usage_key, allow_start_dates_in_future=True, include_completion=True
643+
)
640644

641645
now = datetime.now(pytz.UTC)
642646
assignments = []

lms/djangoapps/grades/course_data.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,12 @@
22
Code used to get and cache the requested course-data
33
"""
44

5-
65
from lms.djangoapps.course_blocks.api import get_course_blocks
76
from openedx.core.djangoapps.content.block_structure.api import get_block_structure_manager
87
from xmodule.modulestore.django import modulestore # lint-amnesty, pylint: disable=wrong-import-order
98

109
from .transformer import GradesTransformer
10+
from ..course_api.blocks.utils import get_cached_transformed_blocks
1111

1212

1313
class CourseData:
@@ -56,7 +56,11 @@ def location(self): # lint-amnesty, pylint: disable=missing-function-docstring
5656
@property
5757
def structure(self): # lint-amnesty, pylint: disable=missing-function-docstring
5858
if self._structure is None:
59-
self._structure = get_course_blocks(
59+
# The get_course_blocks function proved to be a major time sink during a request at "blocks/".
60+
# This caching logic helps improve the response time by getting a copy of the already transformed, but still
61+
# unfiltered, course blocks from RequestCache and thus reducing the number of times that
62+
# the get_course_blocks function is called.
63+
self._structure = get_cached_transformed_blocks() or get_course_blocks(
6064
self.user,
6165
self.location,
6266
collected_block_structure=self._collected_block_structure,

lms/djangoapps/mobile_api/tests/test_course_info_views.py

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -432,6 +432,78 @@ def test_extend_sequential_info_with_assignment_progress_for_other_types(self, b
432432
for block_info in response.data['blocks'].values():
433433
self.assertNotEqual('assignment_progress', block_info)
434434

435+
def test_response_keys(self):
436+
response = self.verify_response(url=self.url)
437+
data = response.data
438+
439+
expected_top_level_keys = {
440+
'blocks',
441+
'certificate',
442+
'course_about',
443+
'course_access_details',
444+
'course_handouts',
445+
'course_modes',
446+
'course_progress',
447+
'course_sharing_utm_parameters',
448+
'course_updates',
449+
'deprecate_youtube',
450+
'discussion_url',
451+
'end',
452+
'enrollment_details',
453+
'id',
454+
'is_self_paced',
455+
'media',
456+
'name',
457+
'number',
458+
'org',
459+
'org_logo',
460+
'root',
461+
'start',
462+
'start_display',
463+
'start_type'
464+
}
465+
expected_course_access_keys = {
466+
"has_unmet_prerequisites",
467+
"is_too_early",
468+
"is_staff",
469+
"audit_access_expires",
470+
"courseware_access"
471+
}
472+
expected_courseware_access_keys = {
473+
"has_access",
474+
"error_code",
475+
"developer_message",
476+
"user_message",
477+
"additional_context_user_message",
478+
"user_fragment"
479+
}
480+
expected_enrollment_details_keys = {"created", "mode", "is_active", "upgrade_deadline"}
481+
expected_media_keys = {"image"}
482+
expected_image_keys = {"raw", "small", "large"}
483+
expected_course_sharing_keys = {"facebook", "twitter"}
484+
expected_course_modes_keys = {"slug", "sku", "android_sku", "ios_sku", "min_price"}
485+
expected_course_progress_keys = {"total_assignments_count", "assignments_completed"}
486+
487+
self.assertSetEqual(set(data), expected_top_level_keys)
488+
self.assertSetEqual(set(data["course_access_details"]), expected_course_access_keys)
489+
self.assertSetEqual(set(data["course_access_details"]["courseware_access"]), expected_courseware_access_keys)
490+
self.assertSetEqual(set(data["enrollment_details"]), expected_enrollment_details_keys)
491+
self.assertSetEqual(set(data["media"]), expected_media_keys)
492+
self.assertSetEqual(set(data["media"]["image"]), expected_image_keys)
493+
self.assertSetEqual(set(data["course_sharing_utm_parameters"]), expected_course_sharing_keys)
494+
self.assertSetEqual(set(data["course_modes"][0]), expected_course_modes_keys)
495+
self.assertSetEqual(set(data["course_progress"]), expected_course_progress_keys)
496+
497+
def test_block_count_depends_on_depth_in_request_params(self):
498+
response_depth_zero = self.verify_response(url=self.url, params={'depth': 0})
499+
response_depth_one = self.verify_response(url=self.url, params={'depth': 1})
500+
blocks_depth_zero = [block for block in self.store.get_items(self.course_key) if block.category == "course"]
501+
blocks_depth_one = [
502+
block for block in self.store.get_items(self.course_key) if block.category in ("course", "chapter")
503+
]
504+
self.assertEqual(len(response_depth_zero.data["blocks"]), len(blocks_depth_zero))
505+
self.assertEqual(len(response_depth_one.data["blocks"]), len(blocks_depth_one))
506+
435507

436508
class TestCourseEnrollmentDetailsView(MobileAPITestCase, MilestonesTestCaseMixin): # lint-amnesty, pylint: disable=test-inherits-tests
437509
"""

0 commit comments

Comments
 (0)