[Docs]: Readme Fix (#617)

abukhoy · web-flow · commit c788f171ac8e · 2025-11-14T08:43:13.000+05:30
Signed-off-by: Abukhoyer Shaik &lt;abukhoye@qti.qualcomm.com&gt;
diff --git a/README.md b/README.md
@@ -108,8 +108,8 @@ For more details about using ``QEfficient`` via Cloud AI 100 Apps SDK, visit [Li
 
 ## Documentation
 
-* [Quick Start Guide](https://quic.github.io/efficient-transformers/source/quick_start.html#)
-* [Python API](https://quic.github.io/efficient-transformers/source/hl_api.html)
+* [Quick Start Guide](https://quic.github.io/efficient-transformers/source/quick_start.html)
+* [QEFF API](https://quic.github.io/efficient-transformers/source/qeff_autoclasses.html)
 * [Validated Models](https://quic.github.io/efficient-transformers/source/validate.html)
 * [Models coming soon](https://quic.github.io/efficient-transformers/source/validate.html#models-coming-soon)
 
diff --git a/docs/source/supported_features.rst b/docs/source/supported_features.rst
@@ -30,6 +30,8 @@ Supported Features
      - Enables execution with FP8 precision, significantly improving performance and reducing memory usage for computational tasks.
    * - Prefill caching
      - Enhances inference speed by caching key-value pairs for shared prefixes, reducing redundant computations and improving efficiency.
+   * - On Device Sampling
+     - Enables sampling operations to be executed directly on the QAIC device rather than the host CPU for QEffForCausalLM models. This enhancement significantly reduces host-device communication overhead and improves inference throughput and scalability. Refer `sample script <https://github.com/quic/efficient-transformers/blob/main/examples/on_device_sampling.py>`_ for more **details**.
    * - Prompt-Lookup Decoding
      - Speeds up text generation by using overlapping parts of the input prompt and the generated text, making the process faster without losing quality. Refer `sample script <https://github.com/quic/efficient-transformers/blob/main/examples/pld_spd_inference.py>`_ for more **details**.
    * - :ref:`PEFT LoRA support <QEffAutoPeftModelForCausalLM>`