Added correct and summarization flags #131

staru09 · 2025-07-12T18:59:40Z

This PR is a WIP for issue #130.
This adds a summarizer and correct flag to improve and summarize the transcripts generated by Deepgram/Whisper using openai api.

staru09 · 2025-07-12T19:09:09Z

@kouloumos please ignore the previous one and review this
Last one was more complex and we may not need that anyways.

kouloumos

Thank you for your contribution, Aru! This is a really solid foundation.

I've added a few small notes inline. I also had a question about the llm service. I was wondering if we should split its responsibilities from the start?

I'm thinking ahead to our plans for a more advanced correction system. Do you think the llm service might get too complex if it has to manage corrections, summaries, and potentially more in the future?

Perhaps we could introduce a CorrectionService and a SummarizerService? They might just be simple LLM prompts for now, but it would give us a dedicated place to build out that more complex logic later.

An added benefit might be simplifying the Transcription class. If it initializes a list of services from the start, we might be able to get rid of the internal flags that track what needs to be done.

What are your thoughts on that trade-off? Happy to discuss it further.

Additional git productivity tip:

Instead of merging the main branch into your feature branch, prefer to rebase your branch on top of main. This creates a clean git history, where code changes are only made in non-merge commits. The basic rebasing workflow is shown below.

git checkout your_branch_name
git fetch origin
git rebase origin/main
# If conflicts arise, resolve them, stage the resolved files, and continue with:
# git rebase --continue
# Repeat until rebase is complete.
git push -f # (force push to GitHub)

kouloumos · 2025-07-13T14:49:39Z

app/services/llm_service.py

+        pass
+
+    @abstractmethod
+    def summarize_text(self, text: str, model: str) -> str:


why not call this summarize_transcript in the same pattern as correct_transcript?

kouloumos · 2025-07-13T15:04:51Z

app/transcription.py

+            text_exporter = self.exporters.get("text")
+
+            # --- Step 1: Save the original raw transcript if it exists ---
+            if transcript.outputs["raw"] and text_exporter:
+                raw_output_dir = text_exporter.get_output_path(transcript)
+                raw_file_path = text_exporter.construct_file_path(
+                    directory=raw_output_dir,
+                    filename=f"{transcript.title}_raw",
+                    file_type="txt",
+                    include_timestamp=False
+                )
+                text_exporter.write_to_file(transcript.outputs["raw"], raw_file_path)
+                self.logger.info(f"Raw transcript saved to: {raw_file_path}")
+
+            # --- Step 2: Perform LLM Correction ---
+            if self.llm_service and self.correct and transcript.outputs["raw"]:
+                self.logger.info(f"Correcting transcript with {self.llm_provider}...")
+                corrected_transcript_text = self.llm_service.correct_transcript(
+                    transcript.outputs["raw"], self.llm_correction_model
+                )
+
+                if text_exporter:
+                    corrected_output_dir = text_exporter.get_output_path(transcript)
+                    corrected_file_path = text_exporter.construct_file_path(
+                        directory=corrected_output_dir,
+                        filename=f"{transcript.title}_corrected",
+                        file_type="txt",
+                        include_timestamp=False
+                    )
+                    text_exporter.write_to_file(corrected_transcript_text, corrected_file_path)
+                    self.logger.info(f"Corrected transcript saved to: {corrected_file_path}")
+
+                # Overwrite the main "raw" output so subsequent steps use the corrected version
+                transcript.outputs["raw"] = corrected_transcript_text
+
+            # --- Step 3: Perform LLM Summarization ---
+            if self.llm_service and self.summarize_llm and transcript.outputs["raw"]:
+                self.logger.info(f"Summarizing transcript with {self.llm_provider}...")
+                summary_text = self.llm_service.summarize_text(
+                    transcript.outputs["raw"], self.llm_summary_model
+                )
+                transcript.summary = summary_text
+
+                if text_exporter:
+                    summary_output_dir = text_exporter.get_output_path(transcript)
+                    summary_file_path = text_exporter.construct_file_path(
+                        directory=summary_output_dir,
+                        filename=f"{transcript.title}_summary",
+                        file_type="txt",
+                        include_timestamp=False
+                    )
+                    text_exporter.write_to_file(summary_text, summary_file_path)
+                    self.logger.info(f"Summary saved to: {summary_file_path}")
+
+            # --- Step 4: Run existing exporters for final outputs (Markdown, JSON, etc.) ---
            if self.markdown or self.github_handler:
                transcript.outputs["markdown"] = self.write_to_markdown_file(
                    transcript,
                )

-            if "text" in self.exporters:
-                try:
-                    transcript.outputs["text"] = self.exporters["text"].export(
-                        transcript, add_timestamp=False
-                    )
-                except Exception as e:
-                    self.logger.warning(f"Text exporter failed: {e}")
-
            if "json" in self.exporters:
                transcript.outputs["json"] = self.exporters["json"].export(
                    transcript


Interesting approach here with the raw and corrected text exports. I see how that could help with verifying the correction algorithm.

I'm thinking about the long-term workflow, and I wonder if we can make the verification process even smoother by leveraging git. What do you think about this potential flow?

When the --github flag is used, we could create two separate commits: a first one with the initial transcript, and a second with the corrections applied. This would allow us to see the exact impact of the algorithm in the PR's diff.

for this to work you would need to store the corrected transcript to a different field and use it in the start() method.

When the flag isn't used, we'd simply save the final, corrected text based in our export settings.

This would also help us enforce that the export step always happens at the very end of the postprocessing chain. I am now thinking that export() should probably be a method by itself.

Let me know if this makes sense or if you foresee any issues with this approach.

kouloumos · 2025-07-13T15:05:30Z

app/transcription.py

+                    corrected_output_dir = text_exporter.get_output_path(transcript)
+                    corrected_file_path = text_exporter.construct_file_path(
+                        directory=corrected_output_dir,
+                        filename=f"{transcript.title}_corrected",
+                        file_type="txt",
+                        include_timestamp=False
+                    )
+                    text_exporter.write_to_file(corrected_transcript_text, corrected_file_path)
+                    self.logger.info(f"Corrected transcript saved to: {corrected_file_path}")


this seems redundant, we have methods for exporting that you can utilize

kouloumos · 2025-07-13T15:08:07Z

transcriber.py

+correct_transcript = click.option(
+    "--correct",
+    is_flag=True,
+    default=False,


you can use the default=settings.config.get(,) here (and on the rest of the new flags) so that users can use the config.ini file to set their defaults.

kouloumos · 2025-07-13T15:09:21Z

transcriber.py

+summarize_llm = click.option(
+    "--summarize-llm",
+    is_flag=True,
+    default=False,
+    help="Generate a new summary using the configured LLM provider.",
+)


Try to use the existing summarize flag, we are not using. It supposed to be working with deepgram but we never use it. so feel free to repurpose it.

do you think we should use LLM for correction and summarize with deepgram only ?

No, this is not what I meant. There is a --summarize flag already. I'm suggesting to use that instead of a new one.

Okk got it. Sorry for the misunderstanding.

kouloumos · 2025-07-13T15:10:28Z

config.ini.example

+[llm]
+llm_provider = openai
+llm_correction_model = gpt-4o
+llm_summary_model = gpt-3.5


add those in the [DEFAULT] category instead of here.

kouloumos · 2025-07-13T15:11:13Z

app/transcription.py

+                if text_exporter:
+                    summary_output_dir = text_exporter.get_output_path(transcript)
+                    summary_file_path = text_exporter.construct_file_path(
+                        directory=summary_output_dir,
+                        filename=f"{transcript.title}_summary",
+                        file_type="txt",
+                        include_timestamp=False
+                    )
+                    text_exporter.write_to_file(summary_text, summary_file_path)
+                    self.logger.info(f"Summary saved to: {summary_file_path}")
+


we shouldn't need all this code for the paths.

also, logging services should happen inside the services

staru09 · 2025-07-13T19:56:27Z

I'm thinking ahead to our plans for a more advanced correction system. Do you think the llm service might get too complex if it has to manage corrections, summaries, and potentially more in the future?

Perhaps we could introduce a CorrectionService and a SummarizerService? They might just be simple LLM prompts for now, but it would give us a dedicated place to build out that more complex logic later.

An added benefit might be simplifying the Transcription class. If it initializes a list of services from the start, we might be able to get rid of the internal flags that track what needs to be done.

What are your thoughts on that trade-off? Happy to discuss it further.

I think let's split the llm_service into correction and summarization service so we atleast have placeholders to look at for further improvement.

kouloumos reviewed Jul 13, 2025

View reviewed changes

staru09 force-pushed the llm_feats branch 4 times, most recently from d1c6a8d to bc4597e Compare August 17, 2025 14:25

staru09 added 6 commits August 17, 2025 20:02

simpler LLM operations

418e814

flags tested

25edacd

WIP: summarizer and correction

84bf3f2

correction prompt updated

538e0ee

openai added to requirements

3ee2135

tests updated

5dfa373

staru09 force-pushed the llm_feats branch from bc4597e to 5dfa373 Compare August 17, 2025 14:34

corrections fixed

7bb34d1

Added correct and summarization flags #131

Are you sure you want to change the base?

Added correct and summarization flags #131

Uh oh!

Conversation

staru09 commented Jul 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

staru09 commented Jul 12, 2025

Uh oh!

kouloumos left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kouloumos Jul 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

staru09 commented Jul 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

staru09 commented Jul 12, 2025 •

edited

Loading

kouloumos left a comment •

edited

Loading

kouloumos Jul 13, 2025 •

edited

Loading