MURAL - Maynooth University Research Archive Library



    Fine Grained Spoken Document Summarization Through Text Segmentation


    Kotey, Samantha, Dahyot, Rozenn and Harte, Naomi (2023) Fine Grained Spoken Document Summarization Through Text Segmentation. 2022 IEEE Spoken Language Technology Workshop (SLT). pp. 647-654.

    [thumbnail of RD_fine.pdf]
    Preview
    Text
    RD_fine.pdf
    Available under License Creative Commons Attribution Non-commercial Share Alike.

    Download (198kB) | Preview

    Abstract

    Podcast transcripts are long spoken documents of conversational dialogue. Challenging to summarize, podcasts cover a diverse range of topics, vary in length, and have uniquely different linguistic styles. Previous studies in podcast summarization have generated short, concise dialogue summaries. In contrast, we propose a method to generate long fine-grained summaries, which describe details of sub-topic narratives. Leveraging a readability formula, we curate a data subset to train a long sequence transformer for abstractive summarization. Through text segmentation, we filter the evaluation data and exclude specific segments of text. We apply the model to segmented data, producing different types of fine grained summaries. We show that appropriate filtering creates comparable results on ROUGE and serves as an alternative method to truncation. Experiments show our model outperforms previous studies on the Spotify podcast dataset when tasked with generating longer sequences of text.
    Item Type: Article
    Keywords: spoken document summarization; text segmentation; long sequence transformers; readability formulas; podcast summarization;
    Academic Unit: Faculty of Science and Engineering > Computer Science
    Faculty of Science and Engineering > Research Institutes > Hamilton Institute
    Item ID: 20545
    Identification Number: 10.1109/slt54892.2023.10022829
    Depositing User: Rozenn Dahyot
    Date Deposited: 09 Sep 2025 10:18
    Journal or Publication Title: 2022 IEEE Spoken Language Technology Workshop (SLT)
    Publisher: IEEE
    Refereed: Yes
    Related URLs:
    URI: https://mural.maynoothuniversity.ie/id/eprint/20545
    Use Licence: This item is available under a Creative Commons Attribution Non Commercial Share Alike Licence (CC BY-NC-SA). Details of this licence are available here

    Repository Staff Only (login required)

    Item control page
    Item control page

    Downloads

    Downloads per month over past year

    Origin of downloads