A Python Flask Audio Search Application

Republished By Plato

Followers: 0

Summary

This code pattern explains how to create an application that you can use to search for a topic within video and audio files.

Description

While listening to a podcast or to video or audio files of courses, you often want to jump directly to the topic rather than listening to extraneous information. However, finding the topics and keywords in the entire recording can be challenging.

In this code pattern, create an application that you can use to search within the video or audio files. With the app, not only can you search, but you can also highlight the text where the search string or topic occurs in the file. The code pattern performs a natural language query search in audio files, and returns the results with the proper timeframe where your search topic is being discussed. This example uses an IBM® Watson™ Machine Learning introduction video to illustrate the process.

When you have completed the code pattern, you understand how to:

Prepare audio and video data and perform chunking to break it into smaller chunks to work with
Work with the Watson Speech to Text service through API calls to convert audio or video to text
Work with the Watson Discovery service through API calls to perform a search on text chunks
Create a Python Flask application and deploy it on IBM Cloud.

Flow

Python Flash audio podcast flow diagram

The user uploads the video or audio file on the UI.
The video or audio file is processed with the moviepy and pydub Python libraries, and is chunked to create smaller chunks to work with.
The user interacts with the Watson Speech to Text service through the provided application UI. The audio chunks are converted into text chunks with Watson Speech to Text.
The text chunks are uploaded on Watson Discovery by calling Watson Discovery APIs with Python SDKs.
The user performs a search query using Watson Discovery.
The results are shown on the UI.