Local Audio Files Using Node.js to Transcribe

May 10, 2023

Transcribing audio files is a common task in many applications, from generating subtitles for videos to creating transcripts for podcasts and interviews. With Node.js, we can easily transcribe local audio files using various libraries and APIs. In this tutorial, we will explore how to transcribe audio files locally using Node.js, taking advantage of the power and flexibility of JavaScript.


Before we begin, ensure you have the following set up:

  1. Node.js: Make sure you have Node.js installed on your machine. You can download it from the official Node.js website.

  2. npm: npm, the Node.js package manager, should be available after installing Node.js.

Step 1: Installing Dependencies

To transcribe audio files, we will use the @google-cloud/speech library, which is an official Node.js client for the Google Cloud Speech-to-Text API. This library allows us to utilize Google's powerful speech recognition capabilities.

Open your terminal and navigate to your project directory. Then, install the required library using npm:

bashCopy codenpm install @google-cloud/speech

Step 2: Setting Up Authentication

To use the Google Cloud Speech-to-Text API, you need to set up authentication and obtain a service account key. Here's how to do it:

  1. Go to the Google Cloud Console and create a new project (or use an existing one).

  2. In the Cloud Console, navigate to IAM & Admin > Service Accounts.

  3. Click on Create Service Account.

  4. Enter a name for the service account and select the role "Project > Editor" (for simplicity in this tutorial).

  5. Choose JSON as the key type and click on Create. This will download a JSON file containing the service account key.

  6. Save the JSON file in your project directory.

Step 3: Transcribing Audio Files

Now that we have the necessary dependencies installed and the service account key ready, we can start transcribing audio files using Node.js. Create a new JavaScript file (e.g., transcribe.js) in your project directory.

javascriptCopy code// transcribe.js const fs = require('fs'); const speech = require('@google-cloud/speech'); // Replace 'path/to/serviceAccountKey.json' with the actual path to your service account key const serviceAccountKeyPath = 'path/to/serviceAccountKey.json'; // Creates a client for the Google Cloud Speech-to-Text API const client = new speech.SpeechClient({ keyFilename: serviceAccountKeyPath, }); // Function to transcribe an audio file async function transcribeAudio(audioFilePath) { try { // Read the audio file into memory const audioBytes = fs.readFileSync(audioFilePath); // Configuration for the audio file const audioConfig = { encoding: 'LINEAR16', sampleRateHertz: 16000, languageCode: 'en-US', // Change the language code as needed }; // Create a request object const request = { audio: { content: audioBytes, }, config: audioConfig, }; // Perform the transcription const [response] = await client.recognize(request); const transcription = response.results .map((result) => result.alternatives[0].transcript) .join('\n'); console.log('Transcription:'); console.log(transcription); } catch (err) { console.error('Error transcribing audio:', err.message); } } // Replace 'path/to/audioFile.wav' with the actual path to your audio file const audioFilePath = 'path/to/audioFile.wav'; transcribeAudio(audioFilePath);

Running the Transcription

To transcribe an audio file, replace 'path/to/audioFile.wav' in the transcribe.js file with the actual path to your audio file. Then, run the following command in your terminal:

bashCopy codenode transcribe.js

Node.js will execute the script, transcribe the audio file using the Google Cloud Speech-to-Text API, and print the transcription to the console.


Transcribing local audio files using Node.js is a powerful and straightforward process, thanks to the @google-cloud/speech library. By following this tutorial and setting up authentication with the Google Cloud Speech-to-Text API, you can easily transcribe audio files for various applications, enhancing accessibility and user experience in your projects.