Upto: Table of Contents of full book "Programming the Raspberry Pi GPU"

OpenMAX Audio on the Raspberry Pi

This chapter looks at audio processing on the Raspberry Pi. The support from OpenMAX is weaker: there is no satisfactory decode component. We have to resort to FFmpeg to decode and render audio.

Resources

Files

Files used are here

Audio components

The Raspberry Pi has a number of OpenMAX components specifically for audio processing:

Audio formats

OpenMAX has a number of data structures used to get and set information about components. We have seen some of these before

Some of these were discussed in the Components chapter.

We haven't looked at the field OMX_AUDIO_PORTDEFINITIONTYPE which is part of the port definition information it contains the following relevant fields

	
typedef struct OMX_AUDIO_PORTDEFINITIONTYPE {
    OMX_STRING cMIMEType;
    OMX_NATIVE_DEVICETYPE pNativeRender;
    OMX_BOOL bFlagErrorConcealment;
    OMX_AUDIO_CODINGTYPE eEncoding;
} OMX_AUDIO_PORTDEFINITIONTYPE;
	
      

The last two fields are the current values set for the port. The possible values are obtained from the next structure OMX_AUDIO_PARAM_PORTFORMATTYPE so we discuss it in the next paragraph. The major field we get here is the audio encoding.

The OMX_AUDIO_PARAM_PORTFORMATTYPE is defined (Section 4.1.16 in the 1.1.2 specification) as

	
typedef struct OMX_AUDIO_PARAM_PORTFORMATTYPE {
    OMX_U32 nSize;
    OMX_VERSIONTYPE nVersion;
    OMX_U32 nPortIndex;
    OMX_U32 nIndex;
    OMX_AUDIO_CODINGTYPE eEncoding;
} OMX_AUDIO_PARAM_PORTFORMATTYPE;
	
      

The first two fields are common to all OpenMAX structures. The nPortIndex is the port we are looking at. The nIndex field is to distinguish between all of the different format types supported by this port. The eEncoding gives information about the format.

The values for OMX_AUDIO_CODINGTYPE are given in Table 4-66 of the 1.1.2 Specification and on the RPi are given in the file /opt/vc/include/IL/OMX_Audio.h as

	
typedef enum OMX_AUDIO_CODINGTYPE {
    OMX_AUDIO_CodingUnused = 0,  /** Placeholder value when coding is N/A  */
    OMX_AUDIO_CodingAutoDetect,  /** auto detection of audio format */
    OMX_AUDIO_CodingPCM,         /** Any variant of PCM coding */
    OMX_AUDIO_CodingADPCM,       /** Any variant of ADPCM encoded data */
    OMX_AUDIO_CodingAMR,         /** Any variant of AMR encoded data */
    OMX_AUDIO_CodingGSMFR,       /** Any variant of GSM fullrate (i.e. GSM610) */
    OMX_AUDIO_CodingGSMEFR,      /** Any variant of GSM Enhanced Fullrate encoded data*/
    OMX_AUDIO_CodingGSMHR,       /** Any variant of GSM Halfrate encoded data */
    OMX_AUDIO_CodingPDCFR,       /** Any variant of PDC Fullrate encoded data */
    OMX_AUDIO_CodingPDCEFR,      /** Any variant of PDC Enhanced Fullrate encoded data */
    OMX_AUDIO_CodingPDCHR,       /** Any variant of PDC Halfrate encoded data */
    OMX_AUDIO_CodingTDMAFR,      /** Any variant of TDMA Fullrate encoded data (TIA/EIA-136-420) */
    OMX_AUDIO_CodingTDMAEFR,     /** Any variant of TDMA Enhanced Fullrate encoded data (TIA/EIA-136-410) */
    OMX_AUDIO_CodingQCELP8,      /** Any variant of QCELP 8kbps encoded data */
    OMX_AUDIO_CodingQCELP13,     /** Any variant of QCELP 13kbps encoded data */
    OMX_AUDIO_CodingEVRC,        /** Any variant of EVRC encoded data */
    OMX_AUDIO_CodingSMV,         /** Any variant of SMV encoded data */
    OMX_AUDIO_CodingG711,        /** Any variant of G.711 encoded data */
    OMX_AUDIO_CodingG723,        /** Any variant of G.723 dot 1 encoded data */
    OMX_AUDIO_CodingG726,        /** Any variant of G.726 encoded data */
    OMX_AUDIO_CodingG729,        /** Any variant of G.729 encoded data */
    OMX_AUDIO_CodingAAC,         /** Any variant of AAC encoded data */
    OMX_AUDIO_CodingMP3,         /** Any variant of MP3 encoded data */
    OMX_AUDIO_CodingSBC,         /** Any variant of SBC encoded data */
    OMX_AUDIO_CodingVORBIS,      /** Any variant of VORBIS encoded data */
    OMX_AUDIO_CodingWMA,         /** Any variant of WMA encoded data */
    OMX_AUDIO_CodingRA,          /** Any variant of RA encoded data */
    OMX_AUDIO_CodingMIDI,        /** Any variant of MIDI encoded data */
    OMX_AUDIO_CodingKhronosExtensions = 0x6F000000, /** Reserved region for introducing Khronos Standard Extensions */ 
    OMX_AUDIO_CodingVendorStartUnused = 0x7F000000, /** Reserved region for introducing Vendor Extensions */

    OMX_AUDIO_CodingFLAC,        /** Any variant of FLAC */
    OMX_AUDIO_CodingDDP,         /** Any variant of Dolby Digital Plus */
    OMX_AUDIO_CodingDTS,         /** Any variant of DTS */
    OMX_AUDIO_CodingWMAPRO,      /** Any variant of WMA Professional */
    OMX_AUDIO_CodingATRAC3,      /** Sony ATRAC-3 variants */
    OMX_AUDIO_CodingATRACX,      /** Sony ATRAC-X variants */
    OMX_AUDIO_CodingATRACAAL,    /** Sony ATRAC advanced-lossless variants  */

    OMX_AUDIO_CodingMax = 0x7FFFFFFF
} OMX_AUDIO_CODINGTYPE;
	
      

Running the program info from the Components chapter shows for the audio_decode component

	
Audio ports:
  Ports start on 120
  There are 2 open ports
  Port 120 has 128 buffers of size 16384
  Direction is input
    Port 120 requires 4 buffers
    Port 120 has min buffer size 16384 bytes
    Port 120 is an input port
    Port 120 is an audio port
    Port mimetype (null)
    Port encoding is MP3
      Supported audio formats are:
      Supported encoding is MP3
          MP3 default sampling rate 0
          MP3 default bits per sample 0
          MP3 default number of channels 0
      Supported encoding is PCM
          PCM default sampling rate 0
          PCM default bits per sample 0
          PCM default number of channels 0
      Supported encoding is AAC
      Supported encoding is WMA
      Supported encoding is Ogg Vorbis
      Supported encoding is RA
      Supported encoding is AMR
      Supported encoding is EVRC
      Supported encoding is G726
      Supported encoding is FLAC
      Supported encoding is DDP
      Supported encoding is DTS
      Supported encoding is WMAPRO
      Supported encoding is ATRAC3
      Supported encoding is ATRACX
      Supported encoding is ATRACAAL
      Supported encoding is MIDI
      No more formats supported
  Port 121 has 1 buffers of size 32768
  Direction is output
    Port 121 requires 1 buffers
    Port 121 has min buffer size 32768 bytes
    Port 121 is an output port
    Port 121 is an audio port
    Port mimetype (null)
    Port encoding is PCM
      Supported audio formats are:
      Supported encoding is PCM
          PCM default sampling rate 44100
          PCM default bits per sample 16
          PCM default number of channels 2
      Supported encoding is DDP
      Supported encoding is DTS
      No more formats supported
	
      

Regrettably, none of these are actually supported except for PCM. According to jamesh in "OMX_AllocateBuffer fails for audio decoder component":

The way it works is that the component passes back success for all the codecs it can potentially support. (i.e. all the codecs we've ever had going). That is then constrained by what codecs are actually installed. It would be better to run time detect which codecs are present, but that code has never been written since its never been required. It's also unlikely ever to be done as Broadcom no longer support audio codecs in this way - they have moved off the Videocore to the host CPU since they are now powerful enough to handle any audio decoding task

That's kind of sad, really.

Okay, so how we find out which encodings are really supported? A partial answer can be given by trying to allocate buffers for a particular encoding. So for each port we loop through the possible encodings, setting the encoding and trying to allocate ther buffers. We can use il_enable_port_buffers for this as it will return -1 if the allocation fails.

The program to do this is il_test_audio_encodings.c:

#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>

#include <OMX_Core.h>
#include <OMX_Component.h>

#include <bcm_host.h>
#include <ilclient.h>

#define AUDIO  "enigma.s16"

#define OUT "out"
FILE *outfp;

void printState(OMX_HANDLETYPE handle) {
    OMX_STATETYPE state;
    OMX_ERRORTYPE err;

    err = OMX_GetState(handle, &state);
    if (err != OMX_ErrorNone) {
        fprintf(stderr, "Error on getting state\n");
        exit(1);
    }
    switch (state) {
    case OMX_StateLoaded:           printf("StateLoaded\n"); break;
    case OMX_StateIdle:             printf("StateIdle\n"); break;
    case OMX_StateExecuting:        printf("StateExecuting\n"); break;
    case OMX_StatePause:            printf("StatePause\n"); break;
    case OMX_StateWaitForResources: printf("StateWait\n"); break;
    case OMX_StateInvalid:          printf("StateInvalid\n"); break;
    default:                        printf("State unknown\n"); break;
    }
}

char *err2str(int err) {
    switch (err) {
    case OMX_ErrorInsufficientResources: return "OMX_ErrorInsufficientResources";
    case OMX_ErrorUndefined: return "OMX_ErrorUndefined";
    case OMX_ErrorInvalidComponentName: return "OMX_ErrorInvalidComponentName";
    case OMX_ErrorComponentNotFound: return "OMX_ErrorComponentNotFound";
    case OMX_ErrorInvalidComponent: return "OMX_ErrorInvalidComponent";
    case OMX_ErrorBadParameter: return "OMX_ErrorBadParameter";
    case OMX_ErrorNotImplemented: return "OMX_ErrorNotImplemented";
    case OMX_ErrorUnderflow: return "OMX_ErrorUnderflow";
    case OMX_ErrorOverflow: return "OMX_ErrorOverflow";
    case OMX_ErrorHardware: return "OMX_ErrorHardware";
    case OMX_ErrorInvalidState: return "OMX_ErrorInvalidState";
    case OMX_ErrorStreamCorrupt: return "OMX_ErrorStreamCorrupt";
    case OMX_ErrorPortsNotCompatible: return "OMX_ErrorPortsNotCompatible";
    case OMX_ErrorResourcesLost: return "OMX_ErrorResourcesLost";
    case OMX_ErrorNoMore: return "OMX_ErrorNoMore";
    case OMX_ErrorVersionMismatch: return "OMX_ErrorVersionMismatch";
    case OMX_ErrorNotReady: return "OMX_ErrorNotReady";
    case OMX_ErrorTimeout: return "OMX_ErrorTimeout";
    case OMX_ErrorSameState: return "OMX_ErrorSameState";
    case OMX_ErrorResourcesPreempted: return "OMX_ErrorResourcesPreempted";
    case OMX_ErrorPortUnresponsiveDuringAllocation: return "OMX_ErrorPortUnresponsiveDuringAllocation";
    case OMX_ErrorPortUnresponsiveDuringDeallocation: return "OMX_ErrorPortUnresponsiveDuringDeallocation";
    case OMX_ErrorPortUnresponsiveDuringStop: return "OMX_ErrorPortUnresponsiveDuringStop";
    case OMX_ErrorIncorrectStateTransition: return "OMX_ErrorIncorrectStateTransition";
    case OMX_ErrorIncorrectStateOperation: return "OMX_ErrorIncorrectStateOperation";
    case OMX_ErrorUnsupportedSetting: return "OMX_ErrorUnsupportedSetting";
    case OMX_ErrorUnsupportedIndex: return "OMX_ErrorUnsupportedIndex";
    case OMX_ErrorBadPortIndex: return "OMX_ErrorBadPortIndex";
    case OMX_ErrorPortUnpopulated: return "OMX_ErrorPortUnpopulated";
    case OMX_ErrorComponentSuspended: return "OMX_ErrorComponentSuspended";
    case OMX_ErrorDynamicResourcesUnavailable: return "OMX_ErrorDynamicResourcesUnavailable";
    case OMX_ErrorMbErrorsInFrame: return "OMX_ErrorMbErrorsInFrame";
    case OMX_ErrorFormatNotDetected: return "OMX_ErrorFormatNotDetected";
    case OMX_ErrorContentPipeOpenFailed: return "OMX_ErrorContentPipeOpenFailed";
    case OMX_ErrorContentPipeCreationFailed: return "OMX_ErrorContentPipeCreationFailed";
    case OMX_ErrorSeperateTablesUsed: return "OMX_ErrorSeperateTablesUsed";
    case OMX_ErrorTunnelingUnsupported: return "OMX_ErrorTunnelingUnsupported";
    default: return "unknown error";
    }
}

void eos_callback(void *userdata, COMPONENT_T *comp, OMX_U32 data) {
    fprintf(stderr, "Got eos event\n");
}

void error_callback(void *userdata, COMPONENT_T *comp, OMX_U32 data) {
    //fprintf(stderr, "OMX error %s\n", err2str(data));
}

int get_file_size(char *fname) {
    struct stat st;

    if (stat(fname, &st) == -1) {
	perror("Stat'ing img file");
	return -1;
    }
    return(st.st_size);
}

static void set_audio_decoder_input_format(COMPONENT_T *component, 
					   int port, int format) {
    // set input audio format
    //printf("Setting audio decoder format\n");
    OMX_AUDIO_PARAM_PORTFORMATTYPE audioPortFormat;
    //setHeader(&audioPortFormat,  sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    memset(&audioPortFormat, 0, sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    audioPortFormat.nSize = sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE);
    audioPortFormat.nVersion.nVersion = OMX_VERSION;

    audioPortFormat.nPortIndex = port;
    //audioPortFormat.eEncoding = OMX_AUDIO_CodingPCM;
    audioPortFormat.eEncoding = format;
    OMX_SetParameter(ilclient_get_handle(component),
                     OMX_IndexParamAudioPortFormat, &audioPortFormat);
    //printf("Format set ok to %d\n", format);
}

char *format2str(OMX_AUDIO_CODINGTYPE format) {
    switch(format) {
    case OMX_AUDIO_CodingUnused: return "OMX_AUDIO_CodingUnused";
    case OMX_AUDIO_CodingAutoDetect: return "OMX_AUDIO_CodingAutoDetect";
    case OMX_AUDIO_CodingPCM: return "OMX_AUDIO_CodingPCM";
    case OMX_AUDIO_CodingADPCM: return "OMX_AUDIO_CodingADPCM";
    case OMX_AUDIO_CodingAMR: return "OMX_AUDIO_CodingAMR";
    case OMX_AUDIO_CodingGSMFR: return "OMX_AUDIO_CodingGSMFR";
    case OMX_AUDIO_CodingGSMEFR: return "OMX_AUDIO_CodingGSMEFR";
    case OMX_AUDIO_CodingGSMHR: return "OMX_AUDIO_CodingGSMHR";
    case OMX_AUDIO_CodingPDCFR: return "OMX_AUDIO_CodingPDCFR";
    case OMX_AUDIO_CodingPDCEFR: return "OMX_AUDIO_CodingPDCEFR";
    case OMX_AUDIO_CodingPDCHR: return "OMX_AUDIO_CodingPDCHR";
    case OMX_AUDIO_CodingTDMAFR: return "OMX_AUDIO_CodingTDMAFR";
    case OMX_AUDIO_CodingTDMAEFR: return "OMX_AUDIO_CodingTDMAEFR";
    case OMX_AUDIO_CodingQCELP8: return "OMX_AUDIO_CodingQCELP8";
    case OMX_AUDIO_CodingQCELP13: return "OMX_AUDIO_CodingQCELP13";
    case OMX_AUDIO_CodingEVRC: return "OMX_AUDIO_CodingEVRC";
    case OMX_AUDIO_CodingSMV: return "OMX_AUDIO_CodingSMV";
    case OMX_AUDIO_CodingG711: return "OMX_AUDIO_CodingG711";
    case OMX_AUDIO_CodingG723: return "OMX_AUDIO_CodingG723";
    case OMX_AUDIO_CodingG726: return "OMX_AUDIO_CodingG726";
    case OMX_AUDIO_CodingG729: return "OMX_AUDIO_CodingG729";
    case OMX_AUDIO_CodingAAC: return "OMX_AUDIO_CodingAAC";
    case OMX_AUDIO_CodingMP3: return "OMX_AUDIO_CodingMP3";
    case OMX_AUDIO_CodingSBC: return "OMX_AUDIO_CodingSBC";
    case OMX_AUDIO_CodingVORBIS: return "OMX_AUDIO_CodingVORBIS";
    case OMX_AUDIO_CodingWMA: return "OMX_AUDIO_CodingWMA";
    case OMX_AUDIO_CodingRA: return "OMX_AUDIO_CodingRA";
    case OMX_AUDIO_CodingMIDI: return "OMX_AUDIO_CodingMIDI";
    case OMX_AUDIO_CodingFLAC: return "OMX_AUDIO_CodingFLAC";
    case OMX_AUDIO_CodingDDP: return "OMX_AUDIO_CodingDDP";
    case OMX_AUDIO_CodingDTS: return "OMX_AUDIO_CodingDTS";
    case OMX_AUDIO_CodingWMAPRO: return "OMX_AUDIO_CodingWMAPRO";
    case OMX_AUDIO_CodingATRAC3: return "OMX_AUDIO_CodingATRAC3";
    case OMX_AUDIO_CodingATRACX: return "OMX_AUDIO_CodingATRACX";
    case OMX_AUDIO_CodingATRACAAL: return "OMX_AUDIO_CodingATRACAAL";
    default: return "Unknown format";
    }
}

void test_audio_port_formats(COMPONENT_T *component, int port) {
    int n = 2;
    while (n <= OMX_AUDIO_CodingMIDI) {
	set_audio_decoder_input_format(component, port, n);
	
	
	// input port
	if (ilclient_enable_port_buffers(component, port, 
					 NULL, NULL, NULL) < 0) {
	    printf("    Unsupported encoding is %s\n", 
		   format2str(n));
	} else {
	    printf("    Supported encoding is %s\n",
		  format2str(n));
	    ilclient_disable_port_buffers(component, port, 
					  NULL, NULL, NULL);
	}
	n++;
    }
    n = OMX_AUDIO_CodingFLAC;
    while (n <= OMX_AUDIO_CodingATRACAAL) {
	set_audio_decoder_input_format(component, port, n);
	
	
	// input port
	if (ilclient_enable_port_buffers(component, port, 
					 NULL, NULL, NULL) < 0) {
	    printf("    Unsupported encoding is %s\n", 
		   format2str(n));
	} else {
	    printf("    Supported encoding is %s\n", 
		   format2str(n));
	    ilclient_disable_port_buffers(component, port, 
					  NULL, NULL, NULL);
	}
	n++;
    }
}

void test_all_audio_ports(COMPONENT_T *component) {
    OMX_PORT_PARAM_TYPE param;
    OMX_PARAM_PORTDEFINITIONTYPE sPortDef;
    OMX_ERRORTYPE err;
    OMX_HANDLETYPE handle = ilclient_get_handle(component);

    int startPortNumber;
    int nPorts;
    int n;

    //setHeader(&param, sizeof(OMX_PORT_PARAM_TYPE));
    memset(&param, 0, sizeof(OMX_PORT_PARAM_TYPE));
    param.nSize = sizeof(OMX_PORT_PARAM_TYPE);
    param.nVersion.nVersion = OMX_VERSION;

    err = OMX_GetParameter(handle, OMX_IndexParamAudioInit, &param);
    if(err != OMX_ErrorNone){
	fprintf(stderr, "Error in getting audio OMX_PORT_PARAM_TYPE parameter\n");
	return;
    }
    printf("Audio ports:\n");

    startPortNumber = param.nStartPortNumber;
    nPorts = param.nPorts;
    if (nPorts == 0) {
	printf("No ports of this type\n");
	return;
    }

    printf("Ports start on %d\n", startPortNumber);
    printf("There are %d open ports\n", nPorts);


    for (n = 0; n < nPorts; n++) {
	//setHeader(&sPortDef, sizeof(OMX_PARAM_PORTDEFINITIONTYPE));
	memset(&sPortDef, 0, sizeof(OMX_PARAM_PORTDEFINITIONTYPE));
	sPortDef.nSize = sizeof(OMX_PARAM_PORTDEFINITIONTYPE);
	sPortDef.nVersion.nVersion = OMX_VERSION;
	

	sPortDef.nPortIndex = startPortNumber + n;
	err = OMX_GetParameter(handle, OMX_IndexParamPortDefinition, &sPortDef);
	if(err != OMX_ErrorNone){
	    fprintf(stderr, "Error in getting OMX_PORT_DEFINITION_TYPE parameter\n");
	    exit(1);
	}
	printf("Port %d has %d buffers of size %d\n",
	       sPortDef.nPortIndex,
	       sPortDef.nBufferCountActual,
	       sPortDef.nBufferSize);
	printf("Direction is %s\n", 
	       (sPortDef.eDir == OMX_DirInput ? "input" : "output"));
	test_audio_port_formats(component, sPortDef.nPortIndex);
    }
}

int main(int argc, char** argv) {
    char *componentName;
    int err;
    ILCLIENT_T  *handle;
    COMPONENT_T *component;

    //char *audio_file = AUDIO;

    /*
      FILE *fp = fopen(audio_file, "r");
      int toread = get_file_size(audio_file);
      outfp = fopen(OUT, "w");
      OMX_BUFFERHEADERTYPE *buff_header;
    */

    componentName = "audio_decode";
    if (argc == 2) {
	componentName = argv[1];
    }


    bcm_host_init();

    handle = ilclient_init();
    if (handle == NULL) {
	fprintf(stderr, "IL client init failed\n");
	exit(1);
    }

    if (OMX_Init() != OMX_ErrorNone) {
	ilclient_destroy(handle);
	fprintf(stderr, "OMX init failed\n");
	exit(1);
    }

    ilclient_set_error_callback(handle,
				error_callback,
				NULL);
    ilclient_set_eos_callback(handle,
			      eos_callback,
			      NULL);


    err = ilclient_create_component(handle,
				    &component,
				    componentName,
				    ILCLIENT_DISABLE_ALL_PORTS
				    |
				    ILCLIENT_ENABLE_INPUT_BUFFERS
				    |
				    ILCLIENT_ENABLE_OUTPUT_BUFFERS
				    );
    if (err == -1) {
	fprintf(stderr, "Component create failed\n");
	exit(1);
    }
    printState(ilclient_get_handle(component));

    err = ilclient_change_component_state(component,
					  OMX_StateIdle);
    if (err < 0) {
	fprintf(stderr, "Couldn't change state to Idle\n");
	exit(1);
    }
    printState(ilclient_get_handle(component));

    test_all_audio_ports(component);

    exit(0);
}

      

The program appears to be only partially successful: for the audio_decode component it shows only two possible formats can be decoded, PCM and ADPCM, but shows that they can be decoded to MP3, Vorbis, etc, which seems most unlikely:

	
Audio ports:
Ports start on 120
There are 2 open ports
Port 120 has 128 buffers of size 16384
Direction is input
    Supported encoding is OMX_AUDIO_CodingPCM
    Supported encoding is OMX_AUDIO_CodingADPCM
    Unsupported encoding is OMX_AUDIO_CodingAMR
    Unsupported encoding is OMX_AUDIO_CodingGSMFR
    Unsupported encoding is OMX_AUDIO_CodingGSMEFR
    Unsupported encoding is OMX_AUDIO_CodingGSMHR
    Unsupported encoding is OMX_AUDIO_CodingPDCFR
    Unsupported encoding is OMX_AUDIO_CodingPDCEFR
    Unsupported encoding is OMX_AUDIO_CodingPDCHR
    Unsupported encoding is OMX_AUDIO_CodingTDMAFR
    Unsupported encoding is OMX_AUDIO_CodingTDMAEFR
    Unsupported encoding is OMX_AUDIO_CodingQCELP8
    Unsupported encoding is OMX_AUDIO_CodingQCELP13
    ...
Port 121 has 1 buffers of size 32768
Direction is output
    Supported encoding is OMX_AUDIO_CodingPCM
    Supported encoding is OMX_AUDIO_CodingADPCM
    Supported encoding is OMX_AUDIO_CodingAMR
    Supported encoding is OMX_AUDIO_CodingGSMFR
    Supported encoding is OMX_AUDIO_CodingGSMEFR
    Supported encoding is OMX_AUDIO_CodingGSMHR
    Supported encoding is OMX_AUDIO_CodingPDCFR
    Supported encoding is OMX_AUDIO_CodingPDCEFR
    Supported encoding is OMX_AUDIO_CodingPDCHR
    Supported encoding is OMX_AUDIO_CodingTDMAFR
    Supported encoding is OMX_AUDIO_CodingTDMAEFR
    Supported encoding is OMX_AUDIO_CodingQCELP8
    Supported encoding is OMX_AUDIO_CodingQCELP13
    Supported encoding is OMX_AUDIO_CodingEVRC
    Supported encoding is OMX_AUDIO_CodingSMV
    Supported encoding is OMX_AUDIO_CodingG711
    Supported encoding is OMX_AUDIO_CodingG723
    Supported encoding is OMX_AUDIO_CodingG726
    Supported encoding is OMX_AUDIO_CodingG729
    ...
	
      

Decoding an audio file using audio_decode

The Broadcom audio_decode will only decode PCM format data. It decodes it to ... PCM format data. PCM (pulse code modulated) is the binary format commonly used to represent unencoded audio data. In other words, unless Broadcom include support for some of the audio codecs then this component is pretty useless.

Rendering PCM data

PCM data

From Wikipedia

Pulse-code modulation (PCM) is a method used to digitally represent sampled analog signals. It is the standard form for digital audio in computers and various Blu-ray, DVD and Compact Disc formats, as well as other uses such as digital telephone systems. A PCM stream is a digital representation of an analog signal, in which the magnitude of the analog signal is sampled regularly at uniform intervals, with each sample being quantized to the nearest value within a range of digital steps.

PCM streams have two basic properties that determine their fidelity to the original analog signal: the sampling rate, which is the number of times per second that samples are taken; and the bit depth, which determines the number of possible digital values that each sample can take.

PCM data can be stored in files as "raw" data. In this case there is no header information to say what the sampling rate and bit depth are. Many tools such as sox use the file extension to determine these properties. From man soxformat:

f32 and f64 indicate files encoded as 32 and 64-bit (IEEE single and double precision) floating point PCM respectively; s8, s16, s24, and s32 indicate 8, 16, 24, and 32-bit signed integer PCM respectively; u8, u16, u24, and u32 indicate 8, 16, 24, and 32-bit unsigned integer PCM respectively

But it should be noted that the file extension is only an aid to understanding some of the PCM codec parameters and how it is stored in the file.

Files can be converted into PCM by tools such as avconv. For example, to convert a WAV file to PCM

	
avconv -i  enigma.wav -f s16le enigma.s16
	
      

The output will give information not saved in the file which you will need to give to a processing program later

	
Input #0, wav, from 'enigma.wav':
  Duration: 00:06:26.38, bitrate: 1411 kb/s
    Stream #0.0: Audio: pcm_s16le, 44100 Hz, stereo, s16, 1411 kb/s
Output #0, s16le, to 'enigma.s16':
  Metadata:
    encoder         : Lavf54.20.4
    Stream #0.0: Audio: pcm_s16le, 44100 Hz, stereo, s16, 1411 kb/s
	
      

From this we can see that the format is two channels, 44,100 Hz, 16 bit little-endian. (The file I used was from a group called Enigma who released an album as Open Content.)

To check that the encoding worked ok you can use aplay as in

	
aplay -r 44100 -c 2 -f S16_LE enigma.s16
	
      

Choosing an output device

OpenMAX has a standard audio render component. But what device does it render to? The inbuilt sound card? A USB sound card? That is not a part of OpenMAX IL - there isn't even a way to list the audio devices - only the audio components.

OpenMAX has an extension mechanism which can be used by an OpenMAX implementor to answer questions like this. The Broadcom core implementation has extension types OMX_CONFIG_BRCMAUDIODESTINATIONTYPE (and OMX_CONFIG_BRCMAUDIOSOURCETYPE) which can be used to set the audio destination (source) device. Code to do this is

     
void setOutputDevice(const char *name) {
   int32_t success = -1;
   OMX_CONFIG_BRCMAUDIODESTINATIONTYPE arDest;

   if (name && strlen(name) < sizeof(arDest.sName)) {
       setHeader(&arDest, sizeof(OMX_CONFIG_BRCMAUDIODESTINATIONTYPE));
       strcpy((char *)arDest.sName, name);
       
       err = OMX_SetParameter(handle, OMX_IndexConfigBrcmAudioDestination, &arDest);
       if (err != OMX_ErrorNone) {
	   fprintf(stderr, "Error on setting audio destination\n");
	   exit(1);
       }
   }
}
     
   

Here is where Broadcom becomes a bit obscure again: the header file IL/OMX_Broadcom.h states that the default value of sName is "local" but doesn't give any other values. The Raspberry Pi forums say that this refers to the 3.5mm analog audio out, and that HDMI is chosen by using the value "hdmi". No other values are documented, and it seems that the Broadcom OpenMAX IL does not support any other audio devices: in particular, USB audio devices are not supported by the current Broadcom OpenMAX IL components for either input or output. So you can't use OpenMAX IL for say audio capture on the Raspberry Pi since it has no Broadcom supported audio input.

Setting PCM format

We use two functions to set the PCM format. The first contains nothing unusual:

	
void set_audio_render_input_format(COMPONENT_T *component) {
    // set input audio format
    printf("Setting audio render format\n");
    OMX_AUDIO_PARAM_PORTFORMATTYPE audioPortFormat;

    memset(&audioPortFormat, 0, sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    audioPortFormat.nSize = sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE);
    audioPortFormat.nVersion.nVersion = OMX_VERSION;

    audioPortFormat.nPortIndex = 100;

    OMX_GetParameter(ilclient_get_handle(component),
                     OMX_IndexParamAudioPortFormat, &audioPortFormat);

    audioPortFormat.eEncoding = OMX_AUDIO_CodingPCM;
    OMX_SetParameter(ilclient_get_handle(component),
                     OMX_IndexParamAudioPortFormat, &audioPortFormat);

    setPCMMode(ilclient_get_handle(component), 100);

}
	
      

The second gets the current PCM parameters and then sets the required PCM parameters (which we konw independently):

	
void setPCMMode(OMX_HANDLETYPE handle, int startPortNumber) {
    OMX_AUDIO_PARAM_PCMMODETYPE sPCMMode;
    OMX_ERRORTYPE err;
 
    memset(&sPCMMode, 0, sizeof(OMX_AUDIO_PARAM_PCMMODETYPE));
    sPCMMode.nSize = sizeof(OMX_AUDIO_PARAM_PCMMODETYPE);
    sPCMMode.nVersion.nVersion = OMX_VERSION;

    sPCMMode.nPortIndex = startPortNumber;

    err = OMX_GetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    printf("Sampling rate %d, channels %d\n",
	   sPCMMode.nSamplingRate, 
	   sPCMMode.nChannels);

    sPCMMode.nSamplingRate = 44100;
    sPCMMode.nChannels = 2;

    err = OMX_SetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    if(err != OMX_ErrorNone){
	fprintf(stderr, "PCM mode unsupported\n");
	return;
    } else {
	fprintf(stderr, "PCM mode supported\n");
	fprintf(stderr, "PCM sampling rate %d\n", sPCMMode.nSamplingRate);
	fprintf(stderr, "PCM nChannels %d\n", sPCMMode.nChannels);
    } 
}
	
      

The program is il_render_audio.c

#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>

#include <OMX_Core.h>
#include <OMX_Component.h>

#include <bcm_host.h>
#include <ilclient.h>

#define AUDIO  "enigma.s16"

/* For the RPi name can be "hdmi" or "local" */
void setOutputDevice(OMX_HANDLETYPE handle, const char *name) {
    OMX_ERRORTYPE err;
    OMX_CONFIG_BRCMAUDIODESTINATIONTYPE arDest;

    if (name && strlen(name) < sizeof(arDest.sName)) {
	memset(&arDest, 0, sizeof(OMX_CONFIG_BRCMAUDIODESTINATIONTYPE));
	arDest.nSize = sizeof(OMX_CONFIG_BRCMAUDIODESTINATIONTYPE);
	arDest.nVersion.nVersion = OMX_VERSION;

	strcpy((char *)arDest.sName, name);
       
	err = OMX_SetParameter(handle, OMX_IndexConfigBrcmAudioDestination, &arDest);
	if (err != OMX_ErrorNone) {
	    fprintf(stderr, "Error on setting audio destination\n");
	    exit(1);
	}
    }
}

void setPCMMode(OMX_HANDLETYPE handle, int startPortNumber) {
    OMX_AUDIO_PARAM_PCMMODETYPE sPCMMode;
    OMX_ERRORTYPE err;
 
    memset(&sPCMMode, 0, sizeof(OMX_AUDIO_PARAM_PCMMODETYPE));
    sPCMMode.nSize = sizeof(OMX_AUDIO_PARAM_PCMMODETYPE);
    sPCMMode.nVersion.nVersion = OMX_VERSION;

    sPCMMode.nPortIndex = startPortNumber;

    err = OMX_GetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    printf("Sampling rate %d, channels %d\n",
	   sPCMMode.nSamplingRate, 
	   sPCMMode.nChannels);

    sPCMMode.nSamplingRate = 44100;
    sPCMMode.nChannels = 2;

    err = OMX_SetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    if(err != OMX_ErrorNone){
	fprintf(stderr, "PCM mode unsupported\n");
	return;
    } else {
	fprintf(stderr, "PCM mode supported\n");
	fprintf(stderr, "PCM sampling rate %d\n", sPCMMode.nSamplingRate);
	fprintf(stderr, "PCM nChannels %d\n", sPCMMode.nChannels);
    } 
}

void printState(OMX_HANDLETYPE handle) {
    OMX_STATETYPE state;
    OMX_ERRORTYPE err;

    err = OMX_GetState(handle, &state);
    if (err != OMX_ErrorNone) {
        fprintf(stderr, "Error on getting state\n");
        exit(1);
    }
    switch (state) {
    case OMX_StateLoaded:           printf("StateLoaded\n"); break;
    case OMX_StateIdle:             printf("StateIdle\n"); break;
    case OMX_StateExecuting:        printf("StateExecuting\n"); break;
    case OMX_StatePause:            printf("StatePause\n"); break;
    case OMX_StateWaitForResources: printf("StateWait\n"); break;
    case OMX_StateInvalid:          printf("StateInvalid\n"); break;
    default:                        printf("State unknown\n"); break;
    }
}

char *err2str(int err) {
    switch (err) {
    case OMX_ErrorInsufficientResources: return "OMX_ErrorInsufficientResources";
    case OMX_ErrorUndefined: return "OMX_ErrorUndefined";
    case OMX_ErrorInvalidComponentName: return "OMX_ErrorInvalidComponentName";
    case OMX_ErrorComponentNotFound: return "OMX_ErrorComponentNotFound";
    case OMX_ErrorInvalidComponent: return "OMX_ErrorInvalidComponent";
    case OMX_ErrorBadParameter: return "OMX_ErrorBadParameter";
    case OMX_ErrorNotImplemented: return "OMX_ErrorNotImplemented";
    case OMX_ErrorUnderflow: return "OMX_ErrorUnderflow";
    case OMX_ErrorOverflow: return "OMX_ErrorOverflow";
    case OMX_ErrorHardware: return "OMX_ErrorHardware";
    case OMX_ErrorInvalidState: return "OMX_ErrorInvalidState";
    case OMX_ErrorStreamCorrupt: return "OMX_ErrorStreamCorrupt";
    case OMX_ErrorPortsNotCompatible: return "OMX_ErrorPortsNotCompatible";
    case OMX_ErrorResourcesLost: return "OMX_ErrorResourcesLost";
    case OMX_ErrorNoMore: return "OMX_ErrorNoMore";
    case OMX_ErrorVersionMismatch: return "OMX_ErrorVersionMismatch";
    case OMX_ErrorNotReady: return "OMX_ErrorNotReady";
    case OMX_ErrorTimeout: return "OMX_ErrorTimeout";
    case OMX_ErrorSameState: return "OMX_ErrorSameState";
    case OMX_ErrorResourcesPreempted: return "OMX_ErrorResourcesPreempted";
    case OMX_ErrorPortUnresponsiveDuringAllocation: return "OMX_ErrorPortUnresponsiveDuringAllocation";
    case OMX_ErrorPortUnresponsiveDuringDeallocation: return "OMX_ErrorPortUnresponsiveDuringDeallocation";
    case OMX_ErrorPortUnresponsiveDuringStop: return "OMX_ErrorPortUnresponsiveDuringStop";
    case OMX_ErrorIncorrectStateTransition: return "OMX_ErrorIncorrectStateTransition";
    case OMX_ErrorIncorrectStateOperation: return "OMX_ErrorIncorrectStateOperation";
    case OMX_ErrorUnsupportedSetting: return "OMX_ErrorUnsupportedSetting";
    case OMX_ErrorUnsupportedIndex: return "OMX_ErrorUnsupportedIndex";
    case OMX_ErrorBadPortIndex: return "OMX_ErrorBadPortIndex";
    case OMX_ErrorPortUnpopulated: return "OMX_ErrorPortUnpopulated";
    case OMX_ErrorComponentSuspended: return "OMX_ErrorComponentSuspended";
    case OMX_ErrorDynamicResourcesUnavailable: return "OMX_ErrorDynamicResourcesUnavailable";
    case OMX_ErrorMbErrorsInFrame: return "OMX_ErrorMbErrorsInFrame";
    case OMX_ErrorFormatNotDetected: return "OMX_ErrorFormatNotDetected";
    case OMX_ErrorContentPipeOpenFailed: return "OMX_ErrorContentPipeOpenFailed";
    case OMX_ErrorContentPipeCreationFailed: return "OMX_ErrorContentPipeCreationFailed";
    case OMX_ErrorSeperateTablesUsed: return "OMX_ErrorSeperateTablesUsed";
    case OMX_ErrorTunnelingUnsupported: return "OMX_ErrorTunnelingUnsupported";
    default: return "unknown error";
    }
}

void eos_callback(void *userdata, COMPONENT_T *comp, OMX_U32 data) {
    fprintf(stderr, "Got eos event\n");
}

void error_callback(void *userdata, COMPONENT_T *comp, OMX_U32 data) {
    fprintf(stderr, "OMX error %s\n", err2str(data));
}

int get_file_size(char *fname) {
    struct stat st;

    if (stat(fname, &st) == -1) {
	perror("Stat'ing img file");
	return -1;
    }
    return(st.st_size);
}

static void set_audio_render_input_format(COMPONENT_T *component) {
    // set input audio format
    printf("Setting audio render format\n");
    OMX_AUDIO_PARAM_PORTFORMATTYPE audioPortFormat;
    //setHeader(&audioPortFormat,  sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    memset(&audioPortFormat, 0, sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    audioPortFormat.nSize = sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE);
    audioPortFormat.nVersion.nVersion = OMX_VERSION;

    audioPortFormat.nPortIndex = 100;


    OMX_GetParameter(ilclient_get_handle(component),
                     OMX_IndexParamAudioPortFormat, &audioPortFormat);

    audioPortFormat.eEncoding = OMX_AUDIO_CodingPCM;
    //audioPortFormat.eEncoding = OMX_AUDIO_CodingMP3;
    OMX_SetParameter(ilclient_get_handle(component),
                     OMX_IndexParamAudioPortFormat, &audioPortFormat);

    setPCMMode(ilclient_get_handle(component), 100);

}

OMX_ERRORTYPE read_into_buffer_and_empty(FILE *fp,
					 COMPONENT_T *component,
					 OMX_BUFFERHEADERTYPE *buff_header,
					 int *toread) {
    OMX_ERRORTYPE r;

    int buff_size = buff_header->nAllocLen;
    int nread = fread(buff_header->pBuffer, 1, buff_size, fp);

    printf("Read %d\n", nread);

    buff_header->nFilledLen = nread;
    *toread -= nread;
    if (*toread <= 0) {
	printf("Setting EOS on input\n");
	buff_header->nFlags |= OMX_BUFFERFLAG_EOS;
    }
    r = OMX_EmptyThisBuffer(ilclient_get_handle(component),
			    buff_header);
    if (r != OMX_ErrorNone) {
	fprintf(stderr, "Empty buffer error %s\n",
		err2str(r));
    }
    return r;
}


int main(int argc, char** argv) {

    int i;
    char *componentName;
    int err;
    ILCLIENT_T  *handle;
    COMPONENT_T *component;

    char *audio_file = AUDIO;
    if (argc == 2) {
	audio_file = argv[1];
    }

    FILE *fp = fopen(audio_file, "r");
    int toread = get_file_size(audio_file);

    OMX_BUFFERHEADERTYPE *buff_header;

    componentName = "audio_render";


    bcm_host_init();

    handle = ilclient_init();
    if (handle == NULL) {
	fprintf(stderr, "IL client init failed\n");
	exit(1);
    }

    if (OMX_Init() != OMX_ErrorNone) {
        ilclient_destroy(handle);
        fprintf(stderr, "OMX init failed\n");
	exit(1);
    }

    ilclient_set_error_callback(handle,
				error_callback,
				NULL);
    ilclient_set_eos_callback(handle,
			      eos_callback,
			      NULL);


    err = ilclient_create_component(handle,
				    &component,
				    componentName,
				    ILCLIENT_DISABLE_ALL_PORTS
				    |
				    ILCLIENT_ENABLE_INPUT_BUFFERS
				    );
    if (err == -1) {
	fprintf(stderr, "Component create failed\n");
	exit(1);
    }
    printState(ilclient_get_handle(component));

    err = ilclient_change_component_state(component,
					  OMX_StateIdle);
    if (err < 0) {
	fprintf(stderr, "Couldn't change state to Idle\n");
	exit(1);
    }
    printState(ilclient_get_handle(component));

    // must be before we enable buffers
    set_audio_render_input_format(component);

    setOutputDevice(ilclient_get_handle(component), "local");

    // input port
    ilclient_enable_port_buffers(component, 100, 
				 NULL, NULL, NULL);
    ilclient_enable_port(component, 100);




    err = ilclient_change_component_state(component,
					  OMX_StateExecuting);
    if (err < 0) {
	fprintf(stderr, "Couldn't change state to Executing\n");
	exit(1);
    }
    printState(ilclient_get_handle(component));

#if 0
    // Read the first block so that the component can get
    // the dimensions of the audio and call port settings
    // changed on the output port to configure it
    buff_header = 
	ilclient_get_input_buffer(component,
				  120,
				  1 /* block */);
    if (buff_header != NULL) {
	read_into_buffer_and_empty(fp,
				   component,
				   buff_header,
				   &toread);

	// If all the file has been read in, then
	// we have to re-read this first block.
	// Broadcom bug?
	if (toread <= 0) {
	    printf("Rewinding\n");
	    // wind back to start and repeat
	    fp = freopen(AUDIO, "r", fp);
	    toread = get_file_size(AUDIO);
	}
    }

    if (toread > 0 && ilclient_remove_event(component, 
					    OMX_EventPortSettingsChanged, 
					    121, 0, 0, 1) == 0) {
	printf("Removed port settings event\n");
	break;
    } else {
	printf("No portr settting seen yet\n");
    }
    // wait for first input block to set params for output port
    if (toread == 0) {
	err = ilclient_wait_for_event(component, 
				      OMX_EventPortSettingsChanged, 
				      121, 0, 0, 1,
				      ILCLIENT_EVENT_ERROR | ILCLIENT_PARAMETER_CHANGED, 
				      10000);
	if (err < 0) {
	    printf("wait for port settings timed out\n");
	} else {
	    printf("gpt port settings event\n");
	    break;
	}
    }


    // now enable output port since port params have been set
    ilclient_enable_port_buffers(component, 100, 
				 NULL, NULL, NULL);
    ilclient_enable_port(component, 100);
#endif

    // now work through the file
    while (toread > 0) {
	OMX_ERRORTYPE r;

	// do we have an input buffer we can fill and empty?
	buff_header = 
	    ilclient_get_input_buffer(component,
				      100,
				      1 /* block */);
	if (buff_header != NULL) {
	    read_into_buffer_and_empty(fp,
				       component,
				       buff_header,
				       &toread);
	}
    }

#if 0
	// do we have an output buffer that has been filled?
	buff_header = 
	    ilclient_get_output_buffer(component,
				       121,
				       0 /* no block */);
	if (buff_header != NULL) {
	    save_info_from_filled_buffer(component,
					 buff_header);
	}
    }

    while (1) {
	printf("Getting last output buffers\n");
	buff_header = 
	    ilclient_get_output_buffer(component,
				       121,
				       1 /* block */);
	if (buff_header != NULL) {
	    save_info_from_filled_buffer(component,
					 buff_header);
	}
    }
#endif

    sleep(100);
    exit(0);
}

      

Decoding an MP3 file using FFmpeg/Avconv (old style)

If you want to play a compressed file such as MP3 or Ogg, it has to be decoded and as noted before, the audio_decode component doesn't do this. So we have to turn to another system. The prominent audio decoding systems are FFmpeg (forked to LibAV) and GStreamer. I will use LibAV as that is the default install on the RPi.

FFmpeg was started in 2000. LibAV forked from it in 2011. Over time, both of the libraries and the file formats have evolved. Consequently, there are code examples on the Web which are no longer appropriate. Generally, FFMpeg and LibAV follow the same API and are generally interchangeable at the user code level - but not always.

The current source distro at Download FFmpeg includes a program doc/examples/decoding_encoding.c which almost works on the MP3 files I tried.

The reason decoding_encoding.c fails on many newer MP3 files is a change of format. Whereas audio samples were previously mainly interleaved, there now seem to be a set of MP3's that are planar. This is easily illustrated with stereo: interleaved means LRLRLR... With planar, a set of consecutive R's are given after a set of consecutive L's as LLLLLL...RRRRR... Interleaved is a degenerate case of planar with run length of one.

A frame of video/audio that is decoded by FFmpeg/LibAV is built in a struct AVFrame. This includes the fields

	
typedef struct AVFrame {
    uint8_t *data[AV_NUM_DATA_POINTERS];
    int linesize[AV_NUM_DATA_POINTERS];
    int nb_samples;
}
	
      

In the interleaved case, all of the samples are in data[0]. In the planar case, they are in data[0], data[1], .... There does not seem to be an explicit indicator of how many planar streams there are, but if a data element is non-NULL it seems to contain a stream. So by walking the data array until we find NULL we can find the number of streams.

Many tools such as aplay will only accept interleaved samples. So given multiple planar streams we have to interleave them ourselves. This isn't hard, once we know the sample size in bytes, the length of each stream and the number of streams (a more robust way is given in a later section):

	
            int data_size = av_samples_get_buffer_size(NULL, c->channels,
                                                       decoded_frame->nb_samples,
                                                       c->sample_fmt, 1);
	    // first time: count the number of  planar streams
	    if (num_streams == 0) {
		while (num_streams < AV_NUM_DATA_POINTERS &&
		       decoded_frame->data[num_streams] != NULL) 
		    num_streams++; 
	    }

	    // first time: set sample_size from 0 to e.g 2 for 16-bit data
	    if (sample_size == 0) {
		sample_size = 
		    data_size / (num_streams * decoded_frame->nb_samples);
	    }

	    int m, n;
	    for (n = 0; n < decoded_frame->nb_samples; n++) {
		// interleave the samples from the planar streams
		for (m = 0; m < num_streams; m++) {
		    fwrite(&decoded_frame->data[m][n*sample_size], 
			   1, sample_size, outfile);
		}
	    }
	
      

The revised program which reads from an MP3 file and writes decoded data to /tmp/test.sw is api-example.c

/*
 * copyright (c) 2001 Fabrice Bellard
 *
 * This file is part of Libav.
 *
 * Libav is free software; you can redistribute it and/or
 * modify it under the terms of the GNU Lesser General Public
 * License as published by the Free Software Foundation; either
 * version 2.1 of the License, or (at your option) any later version.
 *
 * Libav is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 * Lesser General Public License for more details.
 *
 * You should have received a copy of the GNU Lesser General Public
 * License along with Libav; if not, write to the Free Software
 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
 */

// From http://code.haskell.org/~thielema/audiovideo-example/cbits/

/**
 * @file
 * libavcodec API use example.
 *
 * @example libavcodec/api-example.c
 * Note that this library only handles codecs (mpeg, mpeg4, etc...),
 * not file formats (avi, vob, etc...). See library 'libavformat' for the
 * format handling
 */

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

#ifdef HAVE_AV_CONFIG_H
#undef HAVE_AV_CONFIG_H
#endif

#include "libavcodec/avcodec.h"
#include <libavformat/avformat.h>
#include "libavutil/mathematics.h"
#include "libavutil/samplefmt.h"

#define INBUF_SIZE 4096
#define AUDIO_INBUF_SIZE 20480
#define AUDIO_REFILL_THRESH 4096

/*
 * Audio decoding.
 */
static void audio_decode_example(const char *outfilename, const char *filename)
{
    AVCodec *codec;
    AVCodecContext *c = NULL;
    int len;
    FILE *f, *outfile;
    uint8_t inbuf[AUDIO_INBUF_SIZE + FF_INPUT_BUFFER_PADDING_SIZE];
    AVPacket avpkt;
    AVFrame *decoded_frame = NULL;
    int num_streams = 0;
    int sample_size = 0;

    av_init_packet(&avpkt);

    printf("Audio decoding\n");

    /* find the mpeg audio decoder */
    codec = avcodec_find_decoder(CODEC_ID_MP3);
    if (!codec) {
        fprintf(stderr, "codec not found\n");
        exit(1);
    }

    c = avcodec_alloc_context3(codec);;

    /* open it */
    if (avcodec_open2(c, codec, NULL) < 0) {
        fprintf(stderr, "could not open codec\n");
        exit(1);
    }

    f = fopen(filename, "rb");
    if (!f) {
        fprintf(stderr, "could not open %s\n", filename);
        exit(1);
    }
    outfile = fopen(outfilename, "wb");
    if (!outfile) {
        av_free(c);
        exit(1);
    }

    /* decode until eof */
    avpkt.data = inbuf;
    avpkt.size = fread(inbuf, 1, AUDIO_INBUF_SIZE, f);

    while (avpkt.size > 0) {
        int got_frame = 0;

        if (!decoded_frame) {
            if (!(decoded_frame = avcodec_alloc_frame())) {
                fprintf(stderr, "out of memory\n");
                exit(1);
            }
        } else
            avcodec_get_frame_defaults(decoded_frame);

        len = avcodec_decode_audio4(c, decoded_frame, &got_frame, &avpkt);
        if (len < 0) {
            fprintf(stderr, "Error while decoding\n");
            exit(1);
        }
        if (got_frame) {
	    printf("Decoded frame nb_samples %d, format %d\n", 
		   decoded_frame->nb_samples,
		   decoded_frame->format);
	    if (decoded_frame->data[1] != NULL)
	        printf("Data[1] not null\n");
	    else
		printf("Data[1] is null\n");
            /* if a frame has been decoded, output it */
            int data_size = av_samples_get_buffer_size(NULL, c->channels,
                                                       decoded_frame->nb_samples,
                                                       c->sample_fmt, 1);
	    // first time: count the number of  planar streams
	    if (num_streams == 0) {
		while (num_streams < AV_NUM_DATA_POINTERS &&
		       decoded_frame->data[num_streams] != NULL) 
		    num_streams++; 
	    }

	    // first time: set sample_size from 0 to e.g 2 for 16-bit data
	    if (sample_size == 0) {
		sample_size = 
		    data_size / (num_streams * decoded_frame->nb_samples);
	    }

	    int m, n;
	    for (n = 0; n < decoded_frame->nb_samples; n++) {
		// interleave the samples from the planar streams
		for (m = 0; m < num_streams; m++) {
		    fwrite(&decoded_frame->data[m][n*sample_size], 
			   1, sample_size, outfile);
		}
	    }
        }
        avpkt.size -= len;
        avpkt.data += len;
        if (avpkt.size < AUDIO_REFILL_THRESH) {
            /* Refill the input buffer, to avoid trying to decode
             * incomplete frames. Instead of this, one could also use
             * a parser, or use a proper container format through
             * libavformat. */
            memmove(inbuf, avpkt.data, avpkt.size);
            avpkt.data = inbuf;
            len = fread(avpkt.data + avpkt.size, 1,
                        AUDIO_INBUF_SIZE - avpkt.size, f);
            if (len > 0)
                avpkt.size += len;
        }
    }

    fclose(outfile);
    fclose(f);

    avcodec_close(c);
    av_free(c);
    av_free(decoded_frame);
}

int main(int argc, char **argv)
{
    const char *filename = "BST.mp3";
    AVFormatContext *pFormatCtx = NULL;

    if (argc == 2) {
        filename = argv[1];
    }

    // Register all formats and codecs
    av_register_all();
    if(avformat_open_input(&pFormatCtx, filename, NULL, NULL)!=0) {
	fprintf(stderr, "Can't get format\n");
        return -1; // Couldn't open file
    }
    // Retrieve stream information
    if(avformat_find_stream_info(pFormatCtx, NULL)<0)
	return -1; // Couldn't find stream information
    av_dump_format(pFormatCtx, 0, filename, 0);
    printf("Num streams %d\n", pFormatCtx->nb_streams);
    printf("Bit rate %d\n", pFormatCtx->bit_rate);
    audio_decode_example("/tmp/test.sw", filename);

    return 0;
}

      

The compile command is

	
cc -c -g api-example.c
cc api-example.o -lavutil -lavcodec -lavformat -o api-example -lm
	
      

The result may be tested by (you may need to change parameters)

	
aplay -r 44100 -c 2 -f S16_LE /tmp/test.sw
	
      

Rendering MP3 using FFmpeg/LibAV and OpenMAX

Since the Broadcom audio_decode is apparently of little use, if we actually want to play MP3, Ogg or other encoded formats, we have to use FFmpeg/LibAV (or GStreamer) to decode the audio to PCM and then pass it to the Broadcom audio_render component.

Essentially this means taking the last two programs and mashing them together. It isn't hard, just a bit messy. The only tricky point is that the buffers returned from FFmpeg and the buffers used by audio_render are different sizes, and we don't really know which will be bigger. If the audio_render input buffers are bigger, then we just copy (and interleave) the FFmpeg data across; if smaller, then we have to keep fetching new buffers as each one is filled.

The resultant program is il_ffmpeg_render_audio.c

#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>

#include <OMX_Core.h>
#include <OMX_Component.h>

#include <bcm_host.h>
#include <ilclient.h>


#include "libavcodec/avcodec.h"
#include <libavformat/avformat.h>
#include "libavutil/mathematics.h"
#include "libavutil/samplefmt.h"

#define INBUF_SIZE 4096
#define AUDIO_INBUF_SIZE 20480
#define AUDIO_REFILL_THRESH 4096

#define AUDIO  "BST.mp3"

/* For the RPi name can be "hdmi" or "local" */
void setOutputDevice(OMX_HANDLETYPE handle, const char *name) {
    OMX_ERRORTYPE err;
    OMX_CONFIG_BRCMAUDIODESTINATIONTYPE arDest;

    if (name && strlen(name) < sizeof(arDest.sName)) {
	memset(&arDest, 0, sizeof(OMX_CONFIG_BRCMAUDIODESTINATIONTYPE));
	arDest.nSize = sizeof(OMX_CONFIG_BRCMAUDIODESTINATIONTYPE);
	arDest.nVersion.nVersion = OMX_VERSION;

	strcpy((char *)arDest.sName, name);
       
	err = OMX_SetParameter(handle, OMX_IndexConfigBrcmAudioDestination, &arDest);
	if (err != OMX_ErrorNone) {
	    fprintf(stderr, "Error on setting audio destination\n");
	    exit(1);
	}
    }
}

void setPCMMode(OMX_HANDLETYPE handle, int startPortNumber) {
    OMX_AUDIO_PARAM_PCMMODETYPE sPCMMode;
    OMX_ERRORTYPE err;
 
    memset(&sPCMMode, 0, sizeof(OMX_AUDIO_PARAM_PCMMODETYPE));
    sPCMMode.nSize = sizeof(OMX_AUDIO_PARAM_PCMMODETYPE);
    sPCMMode.nVersion.nVersion = OMX_VERSION;

    sPCMMode.nPortIndex = startPortNumber;

    err = OMX_GetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    printf("Sampling rate %d, channels %d\n",
	   sPCMMode.nSamplingRate, 
	   sPCMMode.nChannels);

    sPCMMode.nSamplingRate = 44100;
    sPCMMode.nChannels = 2; // assumed for now - should be checked

    err = OMX_SetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    if(err != OMX_ErrorNone){
	fprintf(stderr, "PCM mode unsupported\n");
	return;
    } else {
	fprintf(stderr, "PCM mode supported\n");
	fprintf(stderr, "PCM sampling rate %d\n", sPCMMode.nSamplingRate);
	fprintf(stderr, "PCM nChannels %d\n", sPCMMode.nChannels);
    } 
}

void printState(OMX_HANDLETYPE handle) {
    OMX_STATETYPE state;
    OMX_ERRORTYPE err;

    err = OMX_GetState(handle, &state);
    if (err != OMX_ErrorNone) {
        fprintf(stderr, "Error on getting state\n");
        exit(1);
    }
    switch (state) {
    case OMX_StateLoaded:           printf("StateLoaded\n"); break;
    case OMX_StateIdle:             printf("StateIdle\n"); break;
    case OMX_StateExecuting:        printf("StateExecuting\n"); break;
    case OMX_StatePause:            printf("StatePause\n"); break;
    case OMX_StateWaitForResources: printf("StateWait\n"); break;
    case OMX_StateInvalid:          printf("StateInvalid\n"); break;
    default:                        printf("State unknown\n"); break;
    }
}

char *err2str(int err) {
    switch (err) {
    case OMX_ErrorInsufficientResources: return "OMX_ErrorInsufficientResources";
    case OMX_ErrorUndefined: return "OMX_ErrorUndefined";
    case OMX_ErrorInvalidComponentName: return "OMX_ErrorInvalidComponentName";
    case OMX_ErrorComponentNotFound: return "OMX_ErrorComponentNotFound";
    case OMX_ErrorInvalidComponent: return "OMX_ErrorInvalidComponent";
    case OMX_ErrorBadParameter: return "OMX_ErrorBadParameter";
    case OMX_ErrorNotImplemented: return "OMX_ErrorNotImplemented";
    case OMX_ErrorUnderflow: return "OMX_ErrorUnderflow";
    case OMX_ErrorOverflow: return "OMX_ErrorOverflow";
    case OMX_ErrorHardware: return "OMX_ErrorHardware";
    case OMX_ErrorInvalidState: return "OMX_ErrorInvalidState";
    case OMX_ErrorStreamCorrupt: return "OMX_ErrorStreamCorrupt";
    case OMX_ErrorPortsNotCompatible: return "OMX_ErrorPortsNotCompatible";
    case OMX_ErrorResourcesLost: return "OMX_ErrorResourcesLost";
    case OMX_ErrorNoMore: return "OMX_ErrorNoMore";
    case OMX_ErrorVersionMismatch: return "OMX_ErrorVersionMismatch";
    case OMX_ErrorNotReady: return "OMX_ErrorNotReady";
    case OMX_ErrorTimeout: return "OMX_ErrorTimeout";
    case OMX_ErrorSameState: return "OMX_ErrorSameState";
    case OMX_ErrorResourcesPreempted: return "OMX_ErrorResourcesPreempted";
    case OMX_ErrorPortUnresponsiveDuringAllocation: return "OMX_ErrorPortUnresponsiveDuringAllocation";
    case OMX_ErrorPortUnresponsiveDuringDeallocation: return "OMX_ErrorPortUnresponsiveDuringDeallocation";
    case OMX_ErrorPortUnresponsiveDuringStop: return "OMX_ErrorPortUnresponsiveDuringStop";
    case OMX_ErrorIncorrectStateTransition: return "OMX_ErrorIncorrectStateTransition";
    case OMX_ErrorIncorrectStateOperation: return "OMX_ErrorIncorrectStateOperation";
    case OMX_ErrorUnsupportedSetting: return "OMX_ErrorUnsupportedSetting";
    case OMX_ErrorUnsupportedIndex: return "OMX_ErrorUnsupportedIndex";
    case OMX_ErrorBadPortIndex: return "OMX_ErrorBadPortIndex";
    case OMX_ErrorPortUnpopulated: return "OMX_ErrorPortUnpopulated";
    case OMX_ErrorComponentSuspended: return "OMX_ErrorComponentSuspended";
    case OMX_ErrorDynamicResourcesUnavailable: return "OMX_ErrorDynamicResourcesUnavailable";
    case OMX_ErrorMbErrorsInFrame: return "OMX_ErrorMbErrorsInFrame";
    case OMX_ErrorFormatNotDetected: return "OMX_ErrorFormatNotDetected";
    case OMX_ErrorContentPipeOpenFailed: return "OMX_ErrorContentPipeOpenFailed";
    case OMX_ErrorContentPipeCreationFailed: return "OMX_ErrorContentPipeCreationFailed";
    case OMX_ErrorSeperateTablesUsed: return "OMX_ErrorSeperateTablesUsed";
    case OMX_ErrorTunnelingUnsupported: return "OMX_ErrorTunnelingUnsupported";
    default: return "unknown error";
    }
}

void eos_callback(void *userdata, COMPONENT_T *comp, OMX_U32 data) {
    fprintf(stderr, "Got eos event\n");
}

void error_callback(void *userdata, COMPONENT_T *comp, OMX_U32 data) {
    fprintf(stderr, "OMX error %s\n", err2str(data));
}

int get_file_size(char *fname) {
    struct stat st;

    if (stat(fname, &st) == -1) {
	perror("Stat'ing img file");
	return -1;
    }
    return(st.st_size);
}

AVPacket avpkt;
AVCodecContext *c = NULL;

/*
 * Audio decoding.
 */
static void audio_decode_example(const char *filename)
{
    AVCodec *codec;


    av_init_packet(&avpkt);

    printf("Audio decoding\n");

    /* find the mpeg audio decoder */
    codec = avcodec_find_decoder(CODEC_ID_MP3);
    if (!codec) {
        fprintf(stderr, "codec not found\n");
        exit(1);
    }

    c = avcodec_alloc_context3(codec);;

    /* open it */
    if (avcodec_open2(c, codec, NULL) < 0) {
        fprintf(stderr, "could not open codec\n");
        exit(1);
    }
}

static void set_audio_render_input_format(COMPONENT_T *component) {
    // set input audio format
    printf("Setting audio render format\n");
    OMX_AUDIO_PARAM_PORTFORMATTYPE audioPortFormat;
    //setHeader(&audioPortFormat,  sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    memset(&audioPortFormat, 0, sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    audioPortFormat.nSize = sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE);
    audioPortFormat.nVersion.nVersion = OMX_VERSION;

    audioPortFormat.nPortIndex = 100;


    OMX_GetParameter(ilclient_get_handle(component),
                     OMX_IndexParamAudioPortFormat, &audioPortFormat);

    audioPortFormat.eEncoding = OMX_AUDIO_CodingPCM;
    //audioPortFormat.eEncoding = OMX_AUDIO_CodingMP3;
    OMX_SetParameter(ilclient_get_handle(component),
                     OMX_IndexParamAudioPortFormat, &audioPortFormat);

    setPCMMode(ilclient_get_handle(component), 100);

}

int num_streams = 0;
int sample_size = 0;

OMX_ERRORTYPE read_into_buffer_and_empty(AVFrame *decoded_frame,
					 COMPONENT_T *component,
					 // OMX_BUFFERHEADERTYPE *buff_header,
					 int total_len) {
    OMX_ERRORTYPE r;
    OMX_BUFFERHEADERTYPE *buff_header = NULL;
    int k, m, n;

    if (total_len <= 4096) { //buff_header->nAllocLen) {
	// all decoded frame fits into one OpenMAX buffer
	buff_header = 
	    ilclient_get_input_buffer(component,
				      100,
				      1 /* block */);
	for (k = 0, n = 0; n < decoded_frame->nb_samples; n++) {
	    for (m = 0; m < num_streams; m++) {
		memcpy(&buff_header->pBuffer[k], 
		       &decoded_frame->data[m][n*sample_size], 
		       sample_size);
		k += sample_size;
	    }
	}

	buff_header->nFilledLen = k;
	r = OMX_EmptyThisBuffer(ilclient_get_handle(component),
				buff_header);
	if (r != OMX_ErrorNone) {
	    fprintf(stderr, "Empty buffer error %s\n",
		    err2str(r));
	}
	return r;
    }

    // more than one OpenMAX buffer required
    for (k = 0, n = 0; n < decoded_frame->nb_samples; n++) {

	if (k == 0) {
	     buff_header = 
		ilclient_get_input_buffer(component,
					  100,
					  1 /* block */);
	}

	// interleave the samples from the planar streams
	for (m = 0; m < num_streams; m++) {
	    memcpy(&buff_header->pBuffer[k], 
		   &decoded_frame->data[m][n*sample_size], 
		   sample_size);
	    k += sample_size;
	}

	if (k >= buff_header->nAllocLen) {
	    // this buffer is full
	    buff_header->nFilledLen = k;
	    r = OMX_EmptyThisBuffer(ilclient_get_handle(component),
				    buff_header);
	    if (r != OMX_ErrorNone) {
		fprintf(stderr, "Empty buffer error %s\n",
			err2str(r));
	    }
	    k = 0;
	    buff_header = NULL;
	}
    }
    if (buff_header != NULL) {
	    buff_header->nFilledLen = k;
	    r = OMX_EmptyThisBuffer(ilclient_get_handle(component),
				    buff_header);
	    if (r != OMX_ErrorNone) {
		fprintf(stderr, "Empty buffer error %s\n",
			err2str(r));
	    }
    }
    return r;
}


int main(int argc, char** argv) {

    int i;
    char *componentName;
    int err;
    ILCLIENT_T  *handle;
    COMPONENT_T *component;

    AVFormatContext *pFormatCtx = NULL;

    char *audio_file = AUDIO;
    if (argc == 2) {
	audio_file = argv[1];
    }

    FILE *fp = fopen(audio_file, "r");
    int toread = get_file_size(audio_file);

    OMX_BUFFERHEADERTYPE *buff_header;

    componentName = "audio_render";

    bcm_host_init();

    handle = ilclient_init();
    if (handle == NULL) {
	fprintf(stderr, "IL client init failed\n");
	exit(1);
    }

    if (OMX_Init() != OMX_ErrorNone) {
        ilclient_destroy(handle);
        fprintf(stderr, "OMX init failed\n");
	exit(1);
    }

    ilclient_set_error_callback(handle,
				error_callback,
				NULL);
    ilclient_set_eos_callback(handle,
			      eos_callback,
			      NULL);


    err = ilclient_create_component(handle,
				    &component,
				    componentName,
				    ILCLIENT_DISABLE_ALL_PORTS
				    |
				    ILCLIENT_ENABLE_INPUT_BUFFERS
				    );
    if (err == -1) {
	fprintf(stderr, "Component create failed\n");
	exit(1);
    }
    printState(ilclient_get_handle(component));

    err = ilclient_change_component_state(component,
					  OMX_StateIdle);
    if (err < 0) {
	fprintf(stderr, "Couldn't change state to Idle\n");
	exit(1);
    }
    printState(ilclient_get_handle(component));


    // FFmpeg init
    av_register_all();
    if(avformat_open_input(&pFormatCtx, audio_file, NULL, NULL)!=0) {
	fprintf(stderr, "Can't get format\n");
        return -1; // Couldn't open file
    }
    // Retrieve stream information
    if(avformat_find_stream_info(pFormatCtx, NULL)<0)
	return -1; // Couldn't find stream information
    av_dump_format(pFormatCtx, 0, audio_file, 0);
    
    audio_decode_example(audio_file);


    // must be before we enable buffers
    set_audio_render_input_format(component);

    setOutputDevice(ilclient_get_handle(component), "local");

    // input port
    ilclient_enable_port_buffers(component, 100, 
				 NULL, NULL, NULL);
    ilclient_enable_port(component, 100);




    err = ilclient_change_component_state(component,
					  OMX_StateExecuting);
    if (err < 0) {
	fprintf(stderr, "Couldn't change state to Executing\n");
	exit(1);
    }
    printState(ilclient_get_handle(component));


    // now work through the file

    int len;
    FILE *f;
    uint8_t inbuf[AUDIO_INBUF_SIZE + FF_INPUT_BUFFER_PADDING_SIZE];

    AVFrame *decoded_frame = NULL;


    f = fopen(audio_file, "rb");
    if (!f) {
        fprintf(stderr, "could not open %s\n", audio_file);
        exit(1);
    }

    /* decode until eof */
    avpkt.data = inbuf;
    avpkt.size = fread(inbuf, 1, AUDIO_INBUF_SIZE, f);

    while (avpkt.size > 0) {
        int got_frame = 0;

        if (!decoded_frame) {
            if (!(decoded_frame = avcodec_alloc_frame())) {
                fprintf(stderr, "out of memory\n");
                exit(1);
            }
        } else
            avcodec_get_frame_defaults(decoded_frame);

        len = avcodec_decode_audio4(c, decoded_frame, &got_frame, &avpkt);
        if (len < 0) {
            fprintf(stderr, "Error while decoding\n");
            exit(1);
        }
        if (got_frame) {
            /* if a frame has been decoded, we want to send it to OpenMAX */
            int data_size = av_samples_get_buffer_size(NULL, c->channels,
                                                       decoded_frame->nb_samples,
                                                       c->sample_fmt, 1);
	    // first time: count the number of  planar streams
	    if (num_streams == 0) {
		while (num_streams < AV_NUM_DATA_POINTERS &&
		       decoded_frame->data[num_streams] != NULL) 
		    num_streams++; 
	    }

	    // first time: set sample_size from 0 to e.g 2 for 16-bit data
	    if (sample_size == 0) {
		sample_size = 
		    data_size / (num_streams * decoded_frame->nb_samples);
	    }

	    // Empty into render_audio input buffers
	    read_into_buffer_and_empty(decoded_frame,
				       component,
				       data_size
				       );
	}

	avpkt.size -= len;
        avpkt.data += len;
        if (avpkt.size < AUDIO_REFILL_THRESH) {
            /* Refill the input buffer, to avoid trying to decode
             * incomplete frames. Instead of this, one could also use
             * a parser, or use a proper container format through
             * libavformat. */
            memmove(inbuf, avpkt.data, avpkt.size);
            avpkt.data = inbuf;
            len = fread(avpkt.data + avpkt.size, 1,
                        AUDIO_INBUF_SIZE - avpkt.size, f);
            if (len > 0)
                avpkt.size += len;
        }
    }

    printf("Finished decoding MP3\n");
    // clean up last empty buffer with EOS
    buff_header = 
	ilclient_get_input_buffer(component,
				  100,
				  1 /* block */);
    buff_header->nFilledLen = 0;
    int r;
    buff_header->nFlags |= OMX_BUFFERFLAG_EOS;
    r = OMX_EmptyThisBuffer(ilclient_get_handle(component),
			    buff_header);
    if (r != OMX_ErrorNone) {
	fprintf(stderr, "Empty buffer error %s\n",
		err2str(r));
    } else {
	printf("EOS sent\n");
    }

    fclose(f);

    avcodec_close(c);
    av_free(c);
    av_free(decoded_frame);

    sleep(10);
    exit(0);
}

      

Rendering MP3 with ID3 extensions using FFmpeg/LibAV and OpenMAX

MP3 was originally designed as a stereo format without metadata information (artist, date of recording, etc). Now we have 5.1 and 6.1 formats with probably more coming. The "MP3 Surround" extension looks after this. Metadata is typically added as an ID3 extension. (I only discovered this after the above program broke badly on some newer MP3 files I have - "No MP3 header").

the command file will identify such files by

	
$ file Beethoven.mp3 
Beethoven.mp3: Audio file with ID3 version 2.3.0
	
      

An MP3 file will usually have two channels for stereo. An MP3+ID3 file containing, say, an image file, wil have two streams, one for the audio and one for the image. The av_dump_format will show all of the IDE metadata, plus the image stream:

	
Input #0, mp3, from 'Beethoven.mp3':
  Metadata:
    copyright       : 2013 Naxos Digital Services Ltd.
    album           : String Quartets - BEETHOVEN L van HAYDN FJ MOZART WA SCHUBERT F JANACEK L (Petersen Quar~1
    TSRC            : US2TL0937001
    title           : String Quartet No 6 in B-Flat Major Op 18 No 6 I Allegro con brio
    TIT1            : String Quartets - BEETHOVEN L van HAYDN FJ MOZART WA SCHUBERT F JANACEK L (Petersen Quar~1
    disc            : 1
    TLEN            : 522200
    track           : 1
    publisher       : Capriccio-(C51147)
    encoder         : LAME 32bits version 3.98.2 (http://www.mp3dev.org/)
    album_artist    : Petersen Quartet
    artist          : Petersen Quartet
    TEXT            : Beethoven, Ludwig van
    TOFN            : 729325.mp3
    genre           : Classical Music
    composer        : Beethoven, Ludwig van
    date            : 2009
  Duration: 00:08:42.24, start: 0.000000, bitrate: 323 kb/s
    Stream #0.0: Audio: mp3, 44100 Hz, 2 channels, s16p, 320 kb/s
    Stream #0.1: Video: mjpeg, yuvj444p, 500x509 [PAR 300:300 DAR 500:509], 90k tbn
    Metadata:
      title           : 
      comment         : Cover (front)
	
      

Stream 0 is the audio stream while an image is in stream 1.

We want to pass the MP3 stream to the FFMpeg/LibAV audio decoder but not the image stream. An AV frame contains the field stream_index which can be used to distinguish between them. If it is the audio stream, pass it to the OMX audio renderer, otherwise skip it. We shall see similar behaviour when we look at rendering audio and video MPEG files. We find the audio stream index by asking av_find_best_stream for the AVMEDIA_TYPE_AUDIO stream.

In the previous sections, we read chunks in from the audio file and then relied on the decoder to break that into frames. That is now apparently on the way out: instead we should use the function av_read_frame which only reads one frame at a time.

Unfortunately, on the RPi distro I was using, LibAV was at package 52, and that has a broken implementation of av_read_frame. You will need to do a package upgrade of LibAV to version 53 in order for the following code to work (the original version got the stream_index wrong).

So by this stage we read frames using av_read_frame, find the audio stream index of the frame and use this to distinguish between audio and other. We still get frames with the audio samples in the wrong format such as AV_SAMPLE_FMT_S16P instead of AV_SAMPLE_FMT_S16. In the previous sections we re-formatted the stream by hand. But of course, the formats might change again, so our code would break.

The Audio Resample package gives a general purpose way of managing this. It requires more set up, but once done will be more stable...

...except for a little glitch: this is one of the few areas in which FFMpeg and LibAV have different APIs. How to do it with FFMpeg is shown in How to convert sample rate from AV_SAMPLE_FMT_FLTP to AV_SAMPLE_FMT_S16? We are using LibAV, which is basically the same but with differently named types. The following sets up the conversion parameters

	
    AVAudioResampleContext *swr = avresample_alloc_context();
    av_opt_set_int(swr, "in_channel_layout",  audio_dec_ctx->channel_layout, 0);
    av_opt_set_int(swr, "out_channel_layout", audio_dec_ctx->channel_layout,  0);
    av_opt_set_int(swr, "in_sample_rate",     audio_dec_ctx->sample_rate, 0);
    av_opt_set_int(swr, "out_sample_rate",    audio_dec_ctx->sample_rate, 0);
    av_opt_set_int(swr, "in_sample_fmt",  audio_dec_ctx->sample_fmt, 0);
    av_opt_set_int(swr, "out_sample_fmt", AV_SAMPLE_FMT_S16,  0);
    avresample_open(swr);
	
      

and this performs the conversion:

	
    uint8_t *buffer;
    av_samples_alloc(&buffer, &out_linesize, 2, decoded_frame->nb_samples,
		     AV_SAMPLE_FMT_S16, 0);
    avresample_convert(swr, &buffer, 
		       decoded_frame->linesize[0], 
		       decoded_frame->nb_samples, 
		       decoded_frame->data, 
		       decoded_frame->linesize[0], 
		       decoded_frame->nb_samples);

	
      

The revised program is il_ffmpeg_render_resample_audio.c

#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>

#include <OMX_Core.h>
#include <OMX_Component.h>

#include <bcm_host.h>
#include <ilclient.h>


#include "libavcodec/avcodec.h"
#include <libavformat/avformat.h>
#include "libavutil/mathematics.h"
#include "libavutil/samplefmt.h"
#include "libavutil/opt.h"
#include "libavresample/avresample.h"

#define INBUF_SIZE 4096
#define AUDIO_INBUF_SIZE 20480
#define AUDIO_REFILL_THRESH 4096

#define AUDIO  "BST.mp3"

AVCodecContext *audio_dec_ctx;
int audio_stream_idx;

/* For the RPi name can be "hdmi" or "local" */
void setOutputDevice(OMX_HANDLETYPE handle, const char *name) {
    OMX_ERRORTYPE err;
    OMX_CONFIG_BRCMAUDIODESTINATIONTYPE arDest;

    if (name && strlen(name) < sizeof(arDest.sName)) {
	memset(&arDest, 0, sizeof(OMX_CONFIG_BRCMAUDIODESTINATIONTYPE));
	arDest.nSize = sizeof(OMX_CONFIG_BRCMAUDIODESTINATIONTYPE);
	arDest.nVersion.nVersion = OMX_VERSION;

	strcpy((char *)arDest.sName, name);
       
	err = OMX_SetParameter(handle, OMX_IndexConfigBrcmAudioDestination, &arDest);
	if (err != OMX_ErrorNone) {
	    fprintf(stderr, "Error on setting audio destination\n");
	    exit(1);
	}
    }
}

void setPCMMode(OMX_HANDLETYPE handle, int startPortNumber) {
    OMX_AUDIO_PARAM_PCMMODETYPE sPCMMode;
    OMX_ERRORTYPE err;
 
    memset(&sPCMMode, 0, sizeof(OMX_AUDIO_PARAM_PCMMODETYPE));
    sPCMMode.nSize = sizeof(OMX_AUDIO_PARAM_PCMMODETYPE);
    sPCMMode.nVersion.nVersion = OMX_VERSION;

    sPCMMode.nPortIndex = startPortNumber;

    err = OMX_GetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    printf("Sampling rate %d, channels %d\n",
	   sPCMMode.nSamplingRate, 
	   sPCMMode.nChannels);

    sPCMMode.nSamplingRate = 44100;
    sPCMMode.nChannels = 2; // assumed for now - should be checked

    err = OMX_SetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    if(err != OMX_ErrorNone){
	fprintf(stderr, "PCM mode unsupported\n");
	return;
    } else {
	fprintf(stderr, "PCM mode supported\n");
	fprintf(stderr, "PCM sampling rate %d\n", sPCMMode.nSamplingRate);
	fprintf(stderr, "PCM nChannels %d\n", sPCMMode.nChannels);
    } 
}

void printState(OMX_HANDLETYPE handle) {
    OMX_STATETYPE state;
    OMX_ERRORTYPE err;

    err = OMX_GetState(handle, &state);
    if (err != OMX_ErrorNone) {
        fprintf(stderr, "Error on getting state\n");
        exit(1);
    }
    switch (state) {
    case OMX_StateLoaded:           printf("StateLoaded\n"); break;
    case OMX_StateIdle:             printf("StateIdle\n"); break;
    case OMX_StateExecuting:        printf("StateExecuting\n"); break;
    case OMX_StatePause:            printf("StatePause\n"); break;
    case OMX_StateWaitForResources: printf("StateWait\n"); break;
    case OMX_StateInvalid:          printf("StateInvalid\n"); break;
    default:                        printf("State unknown\n"); break;
    }
}

char *err2str(int err) {
    switch (err) {
    case OMX_ErrorInsufficientResources: return "OMX_ErrorInsufficientResources";
    case OMX_ErrorUndefined: return "OMX_ErrorUndefined";
    case OMX_ErrorInvalidComponentName: return "OMX_ErrorInvalidComponentName";
    case OMX_ErrorComponentNotFound: return "OMX_ErrorComponentNotFound";
    case OMX_ErrorInvalidComponent: return "OMX_ErrorInvalidComponent";
    case OMX_ErrorBadParameter: return "OMX_ErrorBadParameter";
    case OMX_ErrorNotImplemented: return "OMX_ErrorNotImplemented";
    case OMX_ErrorUnderflow: return "OMX_ErrorUnderflow";
    case OMX_ErrorOverflow: return "OMX_ErrorOverflow";
    case OMX_ErrorHardware: return "OMX_ErrorHardware";
    case OMX_ErrorInvalidState: return "OMX_ErrorInvalidState";
    case OMX_ErrorStreamCorrupt: return "OMX_ErrorStreamCorrupt";
    case OMX_ErrorPortsNotCompatible: return "OMX_ErrorPortsNotCompatible";
    case OMX_ErrorResourcesLost: return "OMX_ErrorResourcesLost";
    case OMX_ErrorNoMore: return "OMX_ErrorNoMore";
    case OMX_ErrorVersionMismatch: return "OMX_ErrorVersionMismatch";
    case OMX_ErrorNotReady: return "OMX_ErrorNotReady";
    case OMX_ErrorTimeout: return "OMX_ErrorTimeout";
    case OMX_ErrorSameState: return "OMX_ErrorSameState";
    case OMX_ErrorResourcesPreempted: return "OMX_ErrorResourcesPreempted";
    case OMX_ErrorPortUnresponsiveDuringAllocation: return "OMX_ErrorPortUnresponsiveDuringAllocation";
    case OMX_ErrorPortUnresponsiveDuringDeallocation: return "OMX_ErrorPortUnresponsiveDuringDeallocation";
    case OMX_ErrorPortUnresponsiveDuringStop: return "OMX_ErrorPortUnresponsiveDuringStop";
    case OMX_ErrorIncorrectStateTransition: return "OMX_ErrorIncorrectStateTransition";
    case OMX_ErrorIncorrectStateOperation: return "OMX_ErrorIncorrectStateOperation";
    case OMX_ErrorUnsupportedSetting: return "OMX_ErrorUnsupportedSetting";
    case OMX_ErrorUnsupportedIndex: return "OMX_ErrorUnsupportedIndex";
    case OMX_ErrorBadPortIndex: return "OMX_ErrorBadPortIndex";
    case OMX_ErrorPortUnpopulated: return "OMX_ErrorPortUnpopulated";
    case OMX_ErrorComponentSuspended: return "OMX_ErrorComponentSuspended";
    case OMX_ErrorDynamicResourcesUnavailable: return "OMX_ErrorDynamicResourcesUnavailable";
    case OMX_ErrorMbErrorsInFrame: return "OMX_ErrorMbErrorsInFrame";
    case OMX_ErrorFormatNotDetected: return "OMX_ErrorFormatNotDetected";
    case OMX_ErrorContentPipeOpenFailed: return "OMX_ErrorContentPipeOpenFailed";
    case OMX_ErrorContentPipeCreationFailed: return "OMX_ErrorContentPipeCreationFailed";
    case OMX_ErrorSeperateTablesUsed: return "OMX_ErrorSeperateTablesUsed";
    case OMX_ErrorTunnelingUnsupported: return "OMX_ErrorTunnelingUnsupported";
    default: return "unknown error";
    }
}

void eos_callback(void *userdata, COMPONENT_T *comp, OMX_U32 data) {
    fprintf(stderr, "Got eos event\n");
}

void error_callback(void *userdata, COMPONENT_T *comp, OMX_U32 data) {
    fprintf(stderr, "OMX error %s\n", err2str(data));
}

int get_file_size(char *fname) {
    struct stat st;

    if (stat(fname, &st) == -1) {
	perror("Stat'ing img file");
	return -1;
    }
    return(st.st_size);
}

AVPacket avpkt;

static void set_audio_render_input_format(COMPONENT_T *component) {
    // set input audio format
    printf("Setting audio render format\n");
    OMX_AUDIO_PARAM_PORTFORMATTYPE audioPortFormat;
    //setHeader(&audioPortFormat,  sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    memset(&audioPortFormat, 0, sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    audioPortFormat.nSize = sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE);
    audioPortFormat.nVersion.nVersion = OMX_VERSION;

    audioPortFormat.nPortIndex = 100;


    OMX_GetParameter(ilclient_get_handle(component),
                     OMX_IndexParamAudioPortFormat, &audioPortFormat);

    audioPortFormat.eEncoding = OMX_AUDIO_CodingPCM;
    //audioPortFormat.eEncoding = OMX_AUDIO_CodingMP3;
    OMX_SetParameter(ilclient_get_handle(component),
                     OMX_IndexParamAudioPortFormat, &audioPortFormat);

    setPCMMode(ilclient_get_handle(component), 100);

}

int num_streams = 0;
int sample_size = 0;

OMX_ERRORTYPE read_into_buffer_and_empty(AVFrame *decoded_frame,
					 COMPONENT_T *component,
					 // OMX_BUFFERHEADERTYPE *buff_header,
					 int total_len) {
    OMX_ERRORTYPE r;
    OMX_BUFFERHEADERTYPE *buff_header = NULL;

    // do this once only
    AVAudioResampleContext *swr = avresample_alloc_context();
    av_opt_set_int(swr, "in_channel_layout",  audio_dec_ctx->channel_layout, 0);
    av_opt_set_int(swr, "out_channel_layout", audio_dec_ctx->channel_layout,  0);
    av_opt_set_int(swr, "in_sample_rate",     audio_dec_ctx->sample_rate, 0);
    av_opt_set_int(swr, "out_sample_rate",    audio_dec_ctx->sample_rate, 0);
    av_opt_set_int(swr, "in_sample_fmt",  audio_dec_ctx->sample_fmt, 0);
    av_opt_set_int(swr, "out_sample_fmt", AV_SAMPLE_FMT_S16,  0);
    avresample_open(swr);

    int required_decoded_size = 0;
    
    int out_linesize;
    required_decoded_size = 
	av_samples_get_buffer_size(&out_linesize, 2, 
				   decoded_frame->nb_samples,
				   AV_SAMPLE_FMT_S16, 0);
    uint8_t *buffer;
    av_samples_alloc(&buffer, &out_linesize, 2, decoded_frame->nb_samples,
		     AV_SAMPLE_FMT_S16, 0);
    avresample_convert(swr, &buffer, 
		       decoded_frame->linesize[0], 
		       decoded_frame->nb_samples, 
		       // decoded_frame->extended_data, 
		       decoded_frame->data, 
		       decoded_frame->linesize[0], 
		       decoded_frame->nb_samples);

    while (required_decoded_size >= 0) {
	buff_header = 
	    ilclient_get_input_buffer(component,
				      100,
				      1 /* block */);
	if (required_decoded_size > 4096) {
	    memcpy(buff_header->pBuffer,
		   buffer, 4096);
	    buff_header->nFilledLen = 4096;
	    buffer += 4096;
	} else {
	     memcpy(buff_header->pBuffer,
		   buffer, required_decoded_size);
	    buff_header->nFilledLen = required_decoded_size;
	}
	required_decoded_size -= 4096;
	
	r = OMX_EmptyThisBuffer(ilclient_get_handle(component),
				buff_header);
	if (r != OMX_ErrorNone) {
	    fprintf(stderr, "Empty buffer error %s\n",
		    err2str(r));
	    return r;
	}
    }
    return r;
}

FILE *favpkt = NULL;

int main(int argc, char** argv) {

    int i;
    char *componentName;
    int err;
    ILCLIENT_T  *handle;
    COMPONENT_T *component;

    AVFormatContext *pFormatCtx = NULL;

    char *audio_file = AUDIO;
    if (argc == 2) {
	audio_file = argv[1];
    }

    OMX_BUFFERHEADERTYPE *buff_header;

    componentName = "audio_render";

    bcm_host_init();

    handle = ilclient_init();
    if (handle == NULL) {
	fprintf(stderr, "IL client init failed\n");
	exit(1);
    }

    if (OMX_Init() != OMX_ErrorNone) {
        ilclient_destroy(handle);
        fprintf(stderr, "OMX init failed\n");
	exit(1);
    }

    ilclient_set_error_callback(handle,
				error_callback,
				NULL);
    ilclient_set_eos_callback(handle,
			      eos_callback,
			      NULL);


    err = ilclient_create_component(handle,
				    &component,
				    componentName,
				    ILCLIENT_DISABLE_ALL_PORTS
				    |
				    ILCLIENT_ENABLE_INPUT_BUFFERS
				    );
    if (err == -1) {
	fprintf(stderr, "Component create failed\n");
	exit(1);
    }
    printState(ilclient_get_handle(component));

    err = ilclient_change_component_state(component,
					  OMX_StateIdle);
    if (err < 0) {
	fprintf(stderr, "Couldn't change state to Idle\n");
	exit(1);
    }
    printState(ilclient_get_handle(component));


    // FFmpeg init
    av_register_all();
    if(avformat_open_input(&pFormatCtx, audio_file, NULL, NULL)!=0) {
	fprintf(stderr, "Can't get format\n");
        return -1; // Couldn't open file
    }
    // Retrieve stream information
    if(avformat_find_stream_info(pFormatCtx, NULL)<0)
	return -1; // Couldn't find stream information
    av_dump_format(pFormatCtx, 0, audio_file, 0);

    int ret;
    if ((ret = av_find_best_stream(pFormatCtx, AVMEDIA_TYPE_AUDIO, -1, -1, NULL, 0)) >= 0) {
	//AVCodecContext* codec_context;
	AVStream *audio_stream;
	int sample_rate;

	audio_stream_idx = ret;
	fprintf(stderr, "Audio stream index is %d\n", ret);

	audio_stream = pFormatCtx->streams[audio_stream_idx];
	audio_dec_ctx = audio_stream->codec;
	//c = audio_dec_ctx;

#if 0
	if (audio_stream->codec->extradata_size > 0) {
	    fprintf(stderr, "Non zero extra data!!\n");
	} else {
	     fprintf(stderr, "Zero extra data!!\n");
	}
#endif

	sample_rate = audio_dec_ctx->sample_rate;
	printf("Sample rate is %d\n", sample_rate);
	printf("Sample format is %d\n", audio_dec_ctx->sample_fmt);
	printf("Num channels %d\n", audio_dec_ctx->channels);

	if (audio_dec_ctx->channel_layout == 0) {
	    audio_dec_ctx->channel_layout = 
		av_get_default_channel_layout(audio_dec_ctx->channels);
	}

	AVCodec *codec = avcodec_find_decoder(audio_stream->codec->codec_id);
	if (avcodec_open2(audio_dec_ctx, codec, NULL) < 0) {
	    fprintf(stderr, "could not open codec\n");
	    exit(1);
	}

	if (codec) {
	    printf("Codec name %s\n", codec->name);
	}
    }

    av_init_packet(&avpkt);

    // must be before we enable buffers
    set_audio_render_input_format(component);

    setOutputDevice(ilclient_get_handle(component), "local");

    // input port
    ilclient_enable_port_buffers(component, 100, 
				 NULL, NULL, NULL);
    ilclient_enable_port(component, 100);

    err = ilclient_change_component_state(component,
					  OMX_StateExecuting);
    if (err < 0) {
	fprintf(stderr, "Couldn't change state to Executing\n");
	exit(1);
    }
    printState(ilclient_get_handle(component));


    // now work through the file

    int len;
    uint8_t inbuf[AUDIO_INBUF_SIZE + FF_INPUT_BUFFER_PADDING_SIZE];

    AVFrame *decoded_frame = NULL;

    /* decode until eof */
    avpkt.data = inbuf;
    av_read_frame(pFormatCtx, &avpkt);

    while (avpkt.size > 0) {
	printf("Packet size %d\n", avpkt.size);
	printf("Stream idx is %d\n", avpkt.stream_index);
	printf("Codec type %d\n", pFormatCtx->streams[1]->codec->codec_type);
	
	if (avpkt.stream_index != audio_stream_idx) {
	    // it's an image, subtitle, etc
	    av_read_frame(pFormatCtx, &avpkt);
	    continue;
	}
	
        int got_frame = 0;

	if (favpkt == NULL) {
	    favpkt = fopen("tmp.mp3", "wb");
	}
	fwrite(avpkt.data, 1, avpkt.size, favpkt);

        if (!decoded_frame) {
            if (!(decoded_frame = avcodec_alloc_frame())) {
                fprintf(stderr, "out of memory\n");
                exit(1);
            }
	}

        len = avcodec_decode_audio4(audio_dec_ctx, 
				     decoded_frame, &got_frame, &avpkt);
        if (len < 0) {
            fprintf(stderr, "Error while decoding\n");
            exit(1);
        }
        if (got_frame) {
            /* if a frame has been decoded, we want to send it to OpenMAX */
            int data_size =
		av_samples_get_buffer_size(NULL, audio_dec_ctx->channels,
					   decoded_frame->nb_samples,
					   audio_dec_ctx->sample_fmt, 1);

	    // Empty into render_audio input buffers
	    read_into_buffer_and_empty(decoded_frame,
				       component,
				       data_size
				       );
	}
	av_read_frame(pFormatCtx, &avpkt);
	continue;
#if 0
	avpkt.size -= len;
        avpkt.data += len;
        if (avpkt.size < AUDIO_REFILL_THRESH) {
            /* Refill the input buffer, to avoid trying to decode
             * incomplete frames. Instead of this, one could also use
             * a parser, or use a proper container format through
             * libavformat. */
            memmove(inbuf, avpkt.data, avpkt.size);
            avpkt.data = inbuf;
	    len = av_read_frame(pFormatCtx, &avpkt);
	    /*
            len = fread(avpkt.data + avpkt.size, 1,
                        AUDIO_INBUF_SIZE - avpkt.size, f);
	    */
            if (len > 0)
                avpkt.size += len;
        }
#endif
    }

    printf("Finished decoding MP3\n");
    // clean up last empty buffer with EOS
    buff_header = 
	ilclient_get_input_buffer(component,
				  100,
				  1 /* block */);
    buff_header->nFilledLen = 0;
    int r;
    buff_header->nFlags |= OMX_BUFFERFLAG_EOS;
    r = OMX_EmptyThisBuffer(ilclient_get_handle(component),
			    buff_header);
    if (r != OMX_ErrorNone) {
	fprintf(stderr, "Empty buffer error %s\n",
		err2str(r));
    } else {
	printf("EOS sent\n");
    }

    avcodec_close(audio_dec_ctx);
    av_free(audio_dec_ctx);
    av_free(decoded_frame);

    sleep(10);
    exit(0);
}

      

Conclusion

Audio support is not so good with the RPi. We have to use tools such as FFmpeg/LibAV or Gstreamer.


      

Copyright © Jan Newmarch, jan@newmarch.name
Creative Commons License
" Programming AudioVideo on the Raspberry Pi GPU " by Jan Newmarch is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License .
Based on a work at https://jan.newmarch.name/RPi/ .

If you like this book, please contribute using PayPal

Or Flattr me:
Flattr this book