×

Discussion Board

Results 1 to 7 of 7
  1. #1
    Registered User
    Join Date
    Jul 2008
    Posts
    4

    Arrow Detecting spoken words within an AMR compressed audio file

    Hi,

    I have an application which records audio from the phones mic and uploads the audio to a server. However, to reduce the data charges incurred I only want to upload audio if it contains speech (basically any audio which contains more than plain background noise).

    Is there an easy way to do this?

    I presume you could analyse the amr file to determine the "frequency" of recorded segments to determine the presence of greater than normal noise... However, if this is a solution I dont have a clue how I would go about coding it.

    Any help is greatly appreciated!

  2. #2
    Registered User
    Join Date
    Jul 2008
    Posts
    4

    Re: Detecting spoken words within an AMR compressed audio file

    I have looked into the structure of an amr file here: http://wiki.forum.nokia.com/index.php/AMR_format

    I understand that each frame of audio is 20 ms in length - Is there anyway to determine the frequency (loudness) of each frame?
    Basically if I can determine that some frames include audio louder than normal background noise - then there may be speech present.

    Can anyone explain how I may do this?

  3. #3
    Registered User
    Join Date
    Jul 2008
    Posts
    4

    Re: Detecting spoken words within an AMR compressed audio file

    Okay, still no replies - but I have found out something which may help me achieve what I want.

    Apparently...
    Discontinuous transmission (DTX) is applied to turn off the transmission in AMR during periods of silence.

    Is this something you can achieve in J2ME or is it a feature reserved for Symbian devices?

    * Hoping for a reply *

  4. #4
    Super Contributor
    Join Date
    Mar 2003
    Location
    Finland
    Posts
    9,553

    Re: Detecting spoken words within an AMR compressed audio file

    I suppose that'd be done when you use a voice codec and a voice call. I don't know how useful that'd be for an AMR file you've recorded yourself, and are trying to upload over a packet data connection.

  5. #5
    Registered User
    Join Date
    Jul 2008
    Posts
    4

    Re: Detecting spoken words within an AMR compressed audio file

    Hi, thanks for your reply!

    Yes I am trying to implement a voip application using J2ME. It is proving problematic as I guess J2ME is not the ideal platform to implement voip, however im giving it my best attempt!

    I originally used pcm encoding to record from the mic and then compressed the captured audio to determine if speech was present within the audio. (1 sec of background noise was < 1kb whereas audio with speech was always > 1kb). However amr encoding always creates a smaller filesize than pcm so I wanted to use amr.

    My problem however is 1 sec of AMR audio is about 3kb, and I cannot determine if that second of audio is background noise or audio which contains speech. I need a way to determine if the audio contains speech...

    Any ideas?

    Oh and I am using simple httpconnection to communicate the audio contained in a byte array between the two mobiles, not rtp etc (I want to keep the application as compatible as possible amongst as many devices)

  6. #6
    Registered User
    Join Date
    Sep 2007
    Posts
    20

    Re: Detecting spoken words within an AMR compressed audio file

    Well you can try to implement an audio gate. Whenever the level of audio is lower than a specified threashold the gate would turn your level to 0. The audio gate uses an envelope to filter the incoming signal and then uses a threashold in db to check it.
    You would need to access the raw audio data to do that. I don't think J2ME allows you to do that and if you really want to do it you would have to parse the file yourself after you get it as a stream.

    As far as frequency is concerned, figuring out the frequency of the signal would not give you the loudness... It would give you the power of the signal at a given frequency. To get the frequency anyway of any given signal you would need to calculate its Fourier Transform. If you need to do that for small windows of as you said 20 ms that would require even more calculations and the short time Fourier Transform.

    In my opinion, that's too much for J2ME but do a google search and see what you can find.

    Hope that answers some of your questions

  7. #7
    Registered User
    Join Date
    May 2009
    Posts
    2

    Re: Detecting spoken words within an AMR compressed audio file

    I'm actually trying to do the same exact thing. At least the part of being able to detect voice in an amr encoded recording from within the phone. I was wondering if you were able to do this and if so, how.

    Thanks!

Similar Threads

  1. Error : Cannot find file : for DUMMIES
    By PACALA_BA in forum Symbian
    Replies: 7
    Last Post: 2008-07-06, 13:18
  2. Converting AMR buffer to PCM - low audio quality output :-(
    By coolbreez in forum Symbian Media (Closed)
    Replies: 10
    Last Post: 2006-10-09, 05:13
  3. Half Duplex Audio Streaming
    By subhrajyotisaha in forum Symbian Media (Closed)
    Replies: 0
    Last Post: 2006-05-25, 04:39
  4. Replies: 1
    Last Post: 2005-05-26, 14:22
  5. Replies: 0
    Last Post: 2002-06-10, 12:24

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
×