Maverick's diary: 09/01/2017

Monday, September 25, 2017

New answer on Quora: What is the need for speech signal processing?

Signal is information and speech is very much so.

A sentence spoken (speech) tells you much more than what you can infer from just reading the same (text).

Attributes of speech are:

Core information: What the speaker intends to convey (message).
Gender: Female voice has higher frequencies compared to male’s.
Age: Voice deepens and crackles as age progresses.
Timbre: Everyone has a unique voice. It is almost as unique as his/her fingerprint.
Emotion: Angst, laughter, weeping, crying, etc.
intensity: Whisper, talk, shout, scream.

The above mentioned can now be extracted and exploited by a computing device using signal processing methods. Few interesting applications are:

Speech recognition (speech to text): To identify what the speaker has said by essentially converting it into text for further processing/storage. Examples are Siri and Google Assistant.
Speaker recognition (voice biometric): To establish identity of the speaker and maybe use it to unlock phone or start a car. This is different from recognition and can be combined with it to unlock by saying a ‘pass phrase’.
speech coding: To effectively store as well as transmit speech in digital form over a channel (internet calls, mobile network, telephone cables, satellite link) using least bandwidth and in error free way.
speech synthesis (Text to speech): To artificially produce speech using systems which mimic the entire mechanism from human vocal cord vibration, air flow out of trachea to filtering effects caused by oral and nasal cavities. Examples are Google assistant and Microsoft Sam.
voice analysis: To medically diagnose the human vocal system from voice samples of patient.
speech enhancement: To improve quality of speech affected by noise in applications like teleconference, VoIP, mobile call, hearing aids.
voice morphing: To impersonate another individual’s voice using words spoken by you. Voice mimicry is one form done by humans. Now we are training computers to do the same. We may one day reach the perfection of regenerating Micheal Jackson’s voice and songs while lyrics are written and sung by someone else. Example from fiction is the voice ‘sticker’ used by Tom Cruise in Mission Impossible movie series.

Signal processing of speech has come a long way from the invention of telephone to voice calls on WhatsApp (which are surprisingly clearer than calls over mobile network).

Tuesday, September 19, 2017

GATE meme #1

Hi!

As a part of preparation, I have decided to post memes that are tuned to the minds of GATE aspirants. These can also be related to folks taking similar competitive exams(largely applicable to India). You can share the blog's link if you want to share these memes.

Apply for the exam (GATE 2018) on or before 5th October here.

Here is the first meme.

Goutham

Thursday, September 7, 2017

The Breakup

I got a call today in the afternoon.

An unfamiliar female voice asked "Am I speaking to Goutham Sharma?".

I replied "Yes".

She said "Hello Goutham, this is HR department of Cognizant".

It took me a second to comprehend what was going on. Then my reply was "Oh!... Hi!".

"Hi Mr. Goutham! So... are your declining your job offer?" I was asked.

"Yes" was my reply without any second thought (She must be heartbroken hearing that).

She continued "Are you sure?".

"Yes. I am sure"

"Thank you Goutham. What can I write as the reason for you declining?"

"Opted for higher studies"

"Thank you very much", she said in dispassionate tone.

"Thank you."

THE END

Maverick's diary

Attention!