Audio file to Word document software?

Dahu371

Free Member
Apr 21, 2009
144
6
Are there any software products good enough yet to transcribe an audio file to text usefully?

We've been doing a lot of interviews of late, recording them and then transcribing them to Word. It would save us a hell of a lot of effort if there's something out there that could help us. I appreciate it's not going to be 100% accurate given the range of mumblings and accents we have, but if we could get even 50-60% accuracy this might be a start.

Or would something like a PPH contractor be useful? Any thoughts?
 

Paul_Rosser

Free Member
Jul 5, 2012
4,567
1,107
London and Essex
Don't know of any software which will even be 50% accurate due to the variety of different people/accents talking in an average meeting.

Even personal dictation software like dragon needs to learn how you speak and even then it's not 100%.

There are lots of agencies you can send your audio file to and they transcribe into word and send back to you, normally these aren't that expensive and may be cheaper than using internal resource.
 
Upvote 0

Dahu371

Free Member
Apr 21, 2009
144
6
Thanks. Looks like the machines aren't there yet then and the human touch is needed. The links in Tim's response above all cost around $1 per minute of audio so this works out at around £40 per hour. Still seems on the high side to me, I think we'll keep this in-house for the time being.
 
Upvote 0
We he been looking at this issue for ages, but the last time we did (about a year ago) the results to clear dictation were still laughable and painfully slow. I type at 60wpm and at least what I type is what was said. As Paul said, get a decent typist for a few quid an hour and give him/her a good system with a footswitch and the thing is done quickly and relatively cheaply.

The problem is the same one as with translation SW. Digital systems just do not understand human thought and therefore do not see much wrong with a sentence like "It is a not raining here also."

The famous sentence "Eats, shoots and leaves." and/or "Eats shoots and leaves." (Gangster/Panda) highlights a difference that no computer SW will understand any time soon!
 
Upvote 0
I've been privy to some very clever people trying to solve just this problem using software and even some specialist hardware, turns out its very tricky!

There are lots of examples like the 'panda' one above which is the title of a book for those interested in such things. The problem comes in the range of peoples voices, the ways in which things are said/meant and some grammatical stuff I didn't even understand let alone know how to help them solve (not that this was my intended role I'll add here).

I think, for now, the transcription software like Dragon is about as good as you get, but I've never gotten on with it and the software won't work at all on my iPad with nobody able to work out why...

What you'd ideally find is a transcriber you get along with, does good work and takes an hourly rate. Build up the relationship by recommending them to others so that they can boost their business and all being well, they'll look after you and keep the prices sharp until the software gets there.
 
Upvote 0
until the software gets there.

Until digital systems are completely redesigned (and probably will have to cease to be purely digital) this will never happen.

I have been banging my head against the translation problem and the only way that can work properly, is if you enter into the system huge amounts of ready-made phrases and then the SW can run through millions of if-then routines. The moment the SW confronts a new subject (e.g. a sexual relationship, instead of quality control management!) you have to have a whole series of new phrases and complete sentences.

It is a fascinating subject, but the core problem is that we understand what the speaker is trying to say and we understand what they are thinking. No digital system is self-aware and therefore cannot imagine what it is like to be the speaker and for that reason, will never be able to truly understand what he/she is trying to say.
 
Upvote 0
I'd be wary of saying never, since technology, especially in the last 10 years, has come on in such leaps and bounds that it wouldn't entirely surprise me if someone told me this had already been solved and we just didn't know about it yet.

That said yes its a corker of a problem for the reasons you say and many many more on top. In my eyes it isn't important enough a problem for the focus I'd imagine it'd take to solve but I am sure others would disagree.
 
Upvote 0

Dahu371

Free Member
Apr 21, 2009
144
6
That said yes its a corker of a problem for the reasons you say and many many more on top. In my eyes it isn't important enough a problem for the focus I'd imagine it'd take to solve but I am sure others would disagree.

I'd disagree! As a wise man once said -

"The best minds of my generation are thinking about how to make people click ads. That sucks"
 
Upvote 0
That was Jeff Hammerbacher.

The task we have is to get machines (in the broadest sense of the word) to understand content and, for example realise how to write "I have read Red the red book."

No ammount of IF-THEN commands will get us to that level of understanding. Only an actual awareness of the meanings of long combinations of words will achieve that, or the translation of that into German or French.
 
  • Like
Reactions: toastking
Upvote 0
I'd disagree! As a wise man once said -

"The best minds of my generation are thinking about how to make people click ads. That sucks"

So you can't think of anything more important that people could be working on? Are you working on the solution? Your quote is regarding advertising, an industry worth trillions. The aim of advertising is for people to make more money, people like money, which makes it important. Rightly or wrongly, it is important we have to have it within our system whether we like that system or not.

Your example is a poorly chosen one to support your thoughts because of that necessity for money which grows into ambition for more or in cases greed for more money. As a result it is obvious that something that is central to making money in such a big way, is going to be important to enough people people to make it important and so people will spend time doing it.

What I questioned was the importance of being able to transcribe audio into text, which nobody in their right mind could argue was as big an industry or as important to a capitalist world as effective advertising.

I could've provided you a better example of your side of the fence and I'm not the one thats meant to in this discussion! Even stating 'what is important to one person might not be important to another' would've been a better response to be fair!
 
Upvote 0
Tried to put quote in about IF THEN but couldn't make it work sorry...

What about if you were to program in a contextual understanding into a form of artificial intelligence.

If you could teach is the rules then surely you could eradicate some of the problems (don't get me wrong I am not stating I have the answer, it is a genuine question and so I remain unconvinced at this point that this would be impossible).
 
Upvote 0
It absolutely and definately is not impossible - it's just that we don't know how to do it!

AI is still a long way off and taking words and turning them into contextual understanding is what it would take to 'translate' speech into either written text or another language, let alone what we mammals do, turn information into understanding and understanding into action. Even my dogs see food and then act upon that by getting out their racing spoons and tucking in - as indeed do I.

We can simulate AI by even more IF THEN commands - even to the point where Deep Thought had so many variations entered into it, in the game of chess, that it became unbeatable. But that ain't understanding.

There are several things we have to do BEFORE we can achieve AI -

1. We need to understand how we think. We still do not really understand the thought and awareness processes and until we do, we will never be able to truly simulate it.

2. Because thought and understanding is almost certainly not a series of one-dimensional, one-question, one-answer series of linear commands, we need to redesign the computing process. We need to introduce totally new ways of looking at information such as images, sound and text and find a way for the 'machine' to translate these into concepts. This almost certainly involves getting away from purely digital systems and using such off-the-wall things as analogue driven fractals.

3. The 'machine' needs totally new ways to exchange information. The number and sheer subtlety of the many ways a dog or a cat interacts with its surroundings is beyond anything we could simulate.
 
  • Like
Reactions: toastking
Upvote 0

Dahu371

Free Member
Apr 21, 2009
144
6
So you can't think of anything more important that people could be working on? Are you working on the solution? Your quote is regarding advertising, an industry worth trillions. The aim of advertising is for people to make more money, people like money, which makes it important. Rightly or wrongly, it is important we have to have it within our system whether we like that system or not.

Your example is a poorly chosen one to support your thoughts because of that necessity for money which grows into ambition for more or in cases greed for more money. As a result it is obvious that something that is central to making money in such a big way, is going to be important to enough people people to make it important and so people will spend time doing it.

What I questioned was the importance of being able to transcribe audio into text, which nobody in their right mind could argue was as big an industry or as important to a capitalist world as effective advertising.

I could've provided you a better example of your side of the fence and I'm not the one thats meant to in this discussion! Even stating 'what is important to one person might not be important to another' would've been a better response to be fair!

I just thought it was an interesting and thought provoking quote. I don't doubt that advertising is important on a capitalist society, but not everyone thinks that the pursuit of making money is the most important thing. This is probably the wrong forum to espouse that view though ;)
 
  • Like
Reactions: toastking
Upvote 0
If you don't think making money is the most important thing @Dahu371 I agree with you wholeheartedly, it isn't to me either. The thing about money is that when thats the goal, you'll always want to make a little more!

I think I just took your comment too literally rather than accepting the reasoning behind it.
 
Upvote 0

antropy

Business Member
  • Business Listing
    Aug 2, 2010
    5,322
    1,104
    West Sussex, UK
    www.antropy.co.uk
    I couldn't disagree more with most of the above!

    I'm a big fan of Ray Kurzweil who created the first OCR, digital synthesizer etc., a massively successful inventor, computer scientist and entrepreneur who now works at Google.

    Anyone interested in this thread should read this:
    http://www.singularity.com/

    And this:
    http://www.howtocreateamind.com/

    Both available from amazon.

    But anyway, back to the OP. Have you not tried Google's voice recognition lately? Get your Android phone, enable voice commands, say: "Ok, Google" and ask it anything. I find it to be more accurate than a person would be, especially with obscure words.

    I believe that Dragon, mentioned above was originally one of Kurzweil's projects and although I haven't used it I would imagine it being similar to Google's.
     
    • Like
    Reactions: toastking
    Upvote 0
    Kurzweil left the relatively simple world of digital sampling and is now in the guru business. Unfortunately, all this work towards AI has totally failed so far. When top neurologists tell me that they do not know how we think, then Kurzweil definately does not know.

    I tried reading both books, but they are so full of hokum and self-promotion, that I gave up.
     
    Upvote 0

    antropy

    Business Member
  • Business Listing
    Aug 2, 2010
    5,322
    1,104
    West Sussex, UK
    www.antropy.co.uk
    Kurzweil left the relatively simple world of digital sampling and is now in the guru business.
    What do you mean by "the guru business"?

    Unfortunately, all this work towards AI has totally failed so far.
    As far as I'm aware he's not really even working on "AI", so what do you mean? What particular projects of his would you say are a failure?

    When top neurologists tell me that they do not know how we think, then Kurzweil definately does not know.
    No one knows every detail of every part of the brain, or even close, but a neurologist is not a neuroscientist, one is a surgeon, the other is a researcher. You wouldn't necessarily expect a neurologist to know as much as a neuroscientist about how the brain works, their job is to carry out surgical procedures.

    I tried reading both books, but they are so full of hokum and self-promotion, that I gave up.
    I've read both, wouldn't have recommended otherwise ...
    Hokum - such as?
    Self-promotion - he doesn't promote anything, but he does of course brag a little about his past achievements.

    I'm not trying to defend Kurzweil for the sake of it but if you're going to make statements like the above, give me some hard, specific facts?
     
    Upvote 0

    Latest Articles