|
|
|
||||||||||||||||||||||||||||
Dictation and Transcription
Tips
|
||||||||||||||||||||||||||||||
![]() |
How can you ensure the best transcription for your
business? Finding a good transcriptionist is one answer;
however, effort to transcribe a tape, the overall quality
of the transcript, and the cost of producing the
transcript is dependent on the quality of the audio file
or tape that is generated. |
|||||||||||||||||||||||||||||
| Things to do | Things not to do |
| Speak clearly. Speak at good level volume. | Speak in a rushed or hurried voice or mumble. Speak quietly. |
| Have people speak one at a time. | Have people talk at once and interrupt each other frequently. |
| For digital or tape recorders, record on fast speed or high quality setting. This makes a clearer recording but uses more memory or tape. | Record on slow speed or low quality, which uses less memory or tape but makes a "muddier" sounding recording which takes longer to transcribe possibly resulting in transcription errors. |
| Record in a quiet
environment. Be aware of background noise
from others, air conditioning, fans, music, and
other sources. |
Record in an environment with lots of background noise like a restaurant, subway, near fans and vents, or place where others are talking or making noise. |
| In groups of two or more, make sure each person can be heard equally well. Use recording system with multiple microphones in large groups to ensure you can hear each individual. | In groups of two or more, allow some people to be heard well while others are barely audible or not audible at all. |
| Use a microphone near the speaker. If the speaker will move around, use a wired or wireless lapel microphone. | Use a stationary microphone and let the speaker move around, creating hard to hear sections on the dictation. |
| Only have one microphone or
recorder? If possible, have all persons
speaking the same distance from the
recorder. If that is not possible, place it
nearest to the most important part of the
conversation or point it in that direction. |
Place the microphone or
recorder near the interviewer so that the
recording barely captures the most important part
- the interviewee. |
| During question/answer
sessions, have people come to a house microphone
or bring a wireless microphone to them before they
ask their question. Alternatively, have the
person answering questions repeat the question so
it is captured on the recorded audio. |
During question/answer
sessions, take no measures to record the
question. You'll only obtain transcribed
answers but be uncertain about the question asked. |
| Use good quality equipment made for the number of people you are recording. Alternatively, if good equipment is not available, use multiple digital or tape recording devices around the room (we will have to listen to each to fill in gaps from the others). | Use poorly maintained, low-quality equipment. Use equipment that was designed for recording one person to record a group of people. |
| Keep recorder going (turned on and recording) well before people talk. | Use the "auto-vox" feature that chops off the beginning of people's sentences. |
| In large groups, have each
person state their name before talking if they
need to be accurately identified. Alternatively,
have a note taker make notes each time a person
talks including their name and the first few words
that they say. Provide agenda. |
Provide no records of a
complex recording environment, making it difficult
to separate out speakers or threads of
conversation. |
| Provide lists of speakers,
agendas for meetings, and other references as
available to us so that we can create better
annotated, ordered transcripts from your audio. |
Provide nothing but the audio
so that you have to edit the speaker
identification and order of your transcripts. |
| If saving files to MP3 files
on your hand held recorder or computer, use the
Constant Bit Rate (CBR) format rather than the
Variable Bit Rate (VBR) format. |
Saving MP3 files with
Variable Bit Rate (VBR) will not allow them to be
transcribed directly because foot pedal
backspacing does not work for
transcriptionists. We will need to convert
VBR MP3 files to CBR format. |
In some cases you may wish to
transcribe poor-quality dictation because the content is
essential. Type-thing Services will
review each case individually to let you know what we
can do to provide the best quality transcription
possible. Poor-quality dictation includes those
which is noisy, muffled, simultaneous overlapping
conversations, and two or more speakers recorded at
greatly different volumes. In some cases we are
able to digitize and enhance the audio to remove noise
or clarify the speakers. See information on this
page about digital audio files .
In such cases we will work to understand how many "indiscernible" sections are permissible. This work is billed at an hourly rate depending on the services needed. Rush service for poor-quality tapes, if available, is billed at rates higher than normal rush because of the large effort and likely transcriber fatigue involved.
In large jobs where we encounter a poor-quality tape, we will often choose to not transcribe the particular tape until the customer is contacted for guidance. In rare instances we may refuse to transcribe very poor audio because of the likely low quality of the resulting transcription or fatigue on the transcriptionist.
| Indiscernibles: | Type-thing Services
normally marks parts of the transcription which
cannot be heard or are uncertain as
"[indiscernible]." We will typically go back three
times to try to understand such conversation after
which we mark "[indiscernible]" the words we could
not hear. Type-thing Services does
this as a compromise in order to reduce your
transcription costs. If transcription of these
hard-to-hear sections is of importance to you, we
can spend more time with the section or by
reviewing the tape a second time. Note that we do not use the term "Inaudible," which means you can't hear anything. We can hear; we just can't make it out the words; hence, indiscernible is the correct usage. Under extreme cases where hard-to-hear dictation must be recovered, we can digitize and filter the audio to obtain the best possible material for transcription. For example, we can subtract out the local background noise in the recording so voices can be heard. However, this requires an additional charge for recovering the speech audio. |
||||
|
Verbatim or edited:
|
Your choice of verbatim
transcription or edited to written English is a
choice that depends on your use of the
transcript. See "Grammar" below.
You'll find many definitions of "verbatim" and
"edited" transcripts, so we'll define what we mean
here.
|
||||
| Guessing: | Type-thing Services will not guess the words that may have been said in a hard-to-hear section of your audio. We will use context of the conversation to help understand these sections, but we will not guess at what has been said. | ||||
| Grammar: | If requested, we can correct grammar as we transcribe. This is an extension of "Verbatim or Edited" noted above. Quite often spoken English does not work well in written form or the speaker may have certain grammatical errors in addition to "ums" and "ahs" which are often used in speech. Let us know if you wish to have transcription that is verbatim or corrected for grammar. This choice is often dependent upon the final use of the transcription. | ||||
| Format: | Please let
us know if the layout format of your transcription
is important to you. If you're not sure about
this, we can suggest several formats based upon
the number of speakers and purpose of the final
transcription. Options include paper printout
margins and how multiple speakers are identified.
Most options will not affect the cost of
transcription. We will identify extra costs upon
your request for special formatting.
How we provide transcription to you is also an option. We can provide printouts, electronic files on disk, email your files, or post them to our web site so that you can download them from the Internet (in a way others cannot access). When providing electronic files, there are numerous file formats that we can provide. Type-thing Services uses the Microsoft Office suite of products; however, we can provide formats such as Open Office, Wordperfect, Macintosh files, text files, and many others. In addition, today's word processor programs are usually able to read in many different formats, so file format is usually not a problem. |
Determining your
likely cost for transcription can be confusing.
You may be faced with costs based upon amount of
material ($/word, $/page, $/line, etc.), time to
transcribe ($/transcriptionist-hour, $/minute-audio),
piece rate ($/tape, $/CD), or bid by the job as
requested by the customer. Different vendors
may not provide the same costing method. Technical
content, amount of editing, and quality of recording are
also factors for a transcriptionist. Type-thing
Services will consider all of these methods
and help you understand what will work best for you;
however, we may only bid a job in a particular method
depending on the material to be transcribed and the
consistency of the material on the tapes or audio files.
See our rates page for Type-thing
Services' specific rate structure.
Here is some information about cost to transcribe that
you may wish to consider.
How long does it take to transcribe a tape or file?
Typically it can take from two to six times the length
of the audio to transcribe. This large range depends on
the type of material, how fast people talk, clarity of
the audio, number of speakers, clarity of the speakers.
Most of the work that Type-thing Services
has performed has taken 1.5 to three times the length of
the audio. Single-speaker or interview
transcription with clear audio takes the least amount of
time.
This information may be useful if you choose to pay
your transcriptionist per hour of labor. If you
choose to do this, realize that a seasoned
transcriptionist that is a fast typist will produce more
per hour of labor.
The cost to transcribe is also dependent on the amount
of technical knowledge or editing required for
transcription. Medical transcription typically costs a
bit more because of the additional skill, tools, and
references needed to ensure an accurate and usable
transcript or medical note. Why is this? Because a
knowledgeable transcriber will produce a higher-quality
transcript that requires less of your valuable time to
edit or correct.
In a business transcript, the cost will be higher if
the customer requires extensive grammatical
corrections. Costs may also be related to the
amount of time required to service your staff with
inquiries, special requests, and "stat" or rapid
turn-around requests.
While legal transcripts of interviews are often transcribed like business transcription ($/min), court proceedings tend to have a wide variability in transcribed content. Therefore we prefer to charge in $/page for this because it better reflects the actual work and amount of transcription involved.
Often times this is a hidden cost or savings!
Many types of documents (reports, letters) are edited as
Type-thing
Services transcribes, so you're not receiving a
verbatim transcript. You're obtaining a finished product
or a document that requires less editing. This saves you
money because you are not charged for the words, lines,
or pages that are edited out of your transcript.
You are also receiving editing in the cost of the
transcript. Of course, you may want verbatim
transcripts for some applications (interviews, legal
proceedings, classes, podcasts, etc.), and Type-thing
Services can provide these.
Note that some transcription companies only provide you
a verbatim transcript. In this case you're paying
a lot more for what you're getting. Why?
You're paying for content in your transcript that will
be eventually edited out. You also have to pay an
editor or edit yourself, which is an additional
cost. Since Type-thing Services
transcriptionists have secretarial skills, they save you
these costs.
See Options for Transcription
section above for more about this topic.

Sometimes
we are asked why a person considering transcription
should not simply use one of the new and improving
programs/systems for computers that type while you talk
or convert dictation to transcripts. These programs
recognize your speech as you talk into a microphone and
type what you say into a document. The promise is that
one would save much money in transcription costs. In
fact, a number companies have sprung up and have
marketed specific systems for the medical and legal
communities, particularly around Electronic
Medical Records (EMR). In addition, you will find that
some EMR providers put digital dictation through a
speech recognition system before it is given to a
transcriptionists ring in a role called a "speech
recognition editor."
The quick
answer:
Most individual
professionals should not yet use speech-to-text.
Not only are these programs not accurate enough, you
must spend more time on three things: (1) You
usually must talk slower and spend time training the
system. (2) You have to correct the errors the program
makes. (3) You have to be your own secretary -
correcting the errors you make, know proper grammar,
spelling, formatting, etc. Most times spoken
English is not as it will be written.
Transcriptionists take care of this for you.
Reality creeps in where large organizations are
requiring staff, due to hope of lowering costs, to
utilize speech recognition systems. Often this
is tied to a move towards electronic medical records
where the EMR company bundles speech recognition as an
"extra added value." Type-thing can provide Speech Recognition
Editing to assist in this process.
The longer answer is that choices should be a matter of
cost and convenience. If the total cost to the
dictator or organization is less using such
text-to-speech systems, then they should use them. Our
experience is that these systems are not yet
sophisticated enough to pay for themselves, and may
actually cost professionals and hospitals more due to
their ongoing time investment. There is no doubt
that in the future these systems will be excellent, but
for practical dictation, these systems have a long way
to go. In a number of cases the hasty move to an
immature technology lines the technology company pockets
at the expense of the medical community. We have
seen and learned of examples where professionals take
more time with this technology and see less patients as
well as have lower quality notes. Here's what we
think:
Type-thing
Services
does not use speech recognition for its transcription
work. Why? Because even if the accuracy of
the process was fairly high (and it never is - we
explore it periodically), we would have to listen to the
whole audio to verify the transcript was correct.
On top of that, we would have to edit your spoken word
to something fitting for your needs. We often do
that on the fly while listening to dictation. All
of this editing a recognition transcript can take longer
than just doing it from scratch.
You might also be interested in a well-referenced
article "Rest in Peas: The
Unrecognized Death of Speech Recognition,"
by Robert Fortner, that argues there are multiple
barriers for good-accuracy speech-to-text systems.
The show that accuracy has not improved in speech
recognition software since about the year 1999.
For limited uses like dialing a cell phone it might
work, but for transcripts there are problems.
Studies have shown that speech recognition plus human
editors are less efficient than a traditional
transcriptionist. (e.g., "Speech Recognition as a
Transcription Aid: A Randomized Comparison With Standard
Transcription," J Am Med Inform Assoc. 2003 Jan-Feb;
10(1): 85–93.) That
document concludes "speech recognition did not
improve the productivity of secretaries or
transcriptionists." In fact, this study said there
was a loss in that speech recognition reduced
efficiencies of medical transcription to 87 percent of a
transcriptionist alone! On the other hand,
technology is likely to improve.
When making that decision consider the following points:
Are you a secretary?
While 24-hour turnaround, or 24-hour
transcription, is possible and can be reliable, be careful
about what you're obtaining, especially when starting a
new relationship with a transcription company.![[Top of
page]](images/rk-top.gif)
It's
not a secret that, as with many industries, off-shore
competition has moved in to challenge providers of
transcription services in the United States. Is it
worth it to you? While this answer is something
you'll have to decide for your situation, make sure
you're looking at savings in the bottom line cost of
your transcription solution. The following
information may be of use in considering this option.
First of all, do you even know when your transcription
is being sent off-shore? Many of these companies
have purchased domestic companies, ".com" web sites, or
established offices in the U.S., but still send the work
abroad. You may interact with a U.S. citizen and
call a U.S. phone number, yet your dictation is sent
outside of the United States. Make sure to ask
where your work will be done, and in general by whom.
Total cost for your transcription is likely related to
these four items. Some items may be more important
to you than others depending on your business needs.
The basic fact that your
work is sent off-shore may be an issue you've not
considered. One positive issue is that if you
require quick turn-around, those working on the
opposite side of the globe can transcribe while you
sleep, so your work may be ready the next morning, in
less than 24 hours. There are a number of
potential negative issues
With establishment of multimedia computers (audio, video, etc.) as the norm, more material is being generated in the form of digital computer files. Digital hand held dictation devices are now available that record to a memory card and can generate audio files you can place on disk or send over the Internet. Type-thing Services has the ability to convert and transcribe such files that come in a variety of formats.
We can also generate these files for use on your web site from your audio or video tape. We'll work with you to understand what you need for your application. Part of our service is understanding these formats and knowing which work well on the web and Internet. We use multiple methods to make the smallest possible audio file for your purpose so that the file can be downloaded or transmitted most efficiently. See our Web and Internet Services page for more detail.
These are some of the existing common open formats for digital audio files:| Windows
WMV Windows PCM (WAV) Microsoft ADPCM (WAV) MPEG3 FhG (MP3)* MP4, M4A CCITT mu-Law and A-Law (WAV) |
IMA/DVI
ADPCM (WAV) MPEG audio (layers I and II) Microsoft ADPCM (WAV) CD and DVD Audio Disks Video formats (AVI, MOV, WMV etc.) |
| Sony
Memory Stick Voice (MSV) Sony Digital Voice File (DVF) |
Sony IC
Recorder Sound (ICS) Olympus (DSS, DS2) |
|
|
| 8-bit
signed
raw format (SAM) ACM waveform (WAV) CCITT mu-Law and A-Law (WAV) Dialogic ADPCM (VOX) IMA/DVI ADPCM (WAV) Real Audio (RA, RAM, RMM, RM, etc.) |
MPEG audio
(layers I and II) Next/Sun CCITT mu-Law, A-Law and PCM (AU) Apple Quicktime Raw PCM Data SampleVision format (SMP) Sound Blaster voice file (VOC) TrueSpeech (WAV) DiamondWare Digitized (DWD) Apple AIFF (PCM encoded data only) (AIF) |
Which files are the best
to use? It depends on
your situation and use of the digital audio file. If
your equipment uses a particular audio file format, you
have limited options.
Which type work on the
Web and Internet?
The web and Internet use of audio is evolving. For
transcription, current influence is created by MP3
players, Apple's I-Pod, and digital dictation
machines. MP3 and WMA file types seem to be
popular at this time.
Original sound files included the Next/Sun (AU extension) files and the also, due to Windows' popularity, the WAV files. Later formats like Quicktime and Real Audio showed promise in reducing the file sizes and added ability to stream the audio. Streaming means the audio is played over your computer's speakers pretty much it arrives. Before that, the entire audio file had to be downloaded before it was played, which was inconvenient for large files or those that were transmitted in real time. Now MPEG3 files are popular for music files and are very good at compressing audio as are WAV type TrueSpeech files. The answer to the question really depends on what you are trying to do and what resources you have to provide the audio files to the user. Some issues include:
What can be done with Audio files to edit the recording? Digital audio files can be easily edited to produce a good quality finished product. For this discussion, editing is the simple rearrangement of audio segments that is analogous to cutting and splicing audio tape. Some examples are:
What can be done with Audio files to enhance the recording? Digital audio files can be enhanced either to improve poor-quality sound or by adding various special effects.
VIDEO TAPE (NTSC)VHS |
DIGITAL VIDEOWMV |
These approaches to dictation and transcription have
become the norm in the industry. "Tapeless" and
"Digital Tapeless" are becoming archaic terms for
Digital refer to dictation without audio tape. This
could be a hand-held recorder that stores your dictation
in memory modules, or it could be a phone-in dictation
system. These types of devices have essentially
replaced hand held tape recorders.
First-generation digital dictation units (popular types
by Sony, Olympus, etc.) typically produce audio in
proprietary formats that are difficult to convert
without their own proprietary software. Newer
devices coming out after 2009 started to create files in
standard file formats such as MP3, MP4, and WMA.
Type-thing Services prefers you consider
phone-in dictation
because of the numerous advantages it offers. See the "Phone-in Dictation" page on
this Web site.

Shown above are the regular cassette (top), executive (left), and micro (right) with approximate sizes for each tape. Micro and Executive cannot be used in the other's machines. Executive tape dictation systems are more expensive but provide superior clarity of dictation.
Formats
Most popular recorders use a single track of mono or
stereo audio. Some of them have two speeds that you can
record your audio. Recording on the fastest speed
produces higher quality dictation, but provides less
recording time on the tape.
Multiple-track recorders are typically used in settings
that require very accurate transcriptions and have
multiple persons that might speak simultaneously. For
instance, courtroom transcripts are often taken by a
four-track recorder with each person wearing a separate
microphone and recording on a different track of the
tape: judge, two lawyers, witness. Multiple-track
recorders are rare outside of the courtroom setting.
However, they provide superior transcripts because the
transcriber allows one to listen to each track
individually or all tracks at once. Again,
digital dictation systems have primarily replaced
tape-based recording.
![[Top of
page]](images/rk-top.gif)
email:
michele@type-thing.com
web: http://www.type-thing.com/