Dictation and Transcription
Better knowledge - better
This page describes various information Type-thing
has compiled about dictation,
transcription, and related Internet, Web, and
technology topics. We want you to know what we know
about creating better recordings and other information
that will help you produce the transcripts you need
for your efforts.
These are tips primarily for those dictating or
recording audio. We also have Tips for Correct Transcription
Information on this page is the opinion of Type-thing
and is not certified in any way to
be accurate, free from error, or applicable for your
particular use. If you have other questions or
suggestions for other material, please let us know.
How can you ensure the best transcription for your
business? Finding a good transcriptionist is one answer;
however, effort to transcribe a tape, the overall quality
of the transcript, and the cost of producing the
transcript is dependent on the quality of the audio file
or tape that is generated.
By producing a good recording, you may be able to reduce
the cost of your transcription, increase accuracy of
transcription and reduce the number of "indiscernible"
sections on your transcript. At Type-thing Services
, we've compiled a list of transcription DOs and DON'Ts
that may be of help.
|Speak clearly. Speak at good
||Speak in a rushed or hurried
voice or mumble. Speak quietly.
|Have people speak one at a
||Have people talk at once and
interrupt each other frequently.
|For digital or tape recorders,
record on fast speed or high quality setting. This
makes a clearer recording but uses more memory or
||Record on slow speed or low
quality, which uses less memory or tape but makes
a "muddier" sounding recording which takes longer
to transcribe possibly resulting in transcription
|Record in a quiet
environment. Be aware of background noise
from others, air conditioning, fans, music, and
|Record in an environment with
lots of background noise like a restaurant,
subway, near fans and vents, or place where others
are talking or making noise.
|In groups of two or more, make
sure each person can be heard equally well. Use
recording system with multiple microphones in
large groups to ensure you can hear each
||In groups of two or more,
allow some people to be heard well while others
are barely audible or not audible at all.
|Use a microphone near the
speaker. If the speaker will move around,
use a wired or wireless lapel microphone.
||Use a stationary microphone
and let the speaker move around, creating hard to
hear sections on the dictation.
|Only have one microphone or
recorder? If possible, have all persons
speaking the same distance from the
recorder. If that is not possible, place it
nearest to the most important part of the
conversation or point it in that direction.
|Place the microphone or
recorder near the interviewer so that the
recording barely captures the most important part
- the interviewee.
sessions, have people come to a house microphone
or bring a wireless microphone to them before they
ask their question. Alternatively, have the
person answering questions repeat the question so
it is captured on the recorded audio.
sessions, take no measures to record the
question. You'll only obtain transcribed
answers but be uncertain about the question asked.
|Use good quality equipment
made for the number of people you are recording.
Alternatively, if good equipment is not available,
use multiple digital or tape recording devices
around the room (we will have to listen to each to
fill in gaps from the others).
||Use poorly maintained,
low-quality equipment. Use equipment that was
designed for recording one person to record a
group of people.
|Keep recorder going (turned on
and recording) well before people talk.
||Use the "auto-vox" feature
that chops off the beginning of people's
|In large groups, have each
person state their name before talking if they
need to be accurately identified. Alternatively,
have a note taker make notes each time a person
talks including their name and the first few words
that they say. Provide agenda.
|Provide no records of a
complex recording environment, making it difficult
to separate out speakers or threads of
|Provide lists of speakers,
agendas for meetings, and other references as
available to us so that we can create better
annotated, ordered transcripts from your audio.
|Provide nothing but the audio
so that you have to edit the speaker
identification and order of your transcripts.
|If saving files to MP3 files
on your hand held recorder or computer, use the
Constant Bit Rate (CBR) format rather than the
Variable Bit Rate (VBR) format.
|Saving MP3 files with
Variable Bit Rate (VBR) will not allow them to be
transcribed directly because foot pedal
backspacing does not work for
transcriptionists. We will need to convert
VBR MP3 files to CBR format.
Not all of these hints apply to every situation. A
single-person transcription rarely has any of these
possible problems. Sometimes you cannot avoid background
noise or conversations where people interrupt and talk
over one another. A good transcriptionist can help some of
these situations; however, they cannot perform miracles.
When you are recording important information, especially
for group discussions, it pays to invest in a good
conference microphone set and recording system.
Services can work with you or the facility in
which you will record your audio to make sure it is the
best it can be for transcription. Contact us for more help!
essential, but poor-quality dictation
In some cases you may wish to
transcribe poor-quality dictation because the content is
essential. Type-thing Services will
review each case individually to let you know what we
can do to provide the best quality transcription
possible. Poor-quality dictation includes those
which is noisy, muffled, simultaneous overlapping
conversations, and two or more speakers recorded at
greatly different volumes. In some cases we are
able to digitize and enhance the audio to remove noise
or clarify the speakers. See information on this
page about digital audio files .
In such cases we will work to understand how many
"indiscernible" sections are permissible. This
work is billed at an hourly rate depending on the
services needed. Rush service for poor-quality
tapes, if available, is billed at rates higher than
normal rush because of the large effort and likely
transcriber fatigue involved.
In large jobs where we encounter a poor-quality tape, we
will often choose to not transcribe the particular tape
until the customer is contacted for guidance. In
rare instances we may refuse to transcribe very poor audio
because of the likely low quality of the resulting
transcription or fatigue on the transcriptionist.
There are several considerations for customizing your
transcripts. Here are a few common options you can discuss
with Type-thing Services.
normally marks parts of the transcription which
cannot be heard or are uncertain as
"[indiscernible]." We will typically go back three
times to try to understand such conversation after
which we mark "[indiscernible]" the words we could
not hear. Type-thing Services does
this as a compromise in order to reduce your
transcription costs. If transcription of these
hard-to-hear sections is of importance to you, we
can spend more time with the section or by
reviewing the tape a second time.
Note that we do not use the term "Inaudible,"
which means you can't hear anything. We can
hear; we just can't make it out the words; hence,
indiscernible is the correct usage.
Under extreme cases where hard-to-hear
dictation must be recovered, we can digitize and
filter the audio to obtain the best possible
material for transcription. For example, we can
subtract out the local background noise in the
recording so voices can be heard. However, this
requires an additional charge for recovering the
Verbatim or edited:
|Your choice of verbatim
transcription or edited to written English is a
choice that depends on your use of the
transcript. See "Grammar" below.
You'll find many definitions of "verbatim" and
"edited" transcripts, so we'll define what we mean
Let us know what you need so your transcript is
useful for your purposes.
- Includes false starts, repeated
- Does not include "ums," "ahs."
- Does not correct grammar.
- Conversion of spoken English to
- Correction of grammar.
- Does not include false starts,
repeated words, or stutters.
- Does not include "ums," "ahs."
Services will not guess the words that
may have been said in a hard-to-hear section of
your audio. We will use context of the
conversation to help understand these sections,
but we will not guess at what has been said.
requested, we can correct grammar as we
transcribe. This is an extension of
"Verbatim or Edited" noted above. Quite often
spoken English does not work well in written form
or the speaker may have certain grammatical errors
in addition to "ums" and "ahs" which are often
used in speech. Let us know if you wish to have
transcription that is verbatim or corrected for
grammar. This choice is often dependent upon the
final use of the transcription.
us know if the layout format of your transcription
is important to you. If you're not sure about
this, we can suggest several formats based upon
the number of speakers and purpose of the final
transcription. Options include paper printout
margins and how multiple speakers are identified.
Most options will not affect the cost of
transcription. We will identify extra costs upon
your request for special formatting.
How we provide transcription to you is also an
option. We can provide printouts, electronic
files on disk, email your files, or post them to
our web site so that you can download them from
the Internet (in a way others cannot access).
When providing electronic files, there are
numerous file formats that we can provide. Type-thing
Services uses the Microsoft Office
suite of products; however, we can provide
formats such as Open Office, Wordperfect,
Macintosh files, text files, and many others. In
addition, today's word processor programs are
usually able to read in many different formats,
so file format is usually not a problem.
About cost to transcribe
likely cost for transcription can be confusing.
You may be faced with costs based upon amount of
material ($/word, $/page, $/line, etc.), time to
transcribe ($/transcriptionist-hour, $/minute-audio),
piece rate ($/tape, $/CD), or bid by the job as
requested by the customer. Different vendors
may not provide the same costing method. Technical
content, amount of editing, and quality of recording are
also factors for a transcriptionist. Type-thing
Services will consider all of these methods
and help you understand what will work best for you;
however, we may only bid a job in a particular method
depending on the material to be transcribed and the
consistency of the material on the tapes or audio files.
See our rates page for Type-thing
Services' specific rate structure.
Here is some information about cost to transcribe that
you may wish to consider.
Audio length & time to transcribe
How long does it take to transcribe a tape or file?
Typically it can take from two to six times the length
of the audio to transcribe. This large range depends on
the type of material, how fast people talk, clarity of
the audio, number of speakers, clarity of the speakers.
Most of the work that Type-thing Services
has performed has taken 1.5 to three times the length of
the audio. Single-speaker or interview
transcription with clear audio takes the least amount of
This information may be useful if you choose to pay
your transcriptionist per hour of labor. If you
choose to do this, realize that a seasoned
transcriptionist that is a fast typist will produce more
per hour of labor.
Audio length & transcript pages
A rough conversion between pages and time is one standard
page per minute of single-speaker audio. A standard
page for most transcription companies is 22 lines of 65
mono-spaced characters across. Type-thing currently
provides 25 lines per page, which saves you about 12
percent on your transcription bill. Single speaker
presentations or interviews may often be less than one
page per minute. Group dialogues or fast-paced
dialogue is usually more than one page per minute of
The cost to transcribe is also dependent on the amount
of technical knowledge or editing required for
transcription. Medical transcription typically costs a
bit more because of the additional skill, tools, and
references needed to ensure an accurate and usable
transcript or medical note. Why is this? Because a
knowledgeable transcriber will produce a higher-quality
transcript that requires less of your valuable time to
edit or correct.
In a business transcript, the cost will be higher if
the customer requires extensive grammatical
corrections. Costs may also be related to the
amount of time required to service your staff with
inquiries, special requests, and "stat" or rapid
While legal transcripts of interviews are often
transcribed like business transcription ($/min), court
proceedings tend to have a wide variability in
transcribed content. Therefore we prefer to charge
in $/page for this because it better reflects the actual
work and amount of transcription involved.
Verbatim transcript & edited transcript
Often times this is a hidden cost or savings!
Many types of documents (reports, letters) are edited as
Services transcribes, so you're not receiving a
verbatim transcript. You're obtaining a finished product
or a document that requires less editing. This saves you
money because you are not charged for the words, lines,
or pages that are edited out of your transcript.
You are also receiving editing in the cost of the
transcript. Of course, you may want verbatim
transcripts for some applications (interviews, legal
proceedings, classes, podcasts, etc.), and Type-thing
Services can provide these.
Note that some transcription companies only provide you
a verbatim transcript. In this case you're paying
a lot more for what you're getting. Why?
You're paying for content in your transcript that will
be eventually edited out. You also have to pay an
editor or edit yourself, which is an additional
cost. Since Type-thing Services
transcriptionists have secretarial skills, they save you
See Options for Transcription
section above for more about this topic.
Cost per line or word
When considering $/word versus $/line, make sure you know
the definition of a line. Usually a line has nothing to do
with how many lines you have in your format or printout.
It is often defined as a certain number of characters
(usually 65) across (which assumes a mono-type
font). Type-thing Services typically
provides a $/line cost for some medical transcription and
$/page cost for interviews because this has been standard
in the industry; however, we can convert our estimates to
other measures if requested.
When is cost per line or word a good deal? Note that
some formatted documents with many short lines could cost
significantly more on a $/page, $/line, or $/minute rate
rather than a $/word rate. Be careful paying a line
rate for such documents because you'll be paying a lot for
empty space. In these type of documents we suggest a
$/word rate. Also note that Type-thing
Services often edits documents as they are typed
for business and medical transcription. This process significantly
reduces that actual content (lines and words) so that
you're not paying for a verbatim transcript that you
have to edit. This is a hidden savings at Type-thing Services
and a hidden cost in verbatim transcripts you might find
A cost per line or word is easy to verify. Most word
processors can count words or lines, which lets you audit
Cost per minute
It is becoming more popular on discount transcription
company websites to provide a $/minute rate for
transcription. Cost per minute is attractive because
you can quickly determine your cost for transcription
based upon the audio you have in hand - even before you
send it to the transcriptionist. Be careful of this
convenience because it may actually cost you a lot more in
the end. Why is this? For instance, if your
audio has a slow speaker, a flat rate per minute will
likely cost you more than a reasonable $/page rate.
Cost per minute may make more sense for verbatim
transcripts than it does for work that you wanted edited
on the fly by the transcriptionist. Consider why you
would want to pay for minutes of audio that are going to
be edited out later.
For convenience, here are a few conversion factors you
might be able to use to help make sense of cost to
transcribe your work. These numbers are average and
exact numbers may be different for your documents.
(standard page*) & audio length
- 0.5 to 0.75 pages / audio minute for slow speakers,
some single-speaker presentations
- 1 page / audio minute for single speaker
- 2 to 4 pages / audio minute for multiple speakers or
Lines per page (standard page*)
- 22 lines / page (normal)
- 25 lines / page (Type-thing - 12 percent savings)
Hours labor to transcribe audio
- Usually 1.5 work hours per audio hour to 3 work
hours per audio hour
- Up to 6 work hours per audio hour for difficult
audio (that requires listening multiple times)
Words per line (standard line of 65
characters including spaces)
- About 8 to 9 average words per line depending on
number of blank lines in document
Characters per word
- About 5.5 to 6
characters/word on average, depends on word
- 65 characters
(including spaces) per line (usually 12-point
- 22 lines per
page (Type-thing uses 25 lines per page - 12
Rush or expedited transcription
Another factor in cost to transcribe is turn-around on a
rush basis. While Type-thing Services can
offer standard rates for some rush or 24- to 48-hour turn
around work, that is reserved for limited volume from
regular or high-volume customers. When you have an
urgent unscheduled effort that is a rush, the cost for
transcription increase over regular fees. Type-thing
Services offers rush service in hours to days depending on
your need and our capacity. See our rates page for standard costs.
Be careful of transcription companies that offer unlimited
rush job work because you're taking a gamble. Your work
has a strong possibility of being sent overseas and
transcript quality may be inferior. We've seen this
happen to many customers that come to us after having
disappointing results. If you want to try this
approach, send the companies something when you're really
not in a rush just to test out their quality.
Finding the best cost to transcribe
What can you do to make up your mind? Contact Type-thing Services,
and we'll let you know the best way to cost your project
given the type of audio and your objectives.
If you'd like to compare across vendors, you can always
provide a sample of audio and possibly finished transcript
to each and ask each vendor how much that item would cost
to transcribe given a particular volume of work.
Why not use
computer dictation, speech-to-text programs?
we are asked why a person considering transcription
should not simply use one of the new and improving
programs/systems for computers that type while you talk
or convert dictation to transcripts. These programs
recognize your speech as you talk into a microphone and
type what you say into a document. The promise is that
one would save much money in transcription costs. In
fact, a number companies have sprung up and have
marketed specific systems for the medical and legal
communities, particularly around Electronic
Medical Records (EMR). In addition, you will find that
some EMR providers put digital dictation through a
speech recognition system before it is given to a
transcriptionists ring in a role called a "speech
professionals should not yet use speech-to-text.
Not only are these programs not accurate enough, you
must spend more time on three things: (1) You
usually must talk slower and spend time training the
system. (2) You have to correct the errors the program
makes. (3) You have to be your own secretary -
correcting the errors you make, know proper grammar,
spelling, formatting, etc. Most times spoken
English is not as it will be written.
Transcriptionists take care of this for you.
Reality creeps in where large organizations are
requiring staff, due to hope of lowering costs, to
utilize speech recognition systems. Often this
is tied to a move towards electronic medical records
where the EMR company bundles speech recognition as an
"extra added value." Type-thing can provide Speech Recognition
Editing to assist in this process.
The longer answer is that choices should be a matter of
cost and convenience. If the total cost to the
dictator or organization is less using such
text-to-speech systems, then they should use them. Our
experience is that these systems are not yet
sophisticated enough to pay for themselves, and may
actually cost professionals and hospitals more due to
their ongoing time investment. There is no doubt
that in the future these systems will be excellent, but
for practical dictation, these systems have a long way
to go. In a number of cases the hasty move to an
immature technology lines the technology company pockets
at the expense of the medical community. We have
seen and learned of examples where professionals take
more time with this technology and see less patients as
well as have lower quality notes. Here's what we
does not use speech recognition for its transcription
work. Why? Because even if the accuracy of
the process was fairly high (and it never is - we
explore it periodically), we would have to listen to the
whole audio to verify the transcript was correct.
On top of that, we would have to edit your spoken word
to something fitting for your needs. We often do
that on the fly while listening to dictation. All
of this editing a recognition transcript can take longer
than just doing it from scratch.
You might also be interested in a well-referenced
article "Rest in Peas: The
Unrecognized Death of Speech Recognition,"
by Robert Fortner, that argues there are multiple
barriers for good-accuracy speech-to-text systems.
The show that accuracy has not improved in speech
recognition software since about the year 1999.
For limited uses like dialing a cell phone it might
work, but for transcripts there are problems.
Studies have shown that speech recognition plus human
editors are less efficient than a traditional
transcriptionist. (e.g., "Speech Recognition as a
Transcription Aid: A Randomized Comparison With Standard
Transcription," J Am Med Inform Assoc. 2003 Jan-Feb;
10(1): 85–93.) That
document concludes "speech recognition did not
improve the productivity of secretaries or
transcriptionists." In fact, this study said there
was a loss in that speech recognition reduced
efficiencies of medical transcription to 87 percent of a
transcriptionist alone! On the other hand,
technology is likely to improve.
When making that decision consider the following
- Costs more overall
- Why should a highly-paid professional spend time
sitting in front of a computer editing their text,
continually retraining the program for new words and
names? Time is money and, at least today,
text-to-speech programs seem to take time away from
the professional. Some systems allow you to talk into
a dictation machine; however, you must still worry
about the points below.
One option being used is
to employ a "speech recognition editor" or "voice
recognition editor" to listen to the audio while editing
the text from the speech recognition system.
Type-thing can provide a speech recognition
as needed; however, the value of
the recognition engine is dubious. If the engine
is not excellent and the dictator cooperative, it can
take as much or more time to edit the text as
transcribing the original audio.
One day computer technology will allow natural speech
transcription, editing, grammar checking, etc., at least
for an individual speaker, and at least for practical
dictation. That day does not appear to be here. However,
if you like new technology, go out and purchase the
relatively inexpensive off-the-shelf software ($100 to
$400) and try it. If you are considering one of the
medical or legal systems being offered by new companies,
try before you buy. We've known several practices that
have tried these systems, only to come back to a
- Talk clearly
- You must talk clearly and enunciate each word. The
programs are getting better, but you cannot slur your
speech, talk extremely fast, etc. The surrounding must
be quiet, not a noisy room, lobby, or car. Multiple
people cannot be talking around you or in the
- Do you speak like you write? Quality
is more than speech recognition!
- Transcription involves punctuation, grammar, and
formatting at the least. Spoken English is
vastly different from written English. You may
be surprised at how unstructured spoken English
appears when typed. When requested, we regularly
correct our clients spoken English into professionally
appearing written transcripts. With
speech-to-text programs, you will have to train
yourself to speak in written English form. Some
EMR systems require the medical professional to call
out punctuation! This is not the only
problem. Many of our clients do not always speak
in an ordered linear format from beginning to
end. Part way through a dictation, they will
remember something that should be inserted elsewhere
in the final product. With a human
transcriptionist, you only have to give direction and
the content for this to be accommodated. With a
speech-to-text program, you end up spending more time.
- Are you a secretary?
- Once you have your transcript in the computer, do
you know all the rules of grammar, spelling,
formatting, etc.? If you do, great. Now waste
your valuable time performing such an administrative
function. Is your staff going to edit the transcript?
If so, great. Do they have good secretarial skills
that will produce letters, reports, and documents that
present the professionalism you need?
- Train, train, train...
- Although these programs are getting better all the
time, they are not yet like the science fiction
portrayals of computers recognizing you talk. The
programs today cannot recognize speech from every
person; they must be "trained." Even after they are
"trained," they will make occasional errors, and will
almost always not understand uncommon words, new
words, or new names. You must train them at least once
each time such words arise.
- Not for groups or poor quality recordings
- Even if the computer can understand one person
speaking clearly, it cannot yet even attempt to
untangle a transcript of multiple speakers, sometimes
talking at once, often in noisy conditions, some
talking quietly, some talking loudly.
How do we know? Not only do we talk to a lot of people
requiring transcription and considering speech-to-text
programs, we have tried this software ourselves as a way
to increase our productivity. We have seen multiple
medical professionals and hospitals be duped into a
technology that is not yet ready.
Electronic Medical Records (EMR)
Electronic Medical Records are electronic, computer-based
records for tracking care from medical providers.
They are a good thing--offering the promise of each
person's medical record being available instantly to the
providers of their choice when they need care, as opposed
to paper files that take days or weeks to transfer.
There are many benefits, but other issues exist in this
transformation such as privacy concerns a number of
problems around medical transcription.
Should we eliminate transcription with the
advent of EMR?
The quick answer is that it seems from current press that
manual data entry (from lists or by typing) of care
providers can take more time than transcription and reduce
the quantity of patients that can be seen. There is
no need to eliminate transcription. Transcribers can
easily provide electronic forms of transcription in any
needed format. The question is whether or not EMR
manufacturers will allow that to happen--integrate
independent transcription with their EMR systems.
Some providers have chosen to take a giant leap and move
completely away from transcription. This is likely
due to the EMR system developers trying to cash in on as
many "extras" as they can in the EMR process, and promise
increased efficiency along the way. Some benefits
exists, but from our observation, many of these EMR
providers may have little incentive to easily accept
transcription from external transcribers. One method
used is a voice recognition system.
This is unlikely to work well alone, but may be better
with Speech Recognition
Editing. Another is a system where medical
staff have to type some information and use check boxes to
fill out others. It may work in some disciplines;
however, we have seen this approach cause extreme decrease
in efficiency and morale of medical practitioners forced
to use this approach. In such cases medical
practitioners need to demand ways to incorporate external
transcription into the EMR system. It's not that
How can Type-thing Services transcription be
incorporated into EMR?
Type-thing services will work with you as needed to submit
your transcription in electronic format to your EMR
system. Because EMR is new, it has been a wild west
frontier environment where there are many different
systems. Contact us and we'll
work with you to find out how we can customize and
integrate with your EMR process or we'll tell you if it's
not possible. We are able to provide HL7 CDA (Clinical
Document Architecture) format information. We can
also arrange to have secure remote access to your EMR and
insert transcription directly. We can provide Speech Recognition Editing
to refine your voice recognition files.
Turnaround Possible or Reliable?
While 24-hour turnaround, or 24-hour
transcription, is possible and can be reliable, be careful
about what you're obtaining, especially when starting a
new relationship with a transcription company.
Every quality transcription company has limited staff to
transcribe your materials. Type-thing Services
offers 24-hour to 48-hour turn around to regular customers
for which we can schedule or set aside transcription time
for their regular work flow. In addition, we offer
such service at rush rates (hours to days with volume) to
others only if we know we can meet that customer's
From our experience, claims of unlimited
24-hour service often indicates that your transcription is
being sent overseas, which has a number of
implications. See "Is off-shore
transcription worth it?" section for other opinions
on that topic.
When you obtain commitment for 24-hour turn around,
remember that agreement should dependent upon your regular
work volume. If you send in a week's worth of your
dictation all at once, that surge may cause delays beyond
the agreed upon delivery time-frame. Call Type-thing Services
to understand what turn-around is possible on your work.
transcription worth it?
not a secret that, as with many industries, off-shore
competition has moved in to challenge providers of
transcription services in the United States. Is it
worth it to you? While this answer is something
you'll have to decide for your situation, make sure
you're looking at savings in the bottom line cost of
your transcription solution. The following
information may be of use in considering this option.
First of all, do you even know when your transcription
is being sent off-shore? Many of these companies
have purchased domestic companies, ".com" web sites, or
established offices in the U.S., but still send the work
abroad. You may interact with a U.S. citizen and
call a U.S. phone number, yet your dictation is sent
outside of the United States. Make sure to ask
where your work will be done, and in general by whom.
Total cost for your transcription is likely related to
these four items. Some items may be more important
to you than others depending on your business needs.
Here is a bit more information on each of these
items. Note that with the exception of the fourth
item, the act of sending work abroad IS NOT the inherent
problem--it is the quality of service you receive.
You may be able to find a quality off-shore provider that
lowers your total cost; however, our experience has shown
that this is not often the case.
- Raw cost to transcribe ($/line, $/page, etc.)
- Risk - Privacy of information sent outside the
- Spoken versus edited text
- Quality of the transcript or product (ability to use
- Customer service, responsiveness and flexibility to
- Specific off-shore issues (turn-around, privacy,
security, export-control regulations)
- 1. Raw cost to
- Because raw cost to transcribe is the initial
attractive feature of off-shore services, your initial
raw cost should be less. You should understand
that raw cost is not your total cost. Consider
the total cost in your decision. Total cost may
be affected by the following three items.
- 2. Risk - Privacy
of information sent outside the United States
- The basic fact that your
work is sent off-shore may be an issue you've not
considered. One positive issue is that if you
require quick turn-around, those working on the
opposite side of the globe can transcribe while you
sleep, so your work may be ready the next morning, in
less than 24 hours. There are a number of
potential negative issues
- When you send your audio and resulting transcripts
outside of the United States (with or without your
knowledge), you are sending it to locations not
covered by United States law. If your
information is private or covered by a number of laws
to which you are held accountable, can you be sure
you've performed due diligence in protecting that
information? If that information is disclosed,
can you obtain damages from a company in a remote
country, one you may not even be able to identify?
- Is the process to send the work abroad such that it
meets your security and privacy needs? Company
proprietary information or health information (HIPAA)
could be compromised. It is not just the
transmission of your audio that should be secure, but
there should also be assurance that the companies and
individuals abroad can maintain privacy and
security. Their networks, computers, and
facilities should be as secure as domestic
providers. A number of instances in the press
have shown that security abroad is an issue.
Even if they have excellent computer and information
security, the people working there are under foreign
government influence and different rules. If
something does go wrong, how are you going to take
action against an off-shore company?
- A critical problem to consider is export-control
regulation. This appears to apply
mostly to technical data, not necessarily personal
medical information. Export Administration
Regulations ("EAR") and International Traffic In Arms
Regulations ("ITAR") control the export of
commodities, software, technical data and other
information to foreign countries. If you send
information abroad in audio files which is covered by
these regulations without the proper export licenses,
you can be fined and go to jail. If
non-U.S.-citizens within the U.S. access this
information, it is also considered an export.
Check with your company or institution to see if your
transcription contains export-controlled information.
- 3. Spoken versus
- In many off-shore transcription services, you are
charged for every word that you speak because your
transcript is a literal copy, often inaccurately, of
what you say. With Type-thing Services,
you are not billed for your redundant words and
comments to our transcribers. We usually reduce
the size of the transcription product you receive
because we edit it as we transcribe! In
addition, you or your staff must now spend time
editing the transcript from spoken to written
English. So, you pay less because there are less
words and lines and you have a higher quality edited
product. This is double the value!
- 4. Quality of the transcript or product
- The most common complaint we've heard from clients
that have tried off-shore services is that the innate
language barriers cause inaccurate transcripts,
grammar is poor, and there are spelling
problems. If the pool of transcriptionists is
large and transitory, your quality may be
variable. This is worsened by U.S. clients that
tend to talk fast, mumble, or of have a strong local
U.S. accent. If you don't mind a poor-quality
product, this may not influence your decision.
Just remember that a poor quality product may
influence your total costs now because you have to fix
the product yourself, or it may influence your future
costs should you call upon the transcription in the
future and find it useless. If the transcript is
a form of insurance or mandated record, you may be
found negligent for accepting a poor quality
transcript. If a faulty transcript is used in
the future, it may cause erroneous actions that will
increase your costs. Note that you can get poor
quality from domestic sources too, so this is not just
an off-shore issue. Off-shore sources may be
able to produce high-quality product if they have the
right staff; however, they are having an increasingly
difficult time finding qualified staff.
In addition to the above four items, you might also
consider the following:
- 5. Customer service, responsiveness, flexibility
- If the off-shore services and their domestic front
offices cannot provide you with the customized and
responsive services that make your work efficient,
then that adds to your total costs. If this
doesn't matter to you because you need little customer
service, then off-shore services may be more
attractive. Common complaints we've heard from
clients include problems redressing quality issues,
following up with updates, and corrections.
Because many off-shore services save money by having
large-scale operations, they may also have some
trouble at customizing their process to fit your
- How does Type-thing Services know?
- We receive clients who have not been satisfied by
their experience with off-shore transcription services
for many of the reasons noted above. We have
been contacted by numerous off-shore companies that
have wanted Type-thing
Services to front their services to U.S.
customers. We have seen transcripts produced by
off-shore transcription companies when clients were
not happy with the results. We have called to
understand the utility of using such services
- Does Type-thing Services use off-shore
- Is off-shore labor plentiful?
- Not necessarily. Plentiful qualified labor is
the entire premise for off-shore transcription
companies ability to maintain low rates and
quality. Recent news articles show that as the
global economy evolves, off-shore markets are
experiencing difficultly in obtaining enough qualified
labor for many technical tasks and service tasks that
require training. Their qualified staff must be
paid more or they move to higher-paying jobs. To
maintain lower rates, they must use less-qualified
labor. The grass is not always greener on the
other side of the ocean.
About digital audio files
With establishment of multimedia computers (audio,
video, etc.) as the norm, more material is being
generated in the form of digital computer files. Digital
hand held dictation devices are now available that
record to a memory card and can generate audio files you
can place on disk or send over the Internet. Type-thing
Services has the ability to convert and
transcribe such files that come in a variety of formats.
We can also generate these files for use on your web
site from your audio or video tape. We'll work
with you to understand what you need for your
application. Part of our service is understanding
these formats and knowing which work well on the web and
Internet. We use multiple methods to make the
smallest possible audio file for your purpose so that
the file can be downloaded or transmitted most
efficiently. See our Web and
Internet Services page for more detail.
These are some of the existing common open formats for
digital audio files:
Windows PCM (WAV)
Microsoft ADPCM (WAV)
MPEG3 FhG (MP3)*
CCITT mu-Law and A-Law (WAV)
MPEG audio (layers I and II)
Microsoft ADPCM (WAV)
CD and DVD Audio Disks
Video formats (AVI, MOV, WMV etc.)
* Note that when creating MP3 files for transcription, you
should use the Constant Bit Rate (CBR) method of storing
sound in the file. Use of the Variable Bit Rate
(VBR) will cause fits for transcriptionists because time
is compressed in unpredictable ways that will cause their
foot pedal backspace feature to jump randomly in the file
while transcribing. If you don't know what this
means, don't worry, Type-thing Services can convert CBR to
VBR files for you.
These are some file formats that are proprietary,
particularly used for hand held digital recorders:
Memory Stick Voice (MSV)
Sony Digital Voice File (DVF)
Recorder Sound (ICS)
Olympus (DSS, DS2)
These are multi-track proprietary file formats. They
are typically for courtroom or law-enforcement use, but
have other applications for multi-channel recording as
- FTR Gold by For The Record (ftrgold.com)
- Liberty Court Recorder/Player by High
Criteria, Inc. (highcriteria.com) (DCR)
These are single track or stereo files, but usually more
obscure file formats. What you consider "obscure"
probably has to do with what applications you work with,
so some may think these are common.
raw format (SAM)
ACM waveform (WAV)
CCITT mu-Law and A-Law (WAV)
Dialogic ADPCM (VOX)
IMA/DVI ADPCM (WAV)
Real Audio (RA, RAM, RMM, RM, etc.)
(layers I and II)
Next/Sun CCITT mu-Law, A-Law and PCM (AU)
Raw PCM Data
SampleVision format (SMP)
Sound Blaster voice file (VOC)
DiamondWare Digitized (DWD)
Apple AIFF (PCM encoded data only) (AIF)
We are also able to transcribe audio from any source on
the Internet or World Wide Web given that we can access it
with a standard browser or program. See our Web and Internet Services page for
Each audio file can have various options that may be
important to dictation and transcription. Typical
options are as follows:
New formats are coming out all the time!
- Tracks: Mono, Stereo,
- The more tracks you have, the more file size is
required. Stereo or multi-track is not
typically useful for transcription unless each track
represents a separate microphone in a different
location. In that case, all the tracks can be
combined for transcription or transcribed
separately. Courtroom recordings often have
four separate tracks (judge, witness box,
- Sample rate
- Sample rates tell you how many times each second
the audio is recorded. Faster rates have
better quality but take more file size. Slower
rates have less quality but produce smaller files.
- Typical sample rates are resumed in samples per
second and are typically 6000, 8000, 11025, 22050,
32000, 44100, 48000, 64000, 88200, 9600, and
176400. CD-quality audio is 44100
samples per second.
- The frequency of audio you can reproduce in a
digital file is at most half the sample rate.
So, at 44100 samples per second, a CD audio can
reproduce at most 22 kilohertz frequencies.
- We recommend that for voice transcription you have
a sample rate at least 22050 samples per
second. We can transcribe lower sample rates,
but the audio quality decreases with lower sample
- Some formats of audio permit various degrees of
compression, which makes the file smaller at the
expense of audio quality. Most of the time
audio quality is not impaired, but at extreme
compression it may be affected. These file
formats are known as "lossy" in that they can loose
audio quality. An example format like this is
- Compression is a trade off of file size to audio
quality. For dictation, select one that does
not significantly impair audio quality. MP3
files can compress more with the Variable Bit Rate
(VBR) format, but don't use that because
transcriptionists cannot use that directly.
Instead, use the Constant Bit Rate (CBR) format.
- Sample size (bits)
- Each sample taken typically has a fixed size,
measured in bits. The larger this size,
the more accurately the audio can be reproduced and
the larger the resulting file. The smaller
this size, the less accurate the audio, but smaller
the resulting file size.
- Typical sample sizes are 8-bit, 16-bit, and
32-bit. The most popular and size we suggest
for transcription is 16-bit.
Which digital audio files
should I use?
Which files are the best
to use? It depends on
your situation and use of the digital audio file. If
your equipment uses a particular audio file format, you
have limited options.
Which type work on the
Web and Internet?
The web and Internet use of audio is evolving. For
transcription, current influence is created by MP3
players, Apple's I-Pod, and digital dictation
machines. MP3 and WMA file types seem to be
popular at this time.
Original sound files included the Next/Sun (AU
extension) files and the also, due to Windows'
popularity, the WAV files. Later formats
like Quicktime and Real Audio showed promise in reducing
the file sizes and added ability to stream the
audio. Streaming means the audio is played over
your computer's speakers pretty much it arrives.
Before that, the entire audio file had to be downloaded
before it was played, which was inconvenient for large
files or those that were transmitted in real time.
Now MPEG3 files are popular for music files and are very
good at compressing audio as are WAV type TrueSpeech
files. The answer to the question really depends
on what you are trying to do and what resources you have
to provide the audio files to the user. Some
do you make the smallest audio files?
This is a fairly technical issue that trades off sound
quality with file size.
- How are you going to provide audio files to the
- Will the users be able to work with the audio files
- What bandwidth Internet connection do the users
- Are the files going to be downloaded or streamed?
The process of decreasing the file size can be fairly
complicated, and if not done properly can result in
distorted or noisy audio files.
- Newer audio file technologies typically make
- Some file formats (or options within a format) can
reduce size. This is compression.
- As the number of samples per second is decreased,
so is the file size (usually).
- As the number of bits of resolution (dynamic range)
per sample decrease, so does the file size (usually).
should be done to generate good audio files?
The most important thing is to start with good quality
audio -- either digitally recorded or recorded on magnetic
audio or video tape. Just like the guidance provided
above about transcription, good quality recordings are
essential at reducing cost and increasing the quality of
your audio file. Fortunately digital audio files can
be edited and enhanced more easily to produce a better
What can be done with
Audio files to edit the recording?
Digital audio files can be easily edited to produce a
good quality finished product. For this discussion,
editing is the simple rearrangement of audio segments
that is analogous to cutting and splicing audio
tape. Some examples are:
At Type-thing Services we clean up the
beginning and end of audio for customers in our
standard fee for generating audio files. Additional
editing is charged on an hourly basis.
- Audio can be easily deleted.
- Audio can be easily moved, copied, or spliced.
- Silence can be added or removed.
- Audio from other sources can be spliced into the
- Multiple tracks can be converted into one track.
What can be done with
Audio files to enhance the recording?
Digital audio files can be enhanced either to improve
poor-quality sound or by adding various special effects.
Such services are typically charged at an hourly
rate. Contact Type-thing
Services if you have questions!
- Uneven speaker volumes can be adjusted so low
volume speakers can be heard.
- One speaker can be increased or decreased in volume
to generate a sense of distance or depth.
- Many constant background noises (hum, buzz, noise,
etc.) can be eliminated without distorting the speech.
- A large number of recording studio special effects
can be added to all or parts of the recording.
About audio files
Video may refer to video tape or electronic video files.
Digital audio can usually be extracted from digital video
files and transcribed as noted above. Video tape
transcription requires making an intermediate audio tape
that can be more easily transcribed. Type-thing
Services has the ability to transcribe the
following formats. Other formats and standards (such as
PAL) can be converted with a slightly longer lead
SVHS (Super VHS)
Digital Video Cassette
MPG, MPEG, MP4
Quicktime files (MOV)
AVI (Audio-Video Interleaved) files
DVD (Digital Video Disk)
Other Digital Files: Just about any
digital," and "phone-in" dictation
These approaches to dictation and transcription have
become the norm in the industry. "Tapeless" and
"Digital Tapeless" are becoming archaic terms for
Digital refer to dictation without audio tape. This
could be a hand-held recorder that stores your dictation
in memory modules, or it could be a phone-in dictation
system. These types of devices have essentially
replaced hand held tape recorders.
First-generation digital dictation units (popular types
by Sony, Olympus, etc.) typically produce audio in
proprietary formats that are difficult to convert
without their own proprietary software. Newer
devices coming out after 2009 started to create files in
standard file formats such as MP3, MP4, and WMA.
Type-thing Services prefers you consider
because of the numerous advantages it offers. See the "Phone-in Dictation" page on
this Web site.
We have the capability to download audio files for
transcription and have also transcribed from voicemail and
other digital transcription services and devices.
About audio tape
sizes and formats
With the advent of digital dictation devices, audio tape
is not used as much for dictation and transcription, yet
they continue to be used in various forums and
applications. There are three primary sizes of tapes all
of which Type-thing Services can
transcribe. In approximate order of popularity they appear
These can be directly transcribed because transcription
machines are available in these sizes. Other size tapes,
including videotape (VHS, BETA, etc.), can also be
transcribed by Type-thing Services . We
first make copies to one of the three above types. Note
that micro and executive are very close in size but do not
fit in each other's machines. When using regular cassette
tapes for transcription, avoid any longer than T-60 (30
minutes on a side). Longer tapes tend to jam more easily
in the transcription machines which often start and stop
the tape. Micro and Executive tapes are designed for
transcription and therefore rarely jam.
- Micro cassette,
- Regular cassette, and
- Executive cassette.
Shown above are the regular cassette (top), executive
(left), and micro (right) with approximate sizes for
each tape. Micro and Executive cannot be used in the
other's machines. Executive tape dictation systems are
more expensive but provide superior clarity of
Most popular recorders use a single track of mono or
stereo audio. Some of them have two speeds that you can
record your audio. Recording on the fastest speed
produces higher quality dictation, but provides less
recording time on the tape.
Multiple-track recorders are typically used in settings
that require very accurate transcriptions and have
multiple persons that might speak simultaneously. For
instance, courtroom transcripts are often taken by a
four-track recorder with each person wearing a separate
microphone and recording on a different track of the
tape: judge, two lawyers, witness. Multiple-track
recorders are rare outside of the courtroom setting.
However, they provide superior transcripts because the
transcriber allows one to listen to each track
individually or all tracks at once. Again,
digital dictation systems have primarily replaced
About quantity of
dictation per tape
How much content can fit on a tape? With use of
digital files, a good question is also how much dictation
fits in a minute or hour of dictation. See the "About Cost to Transcribe" section above
for more detail.
For tapes, it depends on how fast the person or group
talks, and how much quiet time is on the tape, the tape
capacity (length). We have seen 3000-12000 words per tape,
5-50 pages per tape (various length tapes).
Another way to think about this is to consider that a
rough average of one page per minute for single-speaker
dictation. A 60-minute tape might have 60 standard
pages. Multiple speakers or fast speakers will
increase this page count. Again, see the "About Cost to Transcribe"
section above for more detail.
Michele Duran Skroch
505-922-1000 Albuquerque NM voice
703-679-TYPE (8973) NoVA Voice
Serving customers across the United States including
Washington D.C., Northern Virginia, to California, and
of course, New Mexico
on domestic and international business.
7 Sep 2013
information on this site.