[Type-thing Services Transcription] [Type-thing Services Home]

[Return to last page]


Type-thing Services
                        - Medical, Business, Legal Transcription

Dictation and Transcription Tips

Audio Transcription, Video
                      Transcription | Type-thing ServicesBetter knowledge - better transcripts

This page describes various information Type-thing Services has compiled about dictation, transcription, and related Internet, Web, and technology topics. We want you to know what we know about creating better recordings and other information that will help you produce the transcripts you need for your efforts.
These are tips primarily for those dictating or recording audio.  We also have Tips for Correct Transcription.

Information on this page is the opinion of Type-thing Services and is not certified in any way to be accurate, free from error, or applicable for your particular use.  If you have other questions or suggestions for other material, please let us know.


[Top of page]

Tips for quality audio
                  recording, Dictation, Transcription | Type-thing
                  ServicesAbout recording quality

How can you ensure the best transcription for your business? Finding a good transcriptionist is one answer; however, effort to transcribe a tape, the overall quality of the transcript, and the cost of producing the transcript is dependent on the quality of the audio file or tape that is generated.

By producing a good recording, you may be able to reduce the cost of your transcription, increase accuracy of transcription and reduce the number of "indiscernible" sections on your transcript. At Type-thing Services , we've compiled a list of transcription DOs and DON'Ts that may be of help.

Things to do Things not to do
Speak clearly. Speak at good level volume. Speak in a rushed or hurried voice or mumble. Speak quietly.
Have people speak one at a time. Have people talk at once and interrupt each other frequently.
For digital or tape recorders, record on fast speed or high quality setting. This makes a clearer recording but uses more memory or tape. Record on slow speed or low quality, which uses less memory or tape but makes a "muddier" sounding recording which takes longer to transcribe possibly resulting in transcription errors.
Record in a quiet environment.  Be aware of background noise from others, air conditioning, fans, music, and other sources.
Record in an environment with lots of background noise like a restaurant, subway, near fans and vents, or place where others are talking or making noise.
In groups of two or more, make sure each person can be heard equally well. Use recording system with multiple microphones in large groups to ensure you can hear each individual. In groups of two or more, allow some people to be heard well while others are barely audible or not audible at all.
Use a microphone near the speaker.  If the speaker will move around, use a wired or wireless lapel microphone. Use a stationary microphone and let the speaker move around, creating hard to hear sections on the dictation.
Only have one microphone or recorder?  If possible, have all persons speaking the same distance from the recorder.  If that is not possible, place it nearest to the most important part of the conversation or point it in that direction.
Place the microphone or recorder near the interviewer so that the recording barely captures the most important part - the interviewee.
During question/answer sessions, have people come to a house microphone or bring a wireless microphone to them before they ask their question.  Alternatively, have the person answering questions repeat the question so it is captured on the recorded audio.
During question/answer sessions, take no measures to record the question.  You'll only obtain transcribed answers but be uncertain about the question asked.
Use good quality equipment made for the number of people you are recording. Alternatively, if good equipment is not available, use multiple digital or tape recording devices around the room (we will have to listen to each to fill in gaps from the others). Use poorly maintained, low-quality equipment. Use equipment that was designed for recording one person to record a group of people. 
Keep recorder going (turned on and recording) well before people talk. Use the "auto-vox" feature that chops off the beginning of people's sentences.
In large groups, have each person state their name before talking if they need to be accurately identified. Alternatively, have a note taker make notes each time a person talks including their name and the first few words that they say.  Provide agenda.
Provide no records of a complex recording environment, making it difficult to separate out speakers or threads of conversation.
Provide lists of speakers, agendas for meetings, and other references as available to us so that we can create better annotated, ordered transcripts from your audio.
Provide nothing but the audio so that you have to edit the speaker identification and order of your transcripts.
If saving files to MP3 files on your hand held recorder or computer, use the Constant Bit Rate (CBR) format rather than the Variable Bit Rate (VBR) format.
Saving MP3 files with Variable Bit Rate (VBR) will not allow them to be transcribed directly because foot pedal backspacing does not work for transcriptionists.  We will need to convert VBR MP3 files to CBR format.

Not all of these hints apply to every situation. A single-person transcription rarely has any of these possible problems. Sometimes you cannot avoid background noise or conversations where people interrupt and talk over one another. A good transcriptionist can help some of these situations; however, they cannot perform miracles. When you are recording important information, especially for group discussions, it pays to invest in a good conference microphone set and recording system.

Type-thing Services can work with you or the facility in which you will record your audio to make sure it is the best it can be for transcription.  Contact us for more help!

Transcribing essential, but poor-quality dictation

In some cases you may wish to transcribe poor-quality dictation because the content is essential.  Type-thing Services will review each case individually to let you know what we can do to provide the best quality transcription possible.  Poor-quality dictation includes those which is noisy, muffled, simultaneous overlapping conversations, and two or more speakers recorded at greatly different volumes.  In some cases we are able to digitize and enhance the audio to remove noise or clarify the speakers.  See information on this page about digital audio files .

In such cases we will work to understand how many "indiscernible" sections are permissible.  This work is billed at an hourly rate depending on the services needed.  Rush service for poor-quality tapes, if available, is billed at rates higher than normal rush because of the large effort and likely transcriber fatigue involved.

In large jobs where we encounter a poor-quality tape, we will often choose to not transcribe the particular tape until the customer is contacted for guidance.  In rare instances we may refuse to transcribe very poor audio because of the likely low quality of the resulting transcription or fatigue on the transcriptionist.

[Top of

Options for transcription

There are several considerations for customizing your transcripts. Here are a few common options you can discuss with Type-thing Services.

Indiscernibles: Type-thing Services normally marks parts of the transcription which cannot be heard or are uncertain as "[indiscernible]." We will typically go back three times to try to understand such conversation after which we mark "[indiscernible]" the words we could not hear. Type-thing Services does this as a compromise in order to reduce your transcription costs. If transcription of these hard-to-hear sections is of importance to you, we can spend more time with the section or by reviewing the tape a second time.

Note that we do not use the term "Inaudible," which means you can't hear anything.  We can hear; we just can't make it out the words; hence, indiscernible is the correct usage.

Under extreme cases where hard-to-hear dictation must be recovered, we can digitize and filter the audio to obtain the best possible material for transcription. For example, we can subtract out the local background noise in the recording so voices can be heard. However, this requires an additional charge for recovering the speech audio.

Verbatim or edited:
Your choice of verbatim transcription or edited to written English is a choice that depends on your use of the transcript.  See "Grammar" below.  You'll find many definitions of "verbatim" and "edited" transcripts, so we'll define what we mean here.
  • Includes false starts, repeated words, stutters.
  • Does not include "ums," "ahs."
  • Does not correct grammar.
  • Conversion of spoken English to written English.
  • Correction of grammar.
  • Does not include false starts, repeated words, or stutters.
  • Does not include "ums," "ahs."
Let us know what you need so your transcript is useful for your purposes.
Guessing: Type-thing Services will not guess the words that may have been said in a hard-to-hear section of your audio. We will use context of the conversation to help understand these sections, but we will not guess at what has been said.
Grammar: If requested, we can correct grammar as we transcribe.  This is an extension of "Verbatim or Edited" noted above. Quite often spoken English does not work well in written form or the speaker may have certain grammatical errors in addition to "ums" and "ahs" which are often used in speech. Let us know if you wish to have transcription that is verbatim or corrected for grammar. This choice is often dependent upon the final use of the transcription. 
Format: Please let us know if the layout format of your transcription is important to you. If you're not sure about this, we can suggest several formats based upon the number of speakers and purpose of the final transcription. Options include paper printout margins and how multiple speakers are identified. Most options will not affect the cost of transcription. We will identify extra costs upon your request for special formatting. 

How we provide transcription to you is also an option. We can provide printouts, electronic files on disk, email your files, or post them to our web site so that you can download them from the Internet (in a way others cannot access). When providing electronic files, there are numerous file formats that we can provide. Type-thing Services uses the Microsoft Office suite of products; however, we can provide formats such as Open Office, Wordperfect, Macintosh files, text files, and many others. In addition, today's word processor programs are usually able to read in many different formats, so file format is usually not a problem.

About cost to transcribe

cost to
                  transcribe, dictation, transcription | Type-thing
                  ServicesDetermining your likely cost for transcription can be confusing.  You may be faced with costs based upon amount of material ($/word, $/page, $/line, etc.), time to transcribe ($/transcriptionist-hour, $/minute-audio), piece rate ($/tape, $/CD), or bid by the job as requested by the customer.   Different vendors may not provide the same costing method. Technical content, amount of editing, and quality of recording are also factors for a transcriptionist.  Type-thing Services will consider all of these methods and help you understand what will work best for you; however, we may only bid a job in a particular method depending on the material to be transcribed and the consistency of the material on the tapes or audio files.

See our rates page for Type-thing Services' specific rate structure.

Here is some information about cost to transcribe that you may wish to consider.

Audio length & time to transcribe

How long does it take to transcribe a tape or file? Typically it can take from two to six times the length of the audio to transcribe. This large range depends on the type of material, how fast people talk, clarity of the audio, number of speakers, clarity of the speakers. Most of the work that Type-thing Services has performed has taken 1.5 to three times the length of the audio.  Single-speaker or interview transcription with clear audio takes the least amount of time.  

This information may be useful if you choose to pay your transcriptionist per hour of labor.  If you choose to do this, realize that a seasoned transcriptionist that is a fast typist will produce more per hour of labor.

Audio length & transcript pages

A rough conversion between pages and time is one standard page per minute of single-speaker audio.  A standard page for most transcription companies is 22 lines of 65 mono-spaced characters across.  Type-thing currently provides 25  lines per page, which saves you about 12 percent on your transcription bill.  Single speaker presentations or interviews may often be less than one page per minute.  Group dialogues or fast-paced dialogue is usually more than one page per minute of audio.

Technical content

The cost to transcribe is also dependent on the amount of technical knowledge or editing required for transcription. Medical transcription typically costs a bit more because of the additional skill, tools, and references needed to ensure an accurate and usable transcript or medical note. Why is this?  Because a knowledgeable transcriber will produce a higher-quality transcript that requires less of your valuable time to edit or correct.

In a business transcript, the cost will be higher if the customer requires extensive grammatical corrections.  Costs may also be related to the amount of time required to service your staff with inquiries, special requests, and "stat" or rapid turn-around requests.

While legal transcripts of interviews are often transcribed like business transcription ($/min), court proceedings tend to have a wide variability in transcribed content.  Therefore we prefer to charge in $/page for this because it better reflects the actual work and amount of transcription involved.

Verbatim transcript & edited transcript

Often times this is a hidden cost or savings!  Many types of documents (reports, letters) are edited as Type-thing Services transcribes, so you're not receiving a verbatim transcript. You're obtaining a finished product or a document that requires less editing. This saves you money because you are not charged for the words, lines, or pages that are edited out of your transcript.  You are also receiving editing in the cost of the transcript.  Of course, you may want verbatim transcripts for some applications (interviews, legal proceedings, classes, podcasts, etc.), and Type-thing Services can provide these.

Note that some transcription companies only provide you a verbatim transcript.  In this case you're paying a lot more for what you're getting.  Why?  You're paying for content in your transcript that will be eventually edited out.  You also have to pay an editor or edit yourself, which is an additional cost.  Since Type-thing Services transcriptionists have secretarial skills, they save you these costs.

See Options for Transcription section above for more about this topic.

Cost per line or word

When considering $/word versus $/line, make sure you know the definition of a line. Usually a line has nothing to do with how many lines you have in your format or printout. It is often defined as a certain number of characters (usually 65) across (which assumes a mono-type font).  Type-thing Services typically provides a $/line cost for some medical transcription and $/page cost for interviews because this has been standard in the industry; however, we can convert our estimates to other measures if requested. 

When is cost per line or word a good deal?  Note that some formatted documents with many short lines could cost significantly more on a $/page, $/line, or $/minute rate rather than a $/word rate.  Be careful paying a line rate for such documents because you'll be paying a lot for empty space.  In these type of documents we suggest a $/word rate.   Also note that Type-thing Services often edits documents as they are typed for business and medical transcription.  This process significantly reduces that actual content (lines and words) so that you're not paying for a verbatim transcript that you have to edit.  This is a hidden savings at Type-thing Services and a hidden cost in verbatim transcripts you might find elsewhere.

A cost per line or word is easy to verify.  Most word processors can count words or lines, which lets you audit your billing.

Cost per minute

It is becoming more popular on discount transcription company websites to provide a $/minute rate for transcription.  Cost per minute is attractive because you can quickly determine your cost for transcription based upon the audio you have in hand - even before you send it to the transcriptionist.  Be careful of this convenience because it may actually cost you a lot more in the end.  Why is this?  For instance, if your audio has a slow speaker, a flat rate per minute will likely cost you more than a reasonable $/page rate.

Cost per minute may make more sense for verbatim transcripts than it does for work that you wanted edited on the fly by the transcriptionist.  Consider why you would want to pay for minutes of audio that are going to be edited out later.

Conversion summary

For convenience, here are a few conversion factors you might be able to use to help make sense of cost to transcribe your work.  These numbers are average and exact numbers may be different for your documents.

Pages (standard page*) & audio length
  • 0.5 to 0.75 pages / audio minute for slow speakers, some single-speaker presentations
  • 1 page / audio minute for  single speaker regular dictation
  • 2 to 4 pages / audio minute for multiple speakers or fast-paced dialogue
Lines per page (standard page*)
  • 22 lines / page  (normal)
  • 25 lines / page (Type-thing - 12 percent savings)
Hours labor to transcribe audio
  • Usually 1.5 work hours per audio hour to 3 work hours per audio hour
  • Up to 6 work hours per audio hour for difficult audio (that requires listening multiple times)
Words per line (standard line of 65 characters including spaces)
  • About 8 to 9 average words per line depending on number of blank lines in document
Characters per word
  • About 5.5 to 6 characters/word on average, depends on word complexity
*Standard Page
  • 65 characters (including spaces) per line (usually 12-point mono-spaced font)
  • 22 lines per page (Type-thing uses 25 lines per page - 12 percent savings)

Rush or expedited transcription

Another factor in cost to transcribe is turn-around on a rush basis.  While Type-thing Services can offer standard rates for some rush or 24- to 48-hour turn around work, that is reserved for limited volume from regular or high-volume customers.  When you have an urgent unscheduled effort that is a rush, the cost for transcription increase over regular fees.  Type-thing Services offers rush service in hours to days depending on your need and our capacity.  See our rates page for standard costs.

Be careful of transcription companies that offer unlimited rush job work because you're taking a gamble. Your work has a strong possibility of being sent overseas and transcript quality may be inferior.  We've seen this happen to many customers that come to us after having disappointing results.  If you want to try this approach, send the companies something when you're really not in a rush just to test out their quality.

Finding the best cost to transcribe

What can you do to make up your mind?  Contact Type-thing Services, and we'll let you know the best way to cost your project given the type of audio and your objectives.   If you'd like to compare across vendors, you can always provide a sample of audio and possibly finished transcript to each and ask each vendor how much that item would cost to transcribe given a particular volume of work.

Why not use computer dictation, speech-to-text programs?

Sometimes we are asked why a person considering transcription should not simply use one of the new and improving programs/systems for computers that type while you talk or convert dictation to transcripts. These programs recognize your speech as you talk into a microphone and type what you say into a document. The promise is that one would save much money in transcription costs. In fact, a number companies have sprung up and have marketed specific systems for the medical and legal communities, particularly around Electronic  Medical Records (EMR). In addition, you will find that some EMR providers put digital dictation through a speech recognition system before it is given to a transcriptionists ring in a role called a "speech recognition editor."  You may also find that your phone voicemail can be emailed to you with a textual transcription generated by a computer.

The quick answer:

Most individual professionals should not yet use speech-to-text.  Not only are these programs not yet accurate enough, you must spend more time on three things:  (1) You usually must talk slower and spend time training the system. (2) You have to correct the errors the program makes.  (3) You have to be your own secretary - correcting the errors you make, know proper grammar, spelling, formatting, etc.  Most times spoken English is not as it will be written.  Transcriptionists take care of this for you.

Reality creeps in where large organizations are requiring staff, due to hope of lowering costs, to utilize speech recognition systems.  Often this is tied to a move towards electronic medical records where the EMR company bundles speech recognition as an "extra added value."  This is a ticking timebomb where errors in the transcription may cause problems in medical care.  Type-thing can provide Speech Recognition Editing to assist in this process.

The longer answer is that choices should be a matter of cost and convenience.  If the total cost to the dictator or organization is less using such text-to-speech systems, then they should use them. Our experience is that these systems are not yet sophisticated enough to pay for themselves, and may actually cost professionals and hospitals more due to their ongoing time investment.  There is no doubt that in the future these systems will be excellent, but for practical dictation, these systems have a long way to go.  In a number of cases the hasty move to an immature technology lines the technology company pockets at the expense of the medical community.  We have seen and learned of examples where professionals take more time with this technology and see less patients as well as have lower quality notes.  Here's what we think:

Type-thing Services does not use speech recognition for its transcription work.  Why?  Because even if the accuracy of the process was fairly high (and it never is - we explore it periodically), we would have to listen to the whole audio to verify the transcript was correct.  On top of that, we would have to edit your spoken word to something fitting for your needs.  We often do that on the fly while listening to dictation.  All of this editing a recognition transcript can take longer than just doing it from scratch.

You might also be interested in a well-referenced article "Rest in Peas: The Unrecognized Death of Speech Recognition," by Robert Fortner, that argues there are multiple barriers for good-accuracy speech-to-text systems.  The show that accuracy has not improved in speech recognition software since about the year 1999.  For limited uses like dialing a cell phone it might work, but for transcripts there are problems. 

Studies have shown that speech recognition plus human editors are less efficient than a traditional transcriptionist. (e.g., "Speech Recognition as a Transcription Aid: A Randomized Comparison With Standard Transcription," J Am Med Inform Assoc. 2003 Jan-Feb; 10(1): 8593.)  That document concludes "speech recognition did not improve the productivity of secretaries or transcriptionists."  In fact, this study said there was a loss in that speech recognition reduced efficiencies of medical transcription to 87 percent of a transcriptionist alone!   On the other hand, technology is likely to improve. But when?

When making a decision consider the following points:

Costs more overall
Why should a highly-paid professional spend time sitting in front of a computer editing their text, continually retraining the program for new words and names? Time is money and, at least today, text-to-speech programs seem to take time away from the professional. Some systems allow you to talk into a dictation machine; however, you must still worry about the points below.
One option being used is to employ a "speech recognition editor" or "voice recognition editor" to listen to the audio while editing the text from the speech recognition system.  Type-thing can provide a speech recognition editing service as needed; however, the value of the recognition engine is dubious.  If the engine is not excellent and the dictator cooperative, it can take as much or more time to edit the text as transcribing the original audio.
Talk clearly
You must talk clearly and enunciate each word. The programs are getting better, but you cannot slur your speech, talk extremely fast, etc. The surrounding must be quiet, not a noisy room, lobby, or car. Multiple people cannot be talking around you or in the dictation.

Do you speak like you write?  Quality transcription is more than speech recognition!
Transcription involves punctuation, grammar, and formatting at the least.  Spoken English is vastly different from written English.  You may be surprised at how unstructured spoken English appears when typed.  When requested, we regularly correct our clients spoken English into professionally appearing written transcripts.  With speech-to-text programs, you will have to train yourself to speak in written English form.  Some EMR systems require the medical professional to call out punctuation!  This is not the only problem.  Many of our clients do not always speak in an ordered linear format from beginning to end.  Part way through a dictation, they will remember something that should be inserted elsewhere in the final product.  With a human transcriptionist, you only have to give direction and the content for this to be accommodated.  With a speech-to-text program, you end up spending more time.

Are you a secretary?
Once you have your transcript in the computer, do you know all the rules of grammar, spelling, formatting, etc.?  If you do, great. Now waste your valuable time performing such an administrative function. Is your staff going to edit the transcript? If so, great. Do they have good secretarial skills that will produce letters, reports, and documents that present the professionalism you need?

Train, train, train...
Although these programs are getting better all the time, they are not yet like the science fiction portrayals of computers recognizing you talk. The programs today cannot recognize speech from every person; they must be "trained." Even after they are "trained," they will make occasional errors, and will almost always not understand uncommon words, new words, or new names. You must train them at least once each time such words arise.

Not for groups or poor quality recordings
Even if the computer can understand one person speaking clearly, it cannot yet even attempt to untangle a transcript of multiple speakers, sometimes talking at once, often in noisy conditions, some talking quietly, some talking loudly.
One day computer technology will allow natural speech transcription, editing, grammar checking, etc., at least for an individual speaker, and at least for practical dictation. That day does not appear to be here. However, if you like new technology, go out and purchase the relatively inexpensive off-the-shelf software ($100 to $400) and try it.  If you are considering one of the medical or legal systems being offered by new companies, try before you buy. We've known several practices that have tried these systems, only to come back to a professional transcriptionist.

How do we know? Not only do we talk to a lot of people requiring transcription and considering speech-to-text programs, we have tried this software ourselves as a way to increase our productivity.  We have seen multiple medical professionals and hospitals be duped into a technology that is not yet ready.

Transcription and Electronic Medical Records (EMR)

Electronic Medical Records are electronic, computer-based records for tracking care from medical providers.  They are a good thing--offering the promise of each person's medical record being available instantly to the providers of their choice when they need care, as opposed to paper files that take days or weeks to transfer.  There are many benefits, but other issues exist in this transformation such as privacy concerns a number of problems around medical transcription.

Should we eliminate transcription with the advent of EMR?

The quick answer is that it seems from current press that manual data entry (from lists or by typing) of care providers can take more time than transcription and reduce the quantity of patients that can be seen.  There is no need to eliminate transcription.  Transcribers can easily provide electronic forms of transcription in any needed format.  The question is whether or not EMR manufacturers will allow that to happen--integrate independent transcription with their EMR systems.

Some providers have chosen to take a giant leap and move completely away from transcription.  This is likely due to the EMR system developers trying to cash in on as many "extras" as they can in the EMR process, and promise increased efficiency along the way.  Some benefits exists, but from our observation, many of these EMR providers may have little incentive to easily accept transcription from external transcribers.  One method used is a voice recognition system.  This is unlikely to work well alone, but may be better with Speech Recognition Editing.  Another is a system where medical staff have to type some information and use check boxes to fill out others.  It may work in some disciplines; however, we have seen this approach cause extreme decrease in efficiency and morale of medical practitioners forced to use this approach.  In such cases medical practitioners need to demand ways to incorporate external transcription into the EMR system.  It's not that hard.

How can Type-thing Services transcription be incorporated into EMR?

Type-thing services will work with you as needed to submit your transcription in electronic format to your EMR system.  Because EMR is new, it has been a wild west frontier environment where there are many different systems. Contact us and we'll work with you to find out how we can customize and integrate with your EMR process or we'll tell you if it's not possible. We are able to provide HL7 CDA (Clinical Document Architecture) format information.  We can also arrange to have secure remote access to your EMR and insert transcription directly.  We can provide Speech Recognition Editing to refine your voice recognition files.

Is 24-hour Turnaround Possible or Reliable?

                Transcription | Type-thing ServicesWhile 24-hour turnaround, or 24-hour transcription, is possible and can be reliable, be careful about what you're obtaining, especially when starting a new relationship with a transcription company.

Every quality transcription company has limited staff to transcribe your materials.  Type-thing Services offers 24-hour to 48-hour turn around to regular customers for which we can schedule or set aside transcription time for their regular work flow.  In addition, we offer such service at rush rates (hours to days with volume) to others only if we know we can meet that customer's deadline.

From our experience, claims of unlimited 24-hour service often indicates that your transcription is being sent overseas, which has a number of implications.  See "Is off-shore transcription worth it?" section for other opinions on that topic.

When you obtain commitment for 24-hour turn around, remember that agreement should dependent upon your regular work volume.  If you send in a week's worth of your dictation all at once, that surge may cause delays beyond the agreed upon delivery time-frame.  Call Type-thing Services to understand what turn-around is possible on your work.

[Top of

Is off-shore transcription worth it?

It's not a secret that, as with many industries, off-shore competition has moved in to challenge providers of transcription services in the United States.  Is it worth it to you?  While this answer is something you'll have to decide for your situation, make sure you're looking at savings in the bottom line cost of your transcription solution.  The following information may be of use in considering this option.

First of all, do you even know when your transcription is being sent off-shore?  Many of these companies have purchased domestic companies, ".com" web sites, or established offices in the U.S., but still send the work abroad.  You may interact with a U.S. citizen and call a U.S. phone number, yet your dictation is sent outside of the United States.  Make sure to ask where your work will be done, and in general by whom.

Total cost for your transcription is likely related to these four items.  Some items may be more important to you than others depending on your business needs.

  1. Raw cost to transcribe ($/line, $/page, etc.)
  2. Risk - Privacy of information sent outside the United States
  3. Spoken versus edited text
  4. Quality of the transcript or product (ability to use the product)
  5. Customer service, responsiveness and flexibility to your needs
  6. Specific off-shore issues (turn-around, privacy, security, export-control regulations)
Here is a bit more information on each of these items.  Note that with the exception of the fourth item, the act of sending work abroad IS NOT the inherent problem--it is the quality of service you receive.  You may be able to find a quality off-shore provider that lowers your total cost; however, our experience has shown that this is not often the case.
1. Raw cost to transcribe
Because raw cost to transcribe is the initial attractive feature of off-shore services, your initial raw cost should be less.  You should understand that raw cost is not your total cost.  Consider the total cost in your decision.  Total cost may be affected by the following three items.
2. Risk - Privacy of information sent outside the United States
Offshore Transcription Risk - Type-thiing
                    ServicesThe basic fact that your work is sent off-shore may be an issue you've not considered.  One positive issue is that if you require quick turn-around, those working on the opposite side of the globe can transcribe while you sleep, so your work may be ready the next morning, in less than 24 hours.  There are a number of potential negative issues
  • When you send your audio and resulting transcripts outside of the United States (with or without your knowledge), you are sending it to locations not covered by United States law.   If your information is private or covered by a number of laws to which you are held accountable, can you be sure you've performed due diligence in protecting that information?  If that information is disclosed, can you obtain damages from a company in a remote country, one you may not even be able to identify?
  • Is the process to send the work abroad such that it meets your security and privacy needs?  Company proprietary information or health information (HIPAA) could be compromised.  It is not just the transmission of your audio that should be secure, but there should also be assurance that the companies and individuals abroad can maintain privacy and security.  Their networks, computers, and facilities should be as secure as domestic providers.  A number of instances in the press have shown that security abroad is an issue.  Even if they have excellent computer and information security, the people working there are under foreign government influence and different rules.  If something does go wrong, how are you going to take action against an off-shore company?
  • A critical problem to consider is export-control regulation.  This appears to apply mostly to technical data, not necessarily personal medical information.  Export Administration Regulations ("EAR") and International Traffic In Arms Regulations ("ITAR") control the export of commodities, software, technical data and other information to foreign countries.  If you send information abroad in audio files which is covered by these regulations without the proper export licenses, you can be fined and go to jail.  If non-U.S.-citizens within the U.S. access this information, it is also considered an export.  Check with your company or institution to see if your transcription contains export-controlled information.
3. Spoken versus edited text
In many off-shore transcription services, you are charged for every word that you speak because your transcript is a literal copy, often inaccurately, of what you say.  With Type-thing Services, you are not billed for your redundant words and comments to our transcribers.  We usually reduce the size of the transcription product you receive because we edit it as we transcribe!  In addition, you or your staff must now spend time editing the transcript from spoken to written English.  So, you pay less because there are less words and lines and you have a higher quality edited product.  This is double the value!

4. Quality of the transcript or product
The most common complaint we've heard from clients that have tried off-shore services is that the innate language barriers cause inaccurate transcripts, grammar is poor, and there are spelling  problems.  If the pool of transcriptionists is large and transitory, your quality may be variable.  This is worsened by U.S. clients that tend to talk fast, mumble, or of have a strong local U.S. accent.  If you don't mind a poor-quality product, this may not influence your decision.  Just remember that a poor quality product may influence your total costs now because you have to fix the product yourself, or it may influence your future costs should you call upon the transcription in the future and find it useless.  If the transcript is a form of insurance or mandated record, you may be found negligent for accepting a poor quality transcript.  If a faulty transcript is used in the future, it may cause erroneous actions that will increase your costs.  Note that you can get poor quality from domestic sources too, so this is not just an off-shore issue.  Off-shore sources may be able to produce high-quality product if they have the right staff; however, they are having an increasingly difficult time finding qualified staff.
5. Customer service, responsiveness, flexibility
If the off-shore services and their domestic front offices cannot provide you with the customized and responsive services that make your work efficient, then that adds to your total costs.  If this doesn't matter to you because you need little customer service, then off-shore services may be more attractive.  Common complaints we've heard from clients include problems redressing quality issues, following up with updates, and corrections.  Because many off-shore services save money by having large-scale operations, they may also have some trouble at customizing their process to fit your business needs.
In addition to the above four items, you might also consider the following:
How does Type-thing Services know?
We receive clients who have not been satisfied by their experience with off-shore transcription services for many of the reasons noted above.  We have been contacted by numerous off-shore companies that have wanted Type-thing Services to front their services to U.S. customers. We have seen transcripts produced by off-shore transcription companies when clients were not happy with the results.  We have called to understand the utility of using such services ourselves.
Does Type-thing Services use off-shore transcription services?
No.   All our work is performed in the U.S.A by U.S. Citizens.  Most all of our work is performed nearby our location so that we know and can interact personally with our transcriptionist.  Quality is an essential element of the product Type-thing Services provides.
Is off-shore labor plentiful?
Not necessarily.  Plentiful qualified labor is the entire premise for off-shore transcription companies ability  to maintain low rates and quality.  Recent news articles show that as the global economy evolves, off-shore markets are experiencing difficultly in obtaining enough qualified labor for many technical tasks and service tasks that require training.  Their qualified staff must be paid more or they move to higher-paying jobs.  To maintain lower rates, they must use less-qualified labor.  The grass is not always greener on the other side of the ocean.

About digital audio files

With establishment of multimedia computers (audio, video, etc.) as the norm, more material is being generated in the form of digital computer files. Digital hand held dictation devices are now available that record to a memory card and can generate audio files you can place on disk or send over the Internet. Type-thing Services has the ability to convert and transcribe such files that come in a variety of formats.

We can also generate these files for use on your web site from your audio or video tape.  We'll work with you to understand what you need for your application.  Part of our service is understanding these formats and knowing which work well on the web and Internet.  We use multiple methods to make the smallest possible audio file for your purpose so that the file can be downloaded or transmitted most efficiently.  See our Web and Internet Services page for more detail.

These are some of the existing common open formats for digital audio files: 
Windows WMV
Windows PCM (WAV)
Microsoft ADPCM (WAV) 
MPEG3 FhG (MP3)*
MP4, M4A

CCITT mu-Law and A-Law (WAV) 
MPEG audio (layers I and II) 
Microsoft ADPCM (WAV) 
CD and DVD Audio Disks

Video formats (AVI, MOV, WMV etc.)

* Note that when creating MP3 files for transcription, you should use the Constant Bit Rate (CBR) method of storing sound in the file.  Use of the Variable Bit Rate (VBR) will cause fits for transcriptionists because time is compressed in unpredictable ways that will cause their foot pedal backspace feature to jump randomly in the file while transcribing.  If you don't know what this means, don't worry, Type-thing Services can convert CBR to VBR files for you.

These are some file formats that are proprietary, particularly used for hand held digital recorders:
Sony Memory Stick Voice  (MSV)
Sony Digital Voice File (DVF)
Sony IC Recorder Sound (ICS)
Olympus (DSS, DS2)

These are multi-track proprietary file formats.  They are typically for courtroom or law-enforcement use, but have other applications for multi-channel recording as well.
  • FTR Gold by For The Record (ftrgold.com) (FTR)
  • Liberty Court Recorder/Player by High Criteria, Inc. (highcriteria.com) (DCR)

These are single track or stereo files, but usually more obscure file formats.  What you consider "obscure" probably has to do with what applications you work with, so some may think these are common.
8-bit signed raw format (SAM) 
ACM waveform (WAV) 
CCITT mu-Law and A-Law (WAV) 
Dialogic ADPCM (VOX) 
Real Audio (RA, RAM, RMM, RM, etc.) 
MPEG audio (layers I and II) 
Next/Sun CCITT mu-Law, A-Law and PCM (AU) 
Apple Quicktime
Raw PCM Data
SampleVision format (SMP) 
Sound Blaster voice file (VOC) 
TrueSpeech (WAV)
DiamondWare Digitized (DWD)
Apple AIFF (PCM encoded data only) (AIF)

We are also able to transcribe audio from any source on the Internet or World Wide Web given that we can access it with a standard browser or program.  See our Web and Internet Services page for more information.

Each audio file can have various options that may be important to dictation and transcription.  Typical options are as follows:
  • Tracks: Mono, Stereo, Multi-track
    • The more tracks you have, the more file size is required.  Stereo or multi-track is not typically useful for transcription unless each track represents a separate microphone in a different location.  In that case, all the tracks can be combined for transcription or transcribed separately.  Courtroom recordings often have four separate tracks (judge, witness box, defense/defendant, prosecutor/plaintiff).
  • Sample rate
    • Sample rates tell you how many times each second the audio is recorded.  Faster rates have better quality but take more file size.  Slower rates have less quality but produce smaller files.
    • Typical sample rates are resumed in samples per second and are typically 6000, 8000, 11025, 22050, 32000, 44100, 48000, 64000, 88200, 9600, and 176400.   CD-quality audio is 44100 samples per second.
    • The frequency of audio you can reproduce in a digital file is at most half the sample rate.  So, at 44100 samples per second, a CD audio can reproduce at most 22 kilohertz frequencies.
    • We recommend that for voice transcription you have a sample rate at least 22050 samples per second.  We can transcribe lower sample rates, but the audio quality decreases with lower sample rates.
  • Compression
    • Some formats of audio permit various degrees of compression, which makes the file smaller at the expense of audio quality.  Most of the time audio quality is not impaired, but at extreme compression it may be affected.  These file formats are known as "lossy" in that they can loose audio quality.  An example format like this is MP3.
    • Compression is a trade off of file size to audio quality.  For dictation, select one that does not significantly impair audio quality.  MP3 files can compress more with the Variable Bit Rate (VBR) format, but don't use that because transcriptionists cannot use that directly.  Instead, use the Constant Bit Rate (CBR) format.
  • Sample size (bits)
    • Each sample taken typically has a fixed size, measured in bits.   The larger this size, the more accurately the audio can be reproduced and the larger the resulting file.  The smaller this size, the less accurate the audio, but smaller the resulting file size.
    • Typical sample sizes are 8-bit, 16-bit, and 32-bit.  The most popular and size we suggest for transcription is 16-bit.
New formats are coming out all the time!

Which digital audio files should I use?

Which files are the best to use?  It depends on your situation and use of the digital audio file.  If your equipment uses a particular audio file format, you have limited options.

Which type work on the Web and Internet?  The web and Internet use of audio is evolving.  For transcription, current influence is created by MP3 players, Apple's I-Pod, and digital dictation machines.  MP3 and WMA file types seem to be popular at this time.

Original sound files included the Next/Sun (AU extension) files and the also, due to Windows' popularity, the WAV files.   Later formats like Quicktime and Real Audio showed promise in reducing the file sizes and added ability to stream the audio.  Streaming means the audio is played over your computer's speakers pretty much it arrives.  Before that, the entire audio file had to be downloaded before it was played, which was inconvenient for large files or those that were transmitted in real time.  Now MPEG3 files are popular for music files and are very good at compressing audio as are WAV type TrueSpeech files.  The answer to the question really depends on what you are trying to do and what resources you have to provide the audio files to the user.  Some issues include:

  • How are you going to provide audio files to the users?
  • Will the users be able to work with the audio files you provide?
  • What bandwidth Internet connection do the users have?
  • Are the files going to be downloaded or streamed?
How do you make the smallest audio files?  This is a fairly technical issue that trades off sound quality with file size.
  • Newer audio file technologies typically make smaller files.
  • Some file formats (or options within a format) can reduce size.  This is compression.
  • As the number of samples per second is decreased, so is the file size (usually).
  • As the number of bits of resolution (dynamic range) per sample decrease, so does the file size (usually).
The process of decreasing the file size can be fairly complicated, and if not done properly can result in distorted or noisy audio files.

What things should be done to generate good audio files?  The most important thing is to start with good quality audio -- either digitally recorded or recorded on magnetic audio or video tape.  Just like the guidance provided above about transcription, good quality recordings are essential at reducing cost and increasing the quality of your audio file.  Fortunately digital audio files can be edited and enhanced more easily to produce a better recording from 

What can be done with Audio files to edit  the recording?  Digital audio files can be easily edited to produce a good quality finished product. For this discussion, editing is the simple rearrangement of audio segments that is analogous to cutting and splicing audio tape.  Some examples are:

  • Audio can be easily deleted.
  • Audio can be easily moved, copied, or spliced.
  • Silence can be added or removed.
  • Audio from other sources can be spliced into the recording.
  • Multiple tracks can be converted into one track.
At Type-thing Services we clean up the beginning and end of audio  for customers in our standard fee for generating audio files.  Additional editing is charged on an hourly basis.

What can be done with Audio files to enhance the recording?  Digital audio files can be enhanced either to improve poor-quality sound or by adding various special effects.

  • Uneven speaker volumes can be adjusted so low volume speakers can be heard.
  • One speaker can be increased or decreased in volume to generate a sense of distance or depth.
  • Many constant background noises (hum, buzz, noise, etc.) can be eliminated without distorting the speech.
  • A large number of recording studio special effects can be added to all or parts of the recording.
Such services are typically charged at an hourly rate.  Contact Type-thing Services if you have questions!

About audio files from video

Video may refer to video tape or electronic video files. Digital audio can usually be extracted from digital video files and transcribed as noted above. Video tape transcription requires making an intermediate audio tape that can be more easily transcribed. Type-thing Services has the ability to transcribe the following formats. Other formats and standards (such as PAL) can be converted with a slightly longer lead time. 


SVHS (Super VHS)
Digital Video Cassette
8mm (normal)
Hi8 (8mm)
Digital 8


Quicktime files
AVI (Audio-Video Interleaved) files (Microsoft)
DVD (Digital Video Disk)

Other Digital Files: Just about any Internet source

About "tapeless," digital," and "phone-in" dictation

These approaches to dictation and transcription have become the norm in the industry.  "Tapeless" and "Digital Tapeless" are becoming archaic terms for Digital refer to dictation without audio tape. This could be a hand-held recorder that stores your dictation in memory modules, or it could be a phone-in dictation system.  These types of devices have essentially replaced hand held tape recorders.

First-generation digital dictation units (popular types by Sony, Olympus, etc.) typically produce audio in proprietary formats that are difficult to convert without their own proprietary software.  Newer devices coming out after 2009 started to create files in standard file formats such as MP3, MP4, and WMA.

Type-thing Services prefers you consider phone-in dictation because of the numerous advantages it offers. See the "Phone-in Dictation" page on this Web site.

We have the capability to download audio files for transcription and have also transcribed from voicemail and other digital transcription services and devices.

About audio tape sizes and formats

With the advent of digital dictation devices, audio tape is now rarely used for dictation and transcription, yet they continue to be used in various forums and applications. There are three primary sizes of tapes all of which Type-thing Services can transcribe. In approximate order of popularity they appear to be: tapes
  1. Micro cassette,
  2. Regular cassette, and
  3. Executive cassette.
These can be directly transcribed because transcription machines are available in these sizes. Other size tapes, including videotape (VHS, BETA, etc.), can also be transcribed by Type-thing Services . We first make copies to one of the three above types. Note that micro and executive are very close in size but do not fit in each other's machines. When using regular cassette tapes for transcription, avoid any longer than T-60 (30 minutes on a side). Longer tapes tend to jam more easily in the transcription machines which often start and stop the tape. Micro and Executive tapes are designed for transcription and therefore rarely jam.

Shown above are the regular cassette (top), executive (left), and micro (right) with approximate sizes for each tape. Micro and Executive cannot be used in the other's machines. Executive tape dictation systems are more expensive but provide superior clarity of dictation.


Most popular recorders use a single track of mono or stereo audio. Some of them have two speeds that you can record your audio. Recording on the fastest speed produces higher quality dictation, but provides less recording time on the tape.

Multiple-track recorders are typically used in settings that require very accurate transcriptions and have multiple persons that might speak simultaneously. For instance, courtroom transcripts are often taken by a four-track recorder with each person wearing a separate microphone and recording on a different track of the tape: judge, two lawyers, witness. Multiple-track recorders are rare outside of the courtroom setting. However, they provide superior transcripts because the transcriber allows one to listen to each track individually or all tracks at once.   Again, digital dictation systems have primarily replaced tape-based recording.

[Top of

About quantity of dictation per tape

How much content can fit on a tape?  With use of digital files, a good question is also how much dictation fits in a minute or hour of dictation.  See the "About Cost to Transcribe" section above for more detail.

For tapes, it depends on how fast the person or group talks, and how much quiet time is on the tape, the tape capacity (length). We have seen 3000-12000 words per tape, 5-50 pages per tape (various length tapes).

Another way to think about this is to consider that a rough average of one page per minute for single-speaker dictation.  A 60-minute tape might have 60 standard pages.  Multiple speakers or fast speakers will increase this page count. Again, see the "About Cost to Transcribe" section above for more detail.

Michele Duran Skroch (skraw)
505-922-1000 Albuquerque NM voice
703-679-TYPE (8973) NoVA Voice
1-877-217-0005 voice
Serving customers across the United States including
Washington D.C., Northern Virginia, to California, and of course, New Mexico
on domestic and international business.

 email: michele@type-thing.com
web: http://www.type-thing.com/

Updated 24 Jul 2014
Text and graphic content Copyright 2000-2013 Type-thing Services, LLC except where noted . All rights reserved. 
Disclaimer about information on this site.