A Brief Presentation to Intersteno Participants, Prague ... · PDF fileA Brief Presentation to...

A Brief Presentation to Intersteno Participants, Prague, July 2007 Keith Vincent – www.KVincent.com

- 1 -

Thank you for inviting me to give this brief presentation on court reporting, realtime transcription, and captioning/subtitling. For the past 15 years, I've been a court reporter in Houston, Texas. I've offered a lot of software seminars in the United States and I've also been the realtime reporter for quite a few depositions and arbitrations overseas, but it is a privilege to be with you today in the glorious city of Prague, the Heart of Europe.

In preparing for today, I couldn't help but recall that in 1992, when I was finishing my court reporting studies at Alvin Community College, near Houston, I entered a contest to win a brand-new steno machine. We were simply told to write an essay about court reporting. I chose to focus on the reporter as listener. After all these years, I still think that listening is the very essence of our profession. Machines can hear and record, but one of the best compliments we can give to a human being is to say that he or she listens. In other words, hearing is passive, but real listening is active. Indeed, one of the most important things that we do while listening is to ask someone to speak up or to repeat something to ensure that we understand. It is a simple, human courtesy to say, "Please, say that again," but it is the very essence of preserving another person's right to have his or her words speak for themselves. Whether you use a pen, a steno machine, or speech recognition software, what we do is this: listen, recognize what is said, then transcribe. We may differ in the tools we use to capture what we hear, but we are united in our service as good listeners and faithful transcribers. In my experience since 1992, the most exciting developments in computer-assisted transcription have demonstrated that the best CAT product deals with transcription problems so intelligently that the end result is that user can simply focus on listening. Despite the ever-increasing demand for realtime transcription and for subtitling or captioning of live broadcasts, most people have very little notion of what a CAT system does. You've probably seen movies where lawyers are arguing at a passionate 250 words a minute yet the court reporter's fingers are barely moving, on a very old steno machine. That makes my blood boil. Many people could imagine that in today's high-tech environment, one could just set out a few microphones and a computer would instantly transcribe every word automatically, even when multiple people are speaking at the same time or when speakers whisper, mumble, or when voices are obscured by coughing or the coincidental rustling of paper. Even very well informed people might think that all CAT systems are alike. This is simply not true. Yes, they all help you transcribe. However, one could use two systems, with one user, identical input, identical dictionaries, and the results would be very different.


- 2 -

The best CAT software does not just match steno with text. In my opinion, a CAT system that just consists of a matching table and a word processor doesn't really do much. In contrast, we need software that actively identifies patterns within language, understands homonyms, knows how to correctly format numbers, etc. Thus, countless problems can be resolved which passive systems would simply require the human user to correct. For example, take the problem of homonyms. Intelligent conflict resolution is a powerful tool for reporting and captioning/subtitling. Its usefulness is not limited to English language settings.


- 3 -

The best CAT software is now international in scope. It is used every day for parliamentary reporting in Italy, Germany, Argentina, Australia, and for bilingual reporting in Canada. In London's theater district, it provides supertitles to help European and American visitors understand the British. Here are just a few examples of a program that is international in scope.


- 4 -

The best CAT software supports captioners/subtitlers and reporters throughout the world. It supports a long list of steno machines and it considers speech as one of many possible input sources.

No longer is a stenomask merely a method for dictating to a typist. The SpeechGate module from AudioScribe takes the output from major speech engine vendors (currently, IBM ViaVoice and Dragon Naturally Speaking), formats it, and feeds it to the CAT software in a form that closely resembles steno input, meaning that a voice writer has access to all the powerful features that machine reporters enjoy. The best CAT software doesn't just display the text that comes from the speech recognition software. Instead, it refines it and formats it. In addition, it offers an easy method for sending corrections to the speech engine to instantly improve word recognition.


- 5 -


- 6 -

It is equally true that lessons learned in working with voice writers have resulted in breakthrough features for machine writers. Keyboard or speech, it's just an input method. What the CAT software does with the output of these devices, that's the real magic. "Global Magic" and "Translation Magic," two of the best new features for machine writers in the software that I myself use, actually sprang from lessons learned in the course of work with speech recognition software. For voice writers using speech recognition software, little words can be problematic. Little words lend themselves to misrecognition, while big words are more unique. For machine writers, it's just the opposite because big words take multiple steno strokes, making it all too easy to hit a wrong key along the way. Here's what I mean by "Global Magic" and "Translation Magic." Computer-Assisted Transcription programs use global replacements to correct errors that occur when steno is translated to text. These corrections greatly reduce future editing because they build up dictionaries that are used to match steno with text. Typically, this involves quite a bit of typing. However, with its "Global Magic," the CAT system is able to understand what your steno means, even when keys have been dragged in or dropped out during high-speed writing.


- 7 -

Sometimes two successive steno strokes can register simultaneously. "Global Magic" does an excellent job of understanding such "stacking" problems. Here's the text after it was corrected by glancing at the list of suggestions and picking the appropriate choice. When the suffix "s" was added to "city," the spelling was automatically adjusted to produce "cities" rather than "citys."


- 8 -

"Global Magic" greatly simplifies editing. In other words, after translation has occurred, one can make corrections by simply picking from a short list of choices. "Translation Magic" is more fundamental. It reduces editing time by reducing translation errors. It understands what your steno means even when you're writing complicated words for the very first time. It also understands when you drag or drop keys while writing steno that is already defined in your dictionary. In these examples, each item displayed in red is something that was understood by "Translation Magic." In addition, "Auto-Brief" has automatically created a steno shortcut to make it possible to write "pharmacology" in one steno stroke instead of five. That is what is meant by the "PHA = pharmacology," which appears in the upper-left corner of the text window.


- 9 -

Of course, I know that just buying a pen does not make you a pen writer. Neither do a few weeks of training with a steno machine or a steno mask make you make you qualified for the challenges of the real world. Even the best technology needs trained, skillful users, not just magic. This is especially true when it comes to on-line subtitling and captioning of live broadcasts, where there is simply very little time for making corrections along the way. Of course, one expects no errors in off-line captioning or subtitles for movies and other shows where the producers had the luxury of time to get the text right. When the average television viewer sees a live broadcast that is subtitled, he or she probably does not think someone is using a steno machine or speech recognition software with the aid of CAT software for realtime transcription. Not only is an instant transcript being created and stored; in addition, text is being sent to an encoder so that several lines of text, once decoded, may appear on your TV screen. Some material lends itself to the use of scripts. For example, one may have received the text of a speech that will be read. Perhaps a song will be sung and the words can be obtained in advance. This text can be prepared ahead of time and sent out as entire lines to appear on the TV screen. When impromptu remarks occur, these can be written on the steno machine or dictated for the speech recognition software until the point when scripted material resumes. During live broadcasts, subtitling/captioning also requires a gift for summary. Sometimes verbatim transcription is simply not possible or even not desirable. At times it is simply better to read "the hall" rather than "Ba Da Ling Four Season Exhibition Hall of Ice Lantern and Artistic Ice-Carving Works." That's a real company name, by the way. Text must flow at a readable pace, without lagging or suddenly rushing. The hearing impaired may also be reading impaired. So subtitles for live broadcasts must show a real awareness of the audience that they are intended to serve. In captioning and subtitling, the text must not only be correct, it must be properly formatted and precisely positioned on the screen. It must also not linger on screen during extended periods of silence. The best CAT software easily addresses all of these concerns.


- 10 -

During live broadcasts of sporting events, subtitles may be expected to move around the screen to avoid covering up time clocks or other graphic elements. It's just one more way in which realtime transcription software is expected to do so much more than just match steno with text or match speech with text. With ordinary programs, each minute of listening requires several minutes of transcribing. It's not unusual to hear of folks who used to edit 30-35 pages an hour on their old system and who now edit 40-50 pages on state-of-the-art software. The best CAT software allows you to worry a lot less about transcribing and, instead, just focus on listening. I would be remiss if I did not make reference to other exciting work that is taking place throughout the world, and here I am grateful for information that Dave Rogala and Gian Paolo Trivulzio have graciously provided. The European Community has supported several speech recognition research projects, and soon commercial products could result from a combination of efforts made in several countries. It is thus noteworthy that the Japanese Continuous Speech Recognition Consortium's Open-Source Julius Large Vocabulary Continuous Speech Recognition Engine has been ported to Microsoft Windows.


- 11 -

In the European Parliament, Dragon Naturally Speaking is used for preparing session reports and by interpreters who translate text in several languages. Dragon accommodates Dutch, Italian, French, German and English. Intersteno Prague Conference sponsor Newton Information Technology works with the Technical University of Liberec on Czech speech recognition. Newton electronically distributes a wide range of industry- and sector-related newscasts, similar to the realtime Bloomberg service in the U.S. Bella Italia. The Italian Senate and Chamber of Deputies began using speech recognition in 1995 and have sponsored their own education and training. The national TV network RAI uses speech recognition for off-line subtitling. CART services are provided at the University of Bologna. England foresees 100% of new television broadcasts being subtitled by 2010 and has reached 60% capacity. Today about 50 staff members are using speech recognition for subtitling. Additionally, between 3000 and 4000 hours of subtitling per year are done via e-working from Australia using stenotype technology. The Liberated Learning Consortium and Saint Mary's University in Canada have collaborated with IBM to create ViaScribe. Based on IBM ViaVoice, it synchronizes speech, text, and visual media such as PowerPoint presentations into one-click packages retrievable with Real One or Windows Media Player. Liberated Learning technology supports French, Italian, UK and US English language models, and work is ongoing to include Chinese-Mandarin and Japanese. In the USA, Wade Price at Oklahoma State University has developed the first automated software to capture captions for distance learning videos posted on the institute's website for students who are hearing impaired. The automated software is about 80 percent accurate in translating audible words to text. Professors then edit and refine the final text. Mr. Price plans to let the university use it at no cost to provide closed-captioned video to all students on campus with disabilities. Of course, great technology doesn't do you much good without accessible training tools. For that reason I've devoted the past five years to developing eight in-depth video training tutorials for the system that I myself use, Total Eclipse by Advantage Software. These complement the more than 200 short Visualizer movies that are built into Eclipse. Additional information is available at my website (www.KVincent.com).


- 12 -

I am also delighted that Bettye Keyes is here in Prague for Intersteno 2007. Bettye has developed a superb book on speech recognition for reporting and subtitling.

Bettye's website: www.EclipseVoxRocks.com. It's quite an understatement to say Bettye has written a book. What she offers are outstanding pedagogical materials for using speech recognition in realtime reporting and subtitling. You'll probably see me hovering near her, soaking up as much information as I can from this lovely, talented lady. You'll probably also see me in the company of Dan Glassman. Dan has served the international reporting and subtitling community for about 25 years. About ten years ago, he founded Word Technologies to serve as your Advantage Software representative.

Dan's website: www.WordTechnologies.com. Almost two decades have passed since I began my training as a stenotypist. I am amazed at the developments that have occurred, and I look forward to more. For pen writers, I look forward to advances in tablet technology that will open new doors for realtime transcription. As a keyboard writer, I look forward to ever more accurate steno machines such as the "Passport," which Advantage Software will soon be bringing to the market. Of course, voice writing will continue to become an ever more impressive method for capturing the spoken word. However, it is also my hope that each of us will continue to cultivate the skill that unites us: listening. Listening with attentiveness, with respect, with a keen desire to understand. The world will never lose its need for good listeners.

A Brief Presentation to Intersteno Participants, Prague ... · PDF fileA Brief Presentation to...

Documents

Transcript of A Brief Presentation to Intersteno Participants, Prague ... · PDF fileA Brief Presentation to...