Speech Synthesizers as Aids to Revision

6(3), August 1989, pages 81-92

Speech Synthesizers as Aids to Revision

Eleanor Berry

Over the past five years of teaching writing with word processors and studying the effects of word-processor use on student writing, we have come to recognize that using word processors does not of itself make student writers revise more effectively or produce better work. It has become apparent that instruction in the goals and techniques of revising is as essential for students writing with word processors as for students writing with pen and paper. Besides, word processors are sufficiently different from traditional writing tools that different strategies are needed to use them to best advantage. Thus, in addition to general instruction in revision, students need training, as Cynthia Selfe (1985) has suggested, in specific strategies for using the new writing technologies for revising. Further, students may benefit from computer tools that provide cognitive aid, not just physical ease, in revising.

This paper will argue that low-cost speech synthesizers, used to read drafts back to students, are potentially valuable as aids to revision. It will show that there is reason to expect that listening to speech synthesizers read their drafts can help at least some student writers to revise their work better. This paper will urge that classroom trials of speech synthesizers should be undertaken and that, to this end, the modest cost for adequate speech synthesizers should be included in the budgets for updating existing computer facilities and for equipping new ones.

Speech synthesizers represent an existing technology that has not yet, to my knowledge, been exploited in the computer-assisted writing classroom. They are, however, widely used in other applications. A quick survey of different kinds of speech synthesizers and their use in these various applications will help clarify what features are needed in speech synthesizers to be used as revision aids and will help establish the feasibility of incorporating them into the writing classroom.

Speech Synthesizers and Their Applications

At present, speech synthesis is being widely applied to enable blind and other handicapped persons to make use of computer technology; Williams (1984) lists twenty colleges where "talking computers, terminals, typewriters, and calculators" (p. 29) are in use by handicapped students. In such applications, speech synthesis is often associated with voice recognition capability, i.e., the computer's ability to respond to a user's oral commands (rather than only to keyboard and mouse input). Obviously, the possibility of issuing spoken commands to a computer, then getting feedback by listening to the computer speak, is immensely valuable for persons with visual and/or motor handicaps. Writers without such handicaps, however, can enter their texts and get the benefit of seeing copy emerge on the video monitor as it is composed. Such non-handicapped users have no particular need for voice recognition capability in their computer tools, because they can use their hands to enter their texts, issue commands to the computer, and see on the screen what they have thus "said." Speech synthesis can then be used not only to vocalize their texts but also to allow them to hear what they have written.

Now, synthesized speech is also used in a wide range of commercial applications--to tell us the phone numbers we request when we dial Information, to announce station stops, to vocalize instrument readings, and so on. More and more, "the machine-like speech that aficionados of science-fiction films know and love" (Gallagher, 1985, p. 19) is heard around us.

The speech synthesizers used to give instrument readings and to make brief informational announcements vocalize pre-stored speech. Speech synthesizers of this kind are useful in situations that require high-quality speech, but these synthesizers can only produce about 30 seconds of high quality speech at a time, and they possess only a limited vocabulary. In the composition classroom, however, we want a speech synthesizer able to vocalize any text a student might write. For this purpose, a different kind of speech synthesizer is required: a text-to-speech conversion device. A speech synthesizer of this kind converts ASCII text into speech in real time, by stringing together phonemes according to an algorithm that encodes rules for the relation of English spelling to English pronunciation.

Such algorithms can be extremely sophisticated. For example, the firmware in DECtalk (from Digital Equipment Corporation) first parses sentences and words, then checks for any matches in a stored dictionary; if no match is found, the firmware invokes rules for converting letters to phonemes, then applies prosodic rules for speaking in phrases, and finally applies rules for converting phonemes to speech (Bursky, 1985). Similarly, the Prose2000 board from Speech Plus chooses a given allophone at a particular point in a particular sentence on the basis of the sentence's syntactic structure and the speech rate selected by the user, depending on whether the user has chosen the word or clause mode of intonation (Teja, 1983). DECtalk is a stand-alone unit, like a printer, that plugs into the RS 232 serial port of a computer; the Prose2000 takes the form of a circuit board that plugs into an expansion slot inside a computer. Costing a few thousand dollars, DECtalk and Prose2000 are high-end examples of the phoneme-stringing kind of speech synthesizer.

The Prose2000 reportedly "speaks clearly enough to have won it a spot as a talk-show host on a Pittsburgh . . . radio station" (Teja, 1983, p. 65). Fortunately, such sophisticated algorithms, with the concomitant high cost of units, are probably not needed to convert text to speech of sufficiently high quality for vocalizing student writers' drafts. Students' drafts may not have to be rendered in "attractive, natural-sounding speech" (Gallagher, 1985, p. 19) for students to benefit from listening to them vocalized by a speech synthesizer. Indeed, a minimally inflected rendition may be more helpful than an expressive-sounding performance, for such a rendition obliges listening writers to construct their texts, much as outside readers must. Thus, it puts writers in a position close to that of their readers.

Although natural sound and expressive inflection may not be necessary for synthesized speech used to vocalize students' drafts for them to listen to, direct and ready access to the units is, I think, essential. Students should be able to command speech synthesizers to read them successive versions and partial revisions of a paper as they are produced. The price of high-end speech synthesizers would make it prohibitively expensive to give students the access they would need to use these synthesizers effectively. With the most expensive machines, students would probably have to be limited to listening to tapes, made by their teacher, of a single draft (as Elaine Lees' [1985] basic writers at the University of Pittsburgh listened to tapes she had made of their drafts read by a Kurzweil Reader).

The speech sounds produced by the Personal Speech System, a stand-alone unit from Votrax, Inc., are "very machinelike" (Bursky, 1985, p. 119), but this might even constitute a positive appeal for "aficionados of science-fiction films," and the unit costs a few hundred dollars rather than a few thousand. Other low-end speech synthesizers that can vocalize text files in real time include the Microvox from Micromint, Inc., and the Echo GP (general purpose) speech synthesizer from Street Electronics. The Macintosh computer includes a built-in speech synthesizer of this kind and caliber.

Surveying existing kinds of speech synthesizers and applications of speech synthesis, we can differentiate the particular needs of student writers using speech synthesizers as aids to revision from the needs of other kinds of users. Non-handicapped student writers have no particular need of voice recognition capability in speech synthesizers, because they do not have to give input to the computer orally. Non-handicapped writers do need text-to-speech conversion units rather than the kind that vocalize short bits of prestored speech, because these writers will want the computer to read back to them whatever they write. And non-handicapped student writers need these devices, whether built-in or external to the computer, to be directly accessible from the word processor at all times, so that these writers can hear the effects of changes as they are revising. This need for accessibility outweighs, in my judgment, the desirability of highly realistic speech. Because the cost of equipping every computer in a computer classroom or laboratory with a high-end text-to-speech conversion device, with highly realistic speech, would be prohibitive for most institutions, I would recommend equipping facilities with low-end units, with machinelike speech (or using the built-in speech synthesizers that certain kinds of computers will already have).

Using Speech Synthesizers in the Writing Classroom

Speech synthesizers can readily be incorporated as revision aids in writing courses where students write and revise with word processors and are guided in developing their ideas through several drafts. Just as the students might submit their drafts to peer editing or run them through a spell checker and/or a style checker, they can listen to their drafts vocalized by a speech synthesizer. As in the case of peer review, teachers can establish appropriate guidelines for this activity. For example, students might be told to listen to their raw writing, asking themselves, "What is this trying to say?" Here, the speech synthesizer could help in the generation of ideas.

Students might then be instructed to listen to a complete rough draft of their papers, asking themselves questions like, "If I were hearing this without having written it, is there anything in it that I might not understand?" Here, the speech synthesizer would help the students in revising to suit their purposes vis-à-vis an external audience.

Sometimes, especially in the early stages of work on a piece, it may be helpful for writers simply to listen to the speech synthesizer, without looking at their texts on the screen or on printouts. More often, probably, writers can get the most out of listening by following along on printouts, with pen or pencil in hand. Teachers can instruct their students how to mark their printouts while the students listen to the synthesizer read their texts. It will be impractical for students to attempt detailed notes; it will be better for them just to put X' s, question marks, or other simple notations in the margins. Teachers may want to suggest particular questions for students to keep in mind as they listen. After listening, students can be invited to go back over their printouts, making brief notes for changes. Then, back to the word processor.

Later, students might be asked to listen to a near-final draft, marking on a printout any possible errors. Here, the speech synthesizer would help students in copy editing. The speech synthesizer could help them detect the mismatches between altered or interpolated text and context. These mismatches occur--and are easily missed--when, with a word processor, students can make changes without recopying and the re-reading that recopying entails.

More than expert guidance in the use of speech synthesizers is required, however, if student writers are to realize the potential of these devices as revision aids. As Cynthia Selfe (1985) has emphasized with respect to word processors, students must have convenient access to equipment if their use of this equipment is to be more than limited . In the English department computer classroom at the University of Wisconsin-Milwaukee (UWM), where I attempted to incorporate use of speech synthesizers into the revision cycle for writing assignments, only five of twenty student computers had speech synthesizers. And these speech synthesizers could not be driven, like the printers, from within the word-processing program. Getting an oral reading of one's draft should be as easy as getting a printout. But in the UWM lab, each time the students wanted to listen to a draft, they had to run a BASIC program to get the Votrax Personal Speech System to read a text file created by WORDPERFECT. With so few speech synthesizers, it was impossible for all of the students in a class to listen to their drafts during a single class period. This situation was exacerbated by the necessity of switching from the word-processing disk to the BASIC disk and back again, so that even a designated group of five students did not always have the time to make the changes to their papers. Even students who reported finding the speech synthesizers helpful when they used them in class shied away from using them on their own during open lab hours--quite justifiably, they did not want to take that much time from their actual writing or worry about a somewhat complicated procedure. Convenient access to speech synthesizers is a prerequisite for any evaluation of their impact on students' revision practices.

Speech Synthesizers as Aids to Revision: What Kinds of Help Do They Give?

To make a preliminary assessment of the likely impact of speech synthesizers on student writers' revision practices, we shall want to ask what kinds of assistance synthesizers offer writers. More specifically, we can ask the following three questions:

To what extent do speech synthesizers detect, diagnose, and/or recommend solutions to problems for writers, or, on the other hand, leave such judgments and decisions up to the writers?
To what extent are synthesizers text-oriented or, on the other hand, oriented toward the rhetorical situation?
Do synthesizers give the writer feedback in public or in private?

In answering these questions, I have considered speech synthesizers used as described in the preceding section, not as they might be used in other possible classroom applications.

In answer to question 1, we can say that speech synthesizers, in contrast to style checkers, are essentially self-help devices. Synthesizers leave it to the students to detect and diagnose problems themselves and to formulate their own plans for revision. They assist students in problem-detection by presenting their texts to them through a new channel (aural as opposed to visual) and in actual revision by letting students hear the effects of changes as they try them out. The potential value of a revision aid that emphasizes self-help is suggested by a remark of Flower, Hayes, Carey, Schriver, and Stratman (1986) apropos of teacher comments that are essentially "plans for revision": "Students may learn to follow such plans where indicated yet never learn to detect or diagnose the underlying problems themselves" (p. 52). A good deal of energy has been focused on developing software to assess drafts and diagnose their problems; speech synthesizers represent an alternative approach that is worth pursuing.

Speech synthesizers, used to read writers' drafts, are revision aids oriented toward the rhetorical situation--synthesizers put writers in the role of audience for the texts they have written. With a speech synthesizer, switching to an audience's perspective no longer "requires a relatively high level of role-taking ability" (Beach & Eaton, 1984, p.165). In the studies they made of revision practices, Faigley and Witte (1981) noted tremendous diversity among the expert adult writers they observed, and they recognized that one cannot infer from such diversity particular kinds of changes that unskilled writers should be taught to make at particular stages. Rather, what kinds of changes will work when depends on how one composes and on a variety of "situational variables":

Successful revision results . . . from the degree to which revision changes bring a text closer to fitting the demands of the situation. Revisions of inexperienced writers often do not improve their texts. Such writers tend to revise locally, ignoring the situation constraints. . . . Somehow we must teach our students to distance themselves from what they have written, to get them to see it again, then revise. . . . (p. 411)

Speech synthesizers offer a means of realizing this goal, helping students, through hearing their writing, to see it again.

As tools for simulating the rhetorical situation, for helping writers to play the role of their own readers, speech synthesizers contrast with computer tools such as style checkers and error detectors, which are text-oriented and make recommendations for changes on the basis of formal rules. Sommers (1980) observed that students make changes "in compliance with abstract rules about the product, rules that quite often do not apply to the specific problems" (p. 383) in their drafts. Because these are precisely the sort of changes that style checkers typically suggest, they may tend to reinforce and perpetuate inadequate notions and practices of revising. Listening to their drafts read by a speech synthesizer, on the other hand, can help students think about their text as a whole and take the perspective of an audience.

When used with earphones (as I would recommend), speech synthesizers give writers feedback privately. Privacy may be especially conducive to the kind of revision through which writers discover their meaning from what they have written. One use of telecommunications (with outside readers) for feedback, in the Penn State-Farrell Schools Creative Writing Project, has been justified as serving "to eliminate the student's inhibitions by creating a situation when s/he is being assisted by an anonymous, non-grading, non-threatening editor, who is there to assist and encourage" (DeBonis et al., 1986). Speech synthesizers are non-grading, non-threatening, and more than anonymous; being more private than telecommunications, synthesizers might be expected to reduce students' inhibitions even further.

Speech Synthesizers as Aids to Revision: Concerns for Teaching and Research

Speech synthesizers, as tools that foster student self-help in revising, help students distance themselves from their writing and look at it (by hearing it) from an audience's perspective, offer students feedback in private, and have promise as aids to revision. Because synthesizers have, apparently, the potential to answer real needs of student writers, classroom trials of their use are, I think, overdue. Teachers and researchers undertaking to experiment with synthesizers in their courses and projects will want to bear in mind specific potential advantages and problems of speech synthesizers as revision aids for student writers.

Potential advantages:

Speech synthesizers do not correct what they "read." This may make them especially beneficial to basic writers. Although research has shown that basic writers, like more advanced student writers, read over their work as they write, they also typically "correct" the errors in their writing as they read it, unlike more advanced writers (Perl, 1979; Bartholomae, 1980). Writers who cannot see the errors in their texts may be able to hear them.
The neutral voice of a speech synthesizer puts writers close to the position of outside readers of their writing--writers must construct meaning from a minimally differentiated flow of sound even as outside readers must construct meaning from essentially uniform pages of print. Indeed, listening to the "non-performance" of their drafts by low-cost speech synthesizers may be more helpful to student writers than listening to their classmates read their drafts. Besides, what classmate would be always at the student writer's side as s/he works, ready to let the student hear the changes s/he just made?

Potential problems:

Depending on the algorithms speech synthesizers have for translating English spelling and punctuation into speech, they-- not unlike basic writers--have some idiosyncratic notions of the conventions governing the print code. Hence, correct spelling or punctuation may yield a mispronunciation or an inappropriate intonation, perhaps leading writers to think--mistakenly--that there is something wrong with their texts. This situation may induce writers to dismiss all mispronunciation and inappropriate intonations as peculiarities of the speech synthesizer, without considering the possibility that the problem lies with their texts. Writing teachers who want their students to get the benefits of feedback from a speech synthesizer should instruct them to take every disparity between what the speech synthesizer says and what they expected their text to say as an occasion for looking at the text again, to see if it is the locus of the unexpected element.
At least initially, the robot-like character of the synthesized speech of a low-cost unit may be distracting. But with repeated use, the speech will, I expect, become neutral for most people. Some people, however, may actually not be able to understand the speech synthesizer's "accent" at all.

I can suggest a final caveat: We should no more expect all writers to use word processors in the same ways or to the same extent than we should expect all students to use speech synthesizers in the same ways or to the same extent. We should no more insist on the use of speech synthesizers to the exclusion of other revision aids than we should (except for particular occasions and purposes) insist on paperless writing. If we have an understanding of speech synthesizers' attributes as revision aids, however, we have a basis for suggesting or requiring their use where their attributes are well-suited to the needs of our students and to the demands of their writing project in question. Such an understanding also gives us a basis for offering guidance in the use of speech synthesizers. Well-conceived and well-conducted research on synthesizers' actual effects is needed to give us a more adequate basis. The first prerequisite, both for research and for experimental implementation in teaching, is to provide direct and convenient access to a low-cost speech synthesizer for every student involved.

Eleanor Berry lives in Cedar Grove, Wisconsin.

References

Bartholomae, D. (1980). The study of error. College Composition and Communication, 31, 253-269.

Beach, R., & Eaton, S. (1984). Factors influencing self-assessing and revising by college freshmen. In R. Beach & L. S. Bridwell (Eds.), New directions in composition research (pp. 149-170). New York: Guilford Press.

Bursky, D. (1985, May 16). New algorithms, chips bestow human qualities on synthesized speech. Electronic Design, pp. 112-129.

DeBonis, D., et al. (1986). And they all wrote computerly ever after: Creative writing through telecommunications. Paper presented at the Conference on Computers and Writing, Pittsburgh, PA.

Faigley, L., & Witte, S. (1981). Analyzing revision. College Composition and Communication, 32, 400-414.

Flower, L., Hayes, J. R., Carey, L., Schriver, K., & Stratman, J. (1986). Detection, diagnosis, and the strategies of revision. College Composition and Communication, 37, 16-55.

Gallagher, R. T. (1985, January 7). Expert system uses Al for natural sound. Electronics Week, p. 19.

Lees, E .O. (1985). Text-to-voice synthesis: What we can learn by asking writers to proofread with their ears. Paper presented at the Conference on Computers and Composition, Los Angeles, CA.

Perl, S. (1979). The composing processes of unskilled college writers. Research in the Teaching of English, 13, 317-336.

Selfe, C. L. (1985). The electronic pen: Computers and the composing process. In J. L. Collins & E. A. Sommers (Eds.), Writing on-line: Using computers in the teaching of writing (pp. 55-66). Upper Montclair, NJ: Boynton/Cook.

Sommers, N. (1980). Revision strategies of student writers and experienced adult writers. College Composition and Communication, 31, 378-388.

Teja, E. R. (1983, July 21). Voice-output products see expanding use. EDN, pp. 54+.

Williams, J. M. (1984, June 30). Technology and the handicapped. American Education, pp. 27-30.