Human Translation

I sit alone in a booth without windows. In front of me, a computer. Next to the computer, headphones. When I put on the headphones, I hear voices: sometimes one, sometimes many voices. A rough transcript of what they’re saying, the product of speech-recognition software, is already on the screen in front of me. I listen and scan the script, making corrections as necessary. When it’s right, I translate; I put it into English so the bosses can understand.

That’s it. That’s my job.

I hear these voices not because they’re interesting, but because the computer thinks they are. It decides which voices I hear and whether they’re of low, regular, or high priority. And its choices show certain preferences. There are names that come up again and again. Places, too: cities and landmarks and regions and countries. The computer prefers men to women. It prefers young to old.

There are other patterns. Certain dates, for example. Book titles. Webpages. But those run in trends, and I can’t keep up. I don’t really pay that much attention.

The day lasts from nine to five thirty, with a half-hour break at noon and fifteen-minute breaks at ten thirty and three. Every day I translate between two and three thousand words. That means, on a busy day, I’m translating 400 words an hour.

Me, my computer, and the voices: we’re like three parts of the same machine.

Meanwhile the bosses keep talking about a computer program that will make so-called “human translation” obsolete. We have translation programs already, of course — they just don’t work. They know all the rules of the game but not how to play. That’s why this new program will be based not on rules, but on experience. Specifically, my experience.

Soon I’ll be working with the new program, but not in the sense of “employing” or “utilizing” it. I will collaborate with it like a colleague. And it will learn. My experience will become its experience. What I know, it will know. At some point, the program will have absorbed me. Then I’ll be fired.

At the moment, the program is still in the pre-alpha phase, but the bosses seem anxious to begin testing. Something about the backs of their heads, when I pass them in the halls, tells me that they’re smirking.

What they don’t realize is that I have little to teach the computer. I know languages, but I don’t know what language is, and I have no idea how translation works. The more I do it, the more I’m convinced it should be impossible.

If the bosses asked, I’d be willing to share my doubts. But they won’t ask me: I don’t factor in until later. The development team, during this stage, is all software engineers.

If the bosses asked, I’d tell them that, rather than a translation program, the engineers should give us better speech-recognition software. If the computer’s transcripts were reliable, I wouldn’t have to listen to the voices. That would be nice. I could work faster. And I wouldn’t have to listen to the voices.

SPEECH RECOGNITION

The bosses are proud of our speech-recognition software. They say it’s years ahead of anything on the market. It’s better with some languages than with others, but on average its accuracy is 96 percent. I admit that does sound impressive. It means only every twenty-fifth word is nonsense.

Since the transcripts are all in foreign languages, it’s hard to give examples. But if you have any experience with such software, you know the mistakes I’m talking about. A transcript of a voicemail might read: “Hi, we moved, so I’m calling to give you our new number and a dress.” That sort of thing.

Those are the easy mistakes, though — the obvious ones. It’s plausible mistakes that cause problems.

A few months ago I was given a something labeled “High Priority.” It was a big moment. Until then I had handled only the low- and regular-priority stuff. It wasn’t a raise or a promotion, but you might call it a moral victory.

I listened closely to the recording. There were two voices — one female, one male. I’d guess they were both somewhere between thirty and fifty. This, in English translation, is what they said:

Female: Are you still inside?
Male: No, I just finished.
Female: Ah. Oh well. Too late.
Male: I can go back in.
Female: No, no. Doesn’t matter.
Male: All right.
Female: How about the bananas? You didn’t forget the bananas, did you?
Male: No, they’re here. I’m looking at them now.

Now compare that to what the computer heard. Here’s a faithful translation of the computer’s original, uncorrected transcript:

Female: Are you still inside?
Male: No, I just finished.
Female: Ah. Oh well. Too late.
Male: I can go back in.
Female: No, no. Doesn’t matter.
Male: All right.
Female: How about the rifles? You didn’t forget the rifles, did you?
Male: No, they’re here. I’m looking at them now.

The female voice had mixed languages: it had said the word “bananas” in English, although the rest of its sentence was in another language. By coincidence, the English word “bananas” sounds a little like the word for rifles in that other language. So “rifles” is what the computer thought it heard.

ANALYSIS

When I was finished, I turned in the correct translation — the one where the female voice says “bananas” — to an analyst. Normally I would just save it on the network and send the analysts an email alert. But since this was a matter of high priority, I made a printout and took it to an analyst in person. That’s protocol.

The analyst read the printout. He read it again. Then he looked at me.

“This was high priority?”

I thought of explaining how, due to mixed languages and phonetic coincidences, a banana had been mistaken for a gun. Then I thought better of it, shrugged, and backed out of the room.

I remember, when I first started this job, trying to explain a point of grammar to one of the analysts. He cut me off.

“Your job is simple: you write what it says.”

“I would need more context to be certain.”

“You’re not authorized to access more of this file. You have no clearance.”

“I’ll write both possibilities, then. You can decide which is more likely.”

“I only want to read it once, thank you. Just give me good, plain English.”

In the end, I just guessed.

HUMAN ERROR

Not all mistakes are the fault of the speech-recognition software. Sometimes it’s the process that’s to blame, working conditions, office politics. Other times it’s just human error.

It happens that I think I know what a word means and I don’t bother to look it up. A week later, giving in to a nagging doubt, I’ll make inquiries and discover that either I am confused, or the dictionary is. Often, at such moments, it’s the English dictionary I’m holding.

Small mistakes worried me a lot more when I was new here. I kept a list of the mistakes I had made so as to avoid making them again. It started as a short list and became a long one; then, a very long one.

I compiled mistakes for almost a year before realizing, one day, that no one had confronted me about them. Reading over the list, I remembered making the mistakes, but I didn’t remember ever apologizing. I could only assume that they hadn’t been noticed, that my mistakes had been analyzed and processed and filed along with everything else. It wasn’t that truths were being mixed with falsehoods. It was that those distinctions weren’t even being made.

So I threw away the list. It was a liberating moment. It was as if I had never made those mistakes in the first place. And in a sense, I have made no mistakes since. If you give up on the notion that language must be precise, that communication must be perfect, or that we can ever really understand each other — if you give up on all that, then there can be no mistakes.

VOICES

I used to think you couldn’t translate something without understanding it first, but these days when I translate, my mind is a blank. The tricky thing is that then the voices come back to me later. Someone says something, I get déjà vu, and suddenly I hear myself responding in someone else’s words, with someone else’s voice. I find myself trapped in someone else’s conversation.

Other times, when I’m alone, I hear whole conversations as if I were still wearing the headphones. Then the voices mix and become nonsensical. I hear two people talking, but not to each other. Or I hear three, four, five voices, all talking at once. The voices overlap and blend into linguistic music. My head is a small space, but its capacity for words is endless.

I’ve had dreams where I translate voices into other, better voices, where I stretch and modulate them until they become arias or snippets of Shakespeare. I’ve also had dreams where the voices outrun me, and I can only snatch at them while they slither through my hands like eels. Then they turn on me, grow three heads, and chase me down long corridors.

One of my recurring dreams is that I’ve been given a special assignment. I put on the headphones and hear two voices, but one is scrambled. The other one sounds like a boy, perhaps ten years old. The boy is begging, please no please no please. Then he’s crying. Then screaming. Meanwhile the scrambled voice is demanding something, some information, but the boy doesn’t know. He would tell if did, but he doesn’t. He’s sorry. He’s very sorry. Please.

I wake up.

I can’t say how often I’ve had this dream. Too often, in any case. But I always knew, from the first time I had it, that the boy was me. It’s only lately I’ve begun to suspect that the scrambled voice is me, too.

SUSPICIONS

When I got this job, just out of college, I was happy. It was a good job: secure, decent pay, excellent benefits. There was this big government seal on the floor in the lobby, and they’d wax and buffer it every night. I’d arrive in the morning and see the seal shining up at me, and I’d feel proud of something. Of myself, perhaps.

The seal is still there, I’m sure, and probably still shiny. These days, though, I come in by the side entrance. I try to nod my way past security, but it doesn’t always work — sometimes they’re talkative. They say it’s a good morning, and I’m forced to agree. They say I should have a good day, and I can’t imagine what they mean. They say I’m looking sharp today, and — I can’t help it — I worry that they’re mocking me. Before turning on my computer, I’ll puzzle over my reflection in the screen.

Nothing has changed since I started this job, but everything is different somehow. Now the new translation program is coming, and again, nothing will change, nothing will be the same. I’ll just go on doing my job, but my real job will be to teach the computer to do my job. Somewhere, just beyond my peripheral vision, the computer will be watching everything I do.

In effect, the program will make a digital copy of me. It will break me down into nanobytes and nibbles and reassemble them into an electronic effigy. Then I’ll exist on both sides of the screen. But only on the outside will there still be compromises, rationalizations, and forgetfulness. Inside the computer will be a true mirror-me, with no room for distortion or embellishment. When I make mistakes, they won’t just disappear: they’ll be archived. And I won’t be able to delete anything. I’ll be denied access to myself.

The bosses say a good employee has nothing to fear. They say if you are worried about the new program, it can only mean one of four things: one, you’re not doing what you should; two, you are doing what you shouldn’t; three, both one and two; four, you’re paranoid.

I consider myself a good employee, but I am little worried about the new program. Specifically, I worry that the bosses won’t tell us before they begin testing. Maybe they’ve discovered that, when employees know the program is watching, even good ones unconsciously change their behavior in a way that makes it harder for the program to learn. Or maybe they just want to use the program to find out what sort of employees we are.

I consider myself a good employee, but I do have moments of doubt. I’ll be typing away and think, what if my computer has been watching me all along, is watching me right now? What if it’s following every keystroke? Maybe the whole story about a new program in development is just to keep us from suspecting that the program has been running for years.

I consider myself a good employee, but I do have moments of panic. I’ll be typing away and think, what if there is no program and there never will be? Maybe the bosses are just trying to scare me. Maybe they think I’ll work harder and ask fewer questions. Maybe the only program they’re developing is me.

At such moments, I’d really like to talk to my colleagues, the other language people. They might have similar suspicions. But there’s never any opportunity to talk. We emerge from a two- or three-hour session in our booths, the hot glow of the computer screens still on our faces, the cacophony of voices still in our ears, and we’re just blown. We hardly have the strength to hold our coffee cups, to raise our forks and knives, to chew. There are no watercooler chats. We’ve heard too much talk already. We just listen for our own thoughts, and try to avoid eye contact.

+++

Philip K. Zimmerman is a writer and translator from Upstate New York. He lives outside Munich, Germany.

SPEECH RECOGNITION

ANALYSIS

HUMAN ERROR

VOICES

SUSPICIONS

Join our newsletter?