September 28, 2009
September 29, 2009, Added endnote

Technological Deception

All of us have witnessed the wonder of movie special effects. Some may have even considered that our system is now capable of deceiving the masses if they desire. Other technologies are just as persuasive.

The convergence of digital cameras and computer graphics programs have made photography an area of special concern now for investigators. Analog photography in the past was difficult to tamper with. Now, suspicions that digital images can be convincingly altered have forced investigators to look for some unalterable means to gather photographic evidence. To this end, they have reverted to using Polaroid cameras because the entire process is done at once by one mechanism.¹ Images taken by film cameras requiring a separate outside processing are thought to also be easily altered by digital software. All of this is done to protect the truth and to prevent the dismissal of cases on grounds of untrusted evidence.

Some may have come to suspect that audio also can be altered. When I worked at Nortel, I tested a product called DRAMS (digital recorded announcement messages.) It is my understanding that messages were spliced together from words and phrases. The following citation from Nortel technical documentation confirms this belief (italics are mine):

DRAM MECHANICS

The basic unit of DRAM speech data is the phrase. A phrase corresponds to an English phrase, sentence, or group of sentences. A given phrase may be a single word, an entire sentence or even a group of sentences stitched together at the operating company’s discretion.

An announcement trunk is made up of a set of members. A call that is routed to a specific announcement has one of the trunk members chosen by the CC for its terminator. Depending on operating company requirements, up to 255 subscribers may be simultaneously connected to a single channel.

The DRAM is instructed by the CC to play a list of phrases for that trunk, in sequential order, on the corresponding channel. Once the phrase list is exhausted, the MTM switches the subscriber to the next track in the tracklist (if any)

The DRAM, itself, has no knowledge of announcements, only phrases. The CC keeps track of the phrases making up each announcement.²

An earlier version of a similar document says:

Phrases are connected together by the DRA to form different messages.³

Although memory technology now can contain messages in their entirety, this has not always been so. Nortel introduced their line of fully digital telephone switches in 1976, a time when memory capacities would not support long recordings of audio. The first personal computers only had about 16KB of RAM, in time increasing to 64KB. The documentation that I had readily at hand, dated 1988, lists circuit packs with 128KB and 256KB memory capacity, giving a theoretical capacity of 16 and 32 seconds of audio, respectively. I remember testing older circuit packs with less memory than this perhaps as low as 8KB or 16KB. A 16KB memory module would only hold 2 seconds of audio (8-bit telephone audio is sampled at 8000 samples per second). The need to conserve memory by splicing messages together from words, syllables, or short phrases is evident.

I remember hearing operator recordings during this period of time; that they were stilted and would not pass for real speech. Much has happened since then. In the summer of 1994, as a new scapegoat, a coworker was quizzing me about words with body language that betrayed some discomfort as of deceit. When they left the work station, because I knew of the technology to splice words, I spoke to whoever might listen, “Are you trying to gather words to put in my mouth?” Soon afterward, my attempt to call my supervisor was interrupted by a recorded message. The message was spliced together from words from different persons. It was done so skillfully, that if it had been the words of one person it would have passed for unaltered speech. Surely this was an affirmative answer to my previous query, even if only pertaining to their technical capability.

You will ask, how did their skills improve? On many lines. First, they did advanced research in voice technology as evidenced by a Unix-based voice recognition product that I worked, called Network Application Vehicle. The kinds of skills and knowledge gained in this research (and that of others) would be applied to the use of highly capable visual audio editors in the task of effectively splicing the audio. These tools (even Audacity or Nero wave editor) allow one to see whether the joined clips align properly. The software’s filters could be used according to user’s specialized knowledge to smooth out any anomalies. A store of syllables and words from different sentence contexts would ensure that the flow and accenting of the resulting audio would be convincing. Certainly, their skills improved for very concrete reasons after all and are perhaps not ones that novices would be expected to accomplish.

Covert psychic technology could be added to this list, but is beyond the scope of this article.⁴

Beware!

¹Alan Axelrod and Guy Antinozzi, 2003, “The Complete Idiot's Guide To Criminal Investigation”, page 261
²Northern Telecom Training and Development Center, Raleigh NC, October 1988, “DMS-100/200 Maintenance Course 133”
³Northern Telecom, “Introduction to DMS 100”
⁴See source text Bioeffects of Selected Non-lethal Weapons

Document History
September 28, 2009 created
September 29, 2009 added endnote