Lies, Damned Lies and Statistics


Lies, Damned Lies and Statistics is a well-known saying attributed to Benjamin Disraeli. It was popularized in the U.S. by the author Samuel Clemens, aka Mark Twain: “There are three kinds of lies: lies, damned lies, and statistics.” This statement refers to the persuasive power of numbers, and describes how accurate statistics can be used to bolster inaccurate arguments.

The state is the most infamous of liars. Even though myriad penalties exist for punishing individuals for lying (perjury) to the state in all its myriad forms there little to no enforcement or even statutes for the converse. The 21st century should be called the Digital Millennium because of the ubiquity of computers in virtually all forms of human interaction. Digital computers provide the perfect conduit for deceit, for institutionalizing prevarication (lying, fraud, perjury) and making it completely unaccountable. When the history of the 21st century is written, I assert, it will be noted that digital data, all its forms, will be inadmissible in court for any reason. It is always invalid beyond any reasonable doubt. Here’s why.

The argument is simple: digital data has no ownership criteria, it cannot be attributed unambiguously, there can be no chain of authorship, the creation costs are close to zero including forgery costs, and there can be no chain of evidence. Because of these characteristics digital data can only be trusted based upon the word of an individual that attests to its integrity. A single individual’s attestation to fact is the legal equivalent of hearsay and cannot be admitted into a court of law. Let me present these bold assertions in detail.

The well-known Moore’s Law phenomena demonstrate that the number of transistors which can be placed on an integrated circuit is increasing exponentially, thus doubling approximately every two years. The observation was first made by Intel co-founder Gordon E. Moore in a 1965 paper. This trend has continued for more than half a century and is not expected to stop in the foreseeable future. This means that increasing amounts of computational power can be had for less money as time goes on. This is perhaps the finest example of human ingenuity dropping prices from productivity increases due to Human Action in conjunction with private capital and private ownership of property.

The new Itanium chip will have over two billion transistors on a single chip. Long-term storage prices are on the decrease as well. A terabyte (million megabytes) disk drive is now under $200, terabyte flash drives are expected soon, and petabyte and higher storage densities are in the lab thanks to breakthroughs in solid state physics. Look for them at a consumer outlet near you soon.

I have been a professional computer systems architect and developer for more than 20 years. Everything I have ever worked on including CAD models, simulations, programs, development tools, documentation, sample images, pictures of my family, all my favorite eBooks, email archives, and much of LRC and Mises web sites occupy less than 200GB. I will probably not live long enough to fill my 400 GB disks with meaningful data.

For data to be usable it must have a producer and a consumer, even if they are the same individual or computer system. Some physical device like a camera, keyboard, or another computer system interface will source (generate) some digital data. This transducer converts a physical act like snapping a picture into a digital representation of the act. Many cities are now adding digital cameras into the traffic light system. For a typical city street intersection this means at least four cameras are required to digitize traffic. A high-volume intersection might want 8 so there is a camera dedicated to each traffic direction. Big city roads with service drives that run alongside the thoroughfare could easily double this number.

Many cities are planning or implementing a ticketing system where the traffic light state (red, green, yellow) and traffic captured are used to generate automatic moving violations which are then mailed to an offender based upon an optical character recognition system (OCR) that reads the license plate of the offending vehicle. The license database then provides the owner information and the ticket is sent. If the ticket is not paid the legal machinery of summons, absentia judgment, and warrant for arrest can now swing into motion. This is obviously a potential revenue gold mine for the “owner” (city, county, state). Since there is no officer to be sworn for testimony these tickets, while monetary penalties, are not given as “points” against the driver or sent to insurance companies to effect insurability or insurance rates.

This example raises the first problematical issue. The sequence of events is known as a graph. Graphs are simple mathematical entities that consist of vertexes (points) and edges (lines). Graph Theory is a deep and powerful field of mathematics that is easily understood by almost anyone. This type of graph is directed, the direction is actually time that properly sequences the events. The vertices connect the events in the proper temporal (time) order. The graph is (a) Light Turns Red-> (b) Camera Catches Car in Intersection->(c) License is Read-> (d) Owner is Identified->(e) Ticket Is Printed and Mailed-> (f) Fine is Paid: A->B->C-D->E. This graph forms a crucial piece of legal doctrine called the Chain of Evidence. It is also known as the Chain of Custody. The Chain of Evidence provides the ordering of the events, the arrows (vertices) sequence events in time.

The Chain of Custody is actually a misnomer in common usage. It is the graph of ownership or attribution of the Chain of Evidence, not the same so conflating the two is an error of fact. This will be important later on

A police officer testifying in court about events personally witnessed is attributing the chain of evidence. A great deal of legal precedence has been established as to the quality of this testimony. If you are a convicted criminal testifying that the sworn officers’ testimony against you (his version of the Chain of Evidence compared to yours) is false and yours is correct, the weight of jurisprudence falls squarely on the side of state. One against one testimony between private persons is considered as equal and no verdict can be rendered, unless third party evidence proves one or both are perjuring themselves (lying under oath). Logic actually dictates that an officer and a convicted criminal testifying about their versions of the same event are equivalent, but the “law” mandates that the officials prevail. This will be a recurring pattern.

If we consider the traffic light system there are likely to be multiple independent computer systems that provide event data. Each system has one or more individual system administrators that certify to the integrity of the system behavior with respect data captured, stored and retrieved. Very large systems can have entire system that just monitor and administer other system. Nonetheless attribution ultimately always comes down to one or more individuals whose sworn testimony provides the Chain of Custody verification required.

They key here is that the systems and their administrators provide the attribution/attestation as to the integrity of what the data. The National Institute of Standards and Technology provides traceable standards for many facts physical entities of interest with respect to computer systems like time. This attestation process is the logical equivalent of two testimonies between individuals. In a debate these are equivalent, but in a court of law legal precedence again comes down on the side of the state. This precedence has been appealed to the highest court and the state always wins on principal.

The source and destination of the events in a graph (Chain of Evidence) and attribution of the data (Chain of Custody) are only part of the problem. In the case of our traffic system the camera captures the reflected light from the moving vehicle and stores it as an image (picture). A pictorial image is a field (a set whose members are positions and at each position is a value (color or intensity for black and white images) that forms the data. A picture can thus be considered as a fact devoid of context. Author Neal Postman has written eloquently on pictures de-contextualizing information.

If we take a picture of a group of people then those in picture know the context: party, celebration, and some attributable “fact” about the circumstances of the picture. The photographer can be completely devoid of context yet capture successfully an image (think wedding photographer). It is the context of the image that provides the meaning. For our traffic monitoring system the image is the fact of a car reflecting light in a particular intersection. External attestations are required as part of the chain of evidence in order to provide meaning to the context.

For our example the traffic light state (red/yellow/green) may not be visible in the image so the image has to be “tagged” with the light state externally. How close in time can the image acquisition and event state be synchronized? At 30 miles per hour the car is moving at 44 feet per second (roughly two car lengths). If the synchronization is 100 milliseconds positional uncertainty is 4.4 feet and this may be enough to determine if the vehicle is in our out of the intersection at the moment the offense is deemed to occur (when the pictures is snapped). If the light is poor or if the camera is slow it can blur the image and the error margin becomes greater.

The image must show the light to be in only one state, if it is in multiple states the legal model itself becomes devoid of meaning. If it is red, an offense may not have occurred if the operator entered the intersection on a yellow. If the light state information is wrong then the context and chain of evidence is different and no offense can be detected.

Even if the system is perfect (idealized) and my car is running a red light thus a traffic offense is occurring. The system does not know who is driving my car. It could be my wife or a family member or a car thief. It could be the same model car with a fake license plate with my number on it being driven by a group of criminals in a getaway from a crime scene when I am actually in Tahiti on vacation and not capable of driving the car, or my car could be in the junkyard because I scrapped it the day before the offense occurred. These all illustrate how difficult it is to make a meaningful attestation to a physical event when the context cannot be established either logically (could be established in principle) or actually (was established).

We have not even gotten to optical character recognition system and what its limitations and accuracies are. Nor have we discussed communication network bandwidth and latency for synchronizing cameras and time measurement. Being a physical device and a computer system it will have similar limitations on the Chain of Evidence and Chain of Custody. Reading the license plate before the image is snapped would certainly make an offense of little value in terms of proof, but could in fact occur in the case of a legitimate of offense

In addition to the physical limitations of the system consider how it could be deviated from its designed purpose. It would be an elementary effort to take an image of an empty intersection, take an image of a car legally in the intersection and subtract the background leaving the car only. This image could then be added to an intersection only image giving the impression of the car in the intersection at a particular moment in time. If the car is translated with respect to the image slightly it can take a position from one of legality to illegality. The same methodology could be applied to any particular license plate for owner identification, as well as driver.

The diligent reader might argue that lighting/shading models would prevent this from occurring. That is an “expert” witness could tell the real from the fake. Upon which point I disagree for the following reason that since many types of image processing software are available which provide for pixel by pixel image editing (Photoshop, GIMP, Image Magick) it is possible if someone is willing to spend enough time on the edit. Given the massive computational power now available cheaply it is certain that this is not only possible but probable.

This is example illustrates all of the problems with digital surveillance: data model provides evidence, data system has systemic errors, Chain of Evidence and Chain of Custody cannot be attributed beyond a reasonable doubt. Furthermore that the nature of attribution essential to linking the vertexes in the graph always trace back to an individual attestation based upon a definition of what is true. When it is one third party testifying against another third party the legal conclusion cannot select between one and the other and hence is an admissible. In the case of a trusted member of the state of the state (police officer, agent, service man etc.), the legal benefit of truth will accrue to where it is logically not valid.

Systems of these types are in widespread usage in the United States and many other countries. Many cities install them but do not use them as revenue sources, but as data collection systems for traffic flow information. There are many more governmental entities with envious eyes looking at the cost benefit/ratio and awaiting the legal issues to be resolved before implementation. It is only a matter of time before these are ubiquitous and become another arsenal in the states pilfering of the individuals.

Regardless if it is satellite photos, hidden cameras recording criminal activity, airport cameras performing facial recognition, automated gas chromatographs detecting drugs in passersby or tailpipe emission violations, and the list is literally endless. If the data acquisition, transmission, storage and retrieval is digital then the data has no validity in a court of law because of the inadmissibility of evidence in the either The Chain of Custody, the Chain of Evidence, or both. That the low cost of fraud and the fundamental inability to detect real from fake make digital technology inadmissible from a legal standpoint for public purposes.

Just as the Constitution provided limited powers to the Federal Government with all other rights being reserved for the individual, usage of digital technology for private affairs is purely up to the user’s discretion for whatever purpose they desire, as long as it is not offered as evidence of a crime.

It is not my assertion that this is at all reflected in law since just the opposite it true. The preponderance of fact is that the state, in all its forms, is spying (surveillance), on a continual basis, everything possible, for whatever legitimate and illegitimate reasons. The damage to a free society is unlimited with this perversion of technology as it usurps the fundamental constitutional principle of innocence until proven guilty. Guilt is assumed before the act of surveillance which then provides a foregone conclusion regardless of the actual surveillance system fidelity and integrity with respect to purpose and design constraints.

In closing I would add the following, and I think it equally applicable to your family as it is mine. I have repeatedly admonished my children that they should expect to be under video and audio surveillance at all times when in public or private (outside of the own home) as this is the state of the world in which we live, and to act accordingly. Additionally complete digital surveillance of Internet should be assumed at all times. There is no anonymity on the Internet.

Having been professional technologist for over 25 years I am completely agnostic as to the merits of digital technology. It is irrelevant. What matters is human nature which remains the same across cultures, faiths, creeds and over the millennia. Specifically that perversion of human nature that accrues with political power and the desire to hold sway over others. This technology because of its low cost and fungible nature will eventually be deemed to be valueless by a free society. The road to that point may be rough.

Ironically it is the aging analog technology like film cameras, video tape recorders, fingerprints, typewritten and hand written documents that are exactly the opposite. For them forgery is difficult to execute and expensive to reproduce. The fraudulent is always detectable! Like the ballistic signature from a firearm is unique. So to do analog technologies provide authentic signatures beyond a reasonable doubt. The old is better than the new.

The real question is what is surveillance in a free society for? If we lived in a neutral country that had free trade with all and encumbrances with none there would be no need for internal or external surveillance as we would have no enemies. A vigilant national defense using digital technology is reasonable because there would never be a criminal prosecution of an individual so the flaws are irrelevant.

As always the Founding Fathers provide guidance:

“Those who would trade liberty for security will soon have neither.”

~ Attributed to Benjamin Franklin

I know there are many who will disagree with discussion on the lack of integrity in digital data. Feel free to email them to me for analysis and discussion.