Email, Duplication, and Stupidity
We've all been holding trying to hold onto our sanity during this election with one non-substantive controversy after another. The latest, of course, is the cache of Hillary emails found on Anthony Wiener's laptop which were leftover when his estranged wife, a top Clinton aide, used the laptop to screen emails for the former Secretary of State. Today I saw this tweet:
Checking these emails went so fast because they were all duplicates. Let me explain.
In perhaps the most ironic story on earth today, I stumbled upon the duplicate file finder on the CCleaner program. I figured my laptop could use a little maintenance, so I clicked it just as a co-worker asked if I'd heard the latest about this FBI debacle. By the time I turned around (between 10 - 20 sec later) the program was finished and had discovered between 180 - 200 duplicate files. That's nearly 10 - 20 duplicates discovered per second.
Now, I don't have 650,000 different files on my laptop to compare, but the program did compare files in scattered locations all around the laptop. (I have 231,624 files on my computer, for what it’s worth, and you can see how I learned that on the link below this article. Unfortunately, any further analysis gets into details, like how do program files compare against document files, etc. that I don’t know enough to discuss. The important thing is that you don’t need to be a technological wizard to come up with a basic analysis of this issue.)
In contrast, the FBI could have easily set the two sets of emails into only 2 directories and let the program go to work.
In further contrast, I was using a 5 year old laptop. The FBI has mainframe computers that are orders of magnitude faster than this old laptop. The also have technological specialists that could set multiple of these uber-computers onto this stack of data from the beginning. Before we get too far off track, though, let’s go back to the math posed above.
Even my old laptop, which is slowly failing as it limps toward its eventual oblivion, discovered about 10 duplicate files per second.
First, the main premise is that there is no way the FBI could detect an email per second. So, to even things out, let’s deduct 650,000 seconds from those 8 days (691,200 seconds).
Now, divide that by 3,600 to convert those seconds to hours.
41,200 / 3,600 = 11.444 hours.
I’m gonna ballpark that at about 11 ½ hours, which is nearly half a day.
So, even at discovering one duplicate per second, a run of the mill computer should have been able to do the job in only 7 ½ days at at that rate, leaving 11 ½ hours of extra time to write that very difficult paragraph to Congress that the emails were all duplicates. However, even this simple math isn’t done yet, since my old laptop could find at least 10 duplicates per second.
If the FBI’s multiple mainframe computers could do 10 duplicates per second, we divide 650,000 by 10 that takes us down to 65,000 seconds to do the job.
650,000 / 10 = 65,000 seconds
65,000 seconds / 3,600 seconds/hr = 18.06 hours.
Now, I’m tech savvy, but not a tech specialist. Here’s what one well known tech specialist had to say about the time it would take:
So, it seems that my estimate is on the high side.
That is not even a day. Let that sink in. The FBI let this drag on for 8 days when even a slow machine should have been able to accomplish this task in less than a day. But wait - there’s more!
Comey’s reason to even make the announcement was because he didn’t want to possibly make it too close to the election, and he wasn’t sure that this investigation would finish by then. Wouldn’t doing a duplicate check first have been the smart thing to do, especially if it could be done in less than a day? Is he telling the American people that the Director of the FBI doesn’t have a way to ask an IT specialist how long it would take to figure out if this new cache of emails was new or just copies of emails that had already been vetted by the FBI?
Weigh the prejudicial effect of his announcement versus the harm that would have been committed by waiting for less than a day to perform proper due diligence. If you look at 538.com’s horse-race/Rorschach graph, Clinton has gone from 81% chance of winning to 65%. (see link below). To paraphrase a recent piece by Nate Silver of 538.com, you can’t tease out all the variables that caused such a steep decline in these 8 days, but it is very similar to when Comey released the finding of the FBI probe back in July. (another link below.) What I took from his article is that we may not know exactly which breed of duck we have, but this story definitely looks and walks like a duck.
So, why couldn’t Comey give proper due diligence to this issue? Why did he have to put his thumb on the scale of the US Presidential election? I’ve seen reports that there are some deep divisions within the FBI. Well, if the divisions are so deep that it causes this level of improper action by the FBI director, then it’s time to clean house. The credibility of American democracy depends on it. Perhaps that cleaning should start with Comey.
But wait - there's even more! Republicans in general, but especially Congressional leaders should have shown a little decency. Comey's original statement was vague but it certainly did not convey that there was any reason to think that there was solid evidence of criminality in this new cache or emails. As vague as it was, it should have been left alone. Instead, Republican leaders went on and on, saying that the investigation was reopened (it wasn't), and that indictments were surely forthcoming (which, of course, didn't happen). Given the information that they had, there was no justification for these fabrications.
As a father, I believe in basic values of fairness, honesty, and common decency. Since I try to teach these values to my children, one could even say that they are family values. So, why on Earth did the leaders of the "party of family values" distort facts and contort the truth into unrecognizable falsity over this issue? You Republicans are losing this election because America has seen that you don't really believe in these things. You don't represent the family values that I hold dear, and you certainly don't deserve to represent the American people.