This article was originally posted on LinkedIn
Much has been written on the impropriety of FBI Director James Comey’s letter to Congress on Friday, October 28, 2016. As an attorney who specializes in the discovery and analysis of electronic evidence, I have repeatedly been asked my opinion on this unfolding scandal, particularly in regards to what does the FBI likely know at this point, and how long will it take for them to provide real answers.
Let's start with the two key facts asserted in Director Comey’s letter to Congress: First, the FBI discovered emails in an unrelated case “that appear to be pertinent to the investigation (of former Secretary Clinton’s personal email server).” Second, “the FBI cannot yet assess whether or not this material may be significant.” (The letter can be read in full on the New York Times, or other major newspapers).
It has been reported through numerous unauthorized leaks that the unrelated case was the investigation into Anthony Weiner’s alleged relationship with a teenage girl, and that approximately 650,000 emails were located on a computer shared by the family. It has also been reported that the FBI obtained a search warrant to examine these emails on Sunday, October 30, 2016, which would allow the FBI to “assess whether or not this material may be significant.”
With only one week before the election, the 24-hour news cycle is filling the airwaves and print media with non-stop supposition, innuendo, and pure pants-on-fire fabrications.
As a professional with expertise in electronic evidence, I think it is important that we consider what the FBI should be able know now, about 72 hours after obtaining the search warrant for these emails. The following assumes for analysis purposes that there are, in fact, approximately 650,000 emails in total on the device. (NOTE: while not theoretically inconceivable, in my experience this is an extraordinary high number of emails, even assuming that it is from two hyper-power users.) I also assume that the FBI has access to the same technology that I have readily available to me and my colleagues who work in the world of civil electronic discovery. .
INFORMATION AVAILABLE BEFORE THE OCTOBER 30, 2016 WARRANT
Emails are not examined by looking through them in Outlook or Apple Mail. In the course of the Weiner investigation, the FBI would have created a forensic image of the computer containing the emails, copying every zero and one from the computer storage (i.e., a hard drive or flash memory). They would have then extracted and processed all the data on the drive using a computer software program designed expressly for the purpose of examining electronic evidence. Once that is done, a forensic analyst begins to examine the data for evidence. It is at this point I expect an FBI analyst identified emails that contain the domain identifier “@clintonemail.com” and notified superiors that there was electronically stored information (ESI) that may be “pertinent to the investigation” of the Clinton email server. Given that the FBI sought a separate search warrant on October 30 for these emails, it is safe to presume that these emails were not within the scope of the original warrant for Weiner’s emails, and had not been examined or otherwise substantively assessed for actual relevance to the investigation of the Clinton email server prior to Comey sending his letter to Congress on October 28.
INFORMATION AVAILABLE WITHIN HOURS OF OBTAINING THE OCTOBER 30, 2016 WARRANT
From my perspective, what we want to know immediately is whether there are any unique emails that have a reasonable probability of containing information related to the operations of the Department of State or the United States government and could contain classified information. If there is no such evidence on the device, then Director Comey’s letter is materially misleading and he has a duty to promptly clarify his remarks.
Within hours of receiving the search warrant on October 30, the FBI should have been able to perform an initial analysis of the emails and determine, amongst other things, the following:
- The total number of emails containing the domain “@clintonemail.com”.
- A list of all full email addresses that use the domain “@clintonemail.com” (e.g., hrc@)
- The count of emails from each full email address using the domain “@clintonemail.com”.
- A breakdown showing the count of emails were sent from, sent to, copied or blind copied to an individual using the domain “@clintonemail.com” (e.g., 10 FROM hrc@, 50 TO hrc@, 100 CC to @hrc, 2 BCC @hrc).
- The total number of emails which were forwarded to Mr. Weiner from an individual using the domain “@clintonemail.com” (e.g., Huma Abedin forwarded to her husband a copy of an email that HRC was copied on).
- The total number of emails sent from, sent to, copied or blind copied to an individual using the domain “@clintonemail.com” that were also addressed to someone, other than Mr. Weiner, at a .gov domain extension.
There have been media reports that the FBI found no emails that were sent by or to Secretary Clinton in this data set. This is quite plausible. We know that Ms. Abedin had an account on the @clintonemail.com domain using “huma@”. (See, e.g.,http://www.judicialwatch.org/wp-content/uploads/2015/09/JW-v.-State-Abedin-email-006841.pdf). An analyst’s identification of metadata showing even one email from the @clintonemail.com domain while investigating the potential case against Mr. Weiner could serve as a basis for a warrant to determine if Ms. Abedin had failed to submit all subpoenaed emails to the FBI during its investigation of the Clinton email server. Thus, the facts set forth in James Comey’s letter could have been triggered by Huma using her @clintonemail.com account to send one single email to her now estranged husband.
In trying to identify potentially relevant emails, one option immediately available to the FBI is the use of date restrictions. Hillary Clinton served as the 67th Secretary of State from January 21, 2009 to February 1, 2013. Emails sent using the @clintonemail.com domain before January 21, 2009 and after February 1, 2013 are not relevant to the investigation of her use of a private email server while in office. The FBI could have immediately used available technology to filter out any emails that were sent or received before or after those dates, which would likely substantially reduce the volume of potentially relevant emails for further examination.
In addition to metadata analysis and applying date range restrictions, the FBI should have had the immediate ability to perform text based searches, including for the words or symbols used to flag emails as confidential or classified with a specific designation. It has been previously reported that the inclusion of the letter C in a subject line of an email indicated that it was intended to be treated as confidential (not “classified” as repeatedly misstated by several politicians). This information could be found with a simple search that would take no more than a few seconds.
Additionally, assuming that the FBI still retained the emails and related information obtained from the Clinton server, FBI analysts should have been able to execute a software function that would identify exact duplicate copies of emails that were in both data sets. While this process doesn’t always work perfectly when examining ESI from two different systems, there are other ways to identify duplicates that take a little more time and I will address below.
There are several other searches that could have been performed by a trained FBI analyst immediately after receiving the court order, including identifying all domains that appear on the emails sent from or to the @clintonemail.com domain. As noted above, the FBI would have been able to promptly identify any emails with the @state.gov domain or any .gov domain extension that would be presumptively related to the operation of the government, and would need to be set aside for review of the content of the email message.
Another standard operation with emails that would have been available immediately is known as social networking analysis (SNA). SNA allows an analyst to interact with a graphical representation of all individual email addresses and how they interact, including who is communicating with whom and the volume of flow of emails. This would provide an FBI analyst with an easy visual tool to examine who was communicating with whom using the @clintonemail.com domain. While I don’t usually cite to Wikipedia, the page on SNA provides a reasonable explanation of this process.
INFORMATION AVAILABLE WITHIN 72 HOURS OF OBTAINING THE OCTOBER 30, 2016 WARRANT
There is no publicly available evidence at this time that there are ANY emails from or to Secretary Clinton using the @clintonemail.com on the machine. If we were to assume, for sake of argument, that some do exist, the first thing I’d consider before determining my next steps would be how many potentially relevant emails exist related to the Clinton server investigation – i.e., emails that had one point touched the Clinton server and are in fact, related to the operations of the Department of State or any other executive department or governmental branch and may contain classified information. As noted above, the count of emails that touched the Clinton server was immediately available to the FBI. If there are less than a 1,000 potentially relevant emails, I would proceed quickly to conduct a preliminary manual (eyes on every document) first pass review to determine if there is any indication on the face of the document that the emails contain information that are relevant.
If there are a larger number of emails that are potentially relevant, the FBI could run them through a software program that will automatically group related emails together (known as clustering), and identify emails that are exact duplicates or are part of the same email string based on the textual similarities in the emails rather than the metadata. This process generally takes only a few hours, and rarely more than one day. Once the software completes its processing of the data, an FBI analyst could perform advanced analytics to quickly classify emails into defined subject matters, group exact copies and similar versions of emails together, and then identify and prioritize emails that should receive immediate review as likely to be relevant. This work could easily have been performed during the first 48 hours, and certainly 72 hours following the issuance of the warrant on October 30, 2016.
During this same frame, the FBI should have been able to determine if any emails harvested from Mr. Weiner computer are part a missing part of an email thread first identified on the Clinton server. For example, let’s say hypothetically that there was an email found on the Clinton server from a third party to both Secretary Clinton and to Ms. Abedin, but there was no reply found on the Clinton server. Now let’s say that a reply from Secretary Clinton to this third party is located on Mr. Weiner computer. We have software available that will automatically associate the emails, and a quick search will show all emails where a new part of the same email string has been found.
Within 72 hours, the only information remaining to be gathered would come from a manual, document by document review of every email that has any potential to contain classified information.
WHAT WON’T BE AVAILABLE BEFORE THE ELECTION
If contra-leaks are true, and there are hundreds, if not thousands of emails that have a probability of containing classified information, that review will take weeks, if not months, to complete. This process is outside the scope of a normal civil discovery process, as it requires multiple rounds of interagency review and analysis.
At this point, Mr. Comey and the FBI can provide definitive answers to numerous questions raised by his cryptic letter to Congress of October 28, 2016, including how many new, unique emails from Secretary Clinton exist on Mr. Weiner’s computer, and how many will need to be substantively reviewed for potential classification – if any. It seems to me that the calls for Mr. Comey to at a minimum clarify and extend his remarks is not subject to technical limitations, but rather Mr. Comey’s political decision-making.
About the Author:
Eric P. Mandel is the Managing Member of Indicium Law PLC, a boutique firm focused on navigating clients through the legal, technology, and process issues related to eDiscovery, cyber risk, data privacy, and information governance. Eric serves on multiple policy and standards setting bodies, including the Board of Directors of the Legal Technology Professionals Institute (LTPI) and the Steering Committee of The Sedona Conference Working Group 1 (WG1). Eric is licensed to practice law in California and Minnesota, and is an IAPP Certified Information Privacy Professional (CIPP/US).