Establishing an Adequate Search & Why “Custodians [Cannot] be Trusted to Run Effective Searches of Their Own Files”

Aug 03 2012

Nat’l Day Laborer Org. Network v. United States Immigration & Customs Enforcement Agency, — F. Supp. 2d —, 2012 WL 2878130 (S.D.N.Y. July 13, 2012)

In this FOIA case, Judge Shira Scheindlin addressed the adequacy of the government’s search for information responsive to plaintiffs’ substantial FOIA request.

This case addresses plaintiffs’ request for information pursuant to the federal Freedom of Information Act (FOIA) and their assertions that defendants’ searches for such information were inadequate. In its analysis, the court expressly acknowledged that “the search obligations under FOIA are not identical to those under the Federal Rules of Civil Procedure,” but nonetheless reasoned that “much of the logic behind the increasingly well-developed caselaw on e-discovery searches is instructive in the FOIA context . . . .”

Following its analysis of the defendants’ custodian selection which resulted in a mix of findings regarding the adequacy of their efforts, the court turned to its analysis of the searches themselves, beginning with a discussion of what, if any, instructions and/or keywords were provided to custodians to guide their search efforts. The court determined that each agency proceeded differently in its search efforts, ranging from providing custodians with mandatory search terms which they were later confirmed to have used (a search practice that notably was not challenged by the plaintiffs) to providing no search terms, mandatory or otherwise. In the latter case, where the FBI produced nothing to indicate that its internal search memorandum provided custodians with search terms or any instruction or guidance regarding how the search should be conducted, the FOIA officer instead explained that “each employee conducted a manual review of his or her records.” It was further explained that “‘[t]he FBI largely did not rely on search terms, but instead relied on the knowledge of its custodians’ who ‘sift[ed] through and review[ed] tens of thousands of pages of records.” The court noted the likelihood that custodians "only reviewed certain categories of documents … or narrowed the number of emails that they examined by first using search terms" and indicated that more specificity about "what this manual search entailed" should have been provided.

Turning to its analysis of defendants’ search efforts, the court first emphasized that to establish the adequacy of a search, in particular one that relies on search terms, more information must be provided that just the terms themselves: “[I]n order to determine adequacy, it is not enough to know the search terms. The method in which they are combined and deployed is central to the inquiry.” Similarly, discussing the standard to win summary judgment, the court indicated early in its opinion that:

In their affidavits, agencies must ‘ “identify the searched files and describe at least generally the structure of the agency’s file system’ which renders any further search unlikely to disclose additional relevant information.” They must establish that they searched all custodians who were reasonably likely to possess responsive documents. And they must “set[ ] forth the search terms and the type of search performed.”

(Footnotes omitted.)

The court then turned to defendants’ argument that their searches should be deemed adequate, despite the court’s lack of information regarding search terms, “let alone Boolean operators, search fields, and time frames” and also addressed the argument that “[i]t [was] also unclear why custodians could not be trusted to run effective searches of their own files, a skill that most office workers employ on a daily basis.” The court first reasoned that “custodians cannot ‘be trusted to run effective searches,’ without providing a detailed description of those searches, because FOIA places a burden on defendants to establish that they have conducted adequate searches” which must be done by providing more than “merely conclusory statements,” including what terms were used, how they were combined, and whether they were run against the full text of documents. The court continued (footnotes omitted in deference to length):

The second answer to defendants’ question has emerged from scholarship and caselaw only in recent years: most custodians cannot be “trusted” to run effective searches because designing legally sufficient electronic searches in the discovery or FOIA contexts is not part of their daily responsibilities. Searching for an answer on Google (or Westlaw or Lexis) is very different from searching for all responsive documents in the FOIA or e-discovery context. Simple keyword searching is often not enough: “Even in the simplest case requiring a search of on-line e-mail, there is no guarantee that using keywords will always prove sufficient.” There is increasingly strong evidence that “[k]eyword search[ing] is not nearly as effective at identifying relevant information as many lawyers would like to believe.” As Judge Andrew Peck—one of this Court’s experts in e-discovery—recently put it: “In too many cases, however, the way lawyers choose keywords is the equivalent of the child’s game of ‘Go Fish’ … keyword searches usually are not very effective.”

*12 There are emerging best practices for dealing with these shortcomings and they are explained in detail elsewhere. There is a “need for careful thought, quality control, testing, and cooperation with opposing counsel in designing search terms or ‘keywords’ to be used to produce emails or other electronically stored information.” And beyond the use of keyword search, parties can (and frequently should) rely on latent semantic indexing, statistical probability models, and machine learning tools to find responsive documents. Through iterative learning, these methods (known as “computer-assisted” or “predictive” coding) allow humans to teach computers what documents are and are not responsive to a particular FOIA or discovery request and they can significantly increase the effectiveness and efficiency of searches. In short, a review of the literature makes it abundantly clear that a court cannot simply trust the defendant agencies’ unsupported assertions that their lay custodians have designed and conducted a reasonable search.

Turning finally to its evaluation of the adequacy of defendants’ efforts in cases where search terms were tracked, the court accepted the conclusions of plaintiffs’ expert—enlisted to analyze defendants’ searches—that “many” of the searches “were not perfect.” Specifically, plaintiffs’ e-discovery expert indicated that “there [was] no indication that [the agencies] undertook any analysis to determine whether there were other words that should have been included in their search[es], including, for example, a review of a sample set of the documents that did not contain the … search terms” and noted an “absence of any evidence of a thoughtful process in selecting and testing search terms.” The court clarified, however, that it was not a question of perfection, but rather “whether the shortcomings on the part of the agencies made their searches ‘inadequate.’” As to that question, the court concluded that assessment of the adequacy of the searches was “impossible,” citing its awareness of “the limitations of keyword searching” and the “absence of evidence showing the efficacy of the terms used.”

In the end, the court acknowledged that “repeating vast swaths of the search in order to ensure adequacy is a waste of resources,” but, recognizing plaintiffs’ right to the information, ordered that the parties “work cooperatively” to design additional targeted searches, that some of the already-undertaken searches be repeated to help to evaluate their adequacy, and that new searches of certain specified custodians be undertaken utilizing search terms and methodologies agreed to by the parties.