Exploring a New Paradigm for Cleaning Efficacy


Just how efficacious are the cleaning and disinfection interventions performed in healthcare institutions? And what standard are hospitals using to evaluate cleaning efforts?  While it has been suggested that the food industry cleanliness standard (surface bioburden level of <2.5 cfu/cm²) be adopted in healthcare as an indication of relative cleanliness, there is still a lack of conclusive evidence that these levels of contamination relate to the prevention of healthcare-associated infections (HAIs).

By Kelly M. Pyrek

Just how efficacious are the cleaning and disinfection interventions performed in healthcare institutions? And what standard are hospitals using to evaluate cleaning efforts?  While it has been suggested that the food industry cleanliness standard (surface bioburden level of <2.5 cfu/cm²) be adopted in healthcare as an indication of relative cleanliness, there is still a lack of conclusive evidence that these levels of contamination relate to the prevention of healthcare-associated infections (HAIs).

A historical review to place this topic in perspective is in order here. As Dancer (2004) observed, “There may be a link between dirty hospitals and the rising numbers of hospital-acquired infections but there is little evidence to be able to substantiate this at present ... Unfortunately, the mechanisms for evaluating the quality of hospital cleaning regimens are limited.”

Dancer (2004) also outlined the challenges associated with trying to measure cleaning efficacy: “The difficulties in measuring cleaning efficacy are compounded by the lack of standardized methodologies and are rarely quantitative. Environmental screening usually takes place on an ad hoc basis after an outbreak, but it is patently impossible to screen the entire surface of a ward and finding the outbreak strain is not guaranteed. Furthermore, organisms still have to be transmitted to patients. As this is thought to occur via staff hands, strategies for controlling HAI are more likely to favor improvements in hand hygiene than comprehensive screening programs. Cost-benefit and lack of standardized methodologies might also explain the perceived reluctance of private cleaning companies to participate in screening. Certainly, most microbiologists would be cautious about taking environmental samples from hospital wards on a routine basis.” She adds, “No one set of standards exists for general hospital wards, however, and there is considerable variation in sampling methodologies and quantitative reporting. There are further differences in whether sampling is carried out routinely or in response to an infection incident. This makes it difficult to compare fluctuating situations in a ward, between wards and between different hospitals, let alone investigate specific levels of contamination in relation to infection risk.”

In a time when commercial ATP systems were either in their infancy or unavailable, Dancer (2004) called for bacteriological standards with which to assess clinical surface hygiene in hospitals, based on those used by the food industry. The first standard concerns any finding of a specific ‘indicator’ organism, the presence of which suggests a requirement for increased cleaning. Indicators would include Staphylococcus aureus, including methicillin-resistant S. aureus, Clostridium difficile, vancomycin-resistant enterococci and various Gram-negative bacilli. The second standard concerns a quantitative aerobic colony count of <5 cfu/cm2 on frequent hand-touch surfaces in hospitals. As Dancer (2004) noted, “As cleaning could be a cost-effective method of controlling HAI, it should be investigated as a scientific process with measurable outcome. To achieve this, it is necessary to adopt an integrated and risk-based approach. This would include preliminary visual assessment, rapid sensitive tests for organic deposits, and specific microbiological investigations.” She adds, “Both indicator organisms and those gathered within numerical counts can be identified, quantified, documented and audited. The methods required are simple, cheap and reproducible and could be adopted by any healthcare institution with access to a clinical microbiological laboratory. Furthermore, as evidence becomes available, these standards can be modified to reflect the overall risk of infection, and adapted to high-risk patients, high-risk units and emergency or outbreak situations.”

Lewis, et al. (2008) acknowledge that “Calls have been made for a more objective approach to assessing surface cleanliness. To improve the management of hospital cleaning the use of adenosine triphosphate (ATP) in combination with microbiological analysis has been proposed, with a general ATP benchmark value of 500 relative light units (RLU) for one combination of test and equipment.” In their study, Lewis, et al. (2008) used this same test combination to assess cleaning effectiveness in a 1,300-bed teaching hospital after routine and modified cleaning protocols. Based upon the ATP results a revised stricter pass/fail benchmark of 250 RLU is proposed for the range of surfaces used in this study. This was routinely achieved using modified best practice cleaning procedures which also gave reduced surface counts with, for example, aerobic colony counts reduced from >100 to <2.5 cfu/cm2, and counts of Staphylococcus aureus reduced from up to 2.5 to <1 cfu/cm2 (95 percent of the time). The researchers say that benchmarking is linked to incremental quality improvements and both the original suggestion of 500 RLU and the revised figure of 250 RLU can be used by hospitals as part of this process, and that they can also be used in the assessment of novel cleaning methods.

Al-Hamad and Maxwell (2008) concur that “Although microbiological standards have been proposed for surface hygiene in hospitals, standard methods for environmental sampling have not been discussed.” In their study, Al-Hamad and Maxwell (2008) sought to assess the effectiveness of cleaning and disinfection in critical-care units using the wipe-rinse method to detect an indicator organism and dip slides to quantitatively determine the microbial load. Frequent hand-touch surfaces from clinical and non-clinical areas were microbiologically surveyed, targeting both methicillin-susceptible (MSSA) and methicillin-resistant (MRSA) Staphylococcus aureus. A subset of the surfaces targeted was sampled quantitatively to determine the total aerobic count. MRSA was isolated from nine (6.9 percent) and MSSA was isolated from 15 (11.5 percent) of the 130 samples collected. Seven of 81 (8.6 percent) samples collected from non-clinical areas grew MRSA, compared with two (4.1 percent) from 49 samples collected from clinical areas. Of 116 sites screened for the total aerobic count, nine (7.7 percent) showed >5 cfu/cm2 microbial growth. Bed frames, telephones and computer keyboards were among the surfaces that yielded a high total viable count. There was no direct correlation between the findings of total aerobic count and MRSA isolation; however, Al-Hamad and Maxwell (2008) suggest that combining both standards will give a more effective method of assessing the efficacy of cleaning/disinfection strategy. They add that further work is required to evaluate and refine these standards in order to assess the frequency of cleaning required for a particular area, or for changing the protocol or materials used.

Without definitive standards for defining cleanliness, infection preventionists and environmental services directors must evaluate the literature as well as the current chemistries and technologies in the marketplace to determine a plan of action for their institutions. It can be a confusing process, as Bartlett (2014) observes, “Surfaces near patients are increasingly being recognized as important links in transmission of HAIs. It seems obvious that clean surfaces pose a lower risk for transmission than contaminated surfaces, but the relative contributions of different cleaning products, application devices, and new technologies are not clear. U.S. Environmental Protection Agency-approved disinfectants must demonstrate efficacy against viruses, bacteria, and spores, but there is no requirement to assess the clinical effectiveness of a product.”

Enter Philip Carling, MD, of Carney Hospital and Boston University School of Medicine, and colleagues, who hoped to learn, upon completion of their two-phase evaluation, about the clinical effectiveness of two surface disinfectants in a general acute-care hospital. What came out of this study is a new way of thinking about quantifying bioburden reduction while monitoring the possible impact of differences in cleaning thoroughness. Essentially, Carling and colleagues developed a system for the simultaneous assessment of both the cleaning process and cleaning products.

The products compared were a traditional quaternary ammonium compound (QAC) and a novel peracetic acid/hydrogen peroxide disinfectant (ND) as part of terminal room cleaning. As a result of QAC cleaning, 93 (40 percent) of 237 cleaned surfaces confirmed by fluorescent marker removal were found to have complete removal of aerobic bioburden. During the ND phase of the study, bioburden was removed from 211 (77 percent) of 274 cleaned surfaces. Because there was no difference in the thoroughness of cleaning with either disinfectant (65.3 percent and 66.4 percent), the researchers say that significant difference in bioburden reduction can be attributed to better cleaning efficacy with the ND.

In essence, the researchers concluded that in the context of the study design, the ND was 1.93 times more effective in removing bacterial burden than the QAC. Furthermore, the researchers say that study design represents a new research paradigm in which two interventions can be compared by concomitantly and objectively analyzing both the product and process variables in a manner that can be used to define the relative effectiveness of all cleaning and disinfection interventions.

Deshpande, et al. (2014) also compared the efficacy of a PA/H2O2 sporicidal disinfectant with a 1:10 dilution of bleach against vancomycin-resistant enterococcus (VRE), MRSA and Clostridium difficile spores in a laboratory setting. The researchers reported that PA/H2O2 effectiveness was not affected by the presence of organic material, whereas bleach was significantly impaired by the presence of organic material. When used in a clinical setting, both bleach and PA/H2O2 eliminated C difficile, MRSA, and/or VRE contamination on bed rails and bedside tables. On floors where it was compared with a quaternary ammonium disinfectant, only PA/H2O2 significantly reduced C difficile, MRSA, and/or VRE contamination.

Bartlett (2014) asserts that a significant limitation of the Carling and Deshpande studies is “the universal lack of data defining the relative risk for transmission of healthcare-associated pathogens based on specific levels of microbial contamination of surfaces (bioburden).” She adds, “Given that there is no evidence-based standard of ‘how clean is clean,’ interpretation of the reduction in bacterial burden is unclear. Some have suggested using the same threshold as in food preparation surfaces (<2.5 colony-forming units [CFU]/cm2), but whether this level of contamination is associated with a lower risk for transmission of healthcare-associated pathogens is unknown. In this study, only 1.7 percent of cleaned surfaces would have been defined as a failure (>2.5 CFU/cm2), and 85 percent of surfaces would have been counted as ‘clean’ prior to actual cleaning. Stated another way, using only post-cleaning colony counts, 98 percent of surfaces would be declared ‘clean,’ whereas using the fluorescent marker showed that only 66 percent of surfaces were ‘cleaned.’ Whether the increased effectiveness of PA/H2O2 compared with the quaternary ammonium disinfectant or 1:10 dilution of bleach will result in clinical reductions in transmission of environmental pathogens and improved patient outcomes requires further evaluation. However, the technique described by Carling and colleagues, which pairs evaluation of thoroughness of cleaning (using fluorescent marking) with effectiveness of the cleaning product itself (using colony counts from dip slides), is novel. The study authors commented on the potential for using this paradigm to study the relative clinical efficacy of other cleaning and disinfection products, materials, and technologies. Are microfiber cloths better than paper towels? Are disposable disinfectant wipes better than towels soaked in disinfectant? This brings us back to the important question: Does it matter for patients? We don’t have the answers yet, but the techniques described in these articles, combined with studies evaluating the impact of ‘better cleaning’ on patient outcomes, will be instrumental to advancing the science of preventing HAIs.”

One stumbling block could be the methodologies embraced by the Environmental Protection Agency (EPA) when evaluating and approving disinfectants for the disinfection of surfaces that harbor and transmit pathogens. As Carling, et al. (2014) explain, “Although the EPA process for evaluating the intrinsic efficacy of disinfectants in terms of their bactericidal, viracidal and sporicidal efficacy provides the basis for EPA certification labeling, actual assessment of clinical effectiveness is not used as part of the approval process.”

“It’s frustrating that we have no way of clinically evaluating this world of environmental hygiene,” Carling says. “The reality is that almost all studies out there on technologies such as HPV or ultraviolet light have been outbreak situations, and you can’t rely on outbreak situations to dictate practices in non-outbreak situations - it’s not good science. Now that we have some of these new disinfectants, it’s time to ask, are they really better and are they worth it in terms of expense over traditional cleaners and disinfectants. If these new disinfectants are as good as bleach, efficacy-wise, then why not consider them? But first you have to find a way of saying they are equivalent. The EPA approval system is limited. They are providing a level of basic laboratory efficacy of a chemistry. Not to say that’s not important, but there’s no reason not to take the next step in evaluating cleaning efficacy.”

The process that Carling and colleagues used is easily replicated by hospitals, he says. Before cleaning each room, 12 surfaces were concurrently marked with an invisible fluorescent marker and cultured for aerobic bacteria on an adjacent surface to the left of the fluorescent marker using an agar dip slide (10 cm2 surface area) with neutralizer. The slides were incubated for 24 hours, and the total aerobic colony count was determined by the test site microbiology laboratory. Following room cleaning, a black light was used to determine whether the fluorescent marker had been completely removed (the surface was cleaned) as previously described, and a second dip slide culture of the surface adjacent and to the right of the fluorescent marker site was obtained and processed. Culture results were recorded as the absolute number of aerobic colonies per slide after incubation (colony-forming units [CFU]/10 cm2). Carling says that surfaces that had no detectable aerobic bacteria before cleaning were excluded from additional analysis because the effectiveness of the disinfectant on such a surface could not be evaluated. Surfaces that had been overlooked during cleaning as evidenced by the persistence of the fluorescent marker were tabulated to determine the thoroughness of cleaning in each phase of the study. Only surfaces with no detectable bacterial burden (0 CFU) after documented cleaning were defined as effectively cleaned for the purpose of the study.

As Carling, et al. (2014) explain, “Over the past decade, dip slide systems have been used extensively to study the epidemiology of bacterial healthcare environmental surface contamination but have not been used to compare the clinical efficacy of surface disinfectants. Culture swab samples have been used in the evaluation of disinfection cleaning of specific pathogens, such as VRE, MRSA, Clostridium difficile and Acinetobacter, but the high level of non-quantitative sensitivity (standard culture area is 100 cm2) of swab samples or sponge culture systems precludes their use in general bioburden evaluation. Furthermore, it is likely that the performance variability when quantitatively swabbing an estimated or template 100-cm2 surface is prone to both user variability and possible intervention bias. Although the dip slide system, because of its small culture surface, would be expected to have low sensitivity for frequently identifying by culture anaerobic healthcare-associated pathogens that are typically found at very low densities on or near patient surfaces, it is reasonable to assume that bioburden removal represents a good surrogate for healthcare-associated pathogen removal, because many studies have confirmed the absence of disinfectant resistance in pathogens such as MRSA, VRE, and drug-resistant Gram-negative organisms.”

The researchers add, “The modeling used in this study fulfills all elements of what Kuhn first described as a paradigm shift in 1962. Kuhn specifically noted that such a shift leads to a ‘reconstruction of investigation’ that allows for ‘the emergence of highly directed, or paradigm-based research.’ As a result of the described modeling, two interventions can be compared by concomitantly and objectively monitoring the ‘process and product’ elements of disinfection cleaning. This has potential for effectively defining the relative clinical efficacy of cleaning and disinfecting materials such as microfiber cloth, disposable disinfectant wipes, detergents without disinfectant activity, all forms of nontouch disinfection technologies, and self-disinfecting surfaces. Such studies may then begin to objectively clarify best practices for decreasing the risk of pathogen transmission from contaminated surfaces to patients through the use of various cleaning modalities and chemistries while providing guidance for more in-depth clinical studies of cost-benefit issues and healthcare-associated pathogen transmission prevention.”

“Dip slides work, and they are simple,” Carling says. “These dip slides are a few dollars apiece and they are easy to process and you get the data very quickly. It’s very low cost so it can be done easily in any facility. I assert that this kind of model for determining bioburden levels can be used to compare everything - there’s no reason why we can’t compare any type of intervention using this model - any technology, chemistry and surface.” Carling continues, “The biggest frustration in thinking about this model is I don’t know if anyone will actually bother to do this.”

The Centers for Disease Control and Prevention (CDC) guidance (Guh and Carling, 2010)  describes the potential use of  bioburden measurement both before and after cleaning to objectively evaluate the thoroughness of cleaning practice by culture swab, ATP monitoring systems, or dip slides as alternatives to fluorescent gel system monitoring.

As Carling, et al. (2014) note, “Although it is recommended that all U.S. hospitals use EPA-registered disinfectants for cleaning of surfaces that harbor and transmit microbial pathogens, the recent development of systems to directly measure the thoroughness of cleaning practice has led to objectively grounded educational and process feedback programs that have sustainably improved thoroughness of disinfection cleaning practice. Such systems can now be used to monitor cleaning thoroughness when comparing the clinical effectiveness of all disinfection interventions to improve near-patient surface cleanliness. By concomitantly evaluating the clinical bioburden impact of a disinfectant and, simultaneously, the objectively measured thoroughness of cleaning practice, it becomes possible to independently analyze both components of disinfection cleaning (practice and product) during actual clinical use. Given the propensity for cleaning to be more carefully performed when a new chemistry is being evaluated in an unblinded clinical study (Hawthorne effect), the ability to objectively evaluate such an influence substantially improves the ability to interpret possible differences in effectiveness when comparing with surface disinfectants.”

Despite efforts over the past 15 years, it has been extremely challenging to pinpoint a system that definitively defines the clinical relevance of the healthcare surface. As Carling says, “There had been some hope for 2.5 CFU/cm2 as being  a standard, but as of yet there is no scientific evidence of the relevance of such a value. The problem is that bioburden on either cleaned or uncleaned surfaces is very low and it has been difficult to develop a reproducible single value system with enough sensitivity and specificity to measure such a value accurately. Furthermore, reproducibly evaluating small and irregular objects adds to the challenge. The other point besides the fact that bioburden is low, is that for the pathogens we worry about, every one of them has a low infective dose. So just because a surface has a low level of bioburden, and you don’t find MRSA on that one little slide you just took off the surface doesn’t mean it’s not right next to it. The same goes for other pathogens such as norovirus and C difficile.”

Only one thing is for certain right now - that the lively dialogue over defining a standard of cleanliness - and whether or not to use the 2.5 CFU/cm2  standard - continues. As noted in the commentary by Carling and Huang, “Improving Healthcare Environmental Cleaning and Disinfection: Current and Evolving Issues” in Infection Control and Hospital Epidemiology (2013),  “Since the realistic goal of environmental cleaning and disinfection of patient care areas is not to produce a continuously sterile surface environment but rather to effectively decrease pathogen transmission, multi-center studies evaluating both environmen-tal contamination as well as acquisition also have the potential for identifying a threshold of environmental contamination below which transmission and therefore disease risk is minimized.  Identifying such a threshold for key healthcare pathogens could then facilitate additional studies using such a threshold as an acceptable ‘gold standard’ for minimizing disease risk.”

As Carling, et al. (2014) emphasize, “Our findings  shed further light on the challenge of defining when an apparently clean healthcare surface might reasonably be considered bacterially contaminated enough to provide evidence of poor cleaning practice or be defined as dirty. Several years ago, it was suggested that the industrial hygiene threshold for defining food preparation surfaces as clean (aerobic colony count of less than 2.5 CFU/cm2) could be used to evaluate the cleanliness of near-patient surfaces in healthcare and that surfaces containing heavier bacterial bioburdens be defined as cleaning failures. Although a plausible concept, logistical limitations as well as the fact that the standard has yet to be correlated with the relative risk of transmission of healthcare-associated pathogens have been noted by several authors. If our data were used to compare the two chemistries by applying the proposed standard of greater than 2.5 CFU/cm2 to define a cleaning failure, only seven of 425 tested surfaces would have failed to meet the standard following cleaning. As a consequence, no difference in the efficacy of the two chemistries would have been detected. If such a standard were used to assure analysis of only dirty surfaces (pre-cleaned surfaces with more than 2.5 CFU/cm2), none of the surfaces on which the ND was used had more than 2.5 CFU/cm2 following cleaning. Moreover, despite the fact that 11.8 percent of surfaces cleaned with the QAC had bioburdens of more than 2.5 CFU/cm2 following cleaning, the difference would not have been significant due to the small number of surfaces with more than 2.5 CFU/cm2 bioburden before cleaning. All reported studies analyzing a broad range of near-patient surfaces have found, as we did, that only a small proportion (average of 16 percent) of healthcare surfaces have bioburdens of more than 2.5 CFU/cm2 before cleaning. Although it is likely that the highly significant difference in the efficacy of the two chemistries seen when comparing total bioburden removal could be realized using the 2.5 CFU/cm2 breakpoint definition with dip slides (or calibrated adenosine triphosphate [ATP] methods, which broadly correlate with aerobic colony counts) by measuring bioburden both before and after cleaning, such an approach would require a much larger study because of the need to exclude from analysis the approximately 85 percent of surfaces with bioburdens of less than 2.5 CFU/cm2 before cleaning.”

Carling is quick to point out that his study’s results and comparative methodology should not be contemplated in a vacuum; he underscores the importance of coupling environmental hygiene with hand hygiene as part of a multi-modal approach to thinking about how to address surface bioburden and potential hand carriage of pathogens.

“We can’t talk about hand hygiene in isolation,” Carling says. “We must discuss hand hygiene and environmental hygiene as two sides of the same coin. You can’t do one without the other. And if you are not doing well with both types of interventions, then you are not doing right by your patients.”
This is echoed in the sentiments expressed by Dancer (2010) who emphasizes that “Infection control requires a multimodal focus encompassing a wide range of strategies. Each of these interventions plays an integral role in an overall hygiene program, which implies that all deserve consideration and resourcing. To single out hand hygiene while the wards remain grubby, antibiotic prescribing continues unabated, or hospitals overflow with patients will confound any possible benefits from clean hand programs.” 


Al-Hamad A and Maxwell S. How clean is clean? Proposed methods for hospital cleaning assessment. J Hosp Infect. 2008 Dec;70(4):328-34. doi: 10.1016/j.jhin.2008.08.006.

Bartlett AH. How Clean Is Clean Enough -- And How Do We Get There? Medscape, Dec. 29, 2014. Accessible at: http://www.medscape.com/viewarticle/837282
Carling PC, Perkins J, Ferguson J, and Thomasser A. Evaluating a New Paradigm for Comparing Surface Disinfection in Clinical Practice. Infec-tion Control and Hospital Epidemiol. Vol. 35, No. 11, November 2014.

Dancer SJ. How do we assess hospital cleaning? A proposal for microbiological standards for surface hygiene in hospitals. J Hosp Infect
2004, 56(1):10-5.

Dancer SJ. Control of transmission of infection in hospitals requires more than clean hands. Infect Control Hosp Epidemiol 2010;31(9):958–960.

Deshpande A, Mana TS, Cadnum JL, et al. Evaluation of a Sporicidal Peracetic Acid/Hydrogen Peroxide-Based Daily Disinfectant Cleaner. In-fect Control Hosp Epidemiol. 2014;35:1414-1416

Guh A and Carling PC. The Environmental Evaluation Workgroup. Options for evaluating environmental cleaning. Atlanta: Centers for Dis-ease Control and Prevention, 2010. Available at: http://www.cdc.gov/HAI/toolkits/Evaluating-Environmental-Cleaning.html

Lewis T, Griffith C, Gallo M, Weinbren M: A modified ATP benchmark for evaluating the cleaning of some hospital environmental surfaces. J Hosp
Infect 2008, 69(2):156-63.


Related Videos
Infection Control Today Topic of the Month: Mental Health
Infection Control Today Topic of the Month: Mental Health
Cleaning and sanitizing surfaces in hospitals  (Adobe Stock 339297096 by Melinda Nagy)
Set of white bottles with cleaning liquids on the white background. (Adobe Stock 6338071172112 by zolnierek)
Association for the Health Care Environment (Logo used with permission)
Woman lying in hospital bed (Adobe Stock, unknown)
Photo of a model operating room. (Photo courtesy of Indigo-Clean and Kenall Manufacturing)
Mona Shah, MPH, CIC, FAPIC, Construction infection preventionist  (Photo courtesy of Mona Shah)
UV-C Robots by OhmniLabs.  (Photo from OhmniLabs website.)
CDC  (Adobe Stock, unknown)
Related Content