Review
The hCOMET project: International database comparison of results with the comet assay in human biomonitoring. Baseline frequency of DNA damage and effect of main confounders
Mirta Mili c
a, Marcello Ceppi
b, Marco Bruzzone
b, Amaya Azqueta
c,d, Gunnar Brunborg
e, Roger Godschalk
f, Gudrun Koppen
g, Sabine Langie
f, Peter Møller
h,
João Paulo Teixeira
i,bh, Avdulla Alija
j, Diana Anderson
k, Vanessa Andrade
l,
Cristina Andreoli
m, Fisnik Asllani
j, Ezgi Eyluel Bangkoglu
n, Magdalena Baran 9 coková
o, Nursen Basaran
p, Elisa Boutet-Robinet
q, Annamaria Buschini
r, Delia Cavallo
s,
Cristiana Costa Pereira
i,bh, Carla Costa
i,bh, Solange Costa
i,bh, Juliana Da Silva
t, Cristian Del Bo ˊ
u, Vesna Dimitrijevi c Sre ckovi c
v, Ninoslav Djeli c
w,
Malgorzata Dobrzy nska
x, Zdenka Dura cková 9
y, Monika Dvo ráková
y, Goran Gajski
a, Serena Galati
z, Omar García Lima
aa, Lisa Giovannelli
ab, Irina A. Goroshinskaya
ac, Annemarie Grindel
ad, Kristine B. Gutzkow
e, Alba Hernández
ae,af, Carlos Hernández
ag, Kirsten B. Holven
ah, Idoia Ibero-Baraibar
ai, Inger Ottestad
ah, Ela Kadioglu
aj,
Alena Ka ž imirová
o, Elena Kuznetsova
ak, Carina Ladeira
al,am, Blanca Laffon
an,
Palma Lamonaca
ao, Pierre Lebailly
ap, Henriqueta Louro
aq,ar, Tania Mandina Cardoso
aa, Francesca Marcon
m, Ricard Marcos
ae,af, Massimo Moretti
as, Silvia Moretti
at,
Mojgan Najafzadeh
k, Zsuzsanna Nemeth
au, Monica Neri
ao, Bozena Novotna
av, Irene Orlow
aw, Zuzana Paduchova
y, Susana Pastor
ae,af, Hervé Perdry
ax,
Biljana Spremo-Potparevi c
ay, Dwi Ramadhani
az, Patrizia Riso
u, Paula Rohr
l,
Emilio Rojas
ba, Pavel Rossner
av, Anna Safar
au, Semra Sardas
bb, Maria João Silva
aq,ar, Nikolay Sirota
ak, Bozena Smolkova
bc, Marta Staruchova
o, Rudolf Stetina
bd,
Helga Stopper
n, Ekaterina I. Surikova
ac, Stine M. Ulven
ah, Cinzia Lucia Ursini
s, Vanessa Valdiglesias
be, Mahara Valverde
ba, Pavel Vodicka
bf, Katarina Volkovova
o, Karl-Heinz Wagner
ad, Lada Ž ivkovi c
ay, Maria Du š inská
bg, Andrew R. Collins
ah, Stefano Bonassi
ao,*
aMutagenesisUnit,InstituteforMedicalResearchandOccupationalHealth,Ksaverskacesta2,10000,Zagreb,Croatia
bBiostatisticsUnit,SanMartinoPoliclinicHospital,Genoa,Italy
cDepartmentofPharmacologyandToxicology,UniversityofNavarra,C/Irunlarrea1,31008,Pamplona,Spain
dIdiSNA,NavarraInstituteforHealthResearch,C/Irunlarrea3,31008,Pamplona,Spain
eDepartmentofEnvironmentalHealth,SectionofMolecularToxicology,NorwegianInstituteofPublicHealth(NIPH),Lovisenberggt6,0456,Oslo,Norway
fSchoolofNutritionandTranslationalResearchinMetabolism,DepartmentofPharmacologyandToxicology,UniversityofMaastricht,Universiteitssingel50, 6200MD,Maastricht,theNetherlands
gFlemishInstituteofTechnologicalResearch,EnvironmentalRiskandHealthunitVITO-BIOMo,Belgium
hDepartmentofPublicHealth,SectionofEnvironmentalHealth,UniversityofCopenhagen,OsterFarimagsgade5A,DK-1014,Copenhagen,Denmark
iEnvironmentalHealthDepartment,NationalInstituteofHealthDr.RicardoJorge,RuaAlexandreHerculano,321,4000-055,Porto,Portugal
jDepartmentofBiology,UniversityofPrishtina,GeorgeBush,N.N.,10000,Prishtina,Kosovo
kBiomedicalSciencesDepartment,UniversityofBradford,RichmondRoadBradford,Bradford,WestYorkshire,BD71DP,UK
lLaboratoryofTranslationalBiomedicine,UniversityofSouthernSantaCatarina,UNESC,Criciúma,SC,Brazil
mDepartmentofEnvironmentandHealth,IstitutoSuperiorediSanità,VialeReginaElena299,Rome,Italy
nInstituteofPharmacologyandToxicology,UniversityofWuerzburg,VersbacherStrasse9,97078,Wuerzburg,Germany
* Correspondingautorat:DepartmentofHumanSciencesandQualityofLifePromotion,SanRaffaeleUniversity,Head,UnitofClinicalandMolecularEpidemiology,IRCCS SanRaffaelePisana,ViadiValCannuta,24700166,Rome,Italy.
E-mailaddress:[email protected](S.Bonassi).
http://dx.doi.org/10.1016/j.mrrev.2021.108371
1383-5742/©2021TheAuthors.PublishedbyElsevierB.V.ThisisanopenaccessarticleundertheCCBY-NC-NDlicense(http://creativecommons.org/licenses/by-nc-nd/4.0/).
ContentslistsavailableatScienceDirect
Mutation Research/Reviews in Mutation Research
j o u r n a lh o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / r e v i e w s m r C o m m u n i ty a d d r e s s : w w w . e l s e v i e r . c o m / l o c a te / m u t r e s
oInstituteofBiology,MedicalFaculty,SlovakMedicalUniversity,Limbova12,83303,Bratislava,Slovakia
pDepartmentofPharmaceuticalToxicology,FacultyofPharmacy,HacettepeUniversity,Ankara,Turkey
qToxalim(ResearchCentreinFoodToxicology),UniversitédeToulouse,INRAE,ENVT,INP-Purpan,UPS,Toulouse,France
rDepartmentofChemistry,LifeSciencesandEnvironmentalSustainability,UniversityofParma,ParcoAreadelleScienze11A,43124,Parma,Italy
sDepartmentofOccupationalandEnvironmentalMedicine,EpidemiologyandHygiene(DiMEILA),ItalianWorkers’CompensationAuthority(INAIL),Via FontanaCandida1,00078,MontePorzioCatone(Rome),Italy
tLaboratoryofGeneticToxicology,LutheranUniversityofBrazil(ULBRA),Av.Farroupilha8001,Prédio22/Sala22,92425-900,Canoas,RS,Brazil
uDepartmentofFood,EnvironmentalandNutritionalSciences(DeFENS),UniversityofMilan,ViaCeloria2,20133,Milan,Italy
vFacultyofMedicine,ClinicforEndocrinology,DiabetesandMetabolicDisease,UniversityofBelgrade,DrSubotica13,Belgrade,Serbia
wDepartmentofBiology,FacultyofVeterinaryMedicine,UniversityofBelgrade,OslobodjenjaBlvd18,11000,Belgrade,Serbia
xDepartmentofRadiationHygieneandRadiobiology,NationalInstituteofPublicHealth-NationalInstituteofHygiene,24ChocimskaStreet,00-791, Warsaw,Poland
yInstituteforMedicalChemistry,BiochemistryandClinicalBiochemistry,FacultyofMedicine,ComeniusUniversity,Sasinkova2,Bratislava,Slovakia
zCentreforMolecularandTranslationalOncology,UniversityofParma,ParcoAreadelleScienze11A,43124,Parma,Italy
aaCenterforRadiationProtectionandHygiene,Calle20,No4113,e/41y47.Playa.C.P.11300,LaHabana,A.P.6195,C.P.10600,Habana,Cuba
abDepartmentNEUROFARBA,UniversityofFlorence,VialeG.Pieraccini6,50139,Florence,Italy
acLaboratoryfortheStudyofthePathogenesisofMalignantTumors,NationalMedicalResearchCenterforOncology,14line63,344037,Rostov-on-Don,Russia
adDepartmentofNutritionalSciences,UniversityofVienna,Althanstrasse14,1090,Vienna,Austria
aeDepartmentofGeneticsandMicrobiology,FacultyofBiosciences,UniversitatAutònomadeBarcelona,08193,CerdanyoladelVallès(Barcelona),Spain
afConsortiumforBiomedicalResearchinEpidemiologyandPublicHealth(CIBERESP),CarlosIIIInstituteofHealth,28029,Madrid,Spain
agDepartmentofBiochemistry,InstitutodeCienciasBásicasyPreclínicas“VictoriadeGiron”,146St.and31Ave,No3102,Playa,Habana,Cuba
ahDepartmentofNutrition,InstituteofBasicMedicalSciences,UniversityofOslo,Sognsvannsveien9,0372,Oslo,Norway
aiDepartmentofNutrition,FoodScienceandPhysiology,CentreforNutritionResearch,UniversityofNavarra,Irunlarrea1,31008,Pamplona,Navarra,Spain
ajToxicologyDepartment,FacultyofPharmacy,GaziUniversity,Ankara,Turkey
akInstituteofTheoreticalandExperimentalBiophysics,RussianAcademyofSciences,142290,Institutskaya3,Pushchino,MoscowRegion,Russia
alH&TRC-Health&TechnologyResearchCenter,ESTeSL-EscolaSuperiordeTecnologiadaSaúde,InstitutoPolitécnicodeLisboa,1990-096,Lisbon,Portugal
amNOVANationalSchoolofPublicHealth,PublicHealthResearchCentre,UniversidadeNOVAdeLisboa,Lisbon,Portugal
anGrupoDICOMOSA,CentrodeInvestigacionesCientíficasAvanzadas(CICA),DepartamentodePsicología,FacultaddeCienciasdelaEducación,Universidade daCoruña,CampusElviñas/n,15071,ACoruña,Spain
aoIRCCSSanRaffaelePisana,UnitofClinicalandMolecularEpidemiology,DepartmentofHumanSciencesandQualityofLifePromotion,SanRaffaele University,ViadiValCannuta,247.,00161,Rome,Italy
apANTICIPEUnit,INSERM&UniversityofCaen-NormandieCentreFrançoisBaclesse,AvenueduGénéralHarris14076,CaenCedex05,France
aqHumanGeneticsDepartment,NationalInstituteofHealthDoutorRicardoJorge,Av.PadreCruz,1649-016,Lisbon,Portugal
arToxOmics,NMS,NOVAUniversityofLisbon,Lisbon,Portugal
asDepartmentofPharmaceuticalSciences(UnitofPublicHealth),UniversityofPerugia,ViadelGiochetto,06122,Perugia,Italy
atDepartmentofHealthSciences,UniversityofFlorence,DivisionofDermatology,PalagiHospital,VialeMichelangelo41,Florence,Italy
auDepartmentofNon-ionizingRadiation,NationalPublicHealthCenter,AnnaStreet5,1221,Budapest,Hungary
avDepartmentofNanotoxicolgyandMolecularEpidemiology,InstituteofExperimentalMedicineoftheCzechAcademyofSciences,Videnska1083,Prague, CzechRepublic
awMemorialSloanKetteringCancerCenter,EpidemiologyandBiostatistics,NewYork,NewYork,10065,USA
axUnivParis-Saclay,CESP,Villejuif,France
ayCenterofBiologicalResearch,FacultyofPharmacy,UniversityofBelgrade,VojvodeStepe,450,Belgrade,Serbia
azCenterforRadiationSafetyTechnologyandMetrology,NationalNuclearEnergyAgencyofIndonesia,Jl.LebakBulusRayaNo.49,KotakPos7043JKSKL JakartaSelatan,12440,Jakarta,Indonesia
baGenomicMedicineandEnvironmentalToxicology,InstitutodeInvestigacionesBiomédicas,UniversidadNacionalAutónomadeMéxico,CU,Mexico
bbToxicologyDepartment,FacultyofPharmacy,IstinyeUniversity,Istanbul,Turkey
bcCancerResearchInstitute,BiomedicalResearchCenteroftheSlovakAcademyofSciences,Dubravskacesta9,84505,Bratislava,Slovakia
bdDepartmentofToxicologyandMilitaryPharmacy,FacultyofMilitaryHealthSciences,UniversityofDefence,Trebesska1575,50001,HradecKralove,CzechRepublic
beGrupoDICOMOSA,CentrodeInvestigacionesCientíficasAvanzadas(CICA),DepartamentodeBiología,FacultaddeCiencias,UniversidadedaCoruña,Campus AZapateiras/n,15071,ACoruña,Spain
bfExperimentalMedicine,MolecularBiologyofCancer,IEMAVCR,Videnska1083,Prague4,Prague,CzechRepublic
bgNILU,HealthEffectsLaboratory,Kjeller,Norway
bhEPIUnit-InstitutodeSaúdePública,UniversidadedoPorto,RuadasTaipas,no135,4050-600,Porto,Portugal
ARTICLE INFO Articlehistory:
Received9May2020
Receivedinrevisedform25January2021 Accepted27January2021
Availableonline6February2021 Keywords:
Cometassay DNAdamage Pooledanalysis Humanbiomonitoring Biomarkers
ABSTRACT
Thealkalinecometassay,orsinglecellgelelectrophoresis,isoneofthemostpopularmethodsfor assessingDNA damageinhuman population.Oneof theopenissues concerningthis assayis the identificationofthosefactorsthatcanexplainthelargeinter-individualandinter-laboratoryvariation.
InternationalcollaborativeinitiativessuchasthehCOMETproject-aCOSTActionlaunchedin2016- representavaluabletooltomeetthischallenge.TheaimsofhCOMETweretoestablishreferencevalues forthelevelofDNAdamageinhumans,toinvestigatetheeffectofhostfactors,lifestyleandexposureto genotoxicagents,andtocomparedifferentsourcesofassayvariability.Adatabaseof19,320subjectswas generated,poolingdatafrom105studiesrunby44laboratoriesin26countriesbetween1999and2019.A mixedrandomeffectlog-linearmodel,inparallelwithaclassicmeta-analysis,wasappliedtotakeinto accounttheextensiveheterogeneityofdata,duetodescriptor,specimenandprotocolvariability.Asa resultofthisanalysisinterquartileintervalsofDNAstrandbreaks(whichincludesalkali-labilesites)were reportedfortailintensity,taillength,andtailmoment(cometassaydescriptors).Asmallvariationbyage wasreportedinsomedatasets,suggestinghigherDNAdamageinoldestage-classes,whilenoeffectcould beshownforsexorsmokinghabit,althoughthelackofdataonheavysmokershasstilltobeconsidered.
Finally,highlysignificantdifferencesinDNAdamagewerefoundformostexposuresinvestigatedin specificstudies.Inconclusion,thesedata,whichconfirmthatDNAdamagemeasuredbythecometassay isanexcellentbiomarkerofexposureinseveralconditions,maycontributetoimprovingthequalityof studydesignandtothestandardizationofresultsofthecometassayinhumanpopulations.
©2021TheAuthors.PublishedbyElsevierB.V.ThisisanopenaccessarticleundertheCCBY-NC-ND license(http://creativecommons.org/licenses/by-nc-nd/4.0/).
Contents
1. Introduction ... 3
2. Materialsandmethods ... 3
2.1. ThehCOMETdatabase ... 3
2.2. Laboratorymethods ... 4
2.3. Statisticalanalysis ... 4
3. Results ... 6
3.1. BackgroundfrequencyofDNAdamage ... 6
3.2. DistributionofDNAdamagebyageandsex ... 6
3.3. Othermajorconfoundingfactors ... 7
4. Discussion ... 8
5. Conclusions... 11
Funding ... 11
Authorcontribution ... 12
Acknowledgements ... 12
References... 12
1.Introduction
Thealkalinecometassay,orsinglecellgelelectrophoresis,is one of the most popular methods for the assessment of DNA damageinhumanpopulations[1–5].Thetestisquick,simple,and sensitive,qualitiesthatmakethecometassayverywellsuitedto study the effect of exposure to occupational or environmental mutagens,toquantifytheinfluenceoflifestyle,dietanddietary supplementationongeneticstability,andtocheckthelevelsof DNAdamageinrelationtovariousdiseases[6],infreshorfrozen samples of human cells. The presence of DNA damage, or its inadequaterepairisawell-knownfactorintheetiologyofcancer andothernon-communicablediseases,andthereforethecomet assaycanintheorybeusedtodetectearlygeneticdamagesthat indicate the risk of future disease[7]. One of theopen issues concerningthisassayistheidentificationofthosefactorsthatcan explainthelargeinter-individualandinter-laboratoryvariationin results.Severalpotentialsourcesofvariabilityoftheassayhave been identified in human biomonitoring studies, including technical heterogeneity, lifestyle, and individual characteristics.
Differences in protocols and interpretation of the outcomes contributetomultiplytheeffectofbiologicalvariability,making resultsofmulti-centrebiomonitoringstudieswiththecometassay difficult to interpret, and greatly limiting the potential of this approach. To warrantwidespread acceptance and credibility in humanpopulationstudies,thecometassayrequiresstandardiza- tionof theprotocoland a betterknowledgeofcritical features affectingtheassayoutcomes,andtheroleofhostfactors.These challengescan bemetbyinternational collaborative initiatives, involving a large network of laboratories [8]. Here we are interestedspecifically inlevelsof backgroundDNAdamage,i.e.
DNA strand breaks (and alkali-labile sites) present in the cells (whetherfromcontrolorexposedindividuals),measuredwiththe comet assay, withoutany treatmentof cells ex vivo. The large pooleddatabaseresultingfromsucheffortsrepresentsavaluable tooltoevaluatetheassay’sperformanceandtostudythecauses andconsequencesofDNAdamage.Initiativeshavebeenproposed byinternationalbodiesandindividualresearchgroups,especially withintheEuropeanCometAssayValidationGroup(ECVAG)[9– 16]. An international collaborative study on the Comet assay, originallyknownbythenameofComNet,waslaunchedduringthe InternationalCometAssayWorkshopmeetingheld inKusadasi, Turkey (September 13–16, 2011)with the aim of investigating whetherthecomet assayis a reliable and validatedbiomarker assay for use in human biomonitoring [17]. Building on these efforts,thehCOMETCOST(EuropeanCooperationinScienceand Technology)Action(CA15132)hasassembledapooleddatasetof
nearly 20,000individuals,collectingdatafrom existingstudies, andprotocoldetailsfrommostlaboratoriesinthefield.
TheaimsofthisfirstpaperanalysingthelargehCOMETdataset are:1)toestablishreferencevaluesforbackgroundlevelsofDNA damage,2)toinvestigatetheeffectofhostfactors,lifestyle and exposure to genotoxic agents on the level of DNA damage measured by the comet assay, and 3) to model and compare differentsourcesoftheassay'svariability.
2.Materialsandmethods 2.1.ThehCOMETdatabase
MoredetailaboutthecreationofthehCOMETdatabasescanbe found in the previouspublications of the ComNet consortium, whichindicatedthestrategylaterfollowedbyhCOMET[6.17].In brief,44laboratoriescontributeddatareferringto19,320subjects withintheframeworkofthehCOMETEUCOSTActionCA15132.
Data included in the database were generated by 105 studies publishedbetween1999and2019.Studiesusingthecometassay in whole blood (i.e., white blood cells), peripheral blood mononuclearcells (commonlyreferredtoaslymphocytes),orin other tissues in human populations, with an epidemiological design, the presence ofa controlgroup,and withanadequate descriptionoftheprotocol(s)used,wereincludedintheanalysis.
Wheneveravailableintheoriginalsetofdata,detailedinformation was collectedonthose parameters that accordingtopublished literaturewerepotentiallyrelevantforthetechnical,biologicalor epidemiologicalfeaturesofthecometassay,namelydemographic parameters,lifestyle,occupationalexposure,smokinghabit,diet- relatedfactors,geneticprofile,anddiagnosesofchronicdisease.
Thelargeheterogeneityamongthelaboratoriescontributingdata intermsofqualityandquantityofinformationcollecteddidnot allowfine-tuningintheanalysisofparameterssuchassmoking and occupational exposuresto genotoxic agents. For the latter, subjects were broadly classified as exposed individuals or unexposedcontrolsaccordingtocriteriausedtodefineexposure intheoriginalstudies.Thelistoftypesofexposureisheteroge- neous and includes occupational, environmental, and lifestyle exposures togenotoxic agents (e.g.,antineoplastic drugs, anes- thetics, dust, ceramicmaterial,styrene,air pollution, formalde- hyde, metal and metalloids, polycyclic aromatic hydrocarbons, dyes,pesticides,tobaccofarmingandmanufacturing,ionizingand non-ionizingradiation,diet-relatedfactors,alcohol,andsmoking habit). In some studies, the presence of diseases or disease- associatedconditionswereconsidered(e.g.,cancer,cardiovascular diseases, diabetes, obesity, kidney failure, hemodialysis, renal
transplant,inflammatoryboweldisease).Thisapproachhasgiven thepossibilitytoidentifyalargegroupofpresumablyunexposed subjectsrepresentingarobustreferencegroupforseveralanalyses.
The133papers(referringto105studies)fromwhichdatawere includedinthehCOMETdataset-ascommunicatedbyindividual contributors - are listed in the supplementary Table 1. Data gatheringwascoordinatedbytheIRCCSSanRaffaelePisana,Rome, Italy, and the Institutefor Medical Research and Occupational Health,Zagreb,Croatia.Thepooledanalysisofdatawasapproved bytheethicscommitteeoftheIRCCSSanRaffaelePisana,Rome, Italy (12 December 2015, Prot. N. 10/15), i.e., the centre coordinatingdatacollectionand runningthestatistical analysis ofdata.Eachstudyhadalreadyreceivedethicalapprovalfromlocal ethicalcommitteesforthe collectionand analysisofindividual codeddatainaccordancewiththeDeclarationofHelsinki,andall themeasuresofGeneralDataProtectionRegulation(EU)2016/679 (GDPR)wererespected.Thelistoflaboratoriescontributingdata used for this paper with the corresponding codes used for statisticalanalysisis givenin thesupplementaryTable 2,while their geographical distribution is mapped in the Fig. 1. The identification of technical issues that most affect the assay descriptorswillbeevaluatedinanotherpublication,whileresults involvingtheenzyme-modifiedcometassay(testingfordamage suchas oxidisedbases) weretoosparsefora properstatistical analysis.Therefore,thispaperwillpresentonlydatareferringto thefrequencyofDNAstrandbreaks(includingalkali-labilesites)as evaluatedbytheclassicalkalineversionoftheassay.
2.2.Laboratorymethods
An extensive questionnaire collecting technical detail of protocolsusedwas filledinbyparticipatinglaboratories,but to characterizethedatasetanddescribecriticalsourcesoftechnical variability, only main differences between laboratories were evaluated-namely1)useoffresh(57.5%ofsubjects)orfrozen specimens(41.6 %), 2)useofstandard [1] (46.0%) or modified protocols,e.g.,withtheadditionof10%dimethylsulfoxide(DMSO) inthelysisstep(44.0%),3)useofdifferentstainingmethods,of which the most commonwere DAPI (40,6-diamidino-2-phenyl- indole(10.1 %), SYBR1 Gold Nucleic AcidGel Stain(34 %), and ethidiumbromide(30.1%).Minordifferenceswerefoundforthe concentrationofthe (lowmelting) agaroselayerwiththe cells
(from0.5 to0.8%),duration oftheelectrophoresisstep(21.8% lasted more than 20 min), and voltage used during the electrophoresis(from20 to30V usingelectrophoresistanksof differentlengths,resultinginawiderangeofpotentialgradients (V/cm)). The number of cells scored was variable, although a minimumof100cellswerescoredfor92.8%ofspecimens.Finally, in23.1%ofsubjectstheanalyzedspecimenwaswholebloodrather thanisolatedlymphocytes.Giventhefragmenteddistributionof these parameters among laboratories, only the major discrep- ancies (in terms of numbers and biological significance) were included in the multivariate modelling, i.e., cell type (isolated lymphocytesorwholeblood)andsampleprocessing(fresh/frozen sample).
2.3.Statisticalanalysis
Themostcommonlyuseddescriptorsofthecometassaywere consideredforstatisticalanalysis,i.e.,taillength(TL),%tailDNA(%
T),tailmoment(TM),andvisualscoring(VS).Descriptivestatistics referringtobackgroundlevelofDNAdamagereportedbothmean and median as a measure of central tendency, and standard deviationandtheinterquartilerangesasmeasuresofvariability.
The frequency of these descriptors is reported 1) for each laboratoryincludedinthestudy,2)forthetotalofthehCOMET dataset,and3)forthesubgroupofsubjectsclassifiedascontrolsin theoriginalstudies.AsregardsVS,mostlaboratoriesused5classes ofcelldamageaccountingforamaximumpossiblevalueof400, while othersused3 orfewerclasses. Thisdifferencein scoring contributedtogeneratethegreatdealofheterogeneityobserved for thisdescriptor, and thereforenoreferencevalues orpooled estimatesofeffectbyageorsexhavebeenreportedforVS.Studies usingtissuesdifferentfromwholebloodorisolatedlymphocytes, mostly exfoliated epithelial cells, were not included in the statistical analyses.The datasetwas searched for subjectswith repeatedvalues(suchasinbefore-and-afterstudiesorinextended surveillanceprograms),andonlybaselinemeasureswereleftin thedataset.Theserestrictionsbroughtthenumberofsubjectsto 15,421. A furtherscreening identified outliers, technicalerrors, subjectswithoutvalidmeasures,andagroupofnewborns,whose valueswerehardlycomparablewithadultgroups.Afterremoving these cases,the finalvalid number of subjectsincluded in the statisticalanalyseswas13,553.Univariateparametrictestswere
Fig.1.Geographicaldistributionofthe26countriescontributingdatatothehCOMETdataset(darkgrey).
Table1
SpontaneousfrequencyofDNAdamageintheindividualdatasetscontributedbyparticipatinglaboratories.1a)taillength;1b)%tailDNA;1c)tailmoment,and1d)arbitrary units.
1a)TailLength(mm)
Laboratory N Mean(SD) Median 25-75Percentile
AS4 77 49.91(11.7) 48.9 41.159.9
CSA5 138 39.9(22.7) 42.5 17.654.1
CSA6 129 27.3(6.6) 26.6 22.930.1
EU12 42 10.1(3.3) 9.6 7.913.2
EU15 101 28.3(18.1) 21.3 15.037.3
EU17 18 48.2(4.2) 47.5 45.951.5
EU25 76 3.1(1.1) 3.1 2.43.6
EU26 973 36.2(19.0) 28.2 21.747.1
EU30 92 1.5(1.0) 1.5 0.72.0
EU32 562 41.8(14.0) 46.7 41.450.6
EU33 97 30.2(14.7) 32.5 16.541.9
EU36 302 45.1(20.2) 38.9 29.554.1
EU4 171 13.9(1.2) 13.9 13.114.6
EU41 187 6.2(3.7) 5.40 3.58.5
EU50 37 21.8(5.9) 21.4 18.123.7
TOTAL 3,002 32.3(20.4) 28.6 16.647.1
Controlsonly 1,228 28.6(17.5) 26.6 14.243.6
1b)TailIntensity(%)
Laboratory N Mean(SD) Median 25-75Percentile
AS4 77 4.5(1.4) 4.2 3.65.0
EU12 42 0.01(0.01) 0.02 0.010.02
EU15 101 24.0(9.1) 23.6 18.831.1
EU17 115 7.4(4.6) 5.4 4.69.1
EU18 57 6.7(2.1) 6.0 5.57.0
EU22 752 8.1(7.3) 7.2 1.114.6
EU23 86 1.2(1.2) 0.8 0.41.7
EU24 92 13.8(9.0) 13.3 5.221.1
EU25 76 3.1(1.0) 3.0 2.43.3
EU26 1,076 9.5(7.3) 7.2 4.312.2
EU27 17 2.9(0.7) 3.2 2.53.4
EU28 32 6.5(2.8) 6.5 4.08.7
EU3 1,637 2.7(2.3) 2.0 1.13.6
EU30 92 0.9(0.9) 0.6 0.31.2
EU31 142 30.0(16.0) 29.0 15.539.0
EU32 494 0.7(1.0) 0.2 0.10.8
EU33 1,061 16.5(12.2) 12.9 7.021.3
EU34 170 6.4(3.1) 6.4 3.88.2
EU36 441 6.7(2.5) 6.6 5.08.3
EU37 287 10.7(2.4) 10.4 9.212.6
EU4 111 1.1(0.6) 1.1 0.71.6
EU41 187 7.0(3.6) 6.4 4.39.2
EU43 162 2.7(2.1) 2.1 1.23.6
EU45 93 1.4(4.3) 0.4 0.11.0
EU46 329 3.2(1.9) 2.8 1.84.0
EU50 37 6.4(1.5) 6.2 5.87.1
EU6 325 1.1(1.0) .9 0.31.6
OCNA1 202 9.5(5.6) 7.7 6.211.0
TOTAL 8,293 7.4(8.8) 4.5 1.69.9
Controlsonly 4,625 4.8(5.5) 2.9 1.16.7
1c)TailMoment(arbitraryunits)
Laboratory N Mean(SD) Median 25-75Percentile
AS4 77 1.6(0.4) 1.5 1.31.8
CSA5 35 2.0(1.3) 1.4 1.22.4
CSA6 129 1.3(1.0) 1.1 0.51.7
EU12 221 0.4(0.4) 0.4 0.10.7
EU15 173 27.3(15.7) 25.2 16.034.8
EU21 151 1.8(1.7) 1.4 0.72.4
EU23 86 0.6(0.9) 0.3 0.10.9
EU25 76 0.3(0.2) 0.2 0.10.3
EU26 824 2.1(2.7) 1.2 0.62.6
EU30 92 0.5(1.3) 0.01 0.010.2
EU33 273 10.2(13.5) 3.0 1.318.4
EU36 302 1.4(0.7) 1.4 0.91.8
EU4 111 0.1(0.1) 0.1 0.10.2
EU8 41 0.3(0.2) 0.3 0.20.4
OCNA1 202 1.5(1.9) 0.9 0.61.6
TOTAL 2,793 3.96(8.9) 1.1 0.52.4
Controlsonly 1,181 2.8(6.9) 0.9 0.41.8
used on the original or log-transformed data for all analyses according to Lowell and Omori [18]. Since data from each individual study can be considered as a cluster of correlated observations,all regression modelsconsidered thewithin- and between-studiesvariance components. A mixed random effect log-linearmodel (REM) including the fixed effects of age,sex, smokinghabit, cellpopulations(isolatedlymphocytesor whole blood), and cell processing (fresh or frozen samples), and the randomeffectsoflaboratory,studyandsubjects,wasfittedtothe whole dataset, providing a suitable range of expected values [19,20]. The use of random effect modelling (REM) allowed estimationoftheadjustedmeanratio(MR)i.e.theratiobetween the average values of DNA damage in the categories to be compared.REMallowedusalsotoestimatetheVariancePartition Coefficient (VPC),which is a summary measure expressingthe relativecontributionofeachcomponenttothetotalvarianceof data.Tobetterexploretherelationshipbetweenthedescriptorof thecometassayandage,therestrictedcubicsplinetechniquewas applied from 0 to 70 years of age [21]. To allow a more straightforwardand reliable interpretation of results, estimates providedbyREMareaccompaniedineachdataset,whensuitable, byaclassic meta-analysis.Theforestplot,togetherwithmeta- estimates,hasbeenreportedfordataonsex(femalesvs.males) andgenotoxicexposure(subjectsreportedasexposedinoriginal studiesvs.non-exposedsubjects).Thisapproachwasnotusedto study the effect of age (because of the non-linearity of the relationshipwithDNA damage), orthe effectof smoking habit (giventheinter-studyheterogeneityinthequalityofdata). Age was included in multivariate models as a continuous variable, while smoking couldbe evaluatedonly by comparingsubjects reporting to have ever smoked (ever-smokers) with those that declaredtohaveneversmoked(non-smokers).
The MRfor thevariablesincludedin themeta-analyseswas computed for each study, adjusting for the other confounding factors.Subsequently,apooledvaluewasestimatedassuggested byDerSimonianandLaird[22].Themaindifferencebetweenthese approaches isthat a meta-analysistakes intoaccount onlythe between-studyheterogeneity, whilemultilevelmodelling quan- tifiesvariancecomponentsateachlevel.SPSS26andSTATA14.2 statisticalsoftwarewasusedforallanalyses.
3.Results
3.1.BackgroundfrequencyofDNAdamage
Toidentifytherangeofvalues,whichislikelytorepresentthe backgroundfrequencyofDNAdamage,weinvestigatedadatasetof 13,553 subjects with valid results from at least one of the descriptorsof thecometassay.Descriptive statisticsconcerning
thedescriptorsconsideredarereportedinTable1(a-d).Themost popular descriptor was the %T, with a total number of 8293 subjectsscreenedforthisparameter,andmeasuredby28outof40 laboratoriescontributingvaliddatatohCOMET(70.0%).Theother threedescriptors(TL,TM,andVS)wereevaluatedin3002,2793, and4513subjects,respectively.Thenumbersoflaboratoriesusing these descriptors were 15 for TL, 14 for TM, and 13 for VS, correspondingto37.5,35.0,and32.5%ofthetotal,respectively.A graphical summary of values of each descriptor per laboratory usingbox-and-whiskersplotisreportedinFig.2(statisticsreferto unexposed cases only). Model estimates of the coefficient of variation (CV) based on the expected values, i.e., taking into account theeffectof laboratory,study, and repeated measures, showed CVs of 42.9%, 92.6 %,and 305%, for TL, %T, and TM, respectively(datanotshown).Asexpected,a positivelyskewed distributionwasgenerallyobservedforalldescriptors,particularly severefortheTM,duetothefrequentpresenceofextremevalues.
Inmostcases,duetodeparturefromnormality,themedianvalueis themostreliableindexofcentraltendency,andtheinterquartile interval of each descriptor provides the most reliablerange of frequent results. However, to provide a complete view of descriptivestatisticsalsothemean,thestandarddeviation(SD), andthecorrespondingconfidenceintervalarereported.According totheseresults, a typicalbiomonitoringstudy, using %Tas the descriptorofDNAdamageshouldestimateinmostcasesanoverall medianvaluefallingbetween1.6and9.9%.Thisintervaldoesnot change markedly when restricted to the group of subjects designated as controls (4625 subjects) and in this case the interquartileintervalfallsbetween1.1and6.7%.Asummaryreport forreferencevaluesfordifferentdescriptorsofthecometassayis reportedinTable2togetherwiththeresultsof expectedvalues frommultivariate modelling.Measuresof centraltendencyand variabilityintervalsarereportedfortheoveralldatasetandforthe subgroupofsubjectsclassifiedascontrolsintheoriginalstudies.
TheexpecteddistributionofTL,%T,andTMestimatedbythemixed model didnot differ from univariate estimates, but should be preferredsincemainconfoundersaretakenintoaccountandthe influenceofextremevaluesissmoothed.
3.2.DistributionofDNAdamagebyageandsex
Asummary viewofthelevelofDNAdamagebyage-classis shown in Table 3, where absolutenumber, mean and SD, and medians are shown according to age-classes and sex. The univariateanalysisof datashows anincreasingtrend for mean valuesofTLandespeciallyfor%T(p<0.01),whileunexpectedly, TMshowedtheoppositetrend.Themultivariateanalysisofthe pooleddatasetbasedonamixedrandomeffectlog-linearmodel andtakingintoaccountsex,smokinghabit,exposuretogenotoxic 1d)Visualscoring(arbitraryunits)
Laboratory N Mean(SD) Median 25-75Percentile
CSA2 71 114.2(31.0) 118.0 89.0138.0
CSA3 56 8.8(5.3) 7.8 5.110.5
CSA5 55 204.8(53.6) 217.0 159.0254.0
CSA6 76 74.1(7.9) 73.5 70.078.5
CSA7 1,206 21.6(21.7) 15.0 6.030.0
EU14 103 198.1(29.3) 202.5 190.0207
EU19 89 288.7(36.2) 284.0 264.5315.8
EU30 41 28.5(20.5) 24.0 12.542.5
EU31 1,468 78.9(59.0) 59.1 41.095.3
EU36 55 116.5(7.6) 118.0 110.0122.0
EU37 238 17.9(16.1) 13.0 7.024.0
EU42 49 23.5(14.0) 21.8 12.432.3
EU7 1,006 26.0(16.0) 23.0 18.031.0
TOTAL 4,513 – – –
agents,cellpopulationandsampleprocessingdidnotshowany significant association between DNA damage and age, with a progressivenon-significantincreaseof4,2,and6%ofTLbyage- class of 1940 y, 4160 y, and 61+ years, respectively, when comparedtotheyoungestclass.Acubicsplineanalysisappliedto thesedatasuggesteda potentialincreaseofDNAdamageinthe olderage-classes.Tailintensityrisessteeplyupto20years;after thatitseemstostabilize,withmodestfluctuations,butreturnsto riseatabout60yearsofage(SupplementaryFig.1).Duetothe smallnumbersandthelargevariabilityoftheoldestage-classes, this analysis hasbeen truncatedto 70 years of age. The other descriptors evaluated with this approach did not show any significantvariationsbyageclass(datanotshown).
The level of DNA damage according to sex was initially evaluated withunivariate analysis as shown in Table 3. Males showedhighermeanratesofDNAdamage,significantfor%Tand TM,whilemedianvaluesdidnotdifferbetweensexes.However, theobserveddifferenceswerenothomogeneous,andthereforethe effectofthisparameterwasfurtherestimatedineachlaboratory
through meta-analysis, and described with a forest plot. All descriptorsyielded similarresultsand thereforeonlytheforest plotof%Twasshown(Fig.3).Theratiobetweenthemeanof%Tin malestofemalesisreportedforeachstudy,togetherwiththe95% confidenceintervalandtheproportionalweightofeachstudy.The levelofheterogeneity(I2<5%)wasverylow,andonlyonestudy found a borderline statistically significant difference between sexes,i.e.,6%higher%Tinmales.Theoverallestimateofthemeta- MRdidnotshowanyeffectsofsex,i.e.,1.01.95%CI0.981.03.
The estimates from random effect modelling confirmed the resultsofmeta-analysisand failedtofindsignificantdifferences betweenmalesandfemalesinthelevelofDNAdamagemeasured withthecometassay(MRfemales vsmalesfor%T=1.00,95% CI=0.91–1.11).
3.3.Othermajorconfoundingfactors
As discussed in the methods section, results concerning smoking habitsuffer froma great deal of heterogeneityin the Fig.2.Frequency(medianandinterquartiledistance)ofDNAdamagemeasuredinthewholedatasetforallendpoints.1a)Taillength;1b)Tailintensity;1c)Tailmoment;1d) VisualScoring.Referencelinescorrespondstotheoverallmedianvaluesforthespecificendpoint.Overallmedianandinterquartileintervalisreportedwithwholeanddotted lines,respectivelyinFigure1a,1b,and1c.Allstatisticsshowninthistablereferstounexposedcasesonly.
Table2
Referencevaluesforselecteddescriptorsofthecometassayinthewholedatasetandinthesubsetofcontrolindividuals.
Observeddata Modelleddata
Descriptor N Mean(SD) Median 25-75percentile ExpectedMean*(SD) ExpectedMedian*(25-75)
Taillength(mm)Overall 3,002 32.3(20.4) 28.6 16.647.1 22.4(9.6) 23.3(14.826.9)
Controls 1,228 28.6(17.5) 26.6 14.243.6 21.1(8.6) 22.4(14.325.6)
TailIntensity(%)Overall 8,293 7.4(8.8) 4.5 1.69.9 5.4(5.0) 3.9(1.87.7)
Controls 4,625 4.8(5.5) 2.9 1.1-6.7 3.9(3.5) 2.8(1.35.4)
TailMoment(au)Overall 2,793 3.96(8.9) 1.1 0.52.4 2.0(6.1) 1.1(0.52.1)
Controls 1,181 2.8(6.9) 0.9 0.41.8 2.0(9.0) 0.9(0.51.6)
* Predictedvaluesfromamixedrandomeffectlog-linearmodelincludingthefixedeffectsofage,sex,smokinghabit,exposuretogenotoxicagents,celltype(isolated lymphocytesorwholeblood),sampleprocessing(fresh/frozen)andtherandomeffectsoflaboratory,studyandsubject;5-95correspondtothe25th-75thpercentiles.
Standarddeviation(SD),arbitraryunits(au)..
quantity and quality of information retrieved, limiting the comparabilityof results among laboratories. The evaluation of MRs for ever-smokers vs non-smokers in each study allowed a standardizedevaluationofthisparameteratstudylevel.Results didnotshowanyincreasesinever-smokers,whichfor%Tshowed higherlevelsofDNAdamageinthegroupofsmokersonlyin15 studiesoutof36(41.6%),andneversignificantly.Theresultsof REMdidnotfindsignificantdifferencesinthelevelofDNAdamage betweenever-smokersandnon-smokers(MR=0.98;95%CI0.93– 1.02).Alldescriptorsshowedresultssimilarto%T.
Theonlycovariatewhichshowedconstantlyasignificanteffect reportedbymoststudiesinthemeta-analysiswastheexposure studiedineachspecificdataset(alistofmostcommonexposuresis reportedinthemethodssection).Asshownbytheforestplotof Fig.4referringto%T,thelargemajorityofstudies,independentlyof theexposureinvestigated,showedsignificantdifferencesinDNA damage when compared with unexposed controls. The meta- estimateconfirms the excess of DNA damage in the group of exposedsubjects(MR=1.32;95%CI1.20–1.45),inagreementwith themixed random effect model, which estimated a significant increaseofDNAdamageinexposedsubjectsforalldescriptors,i.e, MR=1.46(95%CI=1.19–1.79)for%T,1.16(1.13–1.19)forTL,1.55 (1.42–1.69)forTM.
Amongprotocolfeaturesthatwereusedaspotentialconfound- erinthesemodels,butthatwillrequireadedicatedanalysis,assays using whole blood showed for all descriptors a much lower frequencyofDNAdamagethanassaysusingisolatedlymphocytes, e.g.,for%TtheMRwas0.35(95%CI0.180.71).
FiguresreportedinthesupplementaryTable3showthatwhile mostvariabilityofTLcanbeexplainedwiththeheterogeneityof estimatesbetweenlaboratories,studiesandsubjects,i.e.,91.0%, for the other descriptors higher proportions of residual, unex- plainedvariabilitywerefound,i.e.,42.6%for%T,23.4%forTM,and 55.0%forVS.Foralldescriptorsthemostimportantcomponentof varianceisthelaboratory,evenafterremovingtheheterogeneity betweensinglestudies.
The correlation betweendescriptors was evaluated in those datasets where more than one descriptor was simultaneously evaluated.Correlationcoefficientswerehighlysignificantforthe pairs%T-TL(r=0.25;p<0.001),%T-TM(r=0.41;p<0.001),and
TL-TM(r=0.25;p<0.001).Positivecorrelationswerefoundfor%T and TL withVS,but the smallnumbers prevented any reliable evaluation. A detailed tablewithcorrelation coefficients for all descriptorsandalllaboratoriesisreportedintheSupplementary table 4. An example of the strength of correlation between laboratoriesscoring%TandTLis showed intheSupplementary Fig.2.
4.Discussion
Onlyvalidatedbiomarkerscanbeefficientlyusedinmolecular epidemiologicalstudies.Validationincludesaproperunderstand- ingoftechnicalandhostfactorsthatcontributetothelargeextent of variability which characterizes these studies. To achieve validation,biomarkersof genotoxicity,suchasthecometassay, need joint efforts to reach sufficient numbers of individuals/
observationstodoreliableanalyses.Indeed,criticalimprovements inthequalityandreliabilityofotherpopularbiomarkersofDNA damageand geneticinstabilityresultshavebeen achievedasa consequence of large international collaborative initiatives [23–26].Theavailabilityofthepooleddatabaseassembledwithin thehCOMETinternational collaborativeprojectofferedaunique opportunitytoobtaincomprehensiveinformationaboutthecomet assay. This information includes the background level of DNA damageinthegeneralpopulation,theroleofhostfactorssuchas sexandageonthelevelofDNAdamage,theroleofoccupational and environmental exposure toDNA-damagingagents, and the comparisonofthefourdifferentdescriptorsmostcommonlyused overthe20-yearspanduringwhichthesedatawereobtained.
WhentheComNetprojectwaslaunchedin2011[17]andtaken tothenextlevelbythehCOMETCOSTActionsince2016,allthese pointswereplannedwiththepurposeofimprovingourknowledge of the comet assay and making DNA damage measures more comparable between different laboratories. The levels of DNA damageestimatedbythecometassayareratherheterogeneous, mostly because of the large number of technical protocols available, whichleadstolargeinter-laboratory variation[27].A detailedanalysisofdifferentprotocols,andtheirinfluenceonthe finalestimatesofDNAdamageisbeyondthescopeofthispaper.
Nevertheless,theroleoftechnicalvariabilitywasevaluatedasa whole,consideringthevariabilitybetweenlaboratories,studies, andsubjects.Theimportanceofthisvariabilityhasbeenconfirmed bythepresentstudy,whichgatheredtogetheralargenumberof laboratories and researchers, revealing differences in technical protocolsandcontributingaltogetherindividualdataofthousands of subjects. On the other hand, this variability reflects the sensitivityoftheassay,whichiswellsuitedtodetectdifferences, evensmall,inpopulationsexposedtoDNA-damagingagents,as shown by the analysisof individual studies. The high level of heterogeneityconcerningbackgroundlevelofDNAdamageinthe individual datasets was consistent for all four descriptors evaluated.The sourceof this heterogeneityhasbeenevaluated withinmultilevelmodellingbytheVariancePartitionCoefficient, whichshowedahighproportionofunexplainedvariancefor%T andVS(42.6and55.0%,respectively),whileTLvariationismostly explained by the variance associated with the heterogeneity betweenlaboratories(76.2%)andbetweensinglestudies(14.8%).
Anadditionalexplanationforthelargeinter-laboratoryvariability ofdescriptorsbasedonmetricmeasuressuchasTL(76.2%)isthe lackofcalibration.Thelargernumberofstudiesreporting%Tasthe favouritedescriptorreflectstheorientationoftheliterature,and therecommendationofinternationalcollaborativeconsortia[5,6], whilethelargenumberofsubjectsscreenedusingtheVSreflects the lower cost of this approach, and therefore its greater accessibility.Itshouldbenotedthatthehighlevelofcorrelation betweenthedescriptorsevaluatedshowsclearlythattheyareall Table3
Distributionof selected descriptorsof thecomet assaybyageclass andsex (numbersineachcellrepresent:absolutefrequency;mean(SD),andmedian).All valuesrefertototalhCOMETpopulation.
Age-Class TailLength(mm) TailIntensity(%)* TailMoment(au)
018years 138 1,226 166
16.6(19.4) 3.2(3.1) 7.0(12.8)
12.9 2.1 0.2
1940years 1,437 2,575 1,217
33.9(18.1) 7.2(7.5) 4.0(8.9)
32.9 5.3 1.2
4160years 1,098 2,657 903
31.2(20.4) 7.6(9.9) 4.2(9.9)
26.4 4.0 1.1
61+years 312 1,329 261
35.1(26.7) 10.5(11.2) 3.0(3.8)
30.7 6.2 1.6
Sex TailLength(mm) TailIntensity(%)** TailMoment(au)**
Females 1,330 3,598 1,217
31.7(20.3) 7.1(8.0) 3.6(7.9)
27.5 4.4 1.2
Males 1,671 4,489 1,332
32.8(20.4) 7.6(9.5) 4.7(10.3)
29.6 4.3 1.2
*=Testforlineartrendp<0.01.
** Student’st-testp<0.01;Arbitraryunits(au).