Neural networks for parameter estimation in microstructural MRI: application to a diffusion-relaxation model of white matter

(1)

ContentslistsavailableatScienceDirect

NeuroImage

journalhomepage:www.elsevier.com/locate/neuroimage

Neural networks for parameter estimation in microstructural MRI:

Application to a diﬀusion-relaxation model of white matter

João P. de Almeida Martins

^a^,^b^,¹^,^∗

, Markus Nilsson

^a^,¹

, Björn Lampinen

^c

, Marco Palombo

^d

, Peter T. While

^b^,^e

, Carl-Fredrik Westin

^f^,^g

, Filip Szczepankiewicz

^a^,^f^,^g

aDepartment of Clinical Sciences, Radiology, Lund University, Lund, Sweden

bDepartment of Radiology and Nuclear Medicine, St. Olav’s University Hospital, Trondheim, Norway

cDepartment of Clinical Sciences, Medical Radiation Physics, Lund University, Lund, Sweden

dCentre for Medical Image Computing and Department of Computer Science, University College London, London, United Kingdom

eDepartment of Circulation and Medical Imaging, NTNU-Norwegian University of Science and Technology, Trondheim, Norway

fRadiology, Brigham and Women’s Hospital, Boston, MA, United States

gHarvard Medical School, Boston, MA, United States

a b s t r a c t

Specificfeaturesofwhitemattermicrostructurecanbeinvestigatedbyusingbiophysicalmodelstointerpretrelaxation-diffusionMRIbraindata.Althoughmore intricatemodelshavethepotentialtorevealmoredetailsofthetissue,theyalsoincurtime-consumingparameterestimationthatmayconvergetoinaccurate solutionsduetoaprevalenceoflocalminimainadegeneratefittinglandscape.Machine-learningfittingalgorithmshavebeenproposedtoacceleratetheparameter estimationandincreasetherobustnessoftheattainedestimates.Sofar,learning-basedfittingapproacheshavebeenrestrictedtomicrostructuralmodelswitha reducednumberofindependentmodelparameterswheredensesetsoftrainingdataareeasytogenerate.Moreover,thedegreetowhichmachinelearningcan alleviatethedegeneracyproblemispoorlyunderstood.Forconventionalleast-squaressolvers,ithasbeenshownthatdegeneracycanbeavoidedbyacquisitionwith optimizedrelaxation-diffusion-correlationprotocolsthatincludetensor-valueddiffusionencoding.Whethermachine-learningtechniquescanoffsettheseacquisition requirementsremainstobetested.Inthiswork,weemployartificialneuralnetworkstovastlyacceleratetheparameterestimationforarecentlyintroduced relaxation-diffusionmodelofwhitemattermicrostructure.Wealsodevelopstrategiesforassessingtheaccuracyandsensitivityoffunctionfittingnetworksanduse thosestrategiestoexploretheimpactoftheacquisitionprotocol.Thedevelopedlearning-basedfittingpipelinesweretestedonrelaxation-diffusiondataacquired withoptimalandsub-optimalacquisitionprotocols.Networkstrainedwithanoptimizedprotocolwereobservedtoprovideaccurateparameterestimateswithinshort computationaltimes.Comparingneuralnetworksandleast-squaressolvers,wefoundtheperformanceoftheformertobelessaffectedbysub-optimalprotocols;

however,modelﬁttingnetworkswerestillsusceptibletodegeneracyissuesandtheirusecouldnotfullyreplaceacarefuldesignoftheacquisitionprotocol.

1. Introduction

Microstructure imaginguses compartment modelling of diffusion MRI (dMRI) data with the aim to map specific tissue quantities (Alexanderetal.,2019;Nilssonetal.,2013;Novikovetal.,2019).A centralgoalin microstructureimaginghasbeentoestimate thevol- umefractionsof differentmicrostructuralcomponentssuchas axons (Lampinen et al., 2020, 2019; Veraart etal., 2018). Estimating vol- umefractionsratherthansignalfractionsischallenging,however,be- causeitrequiresthesimultaneousestimationofbothdiffusionandre- laxationpropertiesofthedifferentmodelcompartments.Thiskindof inverseproblemissensitivetodegeneracyissues(Jelescuetal.,2016; Lampinenetal.,2019),inwhichmultiplesetsofmodelparameterscan describetheacquireddataequallywell.Parameterestimationcanalso becomputationallyslow,preventingreal-timemapping.Apotentialso- lutionistoemploymachinelearningtoacceleratetheparameteresti-

∗Correspondingauthorat:DepartmentofClinicalSciences,Radiology,LundUniversity,Lund,Sweden.

E-mailaddress:[email protected](J.P.deAlmeidaMartins).

1 Theseauthorscontributedequallytothiswork.

mationprocess(Golkovetal.,2016).However,thecurrentliterature lackssystematicassessmentsoftheadvantagesanddrawbacksofthis approach, whichissurprisingconsideringtheexponentialincreasein interestforsuchmethods.

Artificialneuralnetworks (ANNs)andothermachinelearningap- proaches have been applied previously toaccelerate the estimation of microstructure parametersfrom dMRIdata(Barbieri etal., 2020; Bertleff etal.,2017;Golkovetal.,2016;Grussuetal.,2020;Gyorietal., 2019;Hilletal.,2021;Kaandorpetal.,2021;Nedjati-Gilanietal.,2017; Palomboetal.,2020;Reisertetal.,2017).Forexample,arandomfor- estregressorhasbeenusedtofitacompartmentmodelforwhitemat- ter(WM)microstructureinthepresenceofwaterexchange (Nedjati- Gilanietal.,2017)andtofittheSANDImodelforgreymatterproperties (Palomboetal.,2020).Reisertetal.(2017)appliedmachinelearning toaBayesianestimationapproachwhichdramaticallyacceleratedthe fitting of two-andthree-compartmentmodels.Barbieri etal.(2020)

https://doi.org/10.1016/j.neuroimage.2021.118601.

Received30March2021;Receivedinrevisedform26August2021;Accepted18September2021 Availableonline22September2021.

(2)

appliedANNstotheintra-voxelincoherentmotionmodel.Animpor- tantopenquestion,however,iswhatimpactthetrainingstrategyhas onthefittingperformance.Thisisparticularlyrelevantwhenappliedto non-linearmulti-compartmentmodelswithmanyindependentmodel parameters,whichwehererefertolooselyas‘high-dimensionalmod- els’.Thegenerationoftrainingdatascalespoorlywiththenumberof modelparameters,assamplingeachcombinationofpmodelparame- tersinmstepsrequiresm^p samples. Aspincreases,itisunavoidable thatafinitesetofsamplesbecomessparseinthep-dimensionalmodel parameterspace,riskingselectionbias.Here,weinvestigatetheimpact ofdifferentsamplingpatternswithinthisspaceontheperformanceof theneuralnetwork.

Apartfromacceleratingmodelﬁtting,neuralnetworksmayinprin- ciplealsoreducetherequirementsontheimagingprotocolbylearning priorsfromtrainingexamples(Golkovetal.,2016).Forexample,neu- ralnetworkshavebeenusedtolearnamappingbetweenfully-sampled andsub-sampleddatasets,whichcaninturnbeusedtostabilisemodel ﬁttingperformanceagainstsubstantialdegreesofdatadown-sampling (Alexanderetal.,2017;Tianetal.,2020).However,wedonotexpect machinelearningapproachestocompletelyalleviatedegeneracyissues.

Indeed,forcaseswheretheacquisitionprotocoldoesnotprovidesuf- ﬁcientinformationtoresolvebetweendiﬀerentparametervalues,the learning-basedestimateswillsimplyequalthemeanofthemodelpa- rameterdistributionusedfortraining(Reisertetal.,2017).

Theaimsofthisstudyweretocomparetrainingstrategies,topro- pose tools toevaluate the performanceof modelfitting neuralnetworks,andtotesttowhatdegreeneuralnetworkscansolveproblems withdegeneracy.Asatestbed,weuseahigh-dimensionalrelaxation- diffusionmicrostructuremodelofWM(Lampinenetal.,2020, 2019; Veraartetal.,2018).Forthismodel,parameterestimationisenabled bystate-of-the-artimagingprotocolsfeaturingso-calledb-tensorencod- ing(Topgaard,2017; Westin et al., 2016) combined with diffusion- relaxationcorrelations(deAlmeida Martinsetal., 2020;deAlmeida MartinsandTopgaard,2018;Lampinenetal.,2019).Weinvestigated theabilityofneuralnetworkstospeedupmodelfitting,andexplored theextenttowhichtheycanoffsettherequirementsontheacquisition protocol.

2. Theory

Whitemattermicrostructurecanbemodelledbymultiplecompart- mentswithdiﬀerentmicrostructuralpropertiesbutacommonorien- tationdistribution (Alexanderetal., 2019; Novikovetal., 2019).In thisdescription,themeasuredsignalistheconvolutionbetweenanori- entationdistributionfunction(ODF)P(𝒏̂)andamicrostructuralkernel K(𝒖̂⋅𝒏̂)

𝑆(𝒖̂)=

∫_|̂_𝒏_|=1𝑃(𝒏̂)𝐾(𝒖̂⋅𝒏̂)d𝒏̂, (1) where𝒏̂and𝒖̂areunitvectorsdefiningthesymmetryaxesoftheODF andofthediffusionencodingprocess,respectively.Notethatthemi- crostructuralkerneldepends on therelativeanglebetween 𝒏̂ and𝒖̂, cos𝛽=𝒖̂⋅𝒏̂.Inthiswork,weassignaneffectivetransverserelaxation timeT₂andanapparentmicroscopicdiffusiontensorDtoeachcom- ponent,anduseexponentiallydecayingfunctionstomodeltheeffect ofthesemicrostructuralpropertiesontherelaxation-diffusion-weighted signal(Veraartetal.,2018).Undertheseassumptions,themicrostruc- turekerneliswrittenasaweightedsumofexponentials

𝐾(𝒖̂⋅𝒏̂)=𝑆0

∑𝐽 𝑗=1

𝑓_𝑗exp(

−𝐁(𝒖̂)∶𝐃_𝑗(𝒏̂)) exp

(

− 𝜏E

𝑇2;𝑗

)

, (2)

correspondingtoamixtureofJcomponentswithsignalfractionf_j,trans- verserelaxationtimeT_2;_j,anddiﬀusiontensorD_j.Thecolon“:” denotes theFrobeniusinnerproduct,B:D=∑

𝑖∑

𝑗𝐵_𝑖𝑗𝐷_𝑖𝑗.InformationaboutT_2;_j andD_jisencodedintothesignalbytheechotime𝜏Eanddiﬀusionen-

codingtensorB(𝒖̂),respectively,bothofwhichareexperimentalvari- ables.Tosimplifythemodel,weonlyconsideraxisymmetricB(𝒖̂)and additionallyassumethatthecomponent-wiseD_jareaxisymmetric.

TheconvolutionexpressedinEq.(1)canbesimpliﬁedbyfactorizing bothP(𝒏̂)andK(𝒖̂⋅𝒏̂)intheirsphericalharmoniccoeﬃcientsp_lmand k_lm,respectively:

𝑃(𝒏̂)=∑

𝑙

∑

𝑚 𝑝𝑙𝑚𝑌𝑙𝑚(𝒏̂), (3)

and 𝐾(𝒖̂⋅𝒏̂)=∑

𝑙^′

𝑘_𝑙^′0𝑌_𝑙^′0(𝒖̂⋅𝒏̂), (4)

whereY_lmarethesphericalharmonicsbasisfunctions 𝑌_𝑙𝑚(Θ,Φ)=

√ 2𝑙+1

4𝜋 (𝑙−𝑚)!

(𝑙+𝑚)!𝐿^𝑚_𝑙(cosΘ)exp(𝑖𝑚Φ), (5) withthe𝐿^𝑚_𝑙(x)termdenotingtheassociatedLegendrepolynomials.The summationsinEqs.(3)arecarriedoutfororderl=0,1,2,…,andde- greem=−l,−l+1,…,l.InEq.(4),wehavetakentheaxialsymmetryof themicrostructuralkernelK(𝒖̂⋅𝒏̂)intoaccount(Lampinenetal.,2020; Novikovetal.,2018).Symmetryaroundthepolaraxisimpliesk_l’_m_’=0 foreitherm’≠0oroddl’.Takentogether,thismeansthatthek_l’_m_’co- eﬃcientsarereducedtotheir0thdegreetermsk_l_’₀(typicallywrittenas k_l_’)andonlyeven-orderedsphericalharmonicterms(l’=0,2,…)pro- videnon-trivialcontributions.Usingthesphericalharmonicsaddition theorem,Eq.(4)canberewrittenas

𝐾(𝒖̂⋅𝒏̂)=∑

𝑙^′

𝑘_𝑙^′0 𝑙^′

∑

𝑚^′=−𝑙^′

𝑌_𝑙^′_𝑚^′(𝒖̂)̄𝑌_𝑙^′_𝑚^′(𝒏̂)

√ 4𝜋

2𝑙^′+1. (6)

InsertingEqs.(3)and(6)intoEq.(1)andmakinguse oftheor- thonormalityofthesphericalharmonicsbasisﬁnallyyields(Driscolland Healy,1994;Healyetal.,1998)

𝑆(𝒖̂)=∑

𝑙

∑

𝑚 𝑘_𝑙0𝑝_𝑙𝑚𝑌_𝑙𝑚(𝒖̂)

√ 4𝜋

2𝑙+1, (7)

where𝒖̂canbeparameterizedbythepolarandazimuthalangles,𝜃and 𝜙,describingtheorientationofB,𝒖̂ ≡(sin𝜃cos𝜙,sin𝜃sin𝜙,cos𝜃).

Thesphericalharmoniccoeﬃcientsofthemicrostructurekernel(k_l₀) andtheODF(p_lm)areestimatedastheinnerproductsbetweenagiven sphericalharmonicsbasisfunctionY_lmandeitherK(𝒖̂⋅𝒏̂)orP(𝒏̂).Due totheorthonormalityofthesphericalharmonicsbasis,theinnerprod- uctsaregivenbymultiplicationwiththecomplexconjugatesoftheY_lm, followedbyintegrationsovertheunitsphere.Forthemicrostructural kernel,thisprocedureresultsin(Lampinenetal.,2020)

𝑘_𝑙0≡𝑘_𝑙=𝑆0

∑𝐽 𝑗=1

𝑓_𝑗√

4𝜋(2𝑙+1)I_𝑙𝑗exp(

−𝑏𝐷I;𝑗(

1−𝑏Δ𝐷Δ;𝑗)) exp

(

− 𝜏E

𝑇2;𝑗

) ,

(8) where bis the conventional b-value andb_Δ denotesthe normalized anisotropyofthediffusionencodingtensorB(Erikssonetal.,2015).The isotropicdiffusivityandthenormalizeddiffusionanisotropy(D_IandD_Δ) arerelatedtotheaxialandradialdiffusivities(D_||andD_⊥)ofthediffu- siontensoraccordingtoD_I=(D_||+2D_⊥)/3andD_Δ=(D_|| −D_⊥)/3D_I (Conturoetal.,1996);initsprincipalaxis,agivenDcanthusberep- resentedby adiagonalmatrixparametrized asdiag(D_I (1 −DΔ), D_I (1−D_Δ),D_I (1+2D_Δ)).TheI_ljfactorsareafunctionoftheregular Legendrepolynomials,L_l,anddefinedas

𝐼𝑙𝑗=

∫

1 0

exp(

−𝛼𝑗𝑥²)

⋅𝐿𝑙(𝑥)d𝑥, (9)

with𝛼j=3bD_I;_jb_ΔD_Δ;_j.

DifferentdiffusionMRImodelsfeaturedifferent numbersofcom- ponentsandimposedifferentconstraintsonthecomponentproperties.

Hereweconsideratwo-compartmentmodel(J=2)comprisinga“stick”

(3)

component(S)withD_Δ_;S=1anda“zeppelin” (Z)componentwithD_Δ_;Z

≠1.Truncatingthesphericalharmonicsummationatthesecondorder (l^max=2)thenyieldsthesignalaccordingto

𝑆(𝒆,𝒎)=𝑆0

[ 𝑓Sexp(

−𝑏𝐷I;S

(1−𝑏Δ

))

× (

I_0;S+4𝜋I_2;S∑

𝑚 𝑝2𝑚𝑌2𝑚(𝜃,𝜙) )

exp (

− 𝜏E

𝑇2;S

)

+( 1−𝑓S

)exp(

−𝑏𝐷I;Z

(1−𝑏Δ𝐷Δ;Z

))

× (

I_0;Z+4𝜋I_2;Z∑

𝑚 𝑝2𝑚𝑌2𝑚(𝜃,𝜙) )

exp (

− 𝜏E

𝑇2;Z

)]

, (10)

wherem∈{−2,−1,0,1,2}.ThederivationofEq.(10)usesthe𝑝00= 𝑌00=1∕√

4𝜋ODFnormalization(Lampinenetal.,2020;Novikovetal., 2018).Thevectors eandmcapturethe experiment-related parameters,e=(𝜏E,b,bΔ,𝜃,𝜙),andscalarmodelparameters,m=(f_S,D_I;S, D_I;Z, D_Δ;Z, T_2;S, T_2;Z, p₂₀, Re(p₂₁), Im(p₂₁), Re(p₂₂), Im(p₂₂)), where Re(𝑝_𝑙𝑚)=(𝑝_𝑙𝑚+(−1)^𝑚𝑝_𝑙−𝑚)∕2andIm(𝑝_𝑙𝑚)=(𝑝_𝑙𝑚−(−1)^𝑚𝑝_𝑙−𝑚)∕2𝑖denote therealandimaginaryparts ofthep_lm coeﬃcients,respectively. We refertothemodelexpressedbyEq.(10)astheStandardModelwithRe- laxation(SMR).Thisnameischosentomarkitsdescendancefromthe

“standardmodel” ofWMmicrostructure(Novikovetal.,2019)andto emphasizethefactthatitaccountsforcompartment-speciﬁcT₂times.

TheSMRmodelparameterscanbedeterminedbyfittingEq.(10)di- rectlytotheacquiredsignals(Lampinenetal.,2020).Analternative strategyistofittosomerepresentationofthesignal,suchasthespher- icalharmonicscoefficients.Veraartetal.(2018)usedamodelfitting frameworkthateffectivelyreducesthedimensionalityoftheparameter spacebymeansofperformingarotationallyinvariantfactorizationof thevoxel-wiseODFs(Novikovetal.,2018;Reisertetal.,2017).Theini- tialstepofsuchframeworkconsistsinprojectingthemeasuredsignal ontoasphericalharmonicbasis

𝑆(𝒖̂)=∑

𝑙

∑

𝑚 𝑆_𝑙𝑚𝑌_𝑙𝑚(𝒖̂). (11)

TheS_lmcoeﬃcientsaresubsequentlyconvertedtorotationalinvariants S_l,andﬁttedtothecorresponding rotationallyinvarianttermsofthe P(𝒖̂)⊗K(𝒈̂⋅𝒖̂)convolution

𝑆_𝑙=𝑝_𝑙𝑘_𝑙, (12)

wherek_listhe0thdegreetermofthemicrostructuralkernelasdeﬁned byEq.(8).Therotationallyinvariantcoeﬃcients,S_landp_l,arecom- putedfrom(Novikovetal.,2018)

𝑥_𝑙=

√ 4𝜋∑

𝑚||𝑥𝑙𝑚||²

(2𝑙+1) , (13)

wherex_lmarethesphericalharmonicscoefficients,andx_l≡S_lorx_l≡ p_l.Atsufficientlylowb-values,signalprojectionswithl>2havesmall contributionstothemeasuredsignalJespersenetal.,2007)andthesum inEq.(11)istypicallytruncatedatthesecondorderterm(l=2).Thefit- tingframeworksummarizedbyEqs.(11)–((13)iscommonlyreferredto asthe“RotInv” approachduetoitsuseofrotationalinvariants.Thel=2 RotInvapproachcondensesthefivep_2m,m∈{−2,−1,0,1,2}parameters oftheSMRmodelontoasinglep₂invariantcapturingtheorientation coherenceofthesub-voxeldiffusiondomains,thusreducingthedimen- sionalityofthefittingproblembyfourparameters.

3. Methods

3.1. Neuralnetworkarchitectureandtraining

Inthiswork,weconstructedfeedforwardneuralnetworksinMAT- LABR2020b(The MathWorks,Inc.), andusedthemtoﬁtvectorsof scalar parameters, m = (f_S, D_I;S, D_I;Z, D_Δ;Z, T_2;S, T_2;Z, p₂₀, Re(p₂₁),

Im(p₂₁),Re(p₂₂),Im(p₂₂)), tosets ofmeasurementsS(𝜏E, B).Weex- ploredvariousnetworkdesignswithdifferentnumbersofhiddennodes and/orlayersbeforedecidingontwofinalnetworkarchitectures:anar- tificialneuralnetworkfeaturing3fullyconnectedhiddenlayerswitha decreasingnumberofnodes(180,80,and55)andadeeper/widerneu- ralnetworkfeaturing4fullyconnectedhiddenlayerswith250nodes each. Allhidden layerswere activated byhyperbolic tangent (tanh) functionsandthedeeper/widernetworkalsofeaturedbatchnormaliza- tionlayersbetweenthefullyconnectedinnerlayersandtheirrespective tanhactivations.Todistinguishthenetworks,werefertothemasthe shallowerneuralnetwork(SNN)anddeeperneuralnetwork(DNN),respectively.BothSNNandDNNcompriseanoutputlayerwith11nodes corresponding totheparameters in m.The inputcompriseda given number(E) ofsignalsamplesacquiredwithapre-definedrelaxation- diffusionencodingprotocol.Weconsideredthreedifferentacquisition protocols;withE=164,E=242,andE=270samples(𝜏E,B).Indepen- dentnetworksweretrainedforeachprotocol,meaningthat3SNNsand 3DNNswereevaluated.ToremovetheinfluenceofS₀fromthefitting problem,wenormalizedtheinputvectortothemediansignalacquired atthelowestb-valueandshortestecho-time.

Supervisednetworktrainingwasperformedusingameansquared errorloss

MSE=‖𝒎targ−𝒎net‖²2, (14)

wherem_targistheground-truthtargetvector,m_netisthecorresponding networkoutputvector,and

|| ⋅ ||₂ denotesthe Euclidean norm. The m_targ parameters were rescaledbetween0and1usingamin-maxnormalizationstrategybe- forebeingsuppliedtothenetworks.Thenetworksweretrainedwithsets ofvoxelswithrandomlygeneratedmodelparametersandnoisysignal samplesS(𝜏E,B),asdetailedinSection3.2.TheSNNsweretrainedwith abatchsizeof0.5⋅10⁶andascaledconjugategradientoptimiser.The DNNsweretrainedinamini-batchfashionusingatotalof5⋅10⁶training sets,amini-batchsizeof50⋅10³,andanAdamoptimiser.Throughout, trainingdatawasdividedsuchthat75%oftheoriginaldatawasused toupdatetheweightsandbiasesand25%wasusedforcross-validation.

Overﬁttingwasaddressedbyanearlystoppingmethodthatterminated trainingfollowinganincreaseoftheMSEofthevalidationdatafor5 (SNN)or20(DNN)consecutiveepochs.

NetworkGPUtrainingtookapproximately83minfortheSNNsand 74minfortheDNNsontwoparallelNVIDIAGeForceRTX2080SUPER, eachwith8GBofmemory.Bothgraphiccardswereinstalledonahigh- endconsumer-gradedesktopcomputerwith32GBmemoryandan8- coreInteli9–9900k3.6GHzCPUwith2threadspercore.

3.2. Generatingtrainingdata

Westudiedtheimpactoftrainingdatagenerationstrategiesonthe networkperformance,includingtrainingbasedonuniformlysampled andrealbraindata.Trainingparametervectorswerecreatedbytwo strategies:

- m_unifwassyntheticallyconstructedbyrandomsamplingofuncorre- lateduniformdistributionswithintheboundsdescribedinTable1; - m_brainwasconstructedfrominvivobraindatabyrandomlysampling parametervectorsestimatedfromaNLLSﬁtofEq.(10).Thisdataset containsparametercorrelationsfoundinatypicalbraindatasetfrom ahealthyadult.

Them_brainvectorscomprisethesolutionsofanonlinearleast-squares (NLLS)fitofEq.(10)toinvivosignaldata,referredtoasm_fit,together withanadditionalparametersetm_mut,consistingofrandommutations ofthefittedsolutions,givenby

𝒎mut=𝑿◦𝒎f it, (15)

where‘◦’denotestheelement-wise(Hadamard)product,andXis an 11-dimensionalvectorofnormallydistributednumbers.Eachelement

(4)

Table1

SMRparameterbounds.ThediffusivityboundswereenforcedbylimitingD_||;S, D_||;ZandD_⊥;Z tothe[0.2,4.0]𝜇m²/msinterval.ForT_2;S andT_2;Z,thelower boundremovestheinfluenceoftheassumedlyfully-attenuatedmyelinwater, andthelargeupperboundofT_2;Zenablesittocaptureeffectsofincreasedvalues inwhitematterlesions(Lampinenetal.,2019)aswellaspossiblecontamination withcerebrospinalfluidwhichisexpectedtohavealargerinfluenceonthemore isotropiczeppelincompartment(Lampinenetal.,2020).

Bounds f _S D _I;S[μm ²/ms] D _I;Z[μm ²/ms] D _Δ;Z T _2;S[ms] T _2;Z[ms]

Minimum 0 0.07 0.2 − 0.46 30 30

Maximum 1 1.33 4.0 0.86 300 1000

ofXisanindependentandidenticallydistributedrandomvariablesam- pledfromanormaldistributionwithmean1andstandarddeviation 0.3.ThestandarddeviationofXwaschosenfollowingbriefinsilicoex- perimentswhichrevealedthatvirtuallyindistinguishabletraining/test results areobtained forstandarddeviations withinthe[0.2, 0.5]in- terval,providedall othertraining/network parametersarekeptcon- stant.Thenumberofm_fitvectorswaskeptconstant(𝑛f it≈1.5⋅10⁵),and thetotalnumberofmutatedvectorswasdefinedas𝑛mut=𝑛brain−𝑛f it. Theintroductionofmutatedparametersisadataaugmentationtech- nique,designedtosimultaneouslycompensatefortherelativelownum- berofm_fitvectorsandexpandthe(f_S, D_I;S,D_I;Z,D_Δ;Z,T_2;S, T_2;Z,p₂₀, Re(p₂₁), Im(p₂₁), Re(p₂₂), Im(p₂₂)) domain of the m_brain parameter targets.

Thetrainingvectors,m_train,werecombinationsofm_brainandm_unif parametervectors.Usingagiventotalnumberofvectors(𝑛tot)andvary- ingnumberofm_brainparameters(𝑛brain),wemodulatedthefractionof invivobraindata,𝑓brain=𝑛brain∕𝑛tot,between0and1instepsof0.05.

TheSNNtrainingsetscontainedatotalof𝑛tot=5⋅10⁵parametervectors,whiletheDNNtrainingsetscontained𝑛tot=5⋅10⁶.Fig.S1inthe SupportingInformationshowsthedistributionofm_ﬁt,m_mut,andm_unif parametersthatcomposeatypical𝑛tot=5⋅10⁵SNNtrainingdataset.

Signaldataweregeneratedfromm_train usingEq.(10)andone of threediﬀerent(𝜏E,B)acquisitionprotocols:

- Theoptimizedprotocolcomprisestensor-valuedencodingwithfull relaxation-diﬀusion-correlationoptimizedforminimalSMRparam- etervariance(Lampinenetal.,2020)

- The unoptimized protocol comprises tensor-valued encoding with relaxation-diffusion-correlations restricted to low b-values (Lampinen et al., 2019). This protocol was an early attempt to design a diffusion-relaxation protocol with b-tensor encoding. It preceded the optimized protocol and was configured to fit into an available timeslot by following heuristics without a formal performanceoptimization,andwaslaterfoundtoyielddegenerate resultsinwhitematter(Lampinenetal.,2020).

- TheLTE-onlyprotocolcomprisesdiﬀusion-relaxationoptimizedfor minimalSMRparametervariancebutincludesonlylinearb-tensor encoding(b_Δ =1)(Lampinenetal.,2020).Justastheunoptimized protocolithasbeenfoundtoyielddegeneratesolutions inwhite matter.

Additionaldetailsonthevariousprotocolscanbefoundintheirre- spectivereferencesandinTableS1oftheSupportingInformation.We emphasizethatalltrainingdatausedinthisstudywasgeneratedusing theSMRforwardmodel,Eq.(10),ratherthanusingrawinvivobrain data.

Noise was sampled from the Rice distribution andadded tothe ground-truthsyntheticsignals.Because relaxation-diffusionMRI data comprisesvoxelswithdifferentsignal-to-noiseratio(SNR),theampli- tudeoftheSNRatS₀=S(𝜏E=0,B=0)wasuniformlyvariedinthe intervalSNR∈[80,160].Consideringtherelaxation-diffusionproper- tiesoftypicalhealthyWM(T₂≈70ms,D_I≈0.9𝜇m²/ms),thischoice resultsin SNR∈[30,60]atthepointof maximalsignalof theopti- mizedprotocol(𝜏E=63ms,b=0.1ms/𝜇m²),SNRamplitudesthatare

consistentwithtensor-valueddMRImeasurementsoftheinvivobrain (Szczepankiewiczetal.,2019a).Finally,networksweretrainedusing m_trainvectorsastargetsandtheircorrespondinginsiliconoisysignalsas inputs.

3.3. Networkevaluation

Tofindtheoptimalfractionofm_unifandm_brainparameters(adjusted bythef_brainparameter),wetrainedSNNswithvaryingvaluesoff_brain, deployedthemoninsilicodatageneratedfromanunseensubject,and comparedthevariousnetworksintermsofaccuracyoftheresultingpa- rameterestimates.Networkaccuracywasassessedvianormalizedroot- mean-squarederrors(NRMSE)andlinearcorrelationswithground-truth valuesintermsofthePearsoncorrelationcoefficient(𝜌).TheNRMSE captures theabsoluteagreementbetweenthetargetground-truthpa- rametersandtheircorrespondingnetworkestimates,whereas𝜌captures thelineartarget-to-estimatecorrelationstrength.Thef_brainoptimization processisdiscussedinmoredetailinsectionS3oftheSupportingInfor- mation.Briefly,thef_brainhyper-parametercontrolsatrade-off between accuracytoWM-relevantparametersandnetworkgeneralizability,and wefoundf_brain=0.5toprovideanoptimalbalancebetweenaccuracy andgeneralizability.Fromthispointonward,weconcentrateonnet- workstrainedwithf_brain=0.5datasetsandevaluatethemin further detailusingcorrelationplots.

Theaccuracyperformanceofanf_brain=0.5SNN,anf_brain=0.5DNN, andastandardNLLSsolverwerecomparedonthebasisofNRMSEsand Pearsoncorrelationcoefficients.Thecomparisonwasperformedusing twodistinctinsilicodatasets:onebasedonm_fitvectorsfromWMand deepGMdata(m_fit;WM-like),andanotherbasedonm_unifvectors.Each datasetcomprisedatotalof10⋅10³parametervectorsandtheirrespec- tiveinsilicosignals.Theground-truthsyntheticsignalswerecorrupted withRiciandistributednoiseandtheSNRattheS₀pointwassampled uniformlyfromthe[80,160]range.

The effects of different acquisition protocols on network performance wereevaluatedintermsofNRMSEandsensitivitytoparame- terchanges.Thelatterwasgaugedbymodulatingthenon-orientational parametersofanSMRsolution(f_S,D_I;S,D_I;Z,D_Δ;Z,T_2;S,T_2;Z)oneata timeby10%andmeasuringtheresponseinallparameters.Theorig- inalparametersetwasbasedoninvivodatafromthecoronaradiata wheref_S=0.45,D_I;S=0.58𝜇m²/ms,D_I;Z=1.36𝜇m²/ms,D_Δ;Z=0.44, T_2;S =69ms,T_2;Z=60ms(Lampinenetal.,2020).Subsequently,in silicodatasetsweregeneratedforeachofthe6modulateddatasets,Rice noisewasaddedwithSNR=160atS₀,andparameterestimationwas performedwithprotocol-specificnetworks.

ToinvestigateifthereducedparameterspaceofRotInvﬁttingim- pactstheperformanceofANN-basedﬁtting,wetrainedanSNNusing rotationallyinvariantinsilicodatasetsandthesameoptimalf_brainvalue foundfortheSMRnetworks.RotInvtrainingvectors,m_train;RI,weregen- eratedfromthem_trainvectors(Section3.2),usingEq.(13)toconvert thefullSMRparameterstoRotInvparameters(f_S,D_I;S,D_I;Z,D_Δ;Z,T_2;S, T_2;Z,p₂).TheRotInvinsilicosignaldatawasgeneratedinfoursteps:

(1) signalswerecalculatedusing mtrainandEq.(10);(2) noisewas added totheinsilico signaldatawitha SNR∈ [80,160]at S0;(3) S_lmcomponentswereestimatedbyprojectingthenoisyS(𝜏E,B)signals toa sphericalharmonicsbasis; and(4)S_l,_l₌_{0,2} signalswerecalcu- latedfromS_lmusingEq.(13).AswiththefullSMRmodel,trainingwas performedusingm_train;RI astargetsandtheircorrespondingsynthetic noisysignalsasANNinputs.

TrainedSMR(RotInv)networks weretestedon previouslyunseen m_unif (m_unif;RI) andm_brain (m_brain;RI) synthetic datasets atan SNR∈ [80,160]atS₀.Performancewascomparedintermsoftheirrespec- tivetarget-estimatecorrelations.Allnetworksweretrained/testedina leave-one-outfashionwherethetrainingandtestingm_brain (m_brain;RI) datasets were generated using in vivo data from diﬀerent subjects (Section3.5).

(5)

3.4. Invivodataacquisition

Weanalyseddatafromthreeadultvolunteerspreviouslyreported in (Lampinen et al., 2020). The study was approved by the re- gional ethical review board in Lund and written informed consent was obtained from all volunteers prior to scanning. Measurements wereperformedonaMAGNETOMPrisma3Tsystem(SiemensHealth- care,Erlangen, Germany) using a prototype spin-echo EPI sequence that facilitates user-defined gradient waveformsfor diffusionencoding(Szczepankiewiczet al., 2019a).Datawere collectedusing echo timesbetween 63and130ms, repetitiontimeof3.4s,voxelsizeof 2.5mm³,40slices,matrix-sizeof88×88,in-planeandthroughplane accelerationfactorof 2×2(GRAPPA), partial-Fourierof 3/4,band- width=1775Hz/pixel,and“strong” fatsaturation.Diffusionencoding wasperformedwithgradientwaveformsoptimizedtomaximizetheen- codingstrengthperunittimeandtosuppressconcomitantfieldeffects (Sjölund etal.,2015; Szczepankiewiczetal., 2019b). Atotalof 270 combinationsof𝜏EandBwereused,accordingtotheoptimizedproto- colinTable S1oftheSupportingInformation.Totalacquisitiontime was15min.

3.5. Invivodataprocessingandparameterestimation

Priortoanalysis,allinvivodatawerecorrectedforeddy-currents andsubjectmotionusingElastiX(Kleinetal.,2009)withextrapolated targetvolumes(Nilssonetal.,2015).Susceptibility-inducedgeometric distortionswerecorrectedusingtheTOPUPtoolinFMRIBsoftwareli- brary(FSL)(Smithetal.,2004).Gibbsringingartefactcorrectionwas performedaccordingtothemethoddescribedin(Kellneretal.,2016).

Tosuppresstheinﬂuenceofnoise,weﬁltereddatawitha3DGaussian kernelwithastandarddeviationof 0.45times thevoxeldimensions (Lampinenetal.,2020).

TheSMRmodelparameterswereestimatedfromavoxel-by-voxel NLLSﬁtofEq.(10)tothepost-processeddata.Theﬁttingprocesswas performed with the multidimensionaldMRI toolbox (https://github.

com/markus-nilsson/md-dmri)(Nilssonetal.,2018),withMATLAB’s built-inlsqcurvefitfunction.Toremoveoutliers,modelfittingwasper- formedtwiceineachvoxelandtheresultwithlowestresidualwasre- tained(Lampinenetal.,2020).Theinitialguessesweresampleduni- formlyfromtheparameterboundsinTable1.Theresultingestimates werestoredandusedtocomputeinsilicosignaldatafollowingtheproce- duredetailedinSection3.2.NLLSfittingofasingleinvivobraindataset tookapproximately8h(approximately5.5spervoxel)ontheCPUde- scribedinSection3.1.Thecomputationswerecarriedoutusingparallel computingandmulti-threading.

Finally,previouslytrainednetworkswereusedtoestimatethepa- rametersfrom Eq.(10) frominvivo data,which tookapproximately 2and20sforthewholebrainusingtheSNNandDNN,respectively.

Trainingwasperformedoninsilicom_train datawithanoptimalf_brain fraction.Thetrainingprocessfollowedaleave-one-outscheme,where thenetworksweretrainedonsyntheticdatageneratedfromtwosub- jectsbeforebeingdeployed/testedonathird,previouslyunseen,sub- ject.Neuralnetworkﬁttingprovidedvoxel-wiseparametermapsthat werecomparedtotheonesobtainedfromaconventionalNLLSﬁtting approach.

4. Results

4.1. Neuralnetworkparameterestimates

SNN-basedparameterestimationwasapproximately10⁴timesfaster thanNLLSﬁtting on thesame computer,andyieldedparametersin goodagreementwiththeground-truthtargetsandpreservedcontrast betweenregions characterizedbydistinct(T₂, D) properties(Fig.1).

Forexample,theestimatedf_Sandp₂arehighinWMregionsgenerally andhighestinorientationallycoherentWMregionssuchasthecorpus

callosum,similartothein-silicoground-truth.However,areducedcon- trastwasobservedintheT_2;Zmaps,wherethedistinctionbetweenWM (darker) andcorticalGM(brighter)regionsis moreprominentin the ground-truthmap.TheT_2;Zestimatesarealsocharacterizedbyconsid- erablediﬀerencesbetween ground-truthandestimatedparametersin thelongT₂ regionssuchasthelateralventricles. Thelargestoverall discrepancybetweenestimatedandground-truthparameterswasfound forD_Δ;Z,likelybecausethesignalisinsensitivetoitwhen|D_Δ;Z|<0.5 (Erikssonetal.,2015).UsinganANNtrainedonsyntheticdatatodi- rectlyﬁtinvivoexperimentaldataresultedinnoisiermaps.Neverthe- less,itpreservedananatomicallyplausiblecontrast.Giventhestrong correlationsbetweeninsilicoground-truthmapsandnetworkestimates, thenoisierappearanceoftheinvivoparametermapsislikelybecause theSMRmodelcannotaccuratelyrepresenttheunderlyinginvivodata.

InvivoSNNparameterestimatesfromWMregionsofinterestsaredis- playedinTableS2oftheSupportingInformation,wheretheyareaddi- tionallycomparedtoNLLSestimates.

Fig. 2 shows that SNN-based estimates correlated well with the ground-truthparametertargets,withmostparametersyieldinglinear correlationcoefficientsclosetoorabove0.9.Thereferencedfigurefo- cusesontheperformanceofanetworktrainedwithinsilicoS(𝜏E,B) datageneratedwiththeoptimizedprotocolandanevenmixofrandom andWM-likesamples(f_brain=0.5),anddistinguishesbetweenperfor- manceonparametersobtainedbyuniformrandomsampling(lightblue points)andparametersderivedfrominvivonon-corticalbraindata(dark bluepoints).Redpointscorrespondtoparametervectorsderivedfrom lowcomponent-specificsignalfractions,asdescribedinthefigurecap- tion.PoorperformanceisobservedforlowD_Δ;Zvalues,wherethenet- workyieldsD_Δ;Z≈0.3regardlessoftheunderlyingground-truth.This canbeattributedtoanintrinsicdifficultyindistinguishingbetweenthe diffusion-weightedsignalsofcomponentswith|D_Δ;Z|<0.5components (Erikssonetal.,2015).Moreover,apoortarget-to-estimatecorrespon- dencewasseenforT_2;Z-timeswherethesewerelongerthanthemaximal echotime.

Theparametermapsestimatedfromthedeepernetworkareingood agreementwiththeirrespectiveground-truthtargets(Figs.S3andS4 oftheSupportingInformationcorrespondtoFigs.1and2).InFig.S4, weobservedthatDNN-basedﬁttingresultedinslightlystrongercorre- lationsbetweennetworkestimatesandground-truthparametertargets.

AlthoughinvivomapsfromDNNandSNNaresimilar,diﬀerencescan be foundinD_Δ;ZandT_2;Z; theDNNproducesanoisierD_Δ;Zandthe T_2;ZmaphasahighercontrastbetweenWMandcorticalGM.Bothof thesefeaturesarelikelyartefactual,andsuggestthattheDNNismore susceptibletodiﬀerencesbetweentheSMRsignalpredictionsandthe measuredinvivodata.

Theerrorsandprediction-targetcorrelationsoftheANN-basedesti- matesarecompiledinTable2,wheretheyarealsocomparedtoacon- ventionalNLLSsolver.TheNLLS,SNN,andDNNapproachesallhavea comparableaccuracyforinsilicodatasetsdesignedtocaptureWM(T₂, D)properties.Bycontrast,thefunction-ﬁttingnetworksareobservedto bemoreaccuratethantheNLLSapproachforsyntheticm_unifparameter vectors.

4.2. Eﬀectofacquisitionprotocolonnetworkaccuracyandsensitivity

Inthissection,wefocusontherelationshipbetweenacquisitionpro- tocoldesignandnetworkperformance.Fig.3showsthatANN-based fittingcouldpartlybutnotcompletelyeliminatetheknownfitdegener- acyintheunoptimizedandLTE-onlyprotocols:ANNsbasedontheopti- mizedprotocolprovidelowerestimationerrors(NRMSE)thantheANNs basedontheothertwoprotocols.Comparingthesetwoprotocols,we notethattheunoptimizedprotocolyieldsrelativelymoreaccurateesti- matesofD_I;S,andT_2;Z,whiletheLTE-onlyprotocolyieldsmoreaccu- rateestimatesoff_S,D_I;Z,D_Δ;Z,T_2;S,andp₂.Fig.3alsoshowsthatthe performanceofbothSNNandDNNislessaffectedbysub-optimalac- quisitionprotocolsthanthetraditionalNLLSapproach(Fig.3).Forthe

(6)

Fig.1.Deployingtrainednetworksonpreviouslyunseeninsilicoandinvivodataprovidesanatomicallyplausibleparametermapsinunder10s(includingdata managementtimes).Theﬁrstandsecondcolumnscomparetheground-truthtargetsandnetworkpredictions,respectively,oftheinsilicodataset.Diﬀerencemaps areshowninthethirdcolumn.Parametermapsobtainedfromapplyingatrainednetworkoninvivobraindataaredisplayedinthefourthcolumn.

NLLSapproach,theuseoftheunoptimizedorLTE-onlyprotocolsleadsto aconsiderableincreaseoftheestimationerrors,whileonlyaslightin- creaseofNRMSEisobservedfortheDNNorSNNapproaches.Thissug- geststhatANNsmaypartlyalleviateparameterestimationdiﬃculties causedbyaprotocolthatisinadequateinrelationtothemicrostructure model.

Fig.4shows thesensitivityof thevariousprotocolstoparameter changes.Networkstrainedondatageneratedwiththeoptimizedproto- colaresensitivetoallparameters,butslightlyunderestimatethemag- nitudeofthechange,particularlyinD_Δ;Z.Theparameter-speciﬁcmod-

ulationsdidnothaveamajoreﬀectontheestimationoftheremaining unmodulatedparameters.Anexceptionwasfoundwhentheunderly- ingT_2;Zisincreasedby10%,whichresultsina3%overestimationof theunchangedD_I;S.Comparedtotheoptimizedprotocol,theunoptimized andLTE-onlyprotocolsexhibitalowersensitivitytothesmallparame- termodulationsandappeartobeunresponsivetochangesinD_Δ;Z(both protocols)andD_I;S(LTE-only).Inadditiontolowersensitivity,theunop- timizedprotocolalsoresultedinlessaccurateestimationsoftheunmod- ulatedparameters,witha10%modulationoff_Sleadingtoanerroneous 7%increaseinT_2;S.

(7)

Fig.2. Scatterplotsofground-truthparametersvs.neuralnetworkpredictions.Lightbluepointsshowresultswhenthenetworkisdeployedonuniformlydistributed randommodelparameters.Thedarkbluepointscorrespondtoaninsilicodatasetderivedfromanonlinearleast-squaredfittomeasuredbraindatawherevoxels withinCSFandcorticalGMwereexcludedbymaskingoutregionswheremicroscopicanisotropy(Lasič etal.,2014),𝜇FA,islowerthan0.6.Theredpointscorrespond toregionswherepooraccuracyisexpected,i.e.,wherethesignalfractionoftherelevantcomponent(“stick” or“zeppelin” dependingontheparameter)accounts forlessthan15%ofthetotalsignalor,forthep₂map,parametervectorswherethe“zeppelin” componentaccountsformorethan85%ofthetotalsignalfraction and|D_Δ;Z|<0.4.TheinnerlegendsshowthePearsoncorrelationcoefficients(𝜌)ofthebluepoints.Forinterpretationofthereferencestocolorinthisfigurelegend, thereaderisreferredtothewebversionofthisarticle.

Table2

AccuracyperformanceofDNN-,SNN-andNLLS-basedfittingapproaches.Performanceisevaluated onsyntheticdatasimulatedfromtwodifferentsets:uniformlysampledrandomparameters(m_unif), andparametersderivedfromleast-squaredmodelfittingtoinvivoWManddeepGMdata(m_fit;WM-like).

Metric Dataset Fitting method Fitting time [s] f S D I;S D I;Z D Δ;Z T 2;S T 2;Z

NRMSE m ﬁt;WM-like NLLS 967 0.07 0.07 0.04 0.1 0.08 0.01

SNN 0.2 0.07 0.08 0.03 0.1 0.07 0.02

DNN 0.3 0.07 0.08 0.03 0.09 0.07 0.02

m unif NLLS 2201 0.08 0.21 0.18 0.21 0.16 0.21

SNN 0.1 0.05 0.11 0.10 0.15 0.12 0.14

DNN 0.5 0.05 0.11 0.10 0.14 0.11 0.14

𝜌 m ﬁt;WM-like NLLS 967 0.90 0.87 0.91 0.78 0.78 0.98

SNN 0.2 0.90 0.84 0.93 0.70 0.76 0.96

DNN 0.3 0.91 0.85 0.92 0.73 0.77 0.95

m _unif NLLS 2201 0.97 0.79 0.78 0.71 0.85 0.77

SNN 0.1 0.98 0.93 0.94 0.86 0.92 0.87

DNN 0.5 0.99 0.93 0.94 0.88 0.92 0.88

4.3. Neuralnetworkﬁttingofrotationallyinvariantmicrostructural features

Fig.5AshowsthattrainingaSNNwithrotationalinvariantsresults in slightlystrongercorrelationbetween targetandestimatedparam- eters(comparewiththescatterplotsof Fig.2).Wenotea consider- ableimprovementinaccuracyatlowD_Δ;Zvalues,wheretheconstant D_Δ;Z≈0.3behaviourobservedforthefullSMRmodel(seeFig.2) is nolongerpresent.ApplyingtheRotInv networktoanunseeninvivo S_l₌_{0,2} datasetresultsinparametermapswithanatomicallyplausible contrast(seeFig.5B).ConsistentwiththebetterD_Δ;Zaccuracyperfor- manceoftheRotInvapproach,wenotethattheRotInvD_Δ;Zinvivomap hasasmootherappearanceandbetterdemarcatescortical/non-cortical parenchymathanitsnon-rotationallyinvariantSMRcounterpart(com- parethefourthcolumnofFigs.1with5B).

Interestingly, in vivo maps smoother than the ones displayed in Fig.5BcanbeattainedfromanANNthatwastrainedonunreasonably noisyinsilicodata.Fig.6displaystheinvivoparametermapsobtained

fromaRotInvnetworktrainedwithSNR∈[20,40]atS₀,whichis4 timeslowerthanthatusedinFig.5.Theresultingmapshaveasmooth appearanceandexhibitanatomicallyplausiblecontrast.Forexample, regions withhighf_S correspondtoWMregions,thelateralventricles arecharacterizedbylowf_SandhighD_I;Zvalues,anddarker/brighter D_Δ;Zregions demarcatecortical/non-corticalparenchyma. Whileitis temptingtofavourtheseductively‘robust’mapsofFig.6overthenois- iermapsofFig.5B,wenotethatthelow-SNRRotInvnetworkresults inweakcorrelationsbetweentargetandestimatedparameters(compare thescatterplotsofFig.6withthoseofFig.5B).Forexample,SNN-based estimatesofD_Δ;Zmayyieldasmoothmapthatappearsrobust,buta closerinspectionrevealsthattheD_Δ;ZestimatesinWManddeepGMre- gionsareequaltothemeanofthetargetD_Δ;Zdistributionandconstitute anexceedinglyinaccurateestimateoftheunderlyingground-truth.The tendencyfornetworkstoreturnthemeanofthetrainingparameterdis- tributionhasbeenreportedinstudiesoftheRotInvmodel(Reisertetal., 2017) and the behaviour was explained in detail by Coelho et al.

(2021).

(8)

Fig.3. OptimizedacquisitionprotocolsresultinANN-andNLLS-basedparameterestimateswithsmallererrors.Thebarplotsindicatethenormalizedroot-mean- squarederrors(NRMSE)betweenground-truthandpredictedparameters,forlearning-based(DNNandSNN)andNLLSfittingapproaches,andforinsilicodatasets generatedwithdifferentacquisitionprotocols.Theleftmostplotscorrespondtoatensor-valued(𝜏E,B)protocoloptimizedforminimalparametervariance,the optimizedprotocol(Lampinenetal.,2020);themiddleplotscorrespondtoasub-optimaltensor-valued(𝜏E,B)protocolwhererelaxation-diffusioncorrelationsare exclusivelyestablishedatlowb-values,theunoptimizedprotocol(Lampinenetal.,2019);therightmostplotsshowtheresultsfora(𝜏E,B)protocoloptimizedfor minimalparametervariancewhenlimitedtolineardiffusionencoding(bΔ=1),theLTE-onlyprotocol(Lampinenetal.,2020).PanelAshowsnetworkperformance onparameterssampledfromauniformdistribution,andpanelBshowstheperformanceoninsilicodatabasedonleast-squaresfittingresultstoinvivonon-cortical braintissuedata.

Fig.4.Sensitivityofacquisitionprotocolsto10%parametermodulations.Thematricesdisplaytherelationbetweenaninducedparameterchangeandtheobserved response.Whenasingleparameteronthey-axisismodulatedby10%,theresponsecanbereadinallotherparametersalongthex-axis.Anidealnetworkwould reportadiagonalmatrixwiththevalue10%onthediagonal,andzerootherwise.Theoptimizedprotocolappearssensitiveinallparameters,whereastheunoptimized protocollackssensitivityD_Δ;ZandtheLTE-onlyprotocolslackssensitivitytobothD_Δ;ZandD_I;S.

5. Discussionandconclusions

ReplacingtraditionalNLLSsolverswithfunction-fittingneuralnet- works enables vastly faster parameter estimationwhen using high- dimensionalmicrostructuralmodels.Onaconsumer-gradedesktopcom- puter, the fitting time was reduced from hours (NNLS) to seconds (ANN).Naturally,theNNLSfittingtimes,basedontherelativelyslow trust-region-reflectivealgorithm, can be improvedby linearizing the fittingproblem(Daduccietal.,2015) orbyusingGPU-basedsolvers (Harmset al.,2017).However, whilesuchprocedures haveenabled wholebrainfittingofnon-linearmodelswithinminutes(Daduccietal., 2015;Harmsetal.,2017),westillexpecttheseconds-longforwardpass

of anANN toprovidea competitivechoicein termsof computation time.

TheANN-basedestimateswereobservedtobeingoodagreement with syntheticdatathat mimicked healthyWMaswell asdatathat spanned the entire space of allowed model parameters. When de- ployed on unseeninvivo brain data, neuralnetworks provide maps thatareconsistentwithknownbrainanatomyandpreservecontrastbe- tweenregionswithdifferentrelaxation-diffusionproperties.Ourfind- ings are encouraging and in line with recent advanced dMRI mod- ellingstudiesthatusemachinelearningtechniquesforparameteres- timation (Barbieri et al., 2020; Bertleff et al., 2017; Golkov et al., 2016; Grussu et al., 2020; Gyori et al., 2019; Hill et al., 2021;

(9)

Fig.5. Neuralnetworkfittingofarotationallyinvariant(RotInv)modelresultsinstrongtarget-estimatecorrelationsandplausiblemaps.(A)Correlationsbetween network-basedparameterestimatesandground-truthparametertargets.TheestimateswereobtainedfromaRotInvnetworktrainedusingafractionoff_brain=0.5 betweenrotationallyinvariantm_brainandm_uniftrainingparametervectors.Thecolour-codingandlegendsfollowthesameconventionasFig.2.(B)Mapsofmi- crostructuraldiffusionparameters–f_S,D_I;S,D_I;ZandD_Δ;Z– obtainedfromfittingaRotInvnetworktorotationallyinvariantinvivobraindata.

Kaandorpetal.,2021;Nedjati-Gilanietal.,2017;Palomboetal.,2020; Reisertetal.,2017).Acombinationoferrormetrics,correlationanal- ysis,andsensitivitymatriceswasfoundtoprovideausefulsetoftools forquantitativelyassessingparameter-speciﬁcaccuracy/sensitivityand foridentifyingthelimitationsoflearning-basedapproaches.Thesetools facilitate a survey of the performance across all dimensions of the SMRmodel,forexample,revealingthatD_Δ;Zwasconsistentlylessac- curatethanotherparameters, asexpectedfrompreviousstudiesthat haveemphasizedthatitisdiﬃculttoestimate(Erikssonetal.,2015; Lampinenetal.,2020,2019).Bycontrast,visualinspection ofANN- basedparametermapswasfoundtoprovidelimitedinsightonthegen- eral performanceof thenetworks. Indeed, smoothandanatomically plausiblemapscanbeachievedevenwithpoornetworkperformance anddatawithlow SNR.Thisisa commonanddeceptivepitfall that hasstrongimplicationsfortheevaluationofperformanceinmachine learningapproaches(Reisertetal.,2017).

Wefoundnoevidencethatvoxel-wiseANN-basedparameterestima- tioncanfullyalleviatethedegenerateﬁttinglandscapetypicallypresent whenworkingwithbiophysicalmodelsindMRI(Jelescuetal.,2016)or replaceanexhaustivesamplingofallrelevantexperimentaldimensions (Coelhoetal.,2019;Lampinenetal.,2020).Fig.3showsworseperfor- manceintermsofparameterestimationerrors(NRMSE)forthetwopro- tocolswithknowndegeneracyproblems(Lampinenetal.,2020).Simi- larly,Fig.4showsthatonlytheoptimizedprotocolcanfaithfullyrecover parameter-speciﬁcchangeswhiletheothertwocannot.Theseareboth

signsofunresolveddegeneracies.Indeed,wecannotexpectgoodperfor- manceforLTE-onlyandunoptimizedprotocolsbecausetheseprotocols canyieldvirtuallyidenticalsignalvectorsfordiﬀerentmodelparam- eters;theinverseproblemhasmanysolutions(Lampinenetal.,2020, 2019).Nevertheless,ANN-basedﬁttingshowedanadvantagecompared withthetraditionalNLLSapproach,asityieldedlowerestimationerrors inthedegeneratecases(unoptimizedandLTE-onlyprotocols;Fig.3).Our interpretationisthattheNLLSapproachreturnsoneoutofthemanyso- lutions,whereastheANN-basedestimatetendstoanaverageacrossthe manysolutions.

The11-dimensionalparameterspaceoftheSMRmodelisdiﬃcult tosampledenselyandthuspresentsachallengewhendesigningtrain- ingdatasetsthatarerepresentativeofthevastﬁttinglandscape.Inthis work,weaddressedthischallengebyconstructingtrainingdatabased oninvivohealthybraindata(m_brain)andmorenaïveparametervec- torsrandomlysampledfromtheentiremodelparameterspace(m_unif).

Networkstrainedexclusivelywithm_brainvectorsdisplayedthebestac- curacy intermsof expectedWMproperties,buttheirdomain ofva- lidityisrestrictedtotherelativelysmallspacespannedbym_brainsolu- tions.Thisraisesquestionsabouttheirgeneralizability,i.e.,theirperfor- manceincaseswhereatypicalmicroscopictissuestructuresarepresent (Alexanderetal.,2019).Tofindagoodtrade-off betweenaccuracyand generalizabilityweoptimizedthefractionofinvivo-basedtrainingdata (f_brain).However,weexpectthatmoreworkisneededtodefineatruly optimalstrategyfornetworktraining.

(10)

Fig.6. Traininganeuralnetworkwithaninsuﬃcientdatasetmayresultinplausiblemapsbutpoortarget-estimatecorrelations.(A)Experimentalparametermaps obtainedfromﬁttingaRotInvSNNthatwastrainedonunreasonablynoisydata(SNRatS₀inthe[20,40]range).Themapswereobtainedbydeployingthenetwork torotationallyinvariantinvivobraindata.(B)Correlationsbetweennetwork-basedparameterestimatesandground-truthparametertargets.Thecolour-codingand legendsfollowthesameconventionasFig.2.

Inthisstudy,wefocusedonfullyconnectednetworksthatfollowthe designofmultilayerperceptrons(MLPs),atraditionalANNclassthat iswell-suitedfor regressionproblems(Cybenko,1989; Horniketal., 1989).AlternativesorcomplementstothefullyconnectedANNarchi- tectureshouldalsobeexploredinfutureworks.Promisingavenuesin- cludetheuseofdropout(GalandGhahramani,2016;Tannoetal.,2021) ordeepensemblestrategies(Lakshminarayananetal.,2016;Qinetal., 2021)asameanstoderiveuncertaintymetrics,theuseofrolled-outnet- workstructuresinspiredbynon-learning-basediterativefittingframe- works(Ye,2017),theuseofauto-encoders(Zucchellietal.,2021),or theuseofdenoisingnetworks(Fadnavisetal.,2020;Wangetal.,2019) tominimize theamount ofnoisepresentinthedatathatissupplied tothefunction-fittingANN.Whilefundamentallydifferentnetworkar- chitecturesmayconsiderablyboosttheperformanceoftheANN-based fittingapproach, themodestdifferences foundbetweentheSNNand DNNdesignssuggeststhatsimplyincreasingthewidthand/ordepthof thefullyconnectedANNarchitectureisnotapromisingavenue.Despite thepotentialforimprovement,wenotethattheplotsinFig.2constitute animprovementoversimilartarget-estimatecorrelationplotsreported in(Reisertetal.,2017),wheresupervisedlearningbasedonpolynomial regressorswasusedtofitathree-compartmentdiffusionmodel,andare equivalenttothecorrelationplotsreportedin morerecentworkson learning-basedfittingofdiffusion(Gyorietal.,2019;Palomboetal., 2020)anddiffusion-relaxationMRImodels(Grussuetal.,2020).

ThefullyconnectedANNsweconsideredherearenotinvariantto samplerotations.TheinputtotheANNsisavectorofEsignalsamples measuredatapre-definedsetof bothdirectional(𝜃,𝜙) androtation- allyinvariant(𝜏E,b, b_Δ)experimental points.Theorderingin which theEmeasurementsareprovidediskeptfixedandsampleswithsimi- larmicrostructuralkernelsK(𝒖̂⋅𝒏̂)butdifferentorientationswillresult indistinctivenetworkinputvectors.Thisplacesaburdenonthetrain- ingdatageneration,whichhastospanasufficientsetofpossibletissue orientations.UsingtheRotInvformulationrendersthenetworkinvari- anttosamplerotations,whichconsiderablyreducesthedimensional- ityoftheparameterspacethathastoberepresentedintrainingdata.

Thehighertrainingefficiencylikelyexplainsits slightlyhigheraccu- racyperformancerelativetothefullSMRnetworks.Alternativestothe RotInvformulationpresentedinthisstudyincludeaframeworkbased onadifferentsetofrotationallyinvariantfeaturesofthedMRIsignal (Zucchellietal.,2021)orfittingthefullSMRmodelwithequivariant networkarchitectures(Cohenetal.,2018;Thomasetal.,2018).

Apotentiallimitationofthepresentstudyisthefocusonasingle multi-compartmentmodeloftissuemicrostructurewhoserangeofap- plicationismostlylimitedtoWManddeepGMtissues.Applications forcorticalGMshouldthereforeconsidermodelstailoredtotheappro- priatemicrostructure(Palomboetal.,2020).Ourdecisiontofocuson asinglemodelfollowsfrompreviousdMRIliteraturewhich haspre- sentedthe“StandardModel” oftissuemicrostructure– fromwhichour SMRmodeldescends– asanoverarchingsignalmodelthatencompasses severalotherWMmodelsasparticularcases(Novikovetal.,2019).Fur- thermore,the“StandardModel” hasbeenusedtorevealgeneraldegen- eracyproblemsinmicrostructureparameterestimation(Novikovetal., 2018). Giventhegenerality of ourmodelandtheprevalence of de- generaciesinadvanceddMRImodelling,weexpectthedegradationof performancewithlessoptimalprotocolstoalsobefoundinalternative multi-compartmentmodelsorwhenusingdifferentlearning-basedfit- tingalgorithms(e.g.:polynomialReisertetal.2017orrandomforest Nedjati-Gilanietal.2017,Palomboetal.2020regressors).However, futureworkisneededtofullycharacterizethegeneralrelationshipbe- tweenmachinelearningapproachesanddegeneratefittinglandscapes.

Inconclusion,functionﬁttingneuralnetworkscanbeusedtovastly accelerateparameterestimationwithhigh-dimensionalmicrostructural MRI models. Theaccuracyof ANN-based estimates was observed to degradelesswithsub-optimalprotocolsthantraditionalNLLSﬁtting.

However, theperformance of functionﬁtting networks was stillob- servedtoprimarilydependontheamountofinformationsampledbythe underlyingmeasurements,andwefoundnoevidencethatANN-based approachescan oﬀsettheneedfor arichsetofdata. Therefore,ma- chinelearningmethodology inMRI microstructuremodellingshould bematchedwithcomprehensivedataacquisition.Thisworkpresentsa