Processing math: 100%

Sunday, September 27, 2020

Formulas Revisit

When I study Andrew Ng course I am used to formulas like dZ[],dA , etc notations. Recently I revisit the topic and I am then used to using the notation u[]i=W[]i:y[1]+b[]andy[]=Φ[](u), so I want to record the corresponding formulas for computation. Since the notation dW doesn't look any cleaner than LW, in the sequel I write everything explicitly. LW[]=1mLU[]Y[1]T LU[]=(W[+1]TLU[+1])Φ[](U[]) where denotes entrywise product of matrices. LY[1]=W[]TLU[] The last two yield the following for <L the hidden layer and for Φ[]:RR the activation function at -th layer. LU[]=LY[]Φ[](U[]) and finally Lb[]=1mmi=1Lu[](i)=1mnp.sum(LU[],axis = 1) For derivation of these formulas one can visit my another post: https://checkerlee.blogspot.com/2019/11/important-formulas-in-backward.html#more

No comments:

Post a Comment