To find DIFFERENCES between crystallographic ‘model’ and real structure of the protein. It is only REPRESENTATION of the real structure. There are many sources of ERROR during the structure solution process . In order to detect INCONSISTENCIES and mistakes. Why do we test our model?

R = (Fo- Fc ) (Fo) :

5 factors that test our protein model are… B factor and occupancy R factor R free R.M.S.D Ramachandran plot R =  (  F o - F c  )  ( F o )


FACTOR #1: B factor and occupancy Termini Flexible loops S olvent-exposed regions Long regions H igh B factors are found in flexible regions How certain are we of atoms positions? Is there potential flexibility in the structure? Do atoms have any static or dynamic disorder?


Therefore … . Low B factor = GOOD! High B factor = uncertainty low overall B factor = high resolution


Is refinement valid? FACTOR #2: R factor How close is our model to the data? How close is our model to reality? Does our protein model have expected chemical characteristics?

R = (Fo- Fc ) where Fo = F observed (Fo) Fc = F calculated :

Calculating R factor …. OR in words… sum of differences between observed and calculated structure factors sum of observed structure factors R =  (  F o - F c ) where F o = F observed  ( F o ) F c = F calculated


Therefore … The R factor indicates the QUALITY of model Shows the models DEVIATION from reality Low R factor = GOOD! High R factor = BAD! Effectively MEASURES models ERRORS


FACTOR #3: R free Has the noise been wrongly interpreted? Has the model been over fitted to the data? Therefore it’s a cross-validation parameter These reflections are excluded from refinement And is similar to R-factor ~ 1000 randomly selected subset of reflections So are statistically independent


R free checks quality of model – BEST INDICATION! Same as R factor but calculated for small percentage of reflections. Unbiased measure of how similar our model is to the data. Therefore … . High value may indicate over-fitting or serious model defect Low R free = GOOD! High R free = BAD!


Is our structure valid chemistry? FACTOR #4: RMSD How much do our model’s bond angles and lengths differ from typical parameters? Root-mean-square deviation of atomic positions – measures the average distance between the atoms of superimposed proteins


Therefore … R.m.s.d . values indicate how close the model’s bond angles and lengths are to those expected for small molecules Dominated by the amplitude of errors – affected by flexible and poorly defined regions RMSD bond length <0.02Å = GOOD ! RMSD bond length >0.02Å= BAD! RMSD Bond angle <2°= GOOD! RMSD Bond angle >2°= BAD! Doesn’t reflect models accuracy


Are the pairs of phi/psi angles of the polypeptide backbone mapped as expected? Ramachandran plot FACTOR #5: Plot of phi Φ and psi Ψ angles for each residue


> 90% of the angles are found in the expected areas of the plot = GOOD! Many residues not in the expected areas of the plot= BAD! Useful because Ramachandran plot values not restrained in refinement process. Therefore …


Conclusion R free !! Ramachandran plot – EXPECTED REGIONS GOOD!! B factor – LOW IS GOOD!! R factor (%) – LOW IS GOOD!! Rfree (%) – LOW IS GOOD!! RMSD – LOW IS GOOD!! Which is the best indicator of model quality?


