Chapter 3: Indexing, Integration and Scaling the Data 

Step 15: Output Plots and Error Scale Factor 
Step 15: Output Plots and Error Scale Factor
When the initial round of scaling is finished, examine the output plots and adjust the error scale factor to bring the overall χ2 down to around 1. Delete the reject file by clicking on the button in the lower Controls panel, and then click scale sets again (Figure 48). Repeat these steps until the overall χ2 is acceptable, then proceed to subsequent rounds of scaling by activating Use Rejections on Next Run and scale again. Repeat the cycle of scaling and rejecting until the number of new rejections has declined to just a few. At this point the scaling is completed.
Figure 48. The scaling Controls panel
After integration some frames may appear to have strong outliers in the integration information plots like unacceptable high χ2 – much above the “average” value. You can exclude those frames from scaling by clicking on the exclude frames button. The table with the list of frames will show up (Figure 49). Now you can mark the frames that should be excluded and close the table.
Figure 49. Frames selected to be excluded from scaling are marked red
The error scale factor is a single multiplicative factor which is applied to the input σ_{I}. This should be adjusted so the normal χ2 (goodness of fit) value that is printed in the final table of the output comes close to 1. By default the input errors are used (error scale factor = 1.3). It applies to the data, which are read after this keyword, so you can apply different error scale factor to subsequent batches by repeating this input with different values. The default value can be adjusted upwards if the overall χ2^{ }is greater than 1.
Reasonable values are between 1 and 2. The default is 1.3. If you need to use values much greater than this, say, 5, there is likely a problem.
The first chart to examine is the one that shows the ratio of the intensity to the error of the intensity, i.e. I/σ as a function of resolution (Figure 50). From this you can find the 2 σ cutoff and adjust your resolution limits accordingly.
The second chart described the completeness of the data for different resolution limits. The blue line represents all reflections, the purple – reflections with I/σ greater then 3, the red reflections with I/σ greater then 5, the orange reflections with I/σ greater then 10, the yellow reflections with I/σ greater then 20 (Figure 51).
The next chart is scale and B versus frame (Figure 52).
Figure 50. The output plot: I/Sigma vs. Resolution
Figure 51. The output plot: Completeness vs. Resolution
Figure 52. The output plot: Scale and B vs. Frame
Ideally you would not reject any reflections. However, in practice even excellent data sets have a few outliers that need to be rejected. Generally if fewer than 0.5% of the reflections are rejected this is considered normal. Rejections significantly greater than 1% usually indicate some problem with the indexing, integration, space group assignment, or the data itself (e.g. spindle or shutter problems).
The table that appears at the top of the page after each round of scaling has two categories: Reflections Marked for Rejection, and Reflections Rejected (Figure 53). The Reflections Marked for Rejection are those that have been written to the reject file. The Reflections Rejected represent those reflections, which have been omitted from the data in the latest round of scaling. Rejection of reflections is an iterative process that should converge after one or two rounds of rejection and refinement. For example, on the first round of scaling, 100 reflections may be flagged for rejection and written to the reject file. If Use Rejections on Next Run is checked, then the reflections in the reject file will be removed from the set of reflection data on the next round of scaling. Having done this, other reflections may now be flagged as outliers, and written to the end of the reject file. Perhaps there are 20 more this time. Repeating this procedure should lead to a convergence, where no more reflections are being rejected. Then you are done.
Figure 53. Info table after first round of scaling
To detect an anomalous signal is very easy. In the first step the data are scaled and postrefined normally, with one exception, namely that the anomalous flag has been set. This tells the program to output a .sca file where the I+ and I reflections are separate. In this case Scalepack will treat the I+ data and the I data as two separate measurements within a data set and the statistics that result from merging the two will reflect the differences between the I+ reflections and the I reflections. Notice that in HKL2000 there is no need to go through a lot of jiffies, which separate the I+ and I data and reformat them, etc. Obviously, for a centric reflection there is no I, so the merging statistics will only reflect the noncentric reflections. You can tell what percentage of your data is being used to calculate the merging statistics by examining the redundancy table near the end of the log file. Under the column of redundancy > 2 you will find out what percentage of the data is being compared. Since you only have I+ and I you will never have a redundancy of more than 2.
The presence of an anomalous signal is detected by examining the χ2 values on the Figure 54. Assuming that the errors scale factors were reasonable, and there is no useful anomalous signal in your data, curves showing the χ2 resolution dependence should be flat and » 1 for scaling with merged and unmerged Friedel pairs. On the other hand, if χ2 > 1 and you see the clear resolution dependence of the χ2^{ }for scaling with merged Friedel pairs, there is a strong indication of the presence of an anomalous signal. The resolution dependence allows you to cut off your resolution for the purposes of calculating an anomalous difference Patterson map. This whole analysis assumes that the error model is reasonable and gives you a χ2 close to 1 when the anomalous signal is not present.
Figure 54. Anomalous signal detection
You can only use scale anomalous when you have enough redundancy to treat the F+ and F completely independently.
Scalepack can be used to determine the space group of your crystal. What follows is a description of how you would continue from the lattice type given by Denzo to determine your space group. This whole analysis, of course, only applies to enantiomorphic compounds, like proteins. It does not apply to small molecules, necessarily, which may crystallize in centrosymmetric space groups. If you expect a centrosymmetric space group, you use any space group which is a subgroup of the Laue class to which your crystal belongs. You also need enough data for this analysis to work so that you can see systematic absences.
To determine your space group, follow these steps:
1. Determine the lattice type in Denzo.
2. Scale by the primary space group in Scalepack. The primary space groups are the first space groups in each Bravais lattice type in the table that follows this discussion. In the absence of lattice pseudosymmetries (e.g. monoclinic with β » 90°) the primary space group will not incorrectly relate symmetry related reflections. Note the χ2 statistics. Now try a higher symmetry space group (next down the list) and repeat the scaling, keeping everything else the same. If the χ2 is about the same, then you know that this is OK, and you can continue. If the χ2 are much worse, then you know that this is the wrong space group, and the previous choice was your space group. The exception is primitive hexagonal, where you should try P6_{1} after failing P3_{1}21 and P3_{1}12.
3. Examine the bottom of the log file or simulated reciprocal lattice picture for the systematic absences. If this was the correct space group, all of these reflections should be absent and their values very small. Compare this list with the listing of reflection conditions by each of the candidate space groups. The set of absences seen in your data which corresponds to the absences characteristic of the listed space groups identifies your space group or pair of space groups. Note that you cannot do any better than this (i.e. get the handedness of screw axes) without phase information.
4. If it turns out that your space group is orthorhombic and contains one or two screw axes, you may need to reindex to align the screw axes with the standard definition. If you have one screw axis, your space group is P222_{1}, with the screw axis along c. If you have two screw axes, then your space group is P2_{1}2_{1}2, with the screw axes along a and b. If the Denzo indexing is not the same as these, then you should reindex using the reindex button.
5. So far, this is the way to index according to the conventions of the International Tables. If you prefer to use a private convention, you may have to work out own transformations. One such transformation has been provided in the case of space groups P2 and P2_{1}.
Bravais Lattice 
Primary assigned Space Groups 
Candidates 
Reflection Conditions along screw axes 
Primitive Cubic 
P213 
195 P23 



198 P213 
(2n,0,0) 

P4132 
207 P432 



208 P4232 
(2n,0,0) 


212 P4332 
(4n,0,0)* 


213 P4132 
(4n,0,0)* 
I Centered Cubic 
I213 
197 I23 
* 


199 I213 
* 

I4132 
211 I432 



214 I4132 
(4n,0,0) 
F Centered Cubic 
F23 
196 F23 


F4132 
209 F432 



210 F4132 
(2n,0,0) 
Primitive Rhombohedral 
R3 
146 R3 


R32 
155 R32 

Primitive Hexagonal 
P31 
143 P3 



144 P31 
(0,0,3n)* 


145 P32 
(0,0,3n)* 

P3112 
149 P312 



151 P3112 
(0,0,3n)* 


153 P3212 
(0,0,3n)* 

P3121 
150 P321 



152 P3121 
(0,0,3n)* 


154 P3221 
(0,0,3n)* 

P61 
168 P6 



169 P61 
(0,0,6n)* 


170 P65 
(0,0,6n)* 


171 P62 
(0,0,3n)** 


172 P64 
(0,0,3n)** 


173 P63 
(0,0,2n) 

P6122 
177 P622 



178 P6122 
(0,0,6n)* 


179 P6522 
(0,0,6n)* 


180 P6222 
(0,0,3n)** 


181 P6422 
(0,0,3n)** 


182 P6322 
(0,0,2n) 
Primitive Tetragonal 
P41 
75 P4 



76 P41 
(0,0,4n)* 


77 P42 
(0,0,2n) 


78 P43 
(0,0,4n)* 

P41212 
89 P422 



90 P4212 
(0,2n,0) 


91 P4122 
(0,0,4n)* 


95 P4322 
(0,0,4n)* 


93 P4222 
(0,0,2n) 


94 P42212 
(0,0,2n),(0,2n,0) 


92 P41212 
(0,0,4n),(0,2n,0)** 


96 P43212 
(0,0,4n),(0,2n,0)** 
I Centered Tetragonal 
I41 
79 I4 



80 I41 
(0,0,4n) 

I4122 
97 I422 



98 I4122 
(0,0,4n) 
Primitive Orthorhombic 
P212121 
16 P222 



17 P2221 
(0,0,2n) 


18 P21212 
(2n,0,0),(0,2n,0) 


19 P212121 
(2n,0,0),(0,2n,0), 
C Centered Orthorhombic 
C2221 
20 C2221 
(0,0,2n) 


21 C222 

I Centered Orthorhombic 
I212121 
23 I222 
* 


24 I212121 
* 
F Centered Orthorhombic 
F222 
22 F222 

Primitive Monoclinic 
P21 
3 P2 



4 P21 
(0,2n,0) 
C Centered Monoclinic 
C2 
5 C2 

Primitive Triclinic 
P1 
1 P1 

Note that for the pairs of similar candidate space groups followed by the * (or **) symbol, scaling and merging of diffraction intensities cannot resolve which member of the possible pair of space groups your crystal form belongs to.
This is easy. In the main HKL2000 window, set up the Output Data Dir to be the directory where your .x files are located. It is not necessary to set the Raw Data Dir. Next, click the Scale Sets Only button followed by load data sets. You will see a list of .x files in the little dialog box. Select the set you want to scale, followed by OK. If you want to scale additional sets of .x files, go through and find these too and add them to the list. Make sure the scale button is selected for each set you want to scale together. For example, if you want to scale two sets of .x files together (say, from two different crystals, or from high and low resolution passes of data collection, or from a native and a derivative, etc.) make sure that scale is clicked for each one. If you don’t want them to be scaled together, then make sure that scale is not clicked. You don’t have to worry about the Image Display or the Experiment Geometry, since this was done when you first generated the .x files.
Now go to the Scaling page. You should see the set or sets of .x files listed in the Pending Sets list. These are the frames that will all be scaled together. Delete the reject file by clicking on the delete reject file button. Then scale away!
Reindexing involves reassigning indices from one unit cell axis to another. This becomes an important issue when comparing two or more data sets that were collected and processed independently. This is because Denzo, when confronted with a choice of more than one possible indexing convention, makes a random choice. This is no problem, except that if it makes a different choice for a second data set, the two will not be comparable without reindexing procedure. One cannot distinguish nonequivalent alternatives without scaling the data, which is why this is not done in Denzo. You can tell if you need to reindex a data set if the χ2 values upon merging the two are very high (e.g. 50). This makes sense when you consider that scaling two or more data sets involves comparing reflections with the same hkl or index. If the two indexing schemes are equivalent but not identical, chaos will result.
No reindexing, no new autoindexing, and nothing except changing the sign of y scale in Denzo can change the sign of the anomalous signal. At the moment reindexing does not work in HKL2000. To reindex you’ll have to follow the scenarios listed in the HKL manual.
The quality of Xray data is initially assessed by statistics. In small molecule crystallography there is almost always a large excess of strong data, so this allows the crystallographer to discard a substantial amount of suspect data and still accurately determine a structure. Compared to small molecules, however, proteins diffract poorly. Moreover, important phase information comes from weak differences and we must be sure these differences do not arise from the noise caused by the limitations of the Xray exposure and detection apparatus. As a result, we cannot simply throw away or statistically downweight marginal data without first making a sophisticated judgment about which data is good and which is bad.
To accurately describe the structure of a protein molecule, quite often we need higher resolution data than the crystal provides. That is life. One of the main judgments the crystallographer makes in assessing the quality of his data is thus the resolution to which his crystal diffracts. In making this judgment, we wish to use the statistical criteria which are most discriminatory and which are the least subjective. In practice, there are two ways of assessing the high resolution limit of diffraction. The first is the ratio of the intensity to the error of the intensity, i.e. I/σ. The second way, which is traditional, but inferior, is the agreement between symmetry related reflections, i.e. R_{merge}.
From a statistical point of view, I/σ is a superior criterion, for two reasons. First, it defines a resolution “limit” since by definition I/σ is the signal to noise of your measurements. In contrast, R_{merge} is unweighted statistics that does not take into account the measurement error.
Second, the σ assigned to each intensity derives its validity from the χ2’s, which represent the weighted ratio of the difference between the observed and average value of I, <I>, squared, divided by the square of the error model, the whole thing times a factor correcting for the correlation between I and <I>. Since it depends on an explicit declaration of the expected error in the measurement, the user of the program is part of the Bayesian reasoning process behind the error estimation.
The essence of Bayesian reasoning in Scalepack is that you bring χ2 (or technically speaking, the goodnessoffit, which is related to the total χ2 by a constant) close to 1.0 by manipulating the parameters of the error model. R_{merge}, on the other hand, is an unweighted statistic which is independent of the error model. It is sensitive to both intentional and unintentional manipulation of the data used to calculate it, and may not correlate with the quality of the data. An example of this is seen when collecting more and more data from the same crystal. As the redundancy goes up, the final averaged data quality definitely improves, yet the R_{merge} also goes up. As a result, R_{merge} is only really useful when comparing data which has been accumulated and treated the same. This will be discussed again later.
In short, I/σ is the preferred way of assessing the quality of diffraction data because it derives its validity from the χ2 (likelihood) analysis. Unless all of the explicit and implicit assumptions which have been made in the calculation of an R_{merge} are known, this criterion is less meaningful. This is particularly true when searching for a “number” which can be used by others to critically evaluate your work.
There are two modes of analysis of data using χ2s. The first mode keeps the χ2 (or more precisely, the goodnessoffit) constant and compares the error models. Basically, this means that you are adjusting your estimates of the errors associated with the measurement until the deviations within observations agree with the expectation based on the error model.
The second mode keeps the error model constant and compares χ2s. This mode is computationally much faster and is used in refinement procedures. Of the two modes, the first is more informative, because it forces you to consider changes in the error model. Which mode you use generally depends on what you are comparing. When assessing the general quality of your detector, the first mode is used. When comparing a derivative to a native, the second mode is used due to an incomplete error model which does not take into account important factors like nonisomorphism. Thus, the χ2 of scaling between a native and a derivative provides a measure of nonisomorphism, assuming the detector error is accurately modeled for both samples.
R_{merge} was historically used as an estimate of the nonisomorphism of data collected on film using several different crystals, and for this purpose it still has some validity because we do not account for nonisomorphism in our error model. It is not so important now that complete Xray data sets are collected from single, frozen crystals.
One of the drawbacks of using R_{merge} as a measure of the quality of a data set is that it can be intentionally and unintentionally manipulated. Unintentional factors which can artificially lower R_{merge} generally have the effect of reducing the redundancy of the data or eliminating weaker observations. In crystallography, the greater the redundancy of the data is the worse is the R_{merge}, because of the correlation between I and <I> which reduces the R_{merge}. The greater the redundancy is the lower is the correlation. For two measurements with the same s, the correlation is 50%, so R_{merge} is underestimated by √2 compared to the case of no correlation. Known unintentional factors which lower R_{merge} include the following:
Data collected so that lower resolution shells, where the data is strong, have a higher redundancy than the higher resolution shells, where the data is generally weaker. This can be accomplished by collecting data on detectors where 2q ¹ 0, or including data from the corners of rectangular or square image plates. There is nothing wrong with using this data; it will just artificially lower the R_{merge}.
Inclusion of single measurements in the calculation of R_{merge} in one widely used program, which is why a table using this erroneous calculation used to be presented in the Scalepack output. Although the bug in the widely used program was unintentional, it nonetheless reduced the R_{merge} and this may have accounted for its longevity. A second, more subtle bug that reduced R_{merge} prompted the introduction of the keyword background correction. Fortunately, both bugs have now been fixed, but the point is that errors of this type can persist.
Omission of negative or weak reflections from the calculation of R_{merge}. This is often undocumented behavior of crystallographic data scaling/merging software. Examples include:
a) elimination from the R_{merge} calculation the reflections that have negative intensities.
b) conversion of I < 0 to I = 0 before the calculation of R_{merge} and inclusion of this reflection in the data set (the statistics of such a type are included in the Scalepack output for reasons of comparison. This is the first R_{merge} table in the log file, not the final one).
c) omitting reflections with <I> < 0 from the calculation of R_{merge} but inclusion of these reflections in the output data set. Default σ cutoffs set to unreasonable values, like 1. This is in fact the default of the software used commonly to process image plate data.
Use of unreasonable default/recommended rejection criteria in other programs. These eliminate individual I’s which should contribute to R_{merge} and yet are still statistically sound measurements.
Use of the eigenvalue filter to determine the overall B factor of a data set collected on a nonfrozen, decaying crystal. In this case, the eigenvalue filter will calculate an overall B factor which is appropriate for the middle of the data set, yet apply this to all data. As a result, the high resolution data will be down weighted compared to data processed with the first, least decayed frame as the reference. The high resolution data is generally weaker than the low resolution data, and as a result is more likely to result in higher R_{merge}. By downweighting the high resolution data, the R_{merge} is artificially lowered. Any program which does not allow the option of setting the reference frame will have this problem. Of course, there is no problem with nondecaying crystals.
There are also intentional ways of lowering your R_{merge}. Like those ways listed above, they generally result from the statistically invalid elimination of weak reflections, reduction of the redundancy of the data, or deemphasis of weak data. The difference between these methods and those listed above is that they are generally under the control of the user.
Use of an unreasonable sigma cutoff (e.g.≥ 0). The rejection of weak data will always improve R_{merge}. There is a further discussion of sigma cutoff in the Scalepack Keywords Descriptions section.
Use of a resolution limit cutoff. Again, the omission of weak data will improve R_{merge}. A reasonable resolution cutoff is the zone where I/σ < 2.
Combining into a single zone for the purposes of calculations those resolution shells where R_{merge} is rapidly changing. In this case, the shell will be dominated by the strong data at the low resolution end of the zone and give the impression that the high resolution limit of the zone has better statistics than it really does. For example, if you combined all your data into a single zone, the R_{merge} in the final shell would be pretty good (=R_{merge} overall), when in fact it was substantially worse. It is more sensible to divide your zones into equal volumes and have enough of them so that you can accurately monitor the decay with resolution.
Omitting partially recorded reflections. This has the effect of a) reducing the redundancy, and b) eliminating poorer reflections. Partially recorded reflections will always have a higher σ associated with them because they have a higher total background, due to the inclusion of background from more than one frame in the reflection.
Scaling I+ and I reflections separately in the absence of a legitimate anomalous signal (scale anomalous). This has the effect of reducing the redundancy.
Ignoring overloaded reflections using the ignore overloads in Scalepack. The intensity of overloaded or saturated, reflections cannot be directly measured, because obviously some of the pixels are saturated. Profile fitting only measures these reflections indirectly, by fitting a calculated profile to the spot using the information contained in the wings or tail of the spot. Ignoring the inaccuracies inherent in this method by ignoring overloads may have a dramatic effect on R_{merge}. Note that in case of molecular replacement the option include overloads should be used.
ignore overloads is often a useful tool, however. For example, when calculating anomalous differences you do not want to use overloaded reflections because you are looking for very, very small differences and want to use only the most accurate data. Another time you might ignore overloads is when you collect multipass data. In this case, a crystal is exposed twice, once for a short time, the other for a longer time. The longer exposure is to sufficiently darken the high resolution reflections, but will result in saturated low resolution reflections. Since the low resolution reflections can be obtained from the short exposures, the overloaded ones can be ignored in the long exposures.
See: Press, William H., Teukolsky Saul A., Vettering William T., Flannery Brian P. “Numerical Recipes in C, The Art of Scientific Computing”, Second Edition, Cambridge University Press, 1992
A statistically sensible cutoff for the resolution of a diffraction data set is that shell where the average I/σ is 2 after correctly integrating and scaling the data. The resolution of the data is a distinct criterion from its completeness. The best data will be 100% complete. Completeness may suffer due to anisotropic diffraction, overlapped reflections, or geometrical constraints (data from the corners of the detector was used, beam stop or cooling nozzle shadows, etc.). In order to properly estimate resolution of the data you have to take into account: I/σ, R_{sym} and data completeness.
The Scalepack log file, scale.log, can be examined to see if the data scaling went well. It is long, since it contains the list of all .x files read in and a record of every cycle of scaling. The log file is divided into several sections. In order these are:
1. the list of .x files read in, and the list of reflections from each of these files that will be rejected
2. the output file name
3. the total number of raw reflections used before combining partials
4. the initial scale and B factors for each frame, goniostat parameters, and the space group
5. the initial scaling procedure.
The log file should be examined after each iteration. In particular, the errors, χ2 and both R factors should be checked.
Scaling 
Table of Contents 
Appendix: Installation Notes 