AUTOMATION OF MACROMOLECULAR DATA COLLECTION ...

fanaticalpumaMechanics

Nov 5, 2013 (3 years and 11 months ago)

57 views

AUTOMATION OF MACROMOLECULAR DATA
COLLECTION
-

INTEGRATION OF DATA COLLECTION AND
DATA PROCESSING

Harold R. Powell
1
, Graeme Winter
1
, Andrew G.W. Leslie
1
, Colin Nave
2
, Elizabeth Duke
2
, Stephen H. Kinder
2
,
Dave Love
2
, Sean McSweeney
3
, Olof Svensson
3
, Darren Spruce
3
, Solange Delageniere
3


(1) MRC
-
LMB, Hills Road, Cambridge, UK (2) Daresbury Laboratory, Daresbury, Warrington, UK (3) ESRF,
BP 220, F
-
38043, Grenoble Cedex, France

In view of the limited availability of beamtime at synchrotron sources and the large number of projects requiring this resour
ce,

automation of both data collection and data
processing has become increasingly important. Improvements in area detector technology (e.g. the introduction of fast readout

de
vices such as CCDs) also emphasize the
fact that human intervention at this stage and that of subsequent data processing is responsible for decreasing the possible

le
vels of throughput attainable. With this in mind
we have made considerable progress in integrating data collection and processing and in automating each of these two componen
ts.


Implementation


The project is divided into five phases and is intended to provide useful added
functionality at all stages. Phase I and II functionality will be available at the
ESRF (beamlines ID14 EH2, EH1) and SRS (beamline 14.2) in the next few
months.


Phase I
. In this phase, the Expert System will simply provide a communication
pathway between the data processing software and the beamline control
software.


Phase II
. The parameters of the data collection will be presented to the user in
a GUI where they can be edited.


Phase III
. An additional button will allow the user to integrate the images as
they are collected. Information about the results of the integration (eg <I/
s
(I)>
as a function of resolution or image) will be fed back to the GUI from the data
processing. Results of merging the data will also be displayed.


Phase IV
. Implementation of fully automated data collection and processing. A
single button will activate initial characterisation of the crystal, and providing
that user
-
defined criteria regarding resolution, mosaicity etc. are met, the data
will be collected and integrated without any user intervention.


Phase V
. Automated sample loading (including crystal centring in the beam)
and a project management system will be integrated with the Expert System.
This will allow rank ordering of multiple crystals based on their diffraction
properties, and fully automated beamline operation.

Mosflm

Expert system

Key to division
of labour

CCP4

Phase I/II

autoindex

estimate mosaicity

integrate single image

determine data collection strategy

Postrefine cell parameters

integrate image

determine effective resolution limit

collect images for postrefinement

collect next image in dataset

Merge/Scale data

collect two images at 0º and 90º

start


crystal

still okay?

finish (error)

Determine strategy

for new point group

data collection


finished?

finish (okay)

n

n


point group

consistent?

select next highest symmetry point
group

Merge/Scale data


point group

changed?

n

n

y

y

y

y

Procedures involved in fully automated data collection and processing

beamline software to collect the required images, and the data processing
software to process the images as they are collected.



Automation of the data processing steps is possible because of improvements
to Mosflm itself, which allow the appropriate sequence of operations to be
carried out in a flexible and robust manner. Commands which were previously
only available from the GUI are now accessible on the command line. All the
features listed in the flowchart exist in Mosflm version 6.2.0.

This work has been funded by CCP4 and the EU via the Autostruct initiative and the Max
-
Inf network.

DNA stands for DNA’s Not Autostruct. For further information, visit http://www.dna.ac.uk

The primary objective of the DNA project is the provision of software that will
allow fully automated collection and processing of diffraction data, including
rapid crystal screening. Ultimately the project will be extended to include
automated sample loading and a project management system, which will enable
a large number of crystals from a number of different projects to be handled
without any manual intervention.


Three modules (data processing, beamline control and sample control) are
linked together to provide a complete system for controlling data collection and
processing. Communication between the three modules is handled by an expert
system; this makes the crucial decisions about the data collection based on
information provided by the data processing module and some basic
parameters relating to the project supplied by the user. Direct communication
between the Expert system and the different modules is through a server
program which uses TCP/IP sockets and a command language conforming to
XML standards. The server program has been developed to provide an
extendable interface to a “next
-
generation” GUI for Mosflm.


The modular nature of this system simplifies installation on different
beamlines, while the open source communications protocol will allow the
straightforward integration of other software into the system.

At present the system has been implemented as an additional button on the data
collection GUIs (PXGEN++ at SRS, ProDC at ESRF) which gives a
"characterize crystal" command.


The "characterize crystal" command issues instructions to



collect two images from a crystal (at 0º and 90º in phi)



autoindex each image individually and also both together



estimate the effective mosaicity



integrate the first image to determine the effective resolution limit



calculate a suitable data collection strategy to give maximum completeness
for both unique and anomalous data.

The success or failure of the autoindexing of one or more test images is used as
the initial indicator of crystal quality. This will be judged by the rms error in
predicted spot positions and the fraction of spots that are rejected from
indexing or refinement. If the autoindexing is successful, the crystal mosaicity
will be estimated and the test images will be integrated to obtain an indication
of data quality and the effective resolution, deduced from the <I/
s
(I)> values as
a function of resolution. A data collection strategy based on the deduced Laue
group (with the lowest possible symmetry) will also be calculated.


The Expert System uses this information to determine if the crystal is really
suitable for data collection (for example, if the required resolution can be
achieved) and to determine data collection parameters. It then instructs the

Expert System

Data processing module

Beamline control module

Sample control module

Disk storage

database

PXGEN++ (left) and ProDC (above) GUIs
before and after characterizing crystal