Non-target screening of veterinary drugs using tandem mass spectrometry on SmartMass

taxidermistplateSoftware and s/w Development

Nov 7, 2013 (3 years and 10 months ago)

89 views

S
-
1


Supporting Information for

N
on
-
target screening
of veterinary drugs
using tandem mass spectrometry

on

Smar
t
Mass

Bing Xia
1
, Xin Liu
2
, Yu
-
Cheng Gu
3
, Zhao
-
Hui Zhang
2
, Hai
-
Yan Wang
1
, Li
-
Sheng
Ding
1
, Yan Zhou
1
*

1
Key Laboratory of Mountain Ecological Restoration

and Bioresource Utilization,
Chengdu Institute of Biology, Chinese Academy of Scien
ces, Chengdu 610041, P.R.
China

2
Beijing Entry
-
Exit Inspection and Quarantine Bur
eau, Beijing 100026, P.R. China

3
Syngenta
,

Jealott's Hill International Research Centre, Br
acknell, Berkshire RG42
6EY, UK


*
Corresponding Author:

Prof. Yan Zhou

Chengdu Institute of Biology, Chinese Academy of Science

Chengdu 610
041, People’s Republic of China

Tel. +86
-
28
-
8
2890800

Fax +86
-
28
-
85222753

zhou
yan@cib.ac.cn


S
-
2


S1

Selection of
libraries and components

for SmartMass implementation

SmartMass was developed on top of many open
-
source or free software programs.
Figure
S
1
gives

an overview of SmartMass’s architecture and the work flow. The
plain text f
ormat (in which the m/z values and intensities
were

separated with tab or
space) of tandem mass spectrometric data
was

pre
-
processed to denoise, sort
ed

and
convert
ed

into a m/z intensity list, along with the precursor m/z and the exact mass of
the
adduct i
on
was
input
t
ed

into SmartMass
for

analy
sis
.

The analysis process
consist
s

of
six

steps:

1.

Searching in the fragmentation pattern database to
determine the

analysis rules
by

which the characteristic patterns are matched by the MS/MS spectrum of the
unknown c
ompound.

2.

Applying the analysis rules to the MS/MS data to get the unknown peripheral
groups
,

followed by grafting these groups
onto the structural skeleton
, or directly
return
ing

the structur
al

skeleton as
a
substructure to be used as filter condition
.

3.

Sca
nning the spectra to obtain the peripheral groups and/or substructures the
compound may have when no rules
were
found.

4.

Searching and downloading candidate structures with

the
results

from step 2
and/
or 3 in a chemical database such as ChemSpider

and PubChe
m
.

5.

Screening the downloaded candidate structures with
structural

type (substructure)
and other
available

information
.

6.

Further personal
ly
-
defined post
-
processing
of

the above results.

Python [1]
,

with its clear syntax and massive standard and third party li
braries
,

S
-
3


was used as the main programming language. With rich third party
science
-
related
libraries such as Pybel [
2
] (chem
informatics

and bioinformatics related, generally
provided by Linux distributions with the package name of python
-
openbabel), Indigo
[
3
] (cheminfo
r
matics
-
related) and Rdkit [
4
] (cheminfo
r
matics related), in addition to
scientific comp
uting libraries such as Scipy [
5
], Python help
s

scientific users
to
rapidly and efficiently develop applications for their research.


In
SmartMass, we used

Indigo to provide the cheminformatics
-
related
functionalities such as MDL/MOL format parsing, molecular weight and formula
computations, the automatic layout of 2D coord
inates and structure rendering.
The
c
heminformatics
-
related library Rdkit was used to
perform substructure matching,
and
the substructure matching function based on Rdkit parse
d

the candidate structures and
look
ed

for a given substructure, and
return
ed

True

if such substructure was found. The
substructure matching functionality is quite imp
ort
ant

in eliminating false candidates
with characteristic substructures
.

The fragmentation pattern and chemical data
handling step was implemented using the lightweight and powerful open
-
source
relat
ional database engine SQLite [
6
], and we used SQLObject
[
7
] as an object
relational manager (ORM) to provide an object interface in Python language to our
database. As an application with a GUI (Graphical User Interface) and one that aims
to run on major operating systems, the UI framework was carefully conside
red
because Python itself is capable of running on nearly every pla
tform. In our case,
PySide [
8
] was chosen for all of the GUI programming. PySide provides
LGPL
-
licensed Python bindings for the Qt [
9
] cross
-
platform application, and the UI
S
-
4


framework allow
s for both open
-
source and proprietary software development.
PySide ultimately aims to support all platforms as Qt itself does. Another reason for
choosing PySide was its strong support from its
parent

company (Nokia Corporation)
and the related open
-
sourc
e community. Additionally, the user
-
friendly development
environment enabled us to create PySide/Qt
-
based applications easily and swiftly. The
confirmation modules for the results were built with PycURL [
10
], which is a Python
interface to libcurl, the hig
hly portable client
-
side URL transfer library with many
protocol supports, including HTTP and HTTPS, and which is commonly used by
many large, successful companies and
for
numerous applications. We used PycURL to
support login, session management, query
-
po
sting and result
-
obtaining tasks of some
of the web
-
based chemical database services. We also used the built
-
in lightweight
Python module urllib and/or urllib2 under circumstances in which multiple protocol
supports were not necessary.

References

1.

Python So
ftware Foundation. Python Programming Language


Official Website,
version 2.7.2. http: //www.python.org/. 2012.

2.

O'Boyle, N. M.; Morley, C.; Hutchison, G.R. Pybel: a Python wrapper for the
OpenBabel cheminformatics toolkit. Chem. Cent. J. 2, 5
-
11 (2008)

3.

G
GA Software Services LLC. Indigo
-

GGA Software Services, version 1.0.0.
http: //ggasoftware.com/opensource/indigo. 2012.

4.

Rdkit. www.rdkit.org. 2012.

5.

SciPy: Open source scientific tools for Python. http: //www.scipy.org/. 2012.

S
-
5


6.

SQLite, version 3.7.9. ht
tp://www.sqlite.org/. 2012.

7.

SQLObject, version 1.2.1. http:// http://sqlobject.org. 2012.

8.

PySide project. PySide, version 1.0.9. http://www.pyside.org. 2011.

9.

Nokia Corporation. Qt, version 4.7.4. http://qt.nokia.com. 2011.

10.

PycURL, version 7.19.0. http:
//pycurl.sourceforge.net/. 2008.
S
-
6


S
2

Derivation

of a
nalysis
rule
s

Executing analysis rules

on the input MS/MS data
was
the core process of

our system.
An analysis rule is

a piece of specialized Python code that is stored in a table named
pattern

in the SQL
ite database file.
An analysis rule is used to analyze

MS/MS of
chemicals

belonging to a class sharing

the same CSS

with or without several marked
sites

[
alphabetically
and automatically named
Ja

to
Jz

by the rule editor of SmartMass
when uncertain atoms (
i.e.

*


marked atoms) presented in an input structure]

that
might change between
each other
.
SmartMass

already contains certain predefined
classes and methods
which allow

users

to
construct

their own analysis rule
s
, such as
classes
M
S

and
Generator

and fu
nction

Exact_mass
(fm, chg)
. Class
MS

(this
class presents the current mass spectrum) provides several methods, such as
exists(*peaks)
, which returns
True

if all of the given peaks are present in the
spectrum,
inten(peak)
, which returns the relative intensi
ty of a given peak, and
exist_nls(*loss)
, which returns
True

if the neutral losses are present in the
spectrum. Class
Generator

provides functions for the generation of controllable
groups or substructures, which allows us to generate various peripheral st
ructures
within the analysis rule. For example,
Generator.chain(n, element)

returns
an n
-
numbered chain made of the
element

skeleton. Function
Exact_mass(formula, charge)

accepts a given formula and charge and
returns its exact mass. We also use some prede
fined variables; for example,
MA

(M
plus adduct) represents the
m/z

of the precursor ion (adduct ion mass is included), and
ME

is the mass of the electron, whereas
AI

(adduct ion) represents the
m/z

of the
adduct ion. All of the predefined classes, functio
ns and variables are freely usable
anywhere in the analysis rule
s
,
allowing users to conveniently

write their own
analysis rule
s
.

Uncertain
v
ariables
p1

to
pn

(or
P1

to
Pn
)

(where
n

is positive integer

and

named sequentially) are used to keep the
m/z

value

of peaks

of MS/MS.
The
values of
p1

to
pn

are sequentially assigned by an
n
-
element subset of
peaks

of
MS/MS

in the analysis process and thus construct
iteration

on the spectrum.

An analysis rule consists of three parts:
predefines
,
characteristic
fragmen
tation
S
-
7


pattern

and

analysis and returning code
. In
predefines
, users can write a legitimate
Python code that could be used in any part of the analysis rule. The
characteristic
fragmentation
pattern

is, in essence, a boolean expression, and the analysis rul
e is not
applied to the mass spectrum until the characteristic pattern returns
True
. The
analysis
and returning code

calculates the exact mass of the peripheral groups and/or
dynamically generates structures of some of the groups before subsequently return
ing
them. SmartMass processes the returned results and queries the peripheral group with
the exact mass in the
periph_group

table of the SQLite database file to determine
possible peripheral groups. Then, the groups are linked to the corresponding sites of

the
skeleton to represent the candidate compounds. The output structures
from

the linking
step are improved using Indigo’s functions to provide a better appearance, and they are
finally displayed in a table view.

S
-
8


Table S1.
Chemical

information of 17 sulf
onamides

Peak
No.

Compound

CAS No.

Chemical structur
e

Molecular
formula

Accurate
Mass
([M+H]
+
)

1

Sul
f
aguanidine

(SGN)

57
-
67
-
0

S
O
O
N
N
H
2
N
H
2
N
H
2

C
7
H
10
N
4
O
2
S

215.0597

2

Sulfanilylacetamide

(SAM)

144
-
80
-
9

S
O
O
O
N
H
H
2
N

C
8
H
10
N
2
O
3
S

215.048
5

3

Sulfadiazine

(SDZ)

68
-
35
-
9

S
O
O
N
H
N
H
2
N
N

C
10
H
10
N
4
O
2
S

251.0597

4

Sulfapyridine

(SP)

144
-
83
-
2

H
2
N
S
O
O
N
H
N

C
11
H
11
N
3
O
2
S

250.064
5

5

Sulfamerazine
(SMR)

127
-
79
-
7

S
O
O
N
H
N
N
H
2
N

C
11
H
12
N
4
O
2
S

265.075
4

6

Sulfamoxole

(SML)

729
-
99
-
7

S
O
O
O
N
H
N
H
2
N

C
11
H
13
N
3
O
3
S

268.0750

7

Sulfamethazine
(SMZ)

57
-
68
-
1

N
N
H
S
O
O
H
2
N
N

C
12
H
14
N
4
O
2
S

279.0910

8

Sulfamethoxypyrid
azine

(SMP)

80
-
35
-
3

S
H
2
N
O
O
N
H
N
N
O

C
11
H
12
N
4
O
3
S

281.070
3

S
-
9


9

Sulfamethizole

(SMT)

144
-
82
-
1

S
S
O
O
N
H
N
H
2
N
N

C
9
H
10
N
4
O
2
S
2


271.031
8

10

Sulfachloropyridazi
ne

(SCP)

80
-
32
-
0

S
O
O
N
H
N
N
H
2
N
C
l

C
10
H
9
ClN
4
O
2
S

285.020
8

11

Sulfadoxine

(SD)

2447
-
5
7
-
6

O
N
N
N
H
S
O
O
H
2
N
O

C
12
H
14
N
4
O
4
S

311.0808

12

Sulfamonomethoxi
ne

(SMM)

1220
-
83
-
3

O
N
H
S
O
O
H
2
N
N
N

C
11
H
12
N
4
O
3
S

281.070
3

13

Sulfisoxazole

(SSX)

127
-
69
-
5

S
O
O
O
N
H
N
H
2
N

C
11
H
13
N
3
O
3
S

268.0750

14

Sulfaben
zamine

(SBZ)

127
-
71
-
9

S
O
O
O
N
H
H
2
N

C
13
H
12
N
2
O
3
S

277.0641

15

Sulfaphenazole

(SPZ)

526
-
08
-
9

O
S
O
N
H
H
2
N
N
N

C
15
H
14
N
4
O
2
S

315.0910

16

Sulfadimethoxine

(DSM)

122
-
11
-
2

O
N
O
N
N
H
S
O
O
H
2
N

C
12
H
14
N
4
O
4
S

311.
0808

17

Sulfaquinoxaline

(SQX)

59
-
40
-
5

S
O
O
N
H
N
N
H
2
N

C
14
H
12
N
4
O
2
S

301.075
4


S
-
10




Figure
S
1.

O
verview of SmartMass’s architecture and work flow
.

S
-
11




Figure
S2
.
XIC of
a mixture of
17 sulfonamide

standards:

1.

SGN,
2.

SAM,
3.

SDZ,
4.

SP,

5.

SMR,
6.

SML,
7.
SMZ,
8.

SMP,
9.
SMT,
10.

SCP,
11.

SD,
12.
SMM,
13.

SSX,
14.

SBZ,
15.
SPZ,
16.

DSM, and
17.

SQX