Quantum Chemical Structure Activity Studies of Anticancer Activity of Seconucleoside Nitrosourea Analogs

Diverse names have been proclaimed in the design of efficient nitrosourea compounds capable of anticancer activity. Nucleoside can be considered as a specific carrier for the nitrosourea moiety with altered pharmacological properties such as increased water solubility, lower myelotoxicity and increased cell membrane transport. Besides, these nucleoside analogs can be one of the prodrugs of the nucleoside that permit the slow release of the active drugs in the cancer cells. They also can be considered as irreversible inhibitors for nucleotide-metabolizing enzyme. Abstract


Introduction
Diverse names have been proclaimed in the design of efficient nitrosourea compounds capable of anticancer activity. Nucleoside can be considered as a specific carrier for the nitrosourea moiety with altered pharmacological properties such as increased water solubility, lower myelotoxicity and increased cell membrane transport. Besides, these nucleoside analogs can be one of the prodrugs of the nucleoside that permit the slow release of the active drugs in the cancer cells. They also can be considered as irreversible inhibitors for nucleotide-metabolizing enzyme.

Abstract
Nucleosides can be considered as specific carriers for the nitrosourea moiety with altered pharmacological properties such as improved water solubility, lower myelotoxicity with enhanced cell membrane transport. Several seconucleoside nitrosourea compounds have been reported to have anticancer activity against solid tumors such as, murine adenocarcinomas of the colon 13 (MAC 13), MAC 15A, colon 38 and mammary carcinoma. In an effort to have a good understanding of the role of the structural features on the anticancer activity of these seconucleosides, we have constructed the quantitative structure activity relationship (QSAR) involving 26 seconucleoside nitrosourea compounds having anticancer activity against murine adenocarcinomas of the colon 15A (MAC 15). Traditional QSAR studies require knowledge of several physicochemical properties to obtain a good QSAR. To circumvent this problem, we have made use of quantum chemical methods to calculate several electronic and molecular properties of these compounds and used these properties to obtain the best QSAR using statistical procedures. Semi-empirical quantum chemical RM1 methods were used to optimize the molecular structures. From the optimized structure, hundreds of molecular and electronic properties were calculated and correlated with the anticancer activities of the compounds. The best correlation was obtained using heuristic and multi-linear regression methods. We were able to obtain a best QSAR with correlation coefficient, R 2 =0.8993. We find that the average structural information contents of the second order, average bond order of oxygen atom and maximum columbic interaction for H-N bond play significant roles in influencing the anticancer activity of the selected compounds. The QSAR obtained in our study can be used to predict the anticancer activity of new seconucleoside nitrosourea analogs, prior to resorting to any experimental studies. Quantitative structure activity relationship (QSAR) which was first developed by Hansch [1] 40 years ago is one of the methods that is commonly used to correlate molecular structure to some kind of in-vitro or in-vivo biological properties. Such studies have become important for understanding the structureactivity relationships of bioactive compounds to guide discovery of new drugs and optimization of drug activity [2]. There are two goals of QSARs, one of which is to recognize the relation between structure and activity, and to identify which chemical properties will significantly influence the biological activities of a compound. The second goal is to forecast biological activity This article is available in: www.cheminformatics.imedpub.com/archive.php

Chemical informatics ISSN 2470-6973
for novel and sometimes yet unavailable compound. Due these reasons, QSAR has become an inevitable tool in pharmaceutical industry [3]. New drugs require strict approval to assure its efficacy and safety. Properties such as low toxicity and an appropriate absorption, distribution, metabolism, and excretion (ADME) profile need to be fulfilled. A lot of new drugs failed in the final phase of the developmental process because of poor ADME properties. The process to select the drugs that fulfill all the required characteristics is an expensive and time consuming procedure. QSAR provide an avenue to discover those properties even in the early phase of the drug development. On the basis of the developed QSAR model, drugs that are deficient in required chemical and physical properties can be removed from further development. Besides, the most suitable compound satisfying all the required criteria can be carried forward for further evaluation [4]. In addition to saving time and cost in chemical syntheses and speeding up decision making process, QSAR model can drastically diminish the use of animals for experimental purpose and fulfill the requirement of international association to avoid animal testing [2].

Materials and Methods
A total of 26 chloroethyl nitrosourea analogs of seconucleosides with known anticancer activity against the mouse adenosarcoma of the colon 15A (MAC 15A) were selected for our analysis
The biological activities of CENU analogs of seconucleoside have been expressed in the percent increase in life span or %ILS which is calculated using: Where, T is the median survival time of the treated animals and C is the median survival time of the control such as the untreated animals. The higher the number of T/C or %ILS, the better the drugs can inhibit the growth of cancer cells. A %ILS value of 125% indicates the minimum value necessary for a statistically significant antitumor activity [8].
A set of experimental data in numerical value and a set of corresponding structure were prepared in computer comprehensible format. A specialized molecular editor ACD/ ChemSketch version 10.0 [9] was used for the conversion of drawing into the corresponding connectivity table.

3D geometry optimization
The most significant in forecasting and description of biological activities and molecular properties is the molecular shapes and   conformations. CAChe WorkSystem Pro program [10] was used to convert the 2-D connectivity table into a 3-D image of the molecules with parallel geometry optimization. This program merges a molecular editor with a geometry optimization practice that is usually based on molecular mechanics. The optimized 3-D structures are then subjected to self-consistent field (SCF) calculation using MOPAC 6.0 [11] using RM1 semi empirical quantum chemical method [12]. The optimized geometries are verified with the available structural information.

Calculation of molecular descriptor
The Comprehensive Descriptor for Structural and Statistical Analysis (CODESSA) program [13][14][15] was used in the calculation of molecular descriptors. Four types of descriptors were calculated; the constitutional, topological, electrostatic, and quantum chemical molecular descriptors. The set of molecular descriptors were further treated with the experimental data by best multilinear regression analysis. The square of the correlation coefficient R 2 , the square of the leave one out cross validated correlation coefficient R 2 cv and the F-test were used to determine the best correlation model. The square of the correlation coefficient R 2 is calculated using Eq. 2.
In the above equation the symbols have the following meaning: n=total number of samples in the data set y ei =experimental property of the sample "i" y ci =calculated property of the sample "i" y a =average experimental property of all the samples Due to the limited number of samples available, the data set was not divided in to training set and test set. Instead the "leave one out" method was used to measure the prediction accuracy of the where y pi represents the predicted property of the sample "i", after this sample is left out of the model. Agreement between R 2 and R 2 cv is taken as the measure of the accuracy of predictive ability of the model. Several strategies are available in the framework of CODESSA to develop the QSAR equations with the maximum predictive power. The search for the multi-parameter regression with the maximum predicting ability is performed using the following strategy.

Among all the possible descriptors only the orthogonal pairs of descriptors i and j (with R 2
ij <R 2 min =0.1) are used for further analysis.
2. Next a two-parameter regression with the pairs of descriptors is carried out. All the pairs of descriptors with regression correlation coefficient greater than R 2 nc =0.65 are chosen for performing the higher-order regression treatments.
3. For each of the above descriptor pair (I,j), a third descriptor, k (with R 2 ik <R 2 nc and R 2 kj <R 2 nc ), is added, and a new three parameter regression treatment is performed. If the Fisher criterion at a given probability level (P<0.05), F, is higher than that for the twoparameter correlation, then the new three parameter regression is carried over for the next step. A new set of descriptor triples with highest regression correlation coefficients are chosen for the next step.
4. For each descriptor set from the previous step, an additional non-collinear descriptor is added by repeating the above step, till the maximum value of the Fisher criterion and with the highest cross-validated correlation coefficient is obtained.
In using this approach, it is crucial to decide when to stop further addition of descriptors to improve the QSAR equation. We have adopted the technique known as 'breaking point' technique [16] which helps to control the model expansion and thus in turn improves the statistical quality of the model. Addition of any independent variables in the QSAR model can lead to improvement in the R 2 value in the consequent regression. However, if the addition of the new descriptor does not significantly improve the R 2 value, then the added descriptor does not contribute any new information to the model. If the increase in R 2 value by the addition of the new descriptor is less than 0.02, then the QSAR model described without the addition of this new descriptor is considered as the best model.

Results and Discussion
The best QSAR for seconucleoside nitrosoureas and the fitted parameters with their statistical errors are given in Table 1 The average information content of order k, K IC, (Bonchev, 1983) is defined as: where n i is a number of atoms in the i-th class and n is a total number of atoms in the molecule. The average structural information content of second order ( 2 SIC) is defined as The empirical partial charge characteristics in the molecule are calculated using the method proposed by ref. [17][18].
The data set for anticancer activity contained structure belonging to nucleoside compounds. In Figure 2 illustrates the result obtained from a QSAR calculation of 26 anticancer seconucleoside compounds which are the analogs of the 2-chloroethyl nitrosourea (CENU).
The regression presented in Table 1 with 4 parameters has good squared correlation coefficient R 2 =0.8796 and squared crossvalidated correlation coefficient R 2 cv =0.8521. The agreement between R 2 and R 2 cv shows the predictive accuracy of the developed QSAR equation. One of the most important descriptor in the correlation equation in Table 1 is the average structural information content of the second order which reflect the number of different structural fragments in the compounds and may therefore be related to the biological activities. Meanwhile, the minimum net atomic charge for a N atom descriptor, describe the availability of a nitrogen lone electron pair for intermolecular hydrogen bonding. However, the remaining two descriptors which are the minimum coulombic interaction for N-H bond and average bond order of oxygen atom can be related to the intramolecular interactions influencing the biological activity through conformational stability. The application of QSAR in combination with the calculation of the quantum chemical descriptors clearly show the role of these molecular interactions in influencing the biological activity of seconucleoside against the mouse adenosarcoma of the colon 15A (MAC 15A). The large values of 2 SIC and q N will lead to decrease in the anticancer activity, whereas larger values of q NH and b OX will lead to increase in the activity of these molecules, due to their negative contributions. Hence in designing more active molecules, care should be taken to increase the interactions involving -NH and -O groups. Any factor that increases the number of fragments or the negative charge on the nitrogen atom should be avoided, as these changes would lead to the decrease in the anticancer activity of the compounds.
Hence it can be seen that, in addition to contributing the understanding about the role of these interactions in influencing Comparison of calculated versus experimental activity (%ILS) using Equation (4). Figure 2 the biological activity, this approach can be used to predict the anticancer activity of new seconucleoside nitrosourea analogs which will immensely reduce the experimental efforts required to synthesize the drug candidate.