## Qsar And Drug Design Introduction

Medicinal chemists have tried to quantify relationships between chemical structure and biological activity since before the turn of the century. However, it was not until the early 1960s, through the joint effect of Corwin Hansch and his computer that, a workable methodology developed known as Quantitative Structure Activity Relationship (QSAR).

Table of Contents

In 1968 Crum-Brown and Fraser published an equation which is considered to the first general formulation of QSARS. In their investigation on different alkaloids they recognized that alkylation of the basic nitrogen atom produced different biological effects of the resulting quaternary ammonium compound, when compared to the basic amines. Therefore they assumed that, biological activity must be the function of the chemical structure.

BA = f[C] ……….1

Richet discovered that toxicity of organic compounds inversely follows their water solubility. Such relationship shows that changing the biological activity (ABA) corresponds to the change in the chemical and physiochemical properties AC.

**△** BA = f(AC) ………. 2

All the QSAR equation corresponds to equation 2, because only the difference in BA are quantitatively correlates with changes in lipophilicity and / or other physicochemical properties of the compound under investigation.

Studies of Meyer and Overtone Fuhrer suggested a linear relationship between lipophilicity and narcotic activities. Fuhrer realized that within homologous series narcotic activity increases in a geometric progression which proves that additivity of group contributes to biological value. Ferguson gave an interpretation of non-linear structure activity relationship with thermodynamics, which also explains the “cut off’ of biological activity beyond certain range of biological activities.

Additional development of QSAR occurred until the work of Louis Hammett (1894-1987) who correlated electronic properties of organic acid and bases with equilibrium constant and reactivity. QSAR are now developed using a variety of parameters as descriptors of the structural properties of molecule. Hammett sigma value are often used for electronic parameters, but quantum mechanically derived electronic parameters may also be used.

Other descriptors to account for the shape, size, lipophilicity, polarizability and other structural properties also have been devised. Nearly 6000 biological and chemical data base is available for QSAR studies.

QSAR involves the derivation of mathematical formula which relates the biological activities of a group of compounds to their measurable physiochemical parameters. These parameters have major influence on the drug’s activity. QSAR derived equation take the general form:

Biological activity = function {parameters}

In which activity is expressed as log (1/C). Where C is the minimum concentration required to cause a defined biological response. QSAR based on Hammett’s relationship utilizes electronic properties as the descriptors of structures. Difficulties were encountered when investigation attempted to apply Hammett type relationship to biological systems, indicating that other structural descriptors were necessary.

Hansch recognized the importance of lipophilicity, expressed as octanol-water partition coefficient, on biological activity. This parameter provides a measure of bio-availability to compounds, which will determine, in part the amount of compounds that gets to the target site.

All these reveal that biological activity of a drug is a function of chemical features (ie. lipophilicity, electronic and steric) of the substituents and skeleton of the molecule. For example lipophilicity is the main factor governing transport, distribution and metabolism of drug in biological system.

Similarly electronic and steric features influence the metabolism and pharmacodynamic process of the drug. A major problem in QSAR studies arise because hydrophobic, electronic and steric effect overlaps and cannot be neatly separated.

## Qsar And Drug Design Parameters

The QSAR approach uses parameters which have been assigned to the various chemical groups that can be used to modify the structure of the drug. The parameter is a measure of the potential contribution of its group to a particular property of the parent drug. The various parameters used in QSAR studies are

- Lipophilic parameters: Partition coefficient, Chromatographic parameters and π – Substitution constant.
- Polarizability parameters: Molar refractivity, Molar volume, Parachor.
- Electronic parameters: Hammett constant, Field and resonance Parameters derived from spectroscopic data, Charge-transfer constant, Dipole moment, Quantum-chemical parameters.
- Steric parameters: Taft’s steric constant, Van der Waals radii.
- Miscellaneous parameters: Molecular weight, Geometric parameters, Conformational entropies, Connectivity indices, other topological parameters.

**Lipophilic Parameters**

Lipophilicity is defined by the partitioning of a compound between an aqueous and a non-aqueous phase. Two parameters are commonly used to represent lipophilicity, namely the partition coefficient (p) and lipophilic substituent constant (r). The former parameter refers to whole molecule, while the latter is related to substituted groups.

**Partition Coefficient**

A drug has to pass through a number of biological membranes in order to reach its site of action. Consequently, organic / aqueous system partition coefficient is the obvious parameters used to measure the movement of drug through these membranes. Partition coefficient is generally given as,

P = [C] org / [C] aqu ………..3

It is a ratio of concentration of substance in organic and aqueous phase of a two compartment system under equilibrium conditions. For easily ionisable drug, correlation must be made as follows

P = [C] org /[Claqu (1-∞) ……………..4

where α = degree of ionization

The accuracy of the correlation of drug activity with partition coefficient will depend on the solvent system used as a model for the membrane. Both pure water and buffered solution are used for the aqueous medium. The n-octanol/water system is frequently chosen, because it appears to be a good mimic of lipid polarity; however more accurate results may be obtained if the organic phase is matched to the area of biological activity being studied.

For example, n-octanol usually gives more consistent results for drugs absorbed in GI tract while less polar solvents such as olive oil give more consistent correlation for drugs crossing the blood-brain barrier. More polar solvents such as chloroform give more consistent value for buccal absorption. n-octanol/water system has more advantages over other systems:

- It is a suitable model of the lipid constituents of biological membrane due to its long alkyl chain and the polar hydroxyl group.
- n-octanol has a low vapor pressure, allowing reproducible measurements
- n-octanol is UV transparent over a large range, making the quantitative determination of a compound is relatively easy.

The nature of the relationship between P and drug activity depends on the range of P values obtained in the compounds used. If this range is small the result may be expressed as a straight-line equation having the general form:

Log1/C K1 log P + K2 ……….5

where K1 and K2 are constants.

This equation indicates a linear relationship between the activity of the drug and its partition coefficient. Over large ranges of P values the graph of log 1/C against log P often has a parabolic form with a maximum value (log P°). The existence of this maximum value implies that there is an optimum balance between aqueous and lipid solubility for maximum biological activity.

Below Po the drug will be reluctant to enter the membrane while above po the drugs will the reluctant to leave the membrane. This means that analogues with partition coefficient near this optimum value are likely to be most active and worth further investigation. The parabolic relationship could be represented by equation of the form.

log (1/C) = – K1 (log P)2 + K2 log P + K3 ……..6

where K1, K2 and K3 are constants that are normally determined by regression analysis.

**Experimental Determination of Partition Coefficient**

**Shake Flask Method**

Weighed amount of the substance is shaken in a flask containing a measured amount of buffer pH – 7.4 or Water and n-Octanol. The amount of chemicals in one or both the phase is determined by appropriate analytical technique and the partition coefficient is calculated. The method is tedious, time-consuming, messy and smelly but is the only method that can used in case of very low lop P value.

**Chromatographic Method**

Compounds with known log P values are injected onto a C18 reverse phase HPLC column to create a calibration curve. Unknown compound are then injected to predict log P. The chromatographic method suffers the disadvantage that the retention time is linearly related to partition coefficient i.e.

For the doubling of log P, there is a tenfold increase in the retention. This often requires different length column to be used, short one for high log P and long ones for low values.

**Chromatographic Parameters**

When the solubility of a solute is considerably greater in one phase than the other, partition coefficient becomes difficult to determine experimentally. Chromatographic parameters obtained from reversed phase thin layer chromatography are occasionally used as substitute for partition coefficient. Silica gel plate, being coated with hydrophobic phases, is eluted with aqueous / organic solvent system of increasing water content. The Rf values are converted into Rm value, which are the true measure of lipophilicity from the following equation.

Rm = log (1/Rf-1) ….. 7

Rm value has been used as a substitute for partition coefficient in QSAR investigations. The determination of RM values offers many important advantages, as compared to the measure of log P values:

- Compounds need not be pure
- Only trace of materials needed.
- A wide range of hydrophilic and lipophilic congeners can be investigated.
- The measurement of practically insoluble analogs possesses no problem.
- No quantitative method for concentration determination needed.
- Several compounds can be estimated simultaneously.

The main disadvantages are

- Lack of precision and reproducibility
- Use of different organic solvent systems renders the derivation of π and π -related scales impossible

**π- Substituent Constant / Lipophilic Substituent Constant**

Lipophilic substituent constants are also known as hydrophobic substituent constants (π). The π substituent constant is defined by Hansch and Co-workers by the following equation.

π = log Prн – log Prx ………….. 8

Where, PRH and PRX are the partition coefficient of the standard compound and its mono substituted derivative respectively. For example calculation of log p value of m- chlorotoluene is given as

The π value of aromatic substituents are tabulated

π values of Aromatic Substituents

However, when several substituents are present, the value of л for the compound is the sum of the л values of each of the separate substituents. The л value for a specific substituent will vary with the structural environment of the substituent. The value of л also depends upon the solvent system used to determine the partition coefficient used in their calculation.

A positive л value indicates that the substituent has a higher lipophilicity than hydrogen and implies that the drug favours the organic phase. A negative л value indicates that the substituent has lower lipophilicity than hydrogen and increases the concentration of the compound in aqueous media of biological systems.

The application of π value for lipophilicity calculation of aliphatic compound led to significant deviation between observed and calculated values. For example, from the definition of π value, лн must be zero and no difference between cн, and π CH_{2}. But the lipophilic contribution of hydrogen atoms is not zero.

Hence, Rekker suggested a new system, known as hydrophobic fragmentation constant, which is a measure of the absolute lipophilicity contribution of the corresponding substituent or group and is no longer based on the exchange of H for X, as л values:

log P = Σai f ………9

Where ai is the number of occurrences of the fragments with the lipophilicity contribution fi. Hydrophobic fragmentation constant,

fH = 0.175 ………10

fcH3 = 0.702

△H = 0.172

fCH2 = 0.530

fCH = 0.235

ΔΗ = 0.295

ΔΗ = 0.085

fc = 0.15

Since there are different values for each type of carbon, no branching correction is required when hydrophobic constant values are used.

For example, the log P value for C_{6}H_{5} (CH_{2})_{3} Cl is calculated as,

C_{6}H_{5} (CH_{2})_{3}Cl= ƒC_{6}H_{5} +3ƒCH_{2} +ali fcl

= 1.886+3 (0.53) + 0.061

= 3.537 (3.55 experimental value)

log P for n-pentane CH_{3}-CH_{2}-CH_{2}-CH_{2}-CH_{3}

log p = 2fcH_{3} + 3 ƒCH_{2} + 3fb

= 2(0.702) + 3(0.53) + 3 (-0.12)

= 3.40 (3.39 experimental value)

fb = single bond between the fragments in the ring

Lipophilic constants are frequently used when dealing with a series of analogs in which only the substituents are different. Many attempts are made to explain lipophilicity by related properties, but none of these alternative approaches has led to a reliable log P prediction system so far.

**Polarizability Parameters**

**Molar Refractivity (MR) **

The molar refractive is a measure of both the volume of a compound and how easily it is polarized. It is expressed as:

MR = (n_{2 }– 1)M/(n_{2} + 2)d ………..11

where

n is the refraction index,

M is the molecular weight and

d is the density.

The term Mw /d defines a volume, while the term (n_{2} – 1) / (n_{2} + 2) provides a correction factor by defining how easily the substituent can be polarized. This is particularly significant if the substituent has an electron or a lone pair of electrons.

The significance of molar refractivity terms in the QSAR equation of some ligand-enzyme interaction could be interpreted with the help of 3D structure. This investigation shows that substituent modeled by MR bind in polar areas, while substituents modeled by л, bind in hydrophobic space.

The positive sign of MR in the QSAR equation explains that the substituent binds to the polar surface, while a negative sign or nonlinear relationship indicates steric hindrance at the binding site.

**Parachor**

The parachor [p] is molar volume V which has been corrected for forces of intermolecular attraction by multiplying the fourth root of surface tension y. It is expressed mathematically as

**Electronic Parameters**

The distribution of electrons in a drug molecule has a considerable influence on the distribution and activity of the drug. In general, non-polar and polar drug in their unionized form are more readily transported through membranes than polar drugs and drugs in their ionized form.

If the drug reaches the target site, the distributed electron will control the type of bond that it forms with the target site, which in turn affects its biological activity. The first attempt to quantify the electronic effect of groups on the physiochemical properties of compound was made by Hammette.

The Hammett Constant (σ)

The distribution of electrons within a molecule depends on the nature of the electron-withdrawing and donating group found in the structure. Hammett used this concept to calculate what now known as the Hammett constant (o) for a variety of mono-substituted benzoic acids. He used these constants to calculate the equilibrium and rate constants for chemical reactions.

However, they are now used as electronic parameters in QSAR relationships. Hammett constants are defined as:

Hammett substitution constant (o) is a measure of the electron-withdrawing or electron donating ability of a substituent and has been determined by comparing the dissociation of series of substituted acid with that of parent or unsubstituted acid. A negative value of ox indicates that the substituent is acting as an electron donor and the positive value indicates that it is acting as an electron-withdrawing group.

Hammett constant takes into account both resonance and inductive effect. Therefore the values of o for a particular substituent will depend on whether the substituent is meta or para. The meta and parao value are commonly used and indicated by a subscript m or p after the symbol oorthoo are often unreliable due to steric hindrance and other effects such as intra molecular hydrogen bonding.

Hammett constant suffer from the disadvantage that it only applies to substituents directly attached to a benzene ring. Most QSAR studies start off by considering σ and if there is more than one substituent, the values are summarized (20). Hammett substitution has been unsuccessful to relate biological activity since electron distribution is not the only factor involved.

**Inductive Substituent Constant**

Hammett constant is a measure of both inductive and mesomeric effect. The p substituent constant (σp) has a greater resonance component than the equivalent meta constant (m) and the inductive contribution can be calculated from the inductive substituent constant (σ1)

It is used in aliphatic compound in which influencing and influenced group do not form a part of a conjugated system.

**Taft’s Substituent Constant**

Taft’s substituent constant (σ *) are a measure of the polar effects of substituent in aliphatic compound when the group in question does not form part of a conjugated system. They are based on the hydrolysis of ester and calculated from the following equation.

The bracketed term with subscript B represent basic hydrolysis and A as acid hydrolysis respectively. The factor 2.48 brings the constant on to the same scale as the Hammett constant. Only the basic term is influenced by polar effect, so that by subtracting the acid term from the basic term only the polar effect remain.

In Taft’s substituent constant only methyl group is the standard for which the constant is zero. However, that can be compared with other constant by writing the methyl group in the form CH_{2} – H and identifying it as the group for H. Taft’s and inductive substituent constants are related as:

**Steric Substitution Constant**

For a drug to interact with an enzyme or to receptor, it has to approach to the binding site. The bulk, size, and shape of the drug may influence this process. A steric substitution constant is a measure of the bulkiness of the group it represents and its effect on the closeness of contact between the drug and the receptor site.

**Verloop Steric Parameter**

Verloop steric parameter is called as sterimol parameter, which involves a computer programme to calculate the steric substituent values from standard bond angles, Vander Waal’s radii, bond length and possible conformation for substituents. It can be used to measure any substituents.

For example the Verloop steric parameters for carboxylic acid group are demonstrated. L is the length of the substituent while B1 – B4 are the radii of the group.

**Charton’s Steric Constants**

The principal problem with Vander Waal’s radii and Taft’s Es value is the limited number of groups to which these constants have been allocated. Charlton introduced a corrected Vander Wal’s radius U in which the minimum Vander Wal’s radius of the substituent group (rv(min)) is corrected for the corresponding radius for hydrogen (rvн), as defined by the equation. They were shown to be a good measure of steric effect by correlation with Es values.

**Minimal Steric Difference (MSD)**

This parameter assesses the difference between molecules in terms of the parts that do not overlap when one chemical formula is placed on top of the other. For example, piperidine is compared with pyrrolidine the methylene group, surrounded by the dotted circle, will determine the MSD. Since this is the only portion which does not overlap. The rules of the calculation are as follows.

- Hydrogen atoms are ignored.
- Elements in the second period of the periodic table have a weighting of 1.
- Elements in the third period have a weighting of 1.5 and
- Elements in higher periods have a weighting of 2.

Thus the MSD between piperidine and pyrrolidine is 1 and that between pyrrolidine and indole is 4.

**Molecular Connectivity**

Molecular connectivity, designated my, can be employed as steric parameters. The superscript m denotes the order of the parameter. Zero-order connectivity (°) is the simplest and is defined by Equation,

The first order connectivity (‘x) is derived for each bond by calculating the product of the numbers associated with the two atoms of the bond. The reciprocal of the square root of this number is the bond value. Bond values are summed to give the first order connectivity for the molecule, so that the value for n-Butane is,

**Other Parameters**

Molecular weight was used by Lien to improve the fit of the parabolic Hansch equation. A more appropriate use of MW was demonstrated in QSAR study of multidrug resistance of tumor cells, where the MW term stands for the dependence of biological activities on diffusion rate constant. The relationship between MW and volume implies that 3√MW corresponding to linear dimension of size should be better than log MW.

Indicator variables sometimes known as dummy variables or de-nova constant are used in linear multiple regression analysis to account for certain features, which can not be described by continuous variables.

It is used to account for other structural features like intra molecular hydrogen bonding, hydrogen donor and acceptor properties, ortho effects, cis/trans isomers, different parent skeleton, different test models etc.

## Qsar And Drug Design Quantitative Models

We looked at the physiochemical parameters commonly used in QSAR studies in the previous section. To draw the QSAR equation with these parameters, it is simple to draw a QSAR model with such property. But the biological activity of most of the drug is related to a combination of physiochemical properties. Various methods are used to draw the QSAR model. Some of them are:

**Hansch Analysis** (The Extra thermodynamic approach)

This is the most popular mathematical approach to QSAR introduced by Corwin Hansch. It is based on the fact that the drug action could be divided into two stages.

- transport of the drug to its site of action.
- the binding of drug to the target site.

Each of these stages is dependent on the chemical and physical properties of the drug and its target site. In Hansch’s analysis, these properties are described by the parameters which correlate the biological activity.

The most commonly used physiochemical parameters for Hansch analysis are log р, л, σ, and steric parameters as practically all the parameters used in Hansch analysis are linear free energy related (ie. derived from equilibrium constant) so it is known as “linear free energy approach” or “extra thermodynamic approach”.

If the hydrophobicity values are limited to a small range then the equation will be linear as follows:

The molecule which is too hydrophilic or too lipophilic will not be able to cross the lipophilic or hydrophilic barriers respectively. Therefore the p values are spread over a large range, then the equation will be parabolic and given as:

The constant K1 – Ks are obtained by least square method. Not all the parameters are necessarily significant in a QSAR model for biological activity. To derive an extra thermodynamic equation following rules are formulated by Hansch:

- Selection of independent variables. A wide range of different parameters like log р, л, o, MR, steric parameters etc should be tried. The parameters selected for the ‘best equation’ should be essentially independent i.e. the inter-correlation coefficient should not be larger than 0.6 – 0.7.
- All the reasonable parameters must be validated by appropriate statistical procedure i.e. either by stepwise regression analysis or cross-validation. The “best equation” is normally one with a lower standard deviation and higher F value.
- If all the equations are equal then one should accept the simplest one.
- Number of terms or variable should be at least 5 or 6 data point per variable to avoid chance correlations.
- It is important to have a model which is consistent with known physical organic and bio-medical chemistry of the process under consideration.

**Application of Hansch Analysis**

Hansch equation may be used to predict the activity of an as yet unsynthesized analogue. This enables the medicinal chemist to make a synthesis of analogue which is worthy. However these predictions should only be regarded as valid, if they are made within the range of parameter values used to establish the Hansch equation. Hansch analysis may also be used to give an indication of the importance of the influence of parameters on the mechanism by which a drug acts.

**Example:**

The adrenergic blocking activity of series of analogue of ẞ- Halo aryl amine was observed. It was found that only л and σ values only related to the activity and not the steric factor, from the following Hansch equation.

log1/c= 1.78π – 0.120 + 1.674 ……… 13

The smaller the value of coefficient of o relative to that of л in the above equation shows that electronic effect do not play an important role in the action of drug.

The accuracy of Hansch equation depends on:

- The number of analogues (n) used. The greater the number, the higher the probability of obtaining an accurate Hansch equation.
- The accuracy of biological data used in the derivation of the equation.
- The choice of parameters (See Craig Plot below)

**Craig Plot**

Craig plot is a simple graphical plot of versus σ or any such two parameters to guide the selection of next substituent. In other word, Craig plot are two dimensional plot of one parameter against another. The plot is divided into four sections corresponding to the positive and negative values of the parameters.

They are used in conjugation with an already established Hansch equation for a series of related aromatic compounds to select the aromatic substituents that are likely to produce highly active analogues. The example is a Craig plot for the σ and π factor of para-aromatic substituent.

For example, suppose Hansch analysis carried out on a series of aromatic compounds yielded the Hansch equation.

To obtain high value for the activity (‘/c) it is necessary to pick up a substituent with a positive л and negative σ value, then the substituent should be taken from the lower right hand quadrant of the plot.

**Advantages of Craig Plot:**

- The plot shows clearly that there is no overall relationship between π and σ.
- It is possible to tell the glance which substituent has positive л and σ parameters, and which substituent has negative л and σ parameters.
- It is easy to see which substituents have similar π values. eg the ethyl, bromo, and trifluoro methyl all are approximately on the same vertical line on the plot.
- It is useful in planning which substituent should be used to derive the most accurate equation involving σ and л.

However, it is emphasized that the use of the Craig plot does not guarantee that the resultant analogs will be more active than the lead because the parameter used may not be relevant to the mechanism by which the analogs act.

**Free Wilson Analysis**

The Free – Wilson analysis is a true structure-activity relationship model. “Mathematical model”, “additivity model” or de-novo approach are the synonyms for the Free Wilson method. This is an alternative procedure to the Hanch model, that substituent constant based on biological activities is used rather than physical properties.

The method is based upon an additive mathematical model in which a particular substituent in a specific position is assumed to make an additive and constant contribution to biological activity of a molecule in a series of chemically related molecules.

This method is based on the assumption, that the introduction of a particular substituent, at a particular molecular position, always leads to a quantitatively similar effect on the biological potency of the whole molecules and is expressed by the equation as

For example, the equation for acetylenic carbonate having antitumor activity is

Applications

- The Free-Wilson approach is easy to apply. Especially, in the early phases of structure activity analyses. It is a simple method to derive substituent contribution and to have a first look on their possible dependence on different physiochemical properties.
- The substituent which cannot fulfill the principle of additivity can be recognized.
- Substituent constants like πл, σ еtc., were not considered and so this method is effective, when substituent constants are not available

However Free-Wilson methods have some Disadvantages di

- The structural variation is necessary in at least two different positions of substitution, otherwise meaning less group contribution would result.
- Large number of parameter is needed to describe relatively few compounds.
- Only a common activity contribution can be derived for substituent which always occur together in different positions of the molecule
- Only a small number of new analogues can be predicted.

**The Fujita-Ban Modification**

Fujita – Ban, reformulated the Free – Wilson equation, where the constant term μ is now defined as the calculated biological activity value of the unsubstituted parent compound of the series. In comparison to the free Wilson method, the Fujita-Ban offers some important advantages:

- The table for regression analysis can be easily generated
- Addition and elimination of the compound is simple and does not significantly change the values of other regression analysis.
- Any compound may be chosen as the reference compound j singularity problems are avoided.

**Mixed Approach** (The Relationship between Hansch and Free-Wilson Analysis)

Hansch analysis and the Free-Wilson model differ in their application, but they are closely related. A mixed approach of with indicator variable offers the advantages of both, Hansch and Free-Wilson analysis and widens their applicability.

The mixed approach can be written as:

Today the mixed approach is the most powerful tool for the quantitative description of large and structurally diverse data sets.

**The Topliss Decision Tree**

This approach is completely non-mathematical and non-statistical and does not need computerization of the data. In certain situations, it might not be feasible to make a large range of structures required for the Hansch equation.

For example, the synthetic route involved might be difficult and only few structures can be made in a limited time. In these circumstances, it would be useful to test the biological activity as they are synthesized and to use these results to determine the next analogs.

A Topliss scheme is a “flow diagram” that in a series of steps directs the medicinal chemist to produce a series of analogs, some of which have greater activity than the lead used to start the tree.

However, its use is limited because it requires a lead compound to have an unfused aromatic ring system and it only produces analogs that are Substituents of aromatic system.

There are two Topliss schemes, one for the aromatic substituents and the other one for the aliphatic side chain substituent both are used in similar manner.

The Topliss scheme for aromatic substituents assumes that the lead compound has been tested for biological activity and contains a monosubstituted aromatic ring. The first analogue in the scheme is the 4-chloro derivative since this derivative is usually easy to synthesize. The chloro substituent is more hydrophobic and electron-withdrawing than hydrogen and therefore, л and σ are positive.

Once the chloro analog has been synthesized, the biological activity is measured. There are three possibilities. The analog will have less activity (L), equal activity (E), or more activity (M). The type of activity observed will determine which branch of the Topliss scheme is followed next.

If the biological activity increases, then the (M) branch is followed and the next analog to be synthesized is the 3, 4-dichloro substituted analogue. If, on the other hand, the activity stays the same, then the (E) branch is followed and the 4-Methyl analogue is synthesized. Finally, if activity drops, the (L) branch is followed and the next analogue is the 4-Methoxy analog.

Biological results from the second analogue now determine the next branch to be followed in the scheme.

Let us consider the situation where the 4-chloro derivative increases in biological activity. Since the chloro substituent has positive л and σ values, it implies that one or both of these properties are important to biological activity. If both are important, then adding a second chloro group should increase biological activity yet further.

If it does, substituents are varied to increase the π and σ values even further. If it does not, then an unfavorable steric interaction or excessive hydrophobicity is indicated. Further modifications then test the relative importance of л and steric factors.

We shall now consider the situation in which the 4-chloro analog drops in activity. This suggests either that negative л and/or σ values are important to activity or that a p-substituent is sterically unfavorable. It is assumed that an unfavorable σ effect is the most likely reason for the reduced activity and so the next substituent is one with a negative o factor (i.e. 4-methoxy).

If activity improves further, changes are suggested to test the relative importance of the σ and л factors. If, on the other hand, the 4-OMe group does not improve activity, it is assumed that an unfavorable steric factor is at work and the next substituent is a 3-chloro group.

The last scenario is that in which the activity of the 4-chloro analog is little changed from the lead compound. This could arise from the drug requiring a positive π value and a negative σ value. Since both values for the chloro group are positive, the beneficial effect of the positive σ value.

The next substituent to try in that case is the 4- σ value. If this still methyl group; this has the necessary positive π value and negative has no beneficial effect, then it is assumed that there is an unfavorable steric interaction at the p-position, and the 3-chloro substituent is chosen next. Further changes continue to vary the relative values of the л and σ factors.

The Topliss scheme for aliphatic side chains was set up similarly to the aromatic scheme and is used in the same way for side groups attached to a carbonyl, amino, amide or similar functional group. The scheme only attempts to differentiate between the hydrophobic and electronic effects of substituents and not the steric properties. Thus, the substituent involved has been chosen to try and minimize any steric differences. It is assumed that the lead compound has a methyl group.

The first analogue suggested is the isopropyl analog. This has an increased л value and in most cases would be expected to increase activity, since it has been found from experience that the hydrophobicity of most lead compounds is less than the optimum hydrophobicity required for activity.

Let us concentrate first on the situation in which activity rises. Following this branch, a cyclopentyl group is now used. A cyclic structure is used since it has a larger π value but keeps any increase in steric factor to a minimum. If activity rises again, more hydrophobic substituents are tried.

If activity does not rise, then there could be two explanations. Either the optimum hydrophobicity has been passed or there is an electronic effect (σi) at work. Further substituents are then used to determine, which is the correct explanation.

Let us now look at the situation where the activity of the isopropyl analog stays much the same. The most likely explanation is that the methyl and isopropyl groups are on either side of the hydrophobic optimum. Therefore, ethyl group is used next, since it has an intermediate π value.

If this does not lead to an improvement, there may be an unfavorable electronic effect. The groups used have been electron-donating and so electron-withdrawing groups with similar π values are now suggested.

Finally, we shall look at the case where activity drops for the isopropyl. In this case, hydrophobic and/or electron-donating groups could be bad for activity and the groups suggested are suitable choice for further development.

## Leave a Reply