iHumans.com <http://ihumans.com>


Home | Site Map | Contact Us | About iHumans | Alliance and R&D Collaboration | Seach Executives and Scientists | Job Openings and Opportunities | Health Care | News and Commentaries | ニホン語のページ (Japanese Pages) |


26 July, 1999

Snippets Come Of Age: Common Disease-Common Variant (CD-CV) Hypothesis
In molecular medicine, a major goal is to understand the role of common genetic variants in association with, and in susceptibility or resistance to, common diseases and pharmacogenetic traits. This will require to characterize the nature of gene variation in human populations that is largely confined to single-nucleotide polymorphisms (SNPs), to assemble an extensive SNP catalogue in candidate genes and to perform association studies for particular diseases. Most of the heterozygosity (two variant allelic sequences of the same gene) are attributable to common alleles that are present in the general population at a frequency of more than 1%. By contrast, there seem to exist common variants that contribute to genetic risk for common diseases, giving a straightforward common disease-common variant (CD-CV) hypothesis. These variants may be rather ethnic- and individual-specific than common across general human population, and likely small in number. It is thus important to build a comprehensive SNP catalogue of these common gene mutations and test them directly for association to clinical phenotypes.

In the human genome of 3 billion bases, SNPs are calculated to exist at a frequency of 1/1000. As the coding region comprises only of 5% of the genome, most of SNPs do not affect protein structure, and as alternate SNPs code for the same protein in a synonymous way (no change in amino acid sequence), most common type SNPs might not be so informative. Amino acid-altering non-synonymous coding-region SNPs would be rare and harder to be found because of expected selection against them in human evolution. Once successfully collected, these mutations may be the first step and a shortcut to find the genes underlying major human diseases. It would be crucial to carefully select candidate genes and assess them in as many patients as possible, perhaps thousands of patients, to link a disease to a very rare gene variant. Of course this should not mean it less important to have a comprehensive catalogue of all SNPs in the genome (gSNPs) without an idea of prior selective association.

In February, 1998, Francis Collins of NHGRI (National Human Genome Research Institute) announced a plan of having 450 human DNA samples to get 100,000 SNPs in 3 years from 4 ethnic groups of Americans (Caucasian, Native, African, Asian). In April, 1999, the SNP Consortium (TSC) of 10 major drug companies and several academic labs was formed to get 300,000 SNPs and 150,000 SNP mapping in the chromosome in 2 years. Celera Genomics of JC Venter aims at accelerating sequencing the whole human genome. More recently (July, 1999), Japan announced a nationally-supported project to get 100,000-150,000 coding region SNPs (cSNPs rather than gSNPs) in 2-3 years from normal and diseased Japanese population first, later extending it into other Asian pobulations. Initially, 50 Japanese DNA samples will be used, and main disease targets are announced to be cancer, hypertension, atherosclerosis, diabetes, allergy, and neurodegenerative diseases. In a rather rare arrangement, something like the Ministry-Agency Consortium was formed among 4 ministries (Agriculture_Fishery, Education_Science_Culture, Health_Welfare, International_Trade_Industry) and an Agency (Science_Technology) in order to push this project. As communications and collaborations among these Government organizations were not so good in the past, it remains to be seen how this ambitious consortium-like arrangement may work out to be efficient and competitive in achieving the goal. Incidentally, this is part of enhanced effort of the Japanese Government in promoting biomedical research and industry. In the next 5 years, $3.3 billion are proposed annually in addition to the current $5 billion for biomedicine. The proposed budget has to be approved by the Diet (Japanese Parliament) at the end of this year and be implemented at best at the middle of 2000. As the genome science makes progress at a horrendous speed, the chances might be that some other organizations such as Venterユs Celera work on the Japanese SNPs to come up with the basic kind of information before massive Japanese effort brings some fruits.

In recent publications, there have appeared technical advances and initial findings in the survey of SNPs related to human disease phenotypes. Wang DG et al (1998: uid=9582121) developed high-density variation-detection DNA chips for large-scale identification, mappiing, and genotyping of SNPs in the human genome. They used this method to identify a total of 3241 candidate SNPs, 2227 of which were mapped in a genetic map. A systematic survey has been made of SNPs in the coding regions of 106 human genes (Cargill M et al, 1999: uid=10391209), the products of which have roles in cardiovascular disease, endocrinology and neuropsychiatry. Samples used were obtained from 51 cell lines (20 European, 14 Asian, 10 African American, 7 African Pygmies) and 10 European blood samples. An average of 114 independent alleles (chromosomes) were screened for each gene. SNPs were confirmed by DNA sequencing. In all, 560 SNPs were found in the total cumulative length of 196.2 kb, including 168 non-coding region SNPs in 60.4 kb, and 392 coding-region SNPs (cSNPs) in 135.8 kb that are divided roughly equally between those causing synonymous (207) and non-synonymous (185) polymorphisms. When the number of these variant sites were normalized for the sample size (sequence length), coding and non-coding regions showed similar frequency of occurrence, 5.30 x 10<-4> and 5.43 x 10<-4> per base, respectively. In the coding region, however, non-synoymous polymorphisms (changing amino acid sequences) were only 38% (3.66 x 10<-4> per base) of synonymous ones (9.73 x 10<-4> per base), indicating that selection acting against deleterious alleles during human evolution.

Another systematic and comprehensive survey has been made to assess the nature, pattern and frequency of SNPs in 75 candidate human genes, the products of which have roles in blood-pressure homeostasis and hypertension (Halushka MK et al, 1999: uid=10391210). In all, 190 kb in 148 alleles were surveyed that comprised of the 5ユ and 3ユ untranslated regions (UTRs, 77 kb), introns (25 kb) and coding sequence (87 kb) of these 75 genes. DNA samples were from 40 Zimbabwe Africans, 32 European Americans, and 3 Northern European descent. These individuals were chosen, with informed consent, from 800 individuals examined for blood pressure and related measurements, and belonged to the top and bottom 2.5% of a normalized blood-pressure distribution. High-density variant-detection arrays (VDAs, for example 300,000 different 25-mer oligonucleotides/VDA) were used for SNP survey. 874 candidate human SNPs were identified, 387 of which were within the coding sequences (cSNPs). Of all cSNPs, 54% (209 cSHPs) lead to a predicted change in the protein sequence, implying a high level of human protein diversity. These protein-altering SNPs (non-synonymous, 5.7 x 10<-4> per base) are 38% of synonymous ones (15.1 x 10<-4> per base), that is identical to the finding above discussed (Cargill M et al, 1999: uid=10391209). This differential nucleotide diversity of cSNPs is likely to directly demonstrate the effects of natural selection on human genes. There were on the average 12 SNPs per gene, but when corrected for sequence length, there was 15-fold variation in the nucleotide diversity across genes and might be correlated with the effects of functional conservation on gene functions that might well be population- and individual-specific.

Cargill et al (1999: uid=10391209) provide a fundamental description of sequence variation in the coding regions of human genes, namely, 1) A gene may contain ca 4 cSNPs, 2) There exist ca 240,000-400,000 common cSNPs over human genome, 3) A gene may differ by 1 base in 2 kb, or 1 heterozygous base in coding region of the gene, 4) Only ca 40% of non-synonymous change alter encoded amino acid, thus a person being heterozygous for ca 24,000-40,000 non-synonymous substitutions. It would need a large number of patients to link a SNP to a disease, and it remains to be seen how the CD-CV hypothesis hold out in massive genomic information.

Related: Please also read Pharmacogenomics: Single Nucleotide Polymorphisms (SNPs) and Personalized Medicines (9 June, 1998)

Please send your comments and ideas through e-mail form at this website.

(go back to top)


Home | Site Map | Contact Us | About iHumans | Alliance and R&D Collaboration | Seach Executives and Scientists | Job Openings and Opportunities | Health Care | News and Commentaries | ニホン語のページ (Japanese Pages) |