|
|
About EXPOLDB HelpDesk | ||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
EXPOLDB is a resource for investigating the natural variations in gene expression in humans and aims to provide insights into gene regulation by linking gene expression data from microarray experiments with the distribution of cis modulators of transcription. It is the first systematic effort to collect gene expression data from microarrays and link them with the distribution of (TG/CA)n repeats. In this release, the database contains gene expression data of blood leukocytes from 13 normal human individuals (five pairs of monozygotic twins and 3 unrelated individuals) measured using HG-U95A oligonucleotide microarrays consisting probes for ~10,000 genes.
EXPOLDB provides information for the 2,888 genes that were differentially expressed
(signal log ratio > 1.585) in unrelated individuals and 212 genes in twins. The information
on mean expression and variability (CV) for the more than 5,000 genes that expressed
in blood and had present call (P) can be accessed. It
also provides information on the expression status of 542 known housekeeping
genes. This database links expression profiles with distribution of (TG/CA)n
repeats and can serve as a resource for examining the role of these repeats.
To make the results more comprehensive, information related to annotation, chromosomal location, cellular localization, Gene Ontology, biochemical roles of the gene products, tissue specific expression, and associated hyperlinks to other public databases have been provided.
Since the database incorporates both genotype and phenotype data and, with the contribution of additional data sets from other investigators and our own studies, can serve as a unique resource for those that study the effects of repetitive sequences on gene expression.
Salient features of EXPOLDB
Query Options The information embedded in EXPOLDB can be retrieved through the 'Query EXPOLDB" page. This page provides information in two categories: II) Differentially Expressed Genes The user can select any of these two query categories. Both the query pages provide the following search options:
|
||||||||||||||||||||
| A. Chromosome: | ||||||||||||||||||||
This query option can search for the expression
and variability of genes present on "All" or a selected chromosome.
The default option will provide list of genes from "All" chromosomes. The selection of a chromosome provides the chromosome specific list of genes. User can also select multiple number of chromosomes from the list box using selection key(shift). B. Gene Symbol : This query option allows the user to search a given gene in this database. The user should enter the HGNC approved gene symbol in the box below. Eg: STAT6
The wildcard search is also provided. For example the gene SHARP can be searched putting wild cards "STAT*" as below
The 'wildcard' search option allows the user to search genes belonging to a closely related family such as RUNX* will find RUNX1, RUNX2, RUNX3.
C. Gene Definition The query option allows the user to search for a gene on the basis of keyword in its annotation. For example, search for the keyword "protease" will give the list of genes containing the word "protease". This query can help in searching for genes on the basis of its function. D. UniGene ID The genes in the database can be searched on the basis of its UniGene Cluster ID. The user should enter the complete UniGene ID in the search box as below. No wildcard search option is available in this query. E. Functional pathways The information for 134 biochemical pathways from KEGG and GenMAPP databases are available for the genes present in this database. The user can retreive information by choosing from the list of available pathways from the list box or can enter the name of the functional pathway for e.g, entering 'glycolysis' into the field would give the list of genes available in EXPOLDB belonging to this selected pathway. Since the pathway information is compiled from two different resources, there is a small overlap in case of Glycolysis and Gluconeogeneis pathways available as 'Glycolysis_and_Gluconeogenesis' from GenMAPP and Glycolysis/Gluconeogenesis from KEGG. Similarly, the Pentose Phosphate Pathway is available as Pentose_phosphate_pathway from GenMAPP and Pentose phosphate pathway from GenMAPP. By convention, the KEGG pathways uses "/" to split words whereas GenMAPP uses "_" to split words. The reference list of the pathways incorporated in EXPOLDB is available in the ‘Biochemical Pathway’ section
However, the number of genes belonging to a pathway that appear from a query will depend on their present (P) call in the array experiments or presence of the probe sets on the HG U95Av2 array. An example of Glycolysis pathway is also available in the 'Biochemical Pathway' section. Aknown polymorphic (TG/CA)n repeat can be searched by giving the marker ID. The database can be searched for the gene that contains this polymorphic marker. The D number for example : "D12S1644" can be entered or the GenBank accession number : "Z53110" can be entered in this box. No wildcard search option is available for this query box.
If the user wants to find out whether a gene of his interest contains a polymorphic repeat, then he can submit either the HGNC approved Gene Symbol in the Gene Symbol input box or give a keyword search in the Gene Definition input box. The ‘EXPOL profile’ page of the given gene will contain the information about the presence of any known polymorphic (TG/CA)n repeats as obtained from the CEPH database G. Variability The variability in the gene has been defined in the database by "Coefficient of Variation (CV)" or "Signal Log Ratio". The differentially expressed genes can be searched in the Query form entitled "Differential Expression" on the basis of the range of expression variability measured as signal log ratio. The user can 1) Enter a range for Signal Log Ratio (Minimum available is 1.6)
2) The default option is "All" for all other queries. 3) Select from the given 'Signal Log Ratio' range from the list of specified range The expression status and the variability in expression (CV) shown by the gene can be queried by the form "Expression in Blood" 1) The default option is "All" for all other queries. 2) Select from the given CV range from the list of specified range 3) Enter a range for CV ( Minimum is 0, Maximum is 0.5)
I. Submitting the Query The query can be submitted by clicking the submit button present at the end. The submission of query will give the 'Result page' displaying the list of genes as the result of submitted query. Submission of the keyword "glycolysis" in the "Biochemical Pathway" query option and selection of chromosome 11 gives the following result.
By clicking on the Gene Symbol the detailed description about the gene including functional information, tissue expression, expression values, variability and distribution of (TG/CA)n and Alu repeats can be obtained. Hyperlinks are provided on the EXPOL profile page to link source databases (available on the web) to make the information more comprehensive.
SimRep, an online tool, developed in Perl identifies dinucleotide
and other repeats in a given nucleotide sequence. The nucleotide sequence has to be submitted
in raw format [plain sequence only containing A,T,G or C nucleotides without any spaces].
List of HGNC Gene Symbols of human genes (version Feb 2006) available in EXPOLDB can be accessed from here.
Gene List
|
Any Suggestions?? Help us improve.
| About Expol - Download Data - Tutorial - Disclaimer- FAQ |
©2003 IGIB