GENES AND GENOMES This document is licensed under the

GENES AND GENOMES This document is licensed under the
1. COMPOSIZIONE CHIMICA DELLE CELLULE VIVENTI Le cellule sono composte di acqua, ioni inorganici e molecole organiche (contenenti carbonio). L'acqua è di gran lunga la molecola più abbondante nelle cellule, che ne costituisce circa il 70% della massa totale. Le interazioni tra l'acqua e gli altri costituenti sono di importanza fondamentale nella biochimica cellulare. La proprietà critica dell'acqua è che si tratta di una molecola polare, in cui cioè gli atomi di idrogeno hanno una leggera carica positiva e l'ossigeno a una leggera carica negativa. A causa di questa natura polare le molecole dell'acqua possono formare legami idrogeno l'una con l'altra e con le altre molecole polari, e possono anche interagire con gli ioni caricati positivamente o negativamente. Come risultato di queste interazioni gli ioni e le molecole polari sono più facilmente solubili nell'acqua. Al contrario le molecole non polari, che non possono interagire con l'acqua, sono scarsamente solubili nell'ambiente acquoso; quindi le molecole non polari tendono a minimizzare il loro contatto con l'acqua associandosi direttamente le une con le altre. Le interazioni fra le molecole polari e non polari con l'acqua e con le altre molecole giocano un ruolo fondamentale nella formazione delle strutture biologiche, come per esempio le membrane cellulari. I principali ioni inorganici della cellula, il sodio (Na+), il potassio (K+), il magnesio (Mg2+), il calcio (Ca2+), il fosfato(HPO42-), il cloruro(Cl-) e il bicarbonato (HCO3-), costituiscono l'1% o meno della massa cellulare. Questi ioni sono coinvolti in una serie di aspetti del metabolismo cellulare e quindi giocano un ruolo critico nella funzione cellulare. Sono però le molecole organiche che costituiscono la parte più caratteristica delle cellule viventi. La gran parte di questi composti organici appartiene all'una o all'altra di quattro classi principali di molecole: i carboidrati, i lipidi, le proteine, e gli acidi nucleici. Le proteine, gli acidi nucleici, e molti carboidrati (i polisaccaridi) sono macromolecole formate dall'unione (polimerizzazione) di centinaia o migliaia di precursori a basso peso molecolare: aminoacidi, nucleotidi e zuccheri semplici. Tali macromolecole costituiscono generalmente dall'80 al 90% del peso secco delle cellule, sia procariotiche che eucariotiche. I lipidi sono un'altra costituente fondamentale delle cellule. Il resto della massa cellulare è composta da piccole molecole organiche, i precursori delle macromolecole. In definitiva, la biochimica essenziale delle cellule può essere definita dalla struttura e e dalla funzione di quattro classi principali di molecole organiche. This document is licensed under the Attribution-NonCommercial-ShareAlike 2.5 Italy license, available at

1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in some viruses) that encodes a nucleic acid or protein product that contributes to or influences the phenotype of the cell or the organism. Genes are the functional units of chromosomal DNA. Each gene not only encodes the structure of some cellular product, but also bears control elements (short sequences) that determine when, where, and how much of that product is synthesized. Most genes encode protein products; special classes of genes encode for RNA molecules. The way genes encode proteins is indirect and involves several steps. The first step is to copy (transcribe) the information encoded in the DNA of the gene as a related but single-stranded molecule called messenger RNA. Subsequently the information in the messenger RNA is translated (decoded) into a string of amino acids called a polypeptide. The polypeptides, on their own or by aggregating with other polypeptides and cell constituents, form the functional proteins of the cell.

2. Introns and exons Trying to pinpoint precisely what genes are is complicated by the fact that many eukaryotic genes contain mysterious segments of DNA, called introns, interspersed in the transcribed region of the gene. Introns do not contain information for functional gene product such as protein. They are transcribed together with the coding regions (called exons) but are then excised from the initial transcript. Since correct sequence in the introns (as well as in the regulatory region) is necessary in order to generate a properly sized transcript at the right time and place, introns (along with coding and regulatory regions) should be considered part of the overall functional unit, in other words, part of the gene

4. Schematic gene structure
Generalized gene structure in prokaryotes and eukaryotes. The coding region (dark green) is the region that contains the information for the structure of the gene product (usually a protein). The adjacent regulatory regions (lime green) contain sequences that are recognized and bound by proteins that make the gene's RNA and by proteins that influence the amount of RNA made.

3. The average lenght of coding regions
Estimates of the average length of polypeptide chains coded by genes of various organisms; these value have to be multiplied by 3 in order to obtaing the lenght of the corresponding coding DNA. Tipical values are 1,000 to 1,500 bp.

5. Number of introns-exons per gene
Many eukaryotic genes contain mysterious segments of DNA, called introns, interspersed in the region of the gene. Introns do not contain information for functional gene product such as protein. Distribution of the number of exons among genes of three organisms

6. Genomes and genes The number of genes increases with genome size, but the trend is complicated due to repetitive DNA and introns. Counting genes is difficult, even in completely sequenced genomes The figure of 100,000 for human is substantially inflated

7. How many genes in the human genome?
Prior to the human genome sequence, the most commonly cited estimate for the number of protein-coding genes in the human genome was 100,000, even though the basis of this figure was somewhat dubious to begin with. In 2001, when the draft sequences of the human genome were announced, the estimates were lowered somewhere between 30,000 and 35,000 The completed sequence, published in 2004, provided an even lower estimate of 20,000 to 25,000 genes A more recent estimate, based on comparing all the human genes with those cataloged for dog and mouse, has even decreased the number of genes below 20,000.

8. How to count genes Gene-prediction programs rely heavily on identifying open reading frames. However, sequences that have a biological function but don't produce a protein have been found in large quantity, and several thousand of “genes” that don't code for proteins have been reported. Thus an open reading frame "is not enough" to identify a gene, and an integrated catalog of protein-coding genes should be based more on comparative evidence from different genomes.

9. Average gene length Intron/exon statistics for various organisms

10. Plasmid genomes Bacterial cells isolated from nature often contain small DNA elements that are not essential for the basic operation of the bacterial cell. These elements are called plasmids. Plasmids are symbiotic molecules that cannot survive at all outside of cells. Even though plasmids are not part of the basic operational system of their host cells, some are quite complex, carrying many genes, so it is quite appropriate to refer to their distinctive DNA as a "plasmid genome." Bacterial plasmids often contain genes that are extremely useful to the bacterial host, for example, by promoting bacterial cell fusion, conferring antibiotic resistance, or producing toxins. Plasmids also are occasionally found in fungal and plant cells. Most are found inside mitochondria and chloroplasts, but some are found in nuclei or in the cytosol. Unlike the bacterial plasmids mentioned above, these eukaryotic plasmids seem to provide no benefits for their hoststhey seem to exist selfishly, only for the purpose of their own propagation. For their replication and maintenance, plasmids depend on the general cellular machinery encoded by the host genome. Bacterial plasmids are most often circular, but there are linear types too. In fungi and plants, linear plasmids are most common, but circular types are known in fungi.

11. Organellar genomes Mitochondrial and chloroplast chromosomes consist of double-stranded DNA molecules. Individual mitochondria and chloroplasts contain identical multiple copies of their chromosomes, and each eukaryotic cell contains several to many of these organelles. The organelle chromosomes contain genes specific to the functions of the organelle concerned. Nevertheless, most of the biological functions that occur inside these organelles are specified by genes in the nuclear genome. There is no overlap with the nuclear genome in gene content. Mitochondria and chloroplasts probably were originally prokaryotic cells that entered and took up a symbiotic relationship inside another cell. Throughout evolution most of the original prokaryotic genes were transferred to the nuclear genome or lost. Mitochondrial genomes can be eliminated in some organisms such as yeasts, but most organisms cannot survive without them, so there is still mutual interdependence between nuclear and organelle subdivisions of the genome. Chloroplasts can be eliminated only in photosynthetic organisms that can survive by taking in preformed nutrients from the environment (that is, that can act as heterotrophs).

12. Most eukaryotic DNA does not include genes
Between genes there is DNA, mostly of unknown function. The size and nature of this DNA vary with the genome. In bacteria and fungi there is little, but in mammals the intergenic regions can be huge. Sequences of DNA that exist quite distant from a given gene can affect the regulation of that gene. They could thus be considered part of the functional gene unit, even though separated by long segments of DNA having nothing to do with the gene in question. In many eukaryotes some of the DNA between genes is repetitive, consisting of several different types of units repeated throughout the genome. Some of the repetitive DNA is dispersed; some is found in contiguous "tandem" arrays. Repetitive DNA is also found in some introns. The extent of this DNA is different in different species, and indeed there is variation of repeat number within species.

13. Comparing gene densities
Schematic diagram of gene topography in four organisms. Light green = introns; dark green = exons; white = intergenic regions

14. A small fraction of total eukaryotic DNA is coding
In mammals, only a few percent of the DNA is actualy coding:

15. Different components of the human genome
Although most prokaryotic chromosomes consist almost entirely of protein-coding genes, such elements make up a small fraction of most eukaryotic genomes. As a prime example, the human genome might contain as few as 20,000 genes, comprising less than 1.5% of the total genome sequence

16. Junk DNA? Introns account for more than a quarter of the human genome. Pseudogenes are non-functional copies of coding genes. They include 'classical pseudogenes' (direct DNA to DNA duplicates), 'processed pseudogenes' (copies that are reverse transcribed back into the genome from RNA and therefore lack introns) and 'Numts' (nuclear pseudogenes of mitochondrial origin). The human genome is estimated to contain about 19,000 pseudogenes. Transposable elements are divided into Class I elements, which transpose through an RNA intermediate (long interspersed nuclear elements - LINEs, endogenous retroviruses, short interspersed nuclear elements - SINEs and long terminal repeat – LTR – retrotransposons) and Class II elements, which transpose directly from DNA to DNA (DNA transposons and miniature inverted repeat transposable elements (MITEs).

17. Coding sequences are needles in the haystack
It is apparent that the coding sequences are only a small part of the genome in most eukaryotes, particularly in human. Finding these regions is like finding a needle in the haystack. In addition, the genes are not uniformly distributed. There are regions in the genome where the genes are packed together, and regions where they are sparse, where finding genes is like finding water in a desert.

18. Categorizing the genes in eukaryotic genomes
Classification schemes based on gene function suggest that all eukaryotes possess the same basic set of genes, but that more complex species have a greater number of genes in each category. For example, humans have the greatest number of genes in all but one of the categories used in the figure, the exception being ‘metabolism' where Arabidopsis comes out on top as a result of its photosynthetic capability, which requires a large set of genes not present in the other four genomes included in this comparison. This functional classification reveals other interesting features, notably that C. elegans has a relatively high number of genes whose functions are involved in cell-cell signaling, which is surprising given that this organism has just 959 cells. Humans, who have 1013 cells, have only 250 more genes for cell-cell signaling.

19. Overview of the human genome
Genome size is approximately 3,200 Mb Gene number is approximately 20,000 Average gene density is 1 per 100 kb (5% of DNA encodes proteins); some areas are gene rich, others are gene deserts (0 to 64 genes per 100 kb) Average gene size (including introns) is 27 kb; gene regions account for about 25% of genome Average polypeptide size is 1.3 kb Fraction of genome with coding functions is about 1.5% At least 50% of genome made of transposable elements (e.g. LINES and Alus) Intron number ranges from 0 (in histones) to 234 (titin , a muscle protein). Hundreds of genes appear to have been transferred directly from bacteria to vertebrate genomes. Mechanism unknown. Functions have been assigned to 60% of genes. Largest human gene is dystrophin (mutated in muscular dystrophy): 2.5 Mb (larger than some bacterial genomes) 1077 blocks of duplicated regions in human genome (contain 10,000 genes): suggests genome rearrangements common in evolution

GENES AND GENOMES This document is licensed under the

Presentazioni simili

Presentazione sul tema: "GENES AND GENOMES This document is licensed under the"— Transcript della presentazione:

Presentazioni simili

Sul progetto

Feed-back

Entrare

Autorizzarsi attraverso i social network:

GENES AND GENOMES This document is licensed under the

Presentazioni simili

Presentazione sul tema: "GENES AND GENOMES This document is licensed under the"— Transcript della presentazione:

Presentazioni simili

Sul progetto

Feed-back