1 Sep 2010 16:38
Re: revisiting genomic coordinates to gene
Thank you very much for your suggestions! Thanks, Andrew On Wed, Sep 1, 2010 at 3:53 AM, Vincent Carey <stvjc@...>wrote: > There are many possible approaches and possible pitfalls. Surely the > following is relevant: > > > get("CTNNB1", revmap(org.Hs.egSYMBOL)) > [1] "1499" > > > get("1499", org.Hs.egCHRLOC) > 3 > 41240941 > > get("1499", org.Hs.egCHRLOCEND) > 3 > 41281939 > > Your location lies within these limits. You could do this more > systematically by defining a collection > of Entrez Gene IDs and building an IRanges or GRanges instance that > stores all the "gene boundary" > information for these IDs. You will have to attend to signs and > multiplicities, and to build versions. > > The GenomicFeatures makeTranscriptDb* facilities are potentially > useful when one is interested in > transcribed or exonic regions specifically. In the following, tx.3 is > an extract from the result of > makeTranscriptDbFromUCSC("hg18"): > > > get("1499", org.Hs.egUCSCKG) > [1] "uc003ckp.2" "uc003ckq.2" "uc003ckr.2" "uc003cks.2" "uc003ckt.1" > [6] "uc010hia.1" "uc011azf.1" "uc011azg.1" > > tx.3[ elementMetadata(tx.3)$tx_name %in% .Last.value, ] > GRanges with 6 ranges and 2 elementMetadata values > seqnames ranges strand | tx_id tx_name > <Rle> <IRanges> <Rle> | <integer> <character> > [1] chr3 [41211405, 41255849] + | 11545 uc010hia.1 > [2] chr3 [41215946, 41256943] + | 11546 uc003ckp.2 > [3] chr3 [41215946, 41256943] + | 11547 uc003ckq.2 > [4] chr3 [41215946, 41256943] + | 11548 uc003ckr.2 > [5] chr3 [41249904, 41253941] + | 11550 uc003cks.2 > [6] chr3 [41252167, 41253962] + | 11551 uc003ckt.1 > > seqlengths > chr1 chr1_random chr10 ... chrX_random chrY > 247249719 1663265 135374737 ... 1719168 57772954 > > and there are undoubtedly ways to use biomaRt to address your concern. > > Perhaps the following is also of interest: > > > findOverlaps(IRanges(start=41266083,width=1), ranges(tx.3)) > An object of class "RangesMatching" > Slot "matchMatrix": > query subject > [1,] 1 2080 > [2,] 1 2081 > > Slot "DIM": > [1] 1 3528 > > > tx.3[2080:2081,] > GRanges with 2 ranges and 2 elementMetadata values > seqnames ranges strand | tx_id tx_name > <Rle> <IRanges> <Rle> | <integer> <character> > [1] chr3 [41263094, 41294629] - | 11552 uc003cku.2 > [2] chr3 [41263094, 41978664] - | 11553 uc003ckv.2 > > seqlengths > chr1 chr1_random chr10 ... chrX_random chrY > 247249719 1663265 135374737 ... 1719168 57772954 > > So it seems your location is in a region that is said to be > transcribed. I could > not find an Entrez Gene ID associated with the "known gene" tx_name values > just above. > > > sessionInfo() > R version 2.12.0 Under development (unstable) (2010-06-30 r52417) > Platform: x86_64-apple-darwin10.3.0/x86_64 (64-bit) > > locale: > [1] C > > attached base packages: > [1] stats graphics grDevices datasets tools utils methods > [8] base > > other attached packages: > [1] org.Hs.eg.db_2.4.1 RSQLite_0.9-1 DBI_0.2-5 > [4] AnnotationDbi_1.11.1 Biobase_2.9.0 GenomicFeatures_1.1.11 > [7] GenomicRanges_1.1.15 IRanges_1.7.32 weaver_1.15.0 > [10] codetools_0.2-2 digest_0.4.2 > > loaded via a namespace (and not attached): > [1] BSgenome_1.17.5 Biostrings_2.17.26 RCurl_1.4-2 XML_3.1-0 > [5] biomaRt_2.5.1 rtracklayer_1.9.3 > > > On Tue, Aug 31, 2010 at 11:43 PM, Andrew Yee <yee@...> wrote: > > I'm interested in converting genomic coordinates to gene names, with > > potential use of the org.Hs.eg.db library, e.g. converting > chr3:41,266,083 > > to CTNNB1. > > > > I know that this topic has been addressed before, see e.g.: > > > > https://stat.ethz.ch/pipermail/bioconductor/2009-January/025906.html(discusses > > use of overlap in IRanges) > > https://stat.ethz.ch/pipermail/bioconductor/2009-October/030140.html > > > > I was wondering if there have been any new solutions or new packages that > > address this problem since these threads. > > > > Thanks, > > Andrew > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@... > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor@... https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
RSS Feed