11 Dec 2012 22:20
Re: Use ChIPpeakAnno to find two-sided nearest genes to a peak
Holly, I believe that the annotation you obtained from different resources are different versions, e.g.,mm10 from Ensemble. I am travelling today. Jianhong will be happy to help you. If you could keep the thread in the bioconductor list for others to contribute/benefit, that would be very much appreciated. Thanks! Best regards, Julie On 12/11/12 2:49 PM, "Holly" <xyang2@...> wrote: Julie, One more question is about how to annotation of intron peaks. I appreciate if you could test the following example and help to figure out how to correctly annotate it using ChIPpeakAnno. For example, I ran the following codes based on the updated Bioconductor packages, data(TSS.mouse.NCBIM37) rd <- RangedData(IRanges(start = 37377492, end= 37378857) , space="chr18" ) annotatePeakInBatch(rd, AnnotationData = TSS.mouse.NCBIM37) Then I got a result as following: RangedData with 1 row and 9 value columns across 1 space space ranges | peak strand <factor> <IRanges> | <character> <character> 1 ENSMUSG00000073593 18 [37377492, 37378857] | 1 - feature start_position end_position <character> <numeric> <numeric> 1 ENSMUSG00000073593 ENSMUSG00000073593 37319509 37338176 insideFeature distancetoFeature shortestDistance <character> <numeric> <numeric> 1 ENSMUSG00000073593 upstream -39316 39316 fromOverlappingOrNearest <character> 1 ENSMUSG00000073593 NearestStart However, on GenomeBrowser http://genome.ucsc.edu/cgi-bin/hgTracks (MCBI37/mm9), it is an intron region of gene Pcdha4-9. While if I am trying: mart<-useMart(biomart="ensembl",dataset="mmusculus_gene_ensembl") getAnnotation(mart, featureType="TSS") annotatePeakInBatch(rd, AnnotationData = Annotation) it gives a totally different results as ENSMUSG00000051242 which is also not as I expected. sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-pc-linux-gnu (64-bit) attached base packages: [1] grid stats graphics grDevices utils datasets methods [8] base other attached packages: [1] org.Mm.eg.db_2.8.0 ChIPpeakAnno_2.6.0 [3] limma_3.14.3 org.Hs.eg.db_2.8.0 [5] GO.db_2.8.0 RSQLite_0.11.2 [7] DBI_0.2-5 BSgenome.Ecoli.NCBI.20080805_1.3.17 [9] BSgenome_1.26.1 Biostrings_2.26.2 [11] multtest_2.14.0 biomaRt_2.14.0 [13] VennDiagram_1.5.1 BayesPeak_1.10.0 [15] rtracklayer_1.18.1 GenomicFeatures_1.10.1 [17] AnnotationDbi_1.20.3 Biobase_2.18.0 [19] GenomicRanges_1.10.5 IRanges_1.16.4 [21] BiocGenerics_0.4.0 BiocInstaller_1.8.3 loaded via a namespace (and not attached): [1] bitops_1.0-5 MASS_7.3-22 parallel_2.15.2 RCurl_1.95-3 [5] Rsamtools_1.10.2 splines_2.15.2 stats4_2.15.2 survival_2.37-2 [9] tools_2.15.2 XML_3.95-0.1 zlibbioc_1.4.0 Thanks again, Holly On 12/10/2012 01:10 PM, Zhu, Lihua (Julie) wrote: Holly, Thanks for the link! The BDPs in ChIPpeakAnno is defined purely according to the coordinates of known genes. Best regards, Julie On 12/10/12 1:30 PM, "Holly" <xyang2@...> <mailto:xyang2@...> wrote: Julie, A basic question to verify your definition of the bi-directional promoters is, did you define them purely according to the coordinates of known genes, or, have you referred to the experimental data, e.g. EST experiments done by http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1853124/ ? I learned a lot from the discussion with you. Thanks again, Holly On 12/10/2012 10:46 AM, Zhu, Lihua (Julie) wrote: Dear Holly, I believe that you are interested in finding the peaks that reside in bi-directional promoters. If so, you can use the following functions in ChIPpeakAnno. BDP = peaksNearBDP(peaks, AnnotationData=TSS, MaxDistance =5000) c(BDP$percentPeaksWithBDP, BDP$n.peaksWithBDP, BDP$n.peaks) all.genes = union(annotated.peaks$feature, BDP$peaksWithBDP$feature) where annotated.peaks is generated from annotatePeakInBatch using TSS. To learn more about peaksNearBDP, please type ?peaksNearBDP in R. If you just want to find genes on both side of the peaks within certain distance away from the peaks, you can use the following command. Annotated.peaks = annotatePeakInBatch(peaks, AnnotationData = TSS, output="both",select="all", maxgap=1000000) Where maxgap can be adjusted according to your needs. Please let me know if this suits your needs. Thanks! Best regards, Julie On 12/10/12 11:19 AM, "Holly" <xyang2@...> <mailto:xyang2@...> wrote: Dear Lihua, I am trying to annotate peaks for not only the genes with the nearest TSS but the ones at the other side of the peaks. Do you think I can use ChIPpeakAnno to get both sided genes for a peak region? If so, what do you suggest? Thanks a lot, Holly [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor@... https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
RSS Feed