Pj Dias | 25 Apr 20:25 2012
Picon

Re: Motif search -- access to JASPAR, MotIV package, more TF-PWM relationships?

Regarding S. cerevisiae, there is a great resource, YEASTRACT database (
http://www.yeastract.com/).

It compiles the motifs already identified, documented regulation based on
wet-lab research and potential regulation based on comparison of motifs
against the promoter region of genes.

It is reviewed and up-dated regularly, being added new motifs and new
documented regulation.

pj

No dia 25 de Abril de 2012 15:12, Steve Lianoglou <
mailinglist.honeypot@...> escreveu:

> Hi,
>
> To carry on the MEME stuff, a biostar post just pointed me to an
> updated scoring metric in tomtom which is made available in the latest
> MEME software suite:
>
> http://bioinformatics.oxfordjournals.org/content/27/12/1603.full
>
> Perhaps wrapping parts of the MEME suite into an R library would be
> useful, no?
>
> You might find the FIRE (and FIRE-pro) suite of tools also useful for
> motif discovery, as welll:
>
> http://physiology.med.cornell.edu/faculty/elemento/lab/software.shtml
>
> Related to that, S. Tavazoie gave a talk at the recent CSHL/sysbio
> meeting and presented TEISER, which seems pretty cool if you're
> looking for structural motifs:
>
> https://tavazoielab.c2b2.columbia.edu/TEISER/
>
> -steve
>
> On Wed, Apr 25, 2012 at 9:44 AM, Zhu, Lihua (Julie)
> <Julie.Zhu@...> wrote:
> > Paul,
> >
> > Thanks for the positive feedback on FlyFactorSurvey! The motifs in this
> > database are generated using the bacterial one-hybrid method (B1H and
> > B1H-seq). All the public motifs can be downloaded freely. It would be
> useful
> > to have a Bioc data package, containing curated and current motifs from
> all
> > organisms if available, that interfaces with MotiV.
> >
> > MEME works very well in finding motifs from B1H-seq data (Christensen et
> > al.,Nucleic Acid Research 2011, Vol39, No.12 e83), although only limited
> > motif discovery tools were compared in the paper. Currently, we are
> working
> > on whether motif discovery can be improved with B1H-seq data.
> >
> > As I understand, MEME is for de nova motif discovery, TOMTOM and STAMP
> are
> > for testing whether the motif returned by a motif finder is significantly
> > similar to a known motif, clover is for searching known motifs in a given
> > set of sequences. We are thinking of adding clover to our website.
> >
> > I am looking forward to your collated survey results.
> >
> > Best regards,
> >
> > Julie
> >
> >
> > On 4/24/12 11:02 PM, "Paul Shannon" <pshannon@...> wrote:
> >
> >> Hi Julie,
> >>
> >> FlyFactorSurvey looks great.   Would that we had such a resource
> (curated,
> >> current, and growing) for all organisms!
> >>
> >> A few questions, if I may:
> >>
> >>   1) What role with respect to FlyFactorSurvey do you picture us taking
> here
> >> at BioC?  How can we help?
> >>
> >>   2) Your website (http://pgfe.umassmed.edu/TFDBS) recommends meme and
> TOMTOM
> >> for motif comparison.  Do you use them yourself?  If so, can you tell
> us about
> >> their strengths and weaknesses?  How do they compare to clover?
> >> (http://zlab.bu.edu/clover/)
> >>
> >> In that same spirit -- trying to find out more about this topic -- here
> are
> >> some more questions:
> >>
> >>   3) The JASPAR database seems to be mostly unchanged since 2009.
> >>      (http://jaspar.genereg.net/html/DOWNLOAD). Does anyone know their
> update
> >> policy?
> >>
> >>   4) Is TRANSFAC only for license holders?
> >>
> >>   5) Are there any other organism-specific gems like FlyFactorSurvey to
> be
> >> discovered out on the web?
> >>
> >> Thanks!
> >>
> >>  - Paul
> >>
> >> On Apr 24, 2012, at 3:16 PM, Zhu, Lihua (Julie) wrote:
> >>
> >>> Paul,
> >>>
> >>> Thanks so much for the comprehensive summary of existing capability of
> Bioc
> >>> and other resources for motif discovery and matching!
> >>>
> >>> Here is my response to your great initiative to collect use cases and
> open
> >>> data resources.
> >>>
> >>> Here is an open data source for Drosophila which we developed:
> >>> http://pgfe.umassmed.edu/TFDBS/
> >>> http://nar.oxfordjournals.org/content/early/2010/11/19/nar.gkq858.full
> >>>
> >>> As you pointed out, there are several excellent Bioconductor packages
> >>> available for the two common cases of motif problems, i.e., de nova
> motif
> >>> discovery and motif matching to known motifs. It would be useful to
> have
> >>> more motif databases available for motif comparison program such as
> MotIV.
> >>> In addition, we use clover to search for known motifs in a given set of
> >>> sequences.
> >>>
> >>> Many thanks for sharing your insights!
> >>>
> >>> Best regards,
> >>>
> >>> Julie
> >>>
> >>>
> >>> On 4/24/12 3:02 PM, "Paul Shannon" <pshannon@...> wrote:
> >>>
> >>>> The recent flurry of interest in sequence motifs here on the bioc list
> >>>> suggests to us that maybe we at Bioconductor could strengthen our
> >>>> infrastructure for this kind of work.  If this work interests you --
> either
> >>>> as
> >>>> a package creator, or as a package user -- please suggest ideas or use
> >>>> cases.
> >>>> What do you need?  I will collect and collate the responses.   We
> hope to
> >>>> identify places where Bioc can help out.
> >>>>
> >>>> For background:  we already have a number of packages (rGADEM, MotIV,
> cosmo,
> >>>> BCRANK, motifRG) which address, with different strengths, what I
> believe to
> >>>> be
> >>>> the two aspects of the motif problem:
> >>>>
> >>>>  1) Detecting enriched motifs in DNA sequence, or in ChIP-seq data
>  (rGADEM,
> >>>> cosmo, motifRG, BCRANK)
> >>>>  2) Predicting the sequence motifs which bind to these enriched
> motifs, and
> >>>> what binding molecules they belong to (MotIV)
> >>>>
> >>>> In the past, a lot of sequence motif/binding work has addressed the
> search
> >>>> for
> >>>> transcription factor binding sites and their cognate transcription
> factors.
> >>>> miRNAs, phorphorylation and methylation all pose related problems.
>  Is there
> >>>> support which we can practically offer here as well?
> >>>>
> >>>> In addition to Bioc packages, there are of course many worthwhile
> websites
> >>>> and
> >>>> external tools:  JASPAR, meme, STAMP (and TRANSFAC, for those with a
> >>>> license).
> >>>> Nooshin mentioned the arabidopsis-specific 'AthaMap'
> >>>> (http://www.athamap.de).
> >>>> Are there other open-source data repositories like this for other
> organisms?
> >>>> c.elegans, as Julie requested?
> >>>>
> >>>> Questions, suggestions, use cases and data sources are all welcome.
> >>>>
> >>>> Thanks!
> >>>>
> >>>> - Paul
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Apr 24, 2012, at 10:47 AM, Zhu, Lihua (Julie) wrote:
> >>>>
> >>>>> Eloi,
> >>>>>
> >>>>> I would like to use MotIV for a c.elegans dataset. What data source
> would
> >>>>> you recommend for matchMotif? Many thanks for your help!
> >>>>>
> >>>>> Best regards,
> >>>>>
> >>>>> Julie
> >>>>>
> >>>>>
> >>>>> On 4/24/12 1:28 PM, "Mercier Eloi" <emercier@...> wrote:
> >>>>>
> >>>>>> Hello,
> >>>>>>
> >>>>>> I am one of the developer of MotIV. I will be happy to help you if
> you
> >>>>>> have any question regarding the package.
> >>>>>>
> >>>>>> First, I want to mention that in the Plos One paper, we used PICS,
> >>>>>> rGADEM and MotIV as a pipeline but MotIV can be use as a stand
> alone.
> >>>>>> Some of the advanced functions won't be available though.
> >>>>>>
> >>>>>> Since the PWMs in MotIV correspond to human TF, you may have to use
> your
> >>>>>> own list of PWMs. What MotIV needs is a simple list of matrices
> >>>>>> (head(jaspar) to view the format).
> >>>>>> Jaspar's PWMs can be easily downloaded but it seems it only
> contains ~20
> >>>>>> motifs. On the other hand, AthaMap has more motifs but I did not
> manage
> >>>>>> to find an easy way to get them. Another place to look at is the
> AGRIS
> >>>>>> website (http://arabidopsis.med.ohio-state.edu/downloads.html).
> >>>>>>
> >>>>>> If you're only interested by the identification of the motifs and
> do not
> >>>>>> want to do further analysis with R, I recommend you to look at
> >>>>>> http://www.benoslab.pitt.edu/stamp for the identification of your
> motifs.
> >>>>>>
> >>>>>> Regards,
> >>>>>>
> >>>>>> Eloi Mercier
> >>>>>>
> >>>>>>
> >>>>>> On 12-04-24 07:36 AM, nooshin wrote:
> >>>>>>> Thanks a lot for your suggestion. I will for sure have a look and
> inform
> >>>>>>> you.
> >>>>>>> Bests,
> >>>>>>> Nooshin
> >>>>>>>
> >>>>>>>
> >>>>>>> On 04/24/2012 04:15 PM, Tim Triche, Jr. wrote:
> >>>>>>>> Ah, I see.  GSL is a useful library to have installed regardless.
> >>>>>>>> Hope things work out.  I found your exchanges with Paul to be
> useful
> >>>>>>>> reading, but obviously I was not reading closely enough, since
> Paul
> >>>>>>>> started off his code sample with biocLite('MotIV').  Oops :-o
> >>>>>>>>
> >>>>>>>> Here is a paper that I found interesting, which does go into some
> >>>>>>>> detail towards a "bulk" approach, from Gottardo's group:
> >>>>>>>>
> >>>>>>>>
> http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.00164
> >>>>>>>> 32
> >>>>>>
> >>>>>>>> Perhaps it will be useful to you as well, would be curious to
> hear if
> >>>>>>>> so.
> >>>>>>>>
> >>>>>>>> --t
> >>>>>>>>
> >>>>>>>> On Tue, Apr 24, 2012 at 7:00 AM, nooshin<n_omranian@...
> >>>>>>>> <mailto:n_omranian@...>>  wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>    Thanks, it's been already solved, it needs GSL package, which
> is a
> >>>>>>>>    bit problematic, but I solved it already.
> >>>>>>>>
> >>>>>>>>    But it does include only 5 matrices (in the webpage) for
> >>>>>>>>    arabidopsis and in the package also!
> >>>>>>>>    I'm downloading manually from AthaMap!
> >>>>>>>>
> >>>>>>>>    Thanks again and keep waiting for 'bulk' approach.
> >>>>>>>>
> >>>>>>>>    Bests,
> >>>>>>>>    Nooshin
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>    On 04/24/2012 03:16 PM, Tim Triche, Jr. wrote:
> >>>>>>>>>    source("http://bioconductor.org/biocLite.R")
> >>>>>>>>>    biocLite("MotIV")
> >>>>>>>>>
> >>>>>>>>>    ought to do the trick for you
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>    On Tue, Apr 24, 2012 at 1:01 AM, nooshin<n_omranian@...
> >>>>>>>>>    <mailto:n_omranian@...>>  wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>        Hi Paul,
> >>>>>>>>>
> >>>>>>>>>        Thanks a lot.
> >>>>>>>>>        I forgot to include bioc, since I only replied to you (no
> to
> >>>>>>>>>        all).
> >>>>>>>>>
> >>>>>>>>>        I can"t install MotIV package to check. I checked in
> google but
> >>>>>>>>> I
> >>>>>>>>>        couldn't find any solution! Do you have any suggestion for
> >>>>>>>>>        installing
> >>>>>>>>>        this package?
> >>>>>>>>>
> >>>>>>>>>        Bests,
> >>>>>>>>>        Nooshin
> >>>>>>>>>
> >>>>>>>>>        On 04/23/2012 06:35 PM, Paul Shannon wrote:
> >>>>>>>>>> (redirecting this back to the Bioc list...)
> >>>>>>>>>>
> >>>>>>>>>> Hi Nooshin,
> >>>>>>>>>>
> >>>>>>>>>> The 'bulk' approach is not quite so ready as I predicted.
> >>>>>>>>>         I might have something by the end of the week.
> >>>>>>>>>>
> >>>>>>>>>> As for mapping between PWMs and TFs, I have most often done
> >>>>>>>>>        this with 'tom-tom' from the meme website.
> >>>>>>>>>>
> >>>>>>>>>> But I just discovered what looks like a good -- maybe
> >>>>>>>>>        better -- approach:  the Bioconductor MotIV package, which
> >>>>>>>>>        includes a 2010 version of jasper.
> >>>>>>>>>> Try this:
> >>>>>>>>>>
> >>>>>>>>>>    source("http://bioconductor.org/biocLite.R")
> >>>>>>>>>>
> >>>>>>>>>> biocLite ('MotIV')
> >>>>>>>>>> library (MotIV);
> >>>>>>>>>> browseVignettes ('MotIV')
> >>>>>>>>>>
> >>>>>>>>>> The jaspar data in this package has 130 TF-PWM mappings,
> >>>>>>>>>        which appear to be human.  More must be known, and
> publicly
> >>>>>>>>>        available.  The JASPAR website has a 'JASPAR CORE Plantae'
> >>>>>>>>>         data set that
> >>>>>>>>>>    - is probably what you are interested in
> >>>>>>>>>>    - might be downloadable, and convertible to the form
> >>>>>>>>>        MotIV wants.
> >>>>>>>>>>
> >>>>>>>>>> Perhaps other readers of the list have other suggestions.
> >>>>>>>>>>
> >>>>>>>>>> If you have any questions on this, please include 'BioC' in
> >>>>>>>>>        your reply, so that we can all get better at this!
> >>>>>>>>>>
> >>>>>>>>>>  - Paul
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Apr 23, 2012, at 6:53 AM, nooshin wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Paul,
> >>>>>>>>>>>
> >>>>>>>>>>> Many thanks for your comprehensive information and code!
> >>>>>>>>>>> I have a question regarding to extract of PWMs. How and
> >>>>>>>>>        where I can download these matrices for all TFs that PWM
> is
> >>>>>>>>>        available for them? I need it only for Arabidopsis
> thaliana.
> >>>>>>>>>>> Is there any package in R which I can give the TF and
> >>>>>>>>>        receive the PWM for it? Or any online database which I can
> >>>>>>>>>        download from it? I have a big problem since Friday to
> find
> >>>>>>>>>        out these matrices for different TFs of A.th. That would
> be
> >>>>>>>>>        so great if you can help me to get these matrices.
> >>>>>>>>>>>
> >>>>>>>>>>>> If you want to do this in bulk, Herve' has some lovely
> >>>>>>>>>        code to make that efficient.
> >>>>>>>>>>> Also can I have this? :)
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks a lot in advance.
> >>>>>>>>>>> Best regards,
> >>>>>>>>>>> Nooshin
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>        *TODAY*/(Beta) /*.*Powered by Yahoo!
> >>>>>>>>>
> >>>>>>>>>        Armored catfish wreak havoc in U.S. South
> >>>>>>>>>
> >>>>>>>>> <
> http://news.yahoo.com/blogs/sideshow/armored-catfish-wreaking-havoc-so
> >>>>>>>>> ut
> >>>>>>>>> h-
> >>>>>>>>>
> florida-lakes-182812663.html;_ylc=X3oDMTFia2oyNjZoBF9TAzk1NDAxMDAyNwRwa
> >>>>>>>>> 2c
> >>>>>>>>> Da
> >>>>>>>>> WQtMjIzODM5NARzeWlkA2RfZWNoMGQ4MGQ-#more-4190>
> >>>>>>>>>
> >>>>>>>>>        Privacy Policy
> >>>>>>>>>        <
> http://info.yahoo.com/privacy/us/yahoo/webbeacons/details.html>
> >>>>>>>>>
> >>>>>>>>>               [[alternative HTML version deleted]]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>        _______________________________________________
> >>>>>>>>>        Bioconductor mailing list
> >>>>>>>>>        Bioconductor@...<mailto:
> Bioconductor@...>
> >>>>>>>>>        https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>>>>>>>>        Search the archives:
> >>>>>>>>>
> >>>>>>>>>
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>    --
> >>>>>>>>>    /A model is a lie that helps you see the truth./
> >>>>>>>>>    /
> >>>>>>>>>    /
> >>>>>>>>>    Howard Skipper
> >>>>>>>>>    <http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf
> >
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> /A model is a lie that helps you see the truth./
> >>>>>>>> /
> >>>>>>>> /
> >>>>>>>> Howard Skipper
> >>>>>>>> <http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf>
> >>>>>>>>
> >>>>>>>
> >>>>>>> [[alternative HTML version deleted]]
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> Bioconductor mailing list
> >>>>>>> Bioconductor@...
> >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>>>>>> Search the archives:
> >>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Bioconductor mailing list
> >>>>> Bioconductor@...
> >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>>>> Search the archives:
> >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>>>
> >>>
> >>>
> >>
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor@...
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>  | Memorial Sloan-Kettering Cancer Center
>  | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@...
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>

	[[alternative HTML version deleted]]

_______________________________________________
Bioconductor mailing list
Bioconductor@...
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


Gmane