Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Aurelien Mazurie <ajmazurie-CULiByY8CLCsTnJN9+BGXg <at> public.gmane.org>
Subject: Best strategy to query linked documents?
Newsgroups: gmane.comp.db.mongodb.user
Date: Thursday 1st July 2010 17:46:18 UTC (over 8 years ago)
Greetings,
I am trying to figure out the best way to store (and query) relationships
between documents in a MongoDB database.

I have three collections in my database: say, one called 'Bag', another
called 'Item', and a third called 'Relationship'. Documents of type
'Relationship' allow me to describe how a given 'Item' is related to a
given
'Bag'. Each 'Relationship' document also contain DBRefs to both a 'Bag' and
an 'Item' document. Several 'Relationship' documents can exists if there
are
many ways the same 'Item' is linked to the same 'Bag'.

Basically, I am storing a multi-edge directed graph with nodes being 'Bag'
or 'Item' documents, and edges being 'Relationship' documents.

All of this works beautifully to model my data, but so far I have trouble
to
get fast answers to the query 'give me all Items that are related to this
Bag' when I want to filter both the Items and the Relationships. E.g.,
"give
me all Items with key/value { a: 1 } that are linked to my Bag through a
relationship with key/value { type: 'part-of' }"

What would be your opinion? I know I could try other technologies (e.g.,
RDF
or graph databases), but for various technical reasons I really want to
keep
using MongoDB. Also, storing the list of Items a Bag is connected to in the
Bag object itself looks unpractical, as I also need to store informations
about this connection. Plus, there are about hundreds of thousands of Items
that can be in one given Bag.

I am considering using server-side functions, and maybe mapreduce, but I am
not sure if this tool (which I never used) could be of help here.

My current strategy is
(1) to ask, client side, for the list of Relationship documents that match
a
filter (e.g., { type: 'part-of' }), then
(2) to extract the $id of the Item objects these Relationship points to,
then
(3) query all Item objects that match a filter (e.g., { a: 1 }) which I
supplemented with an { $id: { $in: [long list of Item identifiers I got
from
querying the Relationships] }}.

This is slow, and I even have to split the list of identifiers by chunks of
few thousands to avoid the 4Mb limit.

Best,
Aurelien Mazurie

-- 
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To post to this group, send email to
mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/[email protected]
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/[email protected]
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
 
CD: 4ms