One-to-Many Relationship Example

In MongoDB, a one-to-many relationship can be modeled in two main ways:
  • Embedding: Embed many related documents in an array within the parent document.
  • Referencing: Store related documents in a separate collection and reference them via an identifier.
  • I'll provide an example using both approaches in the MongoDB shell.
  • Scenario: Modeling Authors and Books
  • We will model an author who writes multiple books, which is a classic one-to-many relationship.
1. Embedding Approach (One-to-Many)
  • In this approach, the books are embedded directly inside the author document as an array.
  • Step 1: Inserting Data (Embedding)

  use libraryDB  # Switch to or create the database

  db.authors.insertOne({
    _id: 1,
    name: "George Orwell",
    age: 46,
    books: [
      {
        title: "1984",
        genre: "Dystopian",
        published_year: 1949
      },
      {
        title: "Animal Farm",
        genre: "Political Satire",
        published_year: 1945
      }
    ]
  })

  • In this case, the books field is an array, and each book is stored as an embedded document within the author document.
  • Step 2: Querying Data (Embedding)
  • To retrieve an author along with their books, you can simply query the authors collection:

  db.authors.findOne({ _id: 1 })

  // Output:
  {
    "_id": 1,
    "name": "George Orwell",
    "age": 46,
    "books": [
      {
        "title": "1984",
        "genre": "Dystopian",
        "published_year": 1949
      },
      {
        "title": "Animal Farm",
        "genre": "Political Satire",
        "published_year": 1945
      }
    ]
  }


Explanation of Embedding:

Advantages:
  • Simpler data retrieval: Since the books are stored directly within the author document, you don’t need to perform additional queries.
  • Single query for updates: You can update the entire author and their books in one go.
Disadvantages:
  • Document size limit: MongoDB has a 16 MB document size limit. If an author writes too many books, the document can grow large.
  • Data duplication: If books are referenced by other entities (e.g., publishers), duplication can occur.
2. Referencing Approach (One-to-Many)
  • In this approach, the books are stored in a separate collection, and the author document references the book_ids in an array. This approach avoids embedding large amounts of data inside a single document.
  • Step 1: Inserting Data (Referencing)
  • Insert data into the books collection:

    db.books.insertMany([
      {
        _id: 101,
        title: "1984",
        genre: "Dystopian",
        published_year: 1949
      },
      {
        _id: 102,
        title: "Animal Farm",
        genre: "Political Satire",
        published_year: 1945
      }
    ])

  • Insert data into the authors collection with references to book_ids:

    db.authors.insertOne({
      _id: 1,
      name: "George Orwell",
      age: 46,
      book_ids: [101, 102]  # Array of references to the books
    })

  • Step 2: Querying Data (Referencing)
  • To get an author and their books, you need to:
    • 1. Query the authors collection to get the book_ids.
    • 2. Use the book_ids to query the books collection.
  • Step 2.1: Find the author:

    var author = db.authors.findOne({ _id: 1 })

  • Step 2.2: Find the books using the book_ids:

    db.books.find({ _id: { $in: author.book_ids } })

    // Output (from the books collection):
    [
      {
        "_id": 101,
        "title": "1984",
        "genre": "Dystopian",
        "published_year": 1949
      },
      {
        "_id": 102,
        "title": "Animal Farm",
        "genre": "Political Satire",
        "published_year": 1945
      }
    ]

  • Step 3: Using $lookup to Join Collections
  • Alternatively, you can use the $lookup operator to join the authors and books collections in a single query:

    db.authors.aggregate([
      {
        $lookup: {
          from: "books",         // Collection to join with
          localField: "book_ids", // Field in the authors collection
          foreignField: "_id",    // Field in the books collection
          as: "books"             // Output array field name
        }
      }
    ])

    // Output:
    [
      {
        "_id": 1,
        "name": "George Orwell",
        "age": 46,
        "book_ids": [101, 102],
        "books": [
          {
            "_id": 101,
            "title": "1984",
            "genre": "Dystopian",
            "published_year": 1949
          },
          {
            "_id": 102,
            "title": "Animal Farm",
            "genre": "Political Satire",
            "published_year": 1945
          }
        ]
      }
    ]


Explanation of Referencing:

Advantages:
  • Flexible and scalable: The books can grow in number without causing the author document to become too large.
  • No duplication: Since the books are stored in a separate collection, other entities (like publishers or libraries) can reference them without duplicating data.
Disadvantages:
  • More complex queries: You need to perform multiple queries or use $lookup to retrieve related data.
  • Data consistency: It’s possible for an author to reference a book that doesn’t exist, which introduces potential data integrity issues unless you enforce checks at the application level.
Conclusion: When to Use Embedding vs. Referencing in One-to-Many Relationships
  • Use Embedding when:
    • The related data (e.g., books) is always accessed with the parent document (e.g., author).
    • The size of the embedded data is small and won’t grow indefinitely.
    • You want simplicity in your data model with fewer collections to manage.
  • Use Referencing when:
    • The related data (e.g., books) might be accessed independently of the parent document (e.g., author).
    • The size of the related data is large or could grow over time.
    • You want to share related data between different entities (e.g., a book is written by an author but also published by a publisher).
    • You need to avoid hitting the document size limit (16 MB in MongoDB).
  • By using either approach, you can effectively model one-to-many relationships based on the specific requirements of your application.

No comments:

Post a Comment

How PHP Embeds Into HTML — And Can It Work Inside JavaScript?

One of PHP's most unique characteristics is that it doesn't live in its own isolated file waiting to be called. It can sit directly ...