Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.2k views
in Technique[技术] by (71.8m points)

mongodb - Rails & Mongoid unique results

Consider the following example of mongo collection:

{"_id" : ObjectId("4f304818884672067f000001"), "hash" : {"call_id" : "1234"}, "something" : "AAA"}
{"_id" : ObjectId("4f304818884672067f000002"), "hash" : {"call_id" : "1234"}, "something" : "BBB"}
{"_id" : ObjectId("4f304818884672067f000003"), "hash" : {"call_id" : "1234"}, "something" : "CCC"}
{"_id" : ObjectId("4f304818884672067f000004"), "hash" : {"call_id" : "5555"}, "something" : "DDD"}
{"_id" : ObjectId("4f304818884672067f000005"), "hash" : {"call_id" : "5555"}, "something" : "CCC"}

I would like to query this collection and get only the first entry for each "call_id", in other words i'm trying to get unique results based on "call_id". I tried to use .distinct method:

@result = Myobject.all.distinct('hash.call_id')

but the resulting array will contain only the unique call_id fields:

["1234", "5555"]

and I need all the other fields too. Is it possible to make a query like this one?:

@result = Myobject.where('hash.call_id' => Myobject.all.distinct('hash.call_id'))

Thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You cannot simply return the document(or subset) by using the distinct. As per the documentation it only returns the distinct array of values based on the given key. But you can achieve this by using map-reduce

var _map = function () {
    emit(this.hash.call_id, {doc:this});
}

var _reduce = function (key, values) {
    var ret = {doc:[]};
    var doc = {};
    values.forEach(function (value) {
    if (!doc[value.doc.hash.call_id]) {
           ret.doc.push(value.doc);
           doc[value.doc.hash.call_id] = true; //make the doc seen, so it will be picked only once
       }
    });
    return ret;
}

The above code is self explanatory, on map function i am grouping it by key hash.call_id and returning the whole doc so it can be processed by reduce funcition.

On reduce function, just loop through the grouped result set and pick only one item from the grouped set (among the multiple duplicate key values - distinct simulation).

Finally create some test data

> db.disTest.insert({hash:{call_id:"1234"},something:"AAA"})
> db.disTest.insert({hash:{call_id:"1234"},something:"BBB"})
> db.disTest.insert({hash:{call_id:"1234"},something:"CCC"})
> db.disTest.insert({hash:{call_id:"5555"},something:"DDD"})
> db.disTest.insert({hash:{call_id:"5555"},something:"EEE"})
> db.disTest.find()
{ "_id" : ObjectId("4f30a27c4d203c27d8f4c584"), "hash" : { "call_id" : "1234" }, "something" : "AAA" }
{ "_id" : ObjectId("4f30a2844d203c27d8f4c585"), "hash" : { "call_id" : "1234" }, "something" : "BBB" }
{ "_id" : ObjectId("4f30a2894d203c27d8f4c586"), "hash" : { "call_id" : "1234" }, "something" : "CCC" }
{ "_id" : ObjectId("4f30a2944d203c27d8f4c587"), "hash" : { "call_id" : "5555" }, "something" : "DDD" }
{ "_id" : ObjectId("4f30a2994d203c27d8f4c588"), "hash" : { "call_id" : "5555" }, "something" : "EEE" }

and running this map reduce

> db.disTest.mapReduce(_map,_reduce, {out: { inline : 1}})
{
    "results" : [
        {
            "_id" : "1234",
            "value" : {
                "doc" : [
                    {
                        "_id" : ObjectId("4f30a27c4d203c27d8f4c584"),
                        "hash" : {
                            "call_id" : "1234"
                        },
                        "something" : "AAA"
                    }
                ]
            }
        },
        {
            "_id" : "5555",
            "value" : {
                "doc" : [
                    {
                        "_id" : ObjectId("4f30a2944d203c27d8f4c587"),
                        "hash" : {
                            "call_id" : "5555"
                        },
                        "something" : "DDD"
                    }
                ]
            }
        }
    ],
    "timeMillis" : 2,
    "counts" : {
        "input" : 5,
        "emit" : 5,
        "reduce" : 2,
        "output" : 2
    },
    "ok" : 1,
}

You get the first document of the distinct set. You can do the same in mongoid by first stringify the map/reduce functions and call mapreduce like this

  MyObject.collection.mapreduce(_map,_reduce,{:out => {:inline => 1},:raw=>true })

Hope it helps


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...