I'm running into the aggregation result exceeds maximum document size (16MB)
error with mongodb aggregation using pymongo.
I was able to overcome it at first using the limit()
option. However, at some point I got the
Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in." error.
Ok, I'll use the {'allowDiskUse':True}
option. This option works when I use it on the commandline, but when I tried to use in my python code
result = work1.aggregate(pipe, 'allowDiskUse:true')
I get TypeError: aggregate() takes exactly 2 arguments (3 given)
error. (that's in spite of the definition given at http://api.mongodb.org/python/current/api/pymongo/collection.html#pymongo.collection.Collection.aggregate: aggregate(pipeline, **kwargs)).
I tried to use runCommand, or rather it's pymongo equivalent:
db.command('aggregate','work1',pipe, {'allowDiskUse':True})
but now I'm back to the 'aggregation result exceeds maximum document size (16MB)' error
In case you need to know
pipe = [{'$project': {'_id': 0, 'summary.trigrams': 1}}, {'$unwind': '$summary'}, {'$unwind': '$summary.trigrams'}, {'$group': {'count': {'$sum': 1}, '_id': '$summary.trigrams'}}, {'$sort': {'count': -1}}, {'$limit': 10000}]
Thank you
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…