Untitled :: Morphia Docs

Aggregations

The aggregation framework in MongoDB allows you to define a series (called a pipeline) of operations (called stages) against the data in a collection. These pipelines can be used for analytics or they can be used to convert your data from one form to another. This guide will not go in to the details of how aggregation works, however. The official MongoDB documentation has extensive tutorials on such details. Rather, this guide will focus on the Morphia API. The examples shown here are taken from the tests in Morphia itself. You can find the full list in the Supported Operators section.

Writing an aggregation pipeline starts just like writing a standard query. As with querying, we start with the Datastore:

datastore.aggregate(Book.class, Author.class).pipeline(
    group(id("author"))
      .field("books", push("$title")),
    sort()
       .ascending("name"))
    .iterator();

aggregate() takes a Class literal. This lets Morphia know which collection to perform this aggregation against. Because of the transformational operations available in the aggregation pipeline, Morphia can not validate as much as it can with querying so care will need to be taken to ensure document fields actually exist when referencing them in your pipeline.

The Pipeline

Aggregation pipelines are comprised of a series stages. Our example here with the group() stage. This method is the Morphia equivalent of the $group operator. This stage, as the name suggests, groups together documents based on various criteria. In this example, we are defining the group ID as the author field which will collect all the books by the author together.

The next step defines a new field, books comprised of the titles of the books found in each document. (For reference, this example is the Morphia equivalent of an example found in the aggregation tutorials.) This results in a series of documents that look like this:

{ "_id" : "Homer", "books" : [ "The Odyssey", "Iliad" ] }
{ "_id" : "Dante", "books" : [ "The Banquet", "Divine Comedy", "Eclogues" ] }

Executing the Pipeline

Once your pipeline is complete, you can execute it via the execute() method. This method optionally takes a Class reference for the target type of your aggregation. Given this type, Morphia will map each document in the results and return it. Additionally, you can also include some options to execute(). We can use the various options on the AggregationOptions class to configure how we want the pipeline to execute.

$out

Depending your use case, you might not want to return the results of your aggregation but simply output them to another collection. That’s where $out comes in. $out is an operator that allows the results of a pipeline to be stored in to a named collection. This collection can not be sharded or a capped collection, however. This collection, if it does not exist, will be created upon execution of the pipeline.

Any existing data in the collection will be replaced by the output of the aggregation.

An example aggregation using the $out stage looks like this:

datastore.aggregate(Book.class, Author.class)
         .pipeline(
             group(id("$author"))
                 .field("books", push().single("$title")),
             out(Author.class))
    .iterator();

You’ll note that out() is the final stage. $out and $merge must be the final stage in our pipeline. We pass a type to out() that reflects the collection we want to write our output to. Morphia will use the type-to-collection mapping you’ve defined when mapping your entities to determine that collection name. You may also pass a String with the collection name as well if the target collection does not correspond to a mapped entity.

$merge

$merge is a very similar option with a some major differences. The biggest difference is that $merge can write to existing collections without destroying the existing documents. $out would overwrite any existing documents and replace them with the results of the pipeline. $merge, however, can deposit these new results alongside existing data and update existing data.

Using $merge might look something like this:

var aggregation = datastore.aggregate(new AggregationOptions().collection("some collection"));
aggregation.pipeline(
    group(id()
        .field("fiscal_year", "$fiscal_year")
        .field("dept", "$dept"))
        .field("salaries", sum("$salary")),
    merge("reporting", "budgets")
        .on("_id")
        .whenMatched(REPLACE)
        .whenNotMatched(INSERT))
    .iterator();

Much like out() above, for merge() we pass in a collection information but here we are also passing in which database to find/create the collection in. A merge is slightly more complex and so has more options to consider. In this example, we’re merging in to the budgets collection in the reporting database and merging any existing documents based on the`_id` as denoted using the on() method. Because there may be existing data in the collection, we need to instruct the operation how to handle those cases. In this example, when documents matching we’re choosing to replace them and when they don’t we’re instructing the operation to insert the new documents in to the collection. Other options are defined on com.mongodb.client.model.MergeOptions type defined by the Java driver.

Supported Operators

Every effort is made to provide 100% coverage of all the operators offered by MongoDB. A select handful of operators have been excluded for reasons of suitability in Morphia. In short, some operators just don’t make sense in Morphia. Below is listed all the currently supported operators. To see an example of an operator in action, click through to see the test cases for that operator.

If an operator is missing and you think it should be included, please file an issue for that operator.

Table 1. Stages
Operator	Docs	Test Examples
$addFields	AddFields#addFields()	TestAddFields
$bucket	Bucket#bucket()	TestBucket
$bucketAuto	AutoBucket#autoBucket()	TestBucketAuto
$changeStream	ChangeStream#changeStream()	TestChangeStream
$collStats	CollectionStats#collStats()	TestCollStats
$count	Count#count(String)	TestCount
$currentOp	CurrentOp#currentOp()	TestCurrentOp
$densify	Densify#densify(String,Range)	TestDensify
$documents	Documents#documents(DocumentExpression…)	TestDocuments
$facet	Facet#facet()	TestFacet
$fill	Fill#fill()	TestFill
$geoNear	GeoNear#geoNear(String) GeoNear#geoNear(Point) GeoNear#geoNear(double)	TestGeoNear
$graphLookup	GraphLookup#graphLookup(String) GraphLookup#graphLookup(Class)	TestGraphLookup
$group	Group#group(GroupId) Group#group()	TestGroup
$indexStats	IndexStats#indexStats()	TestIndexStats
$limit	Limit#limit(long)	TestLimit
$lookup	Lookup#lookup(Class) Lookup#lookup(String) Lookup#lookup()	TestLookup
$match	Match#match(Filter…)	TestMatch
$planCacheStats	PlanCacheStats#planCacheStats()	TestPlanCacheStats
$project	Projection#project()	TestProject
$redact	Redact#redact(Expression)	TestRedact
$replaceRoot	ReplaceRoot#replaceRoot()	TestReplaceRoot
$replaceWith	ReplaceWith#replaceWith()	TestReplaceWith
$sample	Sample#sample(long)	TestSample
$set	Set#set()	TestSet
$setWindowFields	SetWindowFields#setWindowFields()	TestSetWindowFields
$skip	Skip#skip(long)	TestSkip
$sort	Sort#sort()	TestSort
$sortByCount	SortByCount#sortByCount(Object)	TestSortByCount
$unionWith	UnionWith#unionWith(Stage…) UnionWith#unionWith(Class,Stage…) UnionWith#unionWith(String,Stage…)	TestUnionWith
$unset	Unset#unset(String,String…)	TestUnset
$unwind	Unwind#unwind(String)	TestUnwind

Table 2. Expressions
Operator	Docs	Test Examples
$abs	MathExpressions#abs(Object)	TestAbs
$accumulator	AccumulatorExpressions#accumulator(String,String,List,String)	TestAccumulator
$acos	TrigonometryExpressions#acos(Object)	TestAcos
$acosh	TrigonometryExpressions#acosh(Object)	TestAcosh
$add	MathExpressions#add(Object,Object…)	TestAdd
$addToSet	AccumulatorExpressions#addToSet(Object)	TestAddToSet
$allElementsTrue	SetExpressions#allElementsTrue(Object,Object…)	TestAllElementsTrue
$and	BooleanExpressions#and(Object,Object…) BooleanExpressions#and()	TestAnd
$anyElementTrue	SetExpressions#anyElementTrue(Object,Object…)	TestAnyElementTrue
$arrayElemAt	ArrayExpressions#elementAt(Object,Object)	TestArrayElemAt
$arrayToObject	ArrayExpressions#arrayToObject(Object)	TestArrayToObject
$asin	TrigonometryExpressions#asin(Object)	TestAsin
$asinh	TrigonometryExpressions#asinh(Object)	TestAsinh
$atan	TrigonometryExpressions#atan(Object)	TestAtan
$atan2	TrigonometryExpressions#atan2(Object,Object)	TestAtan2
$atanh	TrigonometryExpressions#atanh(Object)	TestAtanh
$avg	AccumulatorExpressions#avg(Object,Object…)	TestAvg
$binarySize	DataSizeExpressions#binarySize(Object)	TestBinarySize
$bitAnd	MathExpressions#bitAnd(Object,Object)	TestBitAnd
$bitNot	MathExpressions#bitNot(Object)	TestBitNot
$bitOr	MathExpressions#bitOr(Object,Object)	TestBitOr
$bitXor	MathExpressions#bitXor(Object,Object)	TestBitXor
$bottom	AccumulatorExpressions#bottom(Object,Sort…)	TestBottom
$bottomN	AccumulatorExpressions#bottomN(Object,Object,Sort…)	TestBottomN
$bsonSize	DataSizeExpressions#bsonSize(Object)	TestBsonSize
$ceil	MathExpressions#ceil(Object)	TestCeil
$cmp	ComparisonExpressions#cmp(Object,Object)	TestCmp
$concat	StringExpressions#concat(Object,Object…)	TestConcat
$concatArrays	ArrayExpressions#concatArrays(Object,Object…)	TestConcatArrays
$cond	ConditionalExpressions#condition(Object,Object,Object)	TestCond
$convert	TypeExpressions#convert(Object,ConvertType)	TestConvert
$cos	TrigonometryExpressions#cos(Object)	TestCos
$cosh	TrigonometryExpressions#cosh(Object)	TestCosh
$count	AccumulatorExpressions#count()	TestCount
$covariancePop	WindowExpressions#covariancePop(Object,Object)	TestCovariancePop
$covarianceSamp	WindowExpressions#covarianceSamp(Object,Object)	TestCovarianceSamp
$dateAdd	DateExpressions#dateAdd(Object,long,TimeUnit)	TestDateAdd
$dateDiff	DateExpressions#dateDiff(Object,Object,TimeUnit)	TestDateDiff
$dateFromParts	DateExpressions#dateFromParts()	TestDateFromParts
$dateFromString	DateExpressions#dateFromString()	TestDateFromString
$dateSubtract	DateExpressions#dateSubtract(Object,long,TimeUnit)	TestDateSubtract
$dateToParts	DateExpressions#dateToParts(Object)	TestDateToParts
$dateToString	DateExpressions#dateToString()	TestDateToString
$dateTrunc	DateExpressions#dateTrunc(Object,TimeUnit)	TestDateTrunc
$dayOfMonth	DateExpressions#dayOfMonth(Object)	TestDayOfMonth
$dayOfWeek	DateExpressions#dayOfWeek(Object)	TestDayOfWeek
$dayOfYear	DateExpressions#dayOfYear(Object)	TestDayOfYear
$degreesToRadians	TrigonometryExpressions#degreesToRadians(Object)	TestDegreesToRadians
$denseRank	WindowExpressions#denseRank()	TestDenseRank
$derivative	WindowExpressions#derivative(Object)	TestDerivative
$divide	MathExpressions#divide(Object,Object)	TestDivide
$documentNumber	WindowExpressions#documentNumber()	TestDocumentNumber
$eq	ComparisonExpressions#eq(Object,Object)	TestEq
$exp	MathExpressions#exp(Object)	TestExp
$expMovingAvg	WindowExpressions#expMovingAvg(Object,int) WindowExpressions#expMovingAvg(Object,double)	TestExpMovingAvg
$filter	ArrayExpressions#filter(Expression,Expression) Expressions#filter(Object,Object)	TestFilter
$first	AccumulatorExpressions#first(Object)	TestFirst
$firstN	AccumulatorExpressions#firstN(Object,Object)	TestFirstN
$floor	MathExpressions#floor(Object)	TestFloor
$function	AccumulatorExpressions#function(String,Object…)	TestFunction
$getField	Miscellaneous#getField(String) Miscellaneous#getField(Object)	TestGetField
$gt	ComparisonExpressions#gt(Object,Object)	TestGt
$gte	ComparisonExpressions#gte(Object,Object)	TestGte
$hour	DateExpressions#hour(Object)	TestHour
$ifNull	ConditionalExpressions#ifNull()	TestIfNull
$in	ArrayExpressions#in(Object,Object)	TestIn
$indexOfArray	ArrayExpressions#indexOfArray(Object,Object)	TestIndexOfArray
$indexOfBytes	StringExpressions#indexOfBytes(Object,Object)	TestIndexOfBytes
$indexOfCP	StringExpressions#indexOfCP(Object,Object)	TestIndexOfCP
$integral	WindowExpressions#integral(Object)	TestIntegral
$isArray	ArrayExpressions#isArray(Object)	TestIsArray
$isNumber	TypeExpressions#isNumber(Object)	TestIsNumber
$isoDayOfWeek	DateExpressions#isoDayOfWeek(Object)	TestIsoDayOfWeek
$isoWeek	DateExpressions#isoWeek(Object)	TestIsoWeek
$isoWeekYear	DateExpressions#isoWeekYear(Object)	TestIsoWeekYear
$last	AccumulatorExpressions#last(Object)	TestLast
$lastN	AccumulatorExpressions#lastN(Object,Object)	TestLastN
$let	VariableExpressions#let(Expression)	TestLet
$linearFill	WindowExpressions#linearFill(Object)	TestLinearFill
$literal	Expressions#literal(Object)	TestLiteral
$ln	MathExpressions#ln(Object)	TestLn
$locf	WindowExpressions#locf(Object)	TestLocf
$log	MathExpressions#log(Object,Object)	TestLog
$log10	MathExpressions#log10(Object)	TestLog10
$lt	ComparisonExpressions#lt(Object,Object)	TestLt
$lte	ComparisonExpressions#lte(Object,Object)	TestLte
$ltrim	StringExpressions#ltrim(Object)	TestLtrim
$map	ArrayExpressions#map(Object,Object)	TestMap
$max	AccumulatorExpressions#max(Object,Object…)	TestMax
$maxN	AccumulatorExpressions#maxN(Object,Object)	TestMaxN
$median	MathExpressions#median(Object)	TestMedian
$mergeObjects	ObjectExpressions#mergeObjects()	TestMergeObjects
$meta	Expressions#meta() Expressions#meta(MetadataKeyword) Meta#indexKey(String) Meta#searchHighlights(String) Meta#searchScore(String) Meta#textScore(String)	TestMeta
$millisecond	DateExpressions#milliseconds(Object)	TestMillisecond
$min	AccumulatorExpressions#min(Object,Object…)	TestMin
$minN	AccumulatorExpressions#minN(Object,Object)	TestMinN
$minute	DateExpressions#minute(Object)	TestMinute
$mod	MathExpressions#mod(Object,Object)	TestMod
$month	DateExpressions#month(Object)	TestMonth
$multiply	MathExpressions#multiply(Object,Object…)	TestMultiply
$ne	ComparisonExpressions#ne(Object,Object)	TestNe
$not	BooleanExpressions#not(Object)	TestNot
$objectToArray	ArrayExpressions#objectToArray(Object)	TestObjectToArray
$or	BooleanExpressions#or(Object,Object…) BooleanExpressions#or()	TestOr
$percentile	MathExpressions#percentile(Object,List) MathExpressions#percentile(List,List)	TestPercentile
$pow	MathExpressions#pow(Object,Object)	TestPow
$push	AccumulatorExpressions#push(Object) AccumulatorExpressions#push()	TestPush
$radiansToDegrees	TrigonometryExpressions#radiansToDegrees(Object)	TestRadiansToDegrees
$rand	Miscellaneous#rand()	TestRand
$range	ArrayExpressions#range(int,int) ArrayExpressions#range(Object,Object)	TestRange
$rank	WindowExpressions#rank()	TestRank
$reduce	ArrayExpressions#reduce(Object,Object,Object)	TestReduce
$regexFind	StringExpressions#regexFind(Object)	TestRegexFind
$regexFindAll	StringExpressions#regexFindAll(Object)	TestRegexFindAll
$regexMatch	StringExpressions#regexMatch(Object)	TestRegexMatch
$replaceAll	StringExpressions#replaceAll(Object,Object,Object)	TestReplaceAll
$replaceOne	StringExpressions#replaceOne(Object,Object,Object)	TestReplaceOne
$reverseArray	ArrayExpressions#reverseArray(Object)	TestReverseArray
$round	MathExpressions#round(Object,Object)	TestRound
$rtrim	StringExpressions#rtrim(Object)	TestRtrim
$sampleRate	Miscellaneous#sampleRate(double)	TestSampleRate
$second	DateExpressions#second(Object)	TestSecond
$setDifference	SetExpressions#setDifference(Object,Object)	TestSetDifference
$setEquals	SetExpressions#setEquals(Object,Object…)	TestSetEquals
$setField	Miscellaneous#setField(Object,Object,Object)	TestSetField
$setIntersection	SetExpressions#setIntersection(Object,Object…)	TestSetIntersection
$setIsSubset	SetExpressions#setIsSubset(Object,Object)	TestSetIsSubset
$setUnion	SetExpressions#setUnion(Object,Object…)	TestSetUnion
$shift	WindowExpressions#shift(Object,long) WindowExpressions#shift(Object,long,Object)	TestShift
$sin	TrigonometryExpressions#sin(Object)	TestSin
$sinh	TrigonometryExpressions#sinh(Object)	TestSinh
$size	ArrayExpressions#size(Object)	TestSize
$slice	ArrayExpressions#slice(Object,int)	TestSlice
$sortArray	ArrayExpressions#sortArray(Object,Sort…)	TestSortArray
$split	StringExpressions#split(Object,Object)	TestSplit
$sqrt	MathExpressions#sqrt(Object)	TestSqrt
$stdDevPop	WindowExpressions#stdDevPop(Object,Object…)	TestStdDevPop
$stdDevSamp	WindowExpressions#stdDevSamp(Object,Object…)	TestStdDevSamp
$strLenBytes	StringExpressions#strLenBytes(Object)	TestStrLenBytes
$strLenCP	StringExpressions#strLenCP(Object)	TestStrLenCP
$strcasecmp	StringExpressions#strcasecmp(Object,Object)	TestStrcasecmp
$substrBytes	StringExpressions#substrBytes(Object,int,int) StringExpressions#substrBytes(Object,Object,Object)	TestSubstrBytes
$substrCP	StringExpressions#substrCP(Object,Object,Object)	TestSubstrCP
$subtract	MathExpressions#subtract(Object,Object)	TestSubtract
$sum	AccumulatorExpressions#sum(Object,Object…)	TestSum
$switch	ConditionalExpressions#switchExpression()	TestSwitch
$tan	TrigonometryExpressions#tan(Object)	TestTan
$tanh	TrigonometryExpressions#tanh(Object)	TestTanh
$toBool	TypeExpressions#toBool(Object)	TestToBool
$toDate	DateExpressions#toDate(Object) TypeExpressions#toDate(Object)	TestToDate
$toDecimal	TypeExpressions#toDecimal(Object)	TestToDecimal
$toDouble	TypeExpressions#toDouble(Object)	TestToDouble
$toInt	TypeExpressions#toInt(Object)	TestToInt
$toLong	TypeExpressions#toLong(Object)	TestToLong
$toLower	StringExpressions#toLower(Object)	TestToLower
$toObjectId	TypeExpressions#toObjectId(Object)	TestToObjectId
$toString	StringExpressions#toString(Object) TypeExpressions#toString(Object)	TestToString
$toUpper	StringExpressions#toUpper(Object)	TestToUpper
$top	AccumulatorExpressions#top(Object,Sort…)	TestTop
$topN	AccumulatorExpressions#topN(Object,Object,Sort…)	TestTopN
$trim	StringExpressions#trim(Object)	TestTrim
$trunc	MathExpressions#trunc(Object) MathExpressions#trunc(Object,Object)	TestTrunc
$tsIncrement	DateExpressions#tsIncrement(Object)	TestTsIncrement
$tsSecond	DateExpressions#tsSecond(Object)	TestTsSecond
$type	TypeExpressions#type(Object)	TestType
$unsetField	Miscellaneous#unsetField(Object,Object)	TestUnsetField
$week	DateExpressions#week(Object)	TestWeek
$year	DateExpressions#year(Object)	TestYear
$zip	ArrayExpressions#zip(Object…)	TestZip