# Cloud database aggregation operation

Sometimes we need to perform analysis operations on data, such as some statistical operations, join table queries, etc. At this time, simple query operations cannot handle these requirements, so we need to use aggregation operations to complete them.

Get the aggregate operation instance of the database collection

db.collection('scores').aggregate()

Notice:

  • Aggregate operation instances are only used for querying, and cannot perform addition, deletion, and modification operations. Only the aggregation operation method can be used on the aggregation operation instance, and basic methods such as where/orderBy cannot be used. Where should be changed to match, and orderBy should be implemented using sort. For details, please read the aggregation operation document below.
  • The performance of aggregation operations is not as good as simple queries under large data volumes. Please choose the appropriate usage according to your own business.

Do not reuse aggregate instances when used in cloud functions, which may cause bugs.

Both of the following are examples of errors:

const db = uniCloud.database()
const collection = db.collection('test')
const aggregate = collection.aggregate() // 云函数实例复用时,此聚合实例也会复用,导致Bug
exports.main = async function(){
  const res = await aggregate.match({a:1}).end()
  return {res}
}
const db = uniCloud.database()
const collection = db.collection('test')
exports.main = async function(){
  const aggregate = collection.aggregate() // 此聚合实例分别在两个请求内使用,导致Bug
  const res1 = await aggregate.match({a:1}).end()
  const res2 = await aggregate.match({a:2}).end()
  return {res1, res2}
}

Correct the above two usages to the correct usage as follows:

const db = uniCloud.database()
const collection = db.collection('test')
exports.main = async function(){
  const aggregate = collection.aggregate() // 每次执行云函数会有独立的聚合实例
  const res = await aggregate.match({a:1}).end()
  return {res}
}
const db = uniCloud.database()
const collection = db.collection('test')
exports.main = async function(){
  const res1 = await collection.aggregate().match({a:1}).end() // 两个请求分别调用aggregate方法产生聚合实例
  const res2 = await collection.aggregate().match({a:2}).end()
  return {res1, res2}
}

# Aggregate expression

Expressions can be field paths, constants, or database operations. Expressions can nest expressions.

Field Path

Expressions use field path notation to specify fields in the record. Field paths are represented by a $ sign followed by the field name or nested field name. Nested field names use dots to connect fields at all levels of nesting. For example, $profile represents the field path of profile, and $profile.name represents the field path of profile.name (the name field nested in the profile field).

For example: the following data is available

[{
  "profile": {
    "name": "Chloe"
  },
  "status": 0
}]
// do the following
let res = await db.collection('scores').aggregate()
  .addFields({
    name: '$profile.name'
  })
  .end()

// return value
{
  "data": [{
    "profile": {
      "name": "Chloe"
    },
    "status": 0,
    "name": "Chloe"
  }]
}

constant

Constants can be of any type. By default, strings starting with $ are treated as field paths. To avoid this behavior, use db.command.aggregate.literal to declare it as a constant.

Database operation method

Refer to Database Operation Method

# addFields

Aggregation stage. Add new fields to the output records. After the addFields aggregation stage, all output records will have the fields specified by addFields in addition to the fields they were input.

API description

addFields is equivalent to the project phase specifying all existing fields and new fields at the same time.

addFields has the following form:

addFields({
  <new field>: <expression>
})

addFields can specify multiple new fields, the value of each new field is determined by the expression used.

If the specified new field has the same name as the original field, the value of the new field will overwrite the value of the original field. Note that addFields cannot be used to add elements to array fields.

Example 1: addFields twice in a row

Suppose the set scores has the following records:

{
  _id: 1,
  student: "Maya",
  homework: [ 10, 5, 10 ],
  quiz: [ 10, 8 ],
  extraCredit: 0
}
{
  _id: 2,
  student: "Ryan",
  homework: [ 5, 6, 5 ],
  quiz: [ 8, 8 ],
  extraCredit: 8
}

Apply addFields twice, the first time adding two fields are the sum of homework and quiz, the second time adding a field and then calculating the sum based on the previous two sums.

const $ = db.command.aggregate
let res = await db.collection('scores').aggregate()
  .addFields({
    totalHomework: $.sum('$homework'),
    totalQuiz: $.sum('$quiz')
  })
  .addFields({
    totalScore: $.add(['$totalHomework', '$totalQuiz', '$extraCredit'])
  })
  .end()

The returned results are as follows:

{
  "_id" : 1,
  "student" : "Maya",
  "homework" : [ 10, 5, 10 ],
  "quiz" : [ 10, 8 ],
  "extraCredit" : 0,
  "totalHomework" : 25,
  "totalQuiz" : 18,
  "totalScore" : 43
}
{
  "_id" : 2,
  "student" : "Ryan",
  "homework" : [ 5, 6, 5 ],
  "quiz" : [ 8, 8 ],
  "extraCredit" : 8,
  "totalHomework" : 16,
  "totalQuiz" : 16,
  "totalScore" : 40
}

Example 2: Adding fields to nested records

Fields can be added to nested records using dot notation. Suppose the vehicles collection contains the following records:

{ _id: 1, type: "car", specs: { doors: 4, wheels: 4 } }
{ _id: 2, type: "motorcycle", specs: { doors: 0, wheels: 2 } }
{ _id: 3, type: "jet ski" }

A new field fuel_type can be added under the specs field by the following operations, and the value is set to the fixed string unleaded:

let res = await db.collection('vehicles').aggregate()
  .addFields({
    'specs.fuel_type': 'unleaded'
  })
  .end()

The returned results are as follows:

{ _id: 1, type: "car",
   specs: { doors: 4, wheels: 4, fuel_type: "unleaded" } }
{ _id: 2, type: "motorcycle",
   specs: { doors: 0, wheels: 2, fuel_type: "unleaded" } }
{ _id: 3, type: "jet ski",
   specs: { fuel_type: "unleaded" } }

Example 3: Set field value to another field

You can set the value of a field to the value of another field by using an expression consisting of $ followed by the field name as the value.

Also using the previous collection example, you can add a field vehicle_type and set its value to the value of the type field as follows:

let res = await db.collection('vehicles').aggregate()
  .addFields({
    vehicle_type: '$type'
  })
  .end()

The returned results are as follows:

{ _id: 1, type: "car", vehicle_type: "car",
   specs: { doors: 4, wheels: 4, fuel_type: "unleaded" } }
{ _id: 2, type: "motorcycle", vehicle_type: "motorcycle",
   specs: { doors: 0, wheels: 2, fuel_type: "unleaded" } }
{ _id: 3, type: "jet ski", vehicle_type: "jet ski",
   specs: { fuel_type: "unleaded" } }

# bucket

Aggregation stage. Divide the input records into different groups according to the given conditions and boundaries, each group is a bucket.

API description

Each group is output as a record, containing a _id field with a lower bound and a count field with the number of records in the group. count is output by default when output is not specified.

bucket will only output if there is at least one record in the group.

bucket has the following form:

bucket({
  groupBy: <expression>,
  boundaries: [<lowerbound1>, <lowerbound2>, ...],
  default: <literal>,
  output: {
    <output1>: <accumulator expr>,
    ...
    <outputN>: <accumulator expr>
  }
})

groupBy is an expression used to determine the grouping applied to each input record. Expressions can be prefixed with $ followed by the path of the field to be used for grouping. Unless a default value is specified with default, each record needs to contain the specified field, and the field value must be within the range specified by boundaries.

boundaries is an array where each element is the lower bound of each group. At least two boundary values must be specified. Array values must be incremented values of the same type.

default is optional. After specifying, records that do not enter any group will enter a default group. The _id of this group record is determined by default. The value of default must be less than the minimum value or greater than or equal to the maximum value in boundaries. The value of default can be different from the value type of the boundaries element.

output is optional and is used to determine which fields the output record contains in addition to _id. The value of each field must be specified with an accumulator expression. When output is specified, the default count will not be output by default and must be specified manually:

output: {
  count: $.sum(1),
  ...
  <outputN>: <accumulator expr>
}

Using buckets requires at least one of the following conditions, otherwise an error will be thrown:

The value obtained by applying the groupBy expression to each input record must be a value within boundaries

Specify a default value that is outside boundaries, or a different type than the value of the boundaries element.

Example

Suppose the collection items has the following records:

{
  _id: "1",
  price: 10
}
{
  _id: "2",
  price: 50
}
{
  _id: "3",
  price: 20
}
{
  _id: "4",
  price: 80
}
{
  _id: "5",
  price: 200
}

To group the above records, group [0, 50) into one group, [50, 100) into one group, and the others into one group:

const $ = db.command.aggregate
let res = await db.collection('items').aggregate()
  .bucket({
    groupBy: '$price',
    boundaries: [0, 50, 100],
    default: 'other',
    output: {
      count: $.sum(1),
      ids: $.push('$_id')
    }
  })
  .end()

The returned results are as follows:

[
  {
    "_id": 0,
    "count": 2,
    "ids": [
      "1",
      "3"
    ]
  },
  {
    "_id": 50,
    "count": 2,
    "ids": [
      "2",
      "4"
    ]
  },
  {
    "_id": "other",
    "count": 1,
    "ids": [
      "5"
    ]
  }
]

# bucketAuto

Aggregation stage. Divide the input records into different groups according to the given conditions, each group is a bucket. One of the differences from bucket is that there is no need to specify boundaries, bucketAuto will automatically try to spread the records as evenly as possible into each group.

API description Each group is output as a record, containing an _id field whose value is an object containing the two fields of the maximum and minimum values in the group, and a count field whose value is the number of records in the group. count is output by default when output is not specified.

bucketAuto has the following form:

bucketAuto({
  groupBy: <expression>,
  buckets: <number>,
  granularity: <string>,
  output: {
    <output1>: <accumulator expr>,
    ...
    <outputN>: <accumulator expr>
  }
})

groupBy is an expression used to determine the grouping applied to each input record. Expressions can be prefixed with $ followed by the field path to group by. Unless a default value is specified with default, each record needs to contain the specified field, and the field value must be within the range specified by boundaries.

buckets is a positive integer specifying the number of buckets to split.

granularity is an optional enumeration value string used to ensure that the automatically computed bounds conform to the given rules. This field can only be used if all groupBy values are numbers and there are no NaNs. Enumerated values include: R5, R10, R20, R40, R80, 1-2-5, E6, E12, E24, E48, E96, E192, POWERSOF2.

output is optional and is used to determine which fields the output record contains in addition to _id. The value of each field must be specified with an accumulator expression. When output is specified, the default count will not be output by default and must be specified manually:

output: {
  count: $.sum(1),
  ...
  <outputN>: <accumulator expr>
}

The output grouping may be smaller than the given number of groups in the following cases:

The number of input records is less than the number of groups

  • groupBy counts fewer unique values than groups
  • granularity has less spacing than the number of packets
  • granularity is not granular enough to be evenly distributed among groups

granularity details

granularity is used to guarantee that boundary values belong to a given sequence of numbers.

Renard Sequence

The Renard sequence is a sequence of numbers between 1.0 and 10.0 (10.3 in the case of R80) derived by the 5 / 10 / 20 / 40 / 80 root of 10.

Setting the granularity to R5 / R10 / R20 / R40 / R80 limits the boundary values to within the sequence. If the value of groupBy is not within 1.0 to 10.0 (or 10.3 in case of R80), the serial number is automatically multiplied by 10.

E sequence

The E-sequence is a sequence of numbers between 1.0 and 10.0 with a certain error derived from 10 to the power of 6 / 12 / 24 / 48 / 96 / 192.

1-2-5 sequence

The 1-2-5 sequence behaves the same as the three-valued Renard sequence.

Sequence of powers of 2

A sequence of numbers made up of powers of 2.

Example

Suppose the collection items has the following records:

{
  _id: "1",
  price: 10.5
}
{
  _id: "2",
  price: 50.3
}
{
  _id: "3",
  price: 20.8
}
{
  _id: "4",
  price: 80.2
}
{
  _id: "5",
  price: 200.3
}

The above records are automatically grouped into three groups:

const $ = db.command.aggregate
let res = await db.collection('items').aggregate()
  .bucketAuto({
    groupBy: '$price',
    buckets: 3,
  })
  .end()

The returned results are as follows:

{
  "_id": {
    "min": 10.5,
    "max": 50.3
  },
  "count": 2
}
{
  "_id": {
    "min": 50.3,
    "max": 200.3
  },
  "count": 2
}
{
  "_id": {
    "min": 200.3,
    "max": 200.3
  },
  "count": 1
}

# count

Aggregation stage. Calculates the number of records input to this stage from the previous aggregation stage, and outputs a record where the value of the specified field is the number of records.

API description

count has the following form:

count(<string>)

<string> is the name of the field to output the number of records, cannot be an empty string, cannot start with $, cannot contain . characters.

The count stage is equivalent to the operation of group + project:

const $ = db.command.aggregate
let res = await db.collection('items').aggregate()
  .group({
    _id: null,
    count: $.sum(1),
  })
  .project({
    _id: 0,
  })
  .end()

The above operation will output a record with the count field.

Example

Suppose the collection items has the following records:

{
  _id: "1",
  price: 10.5
}
{
  _id: "2",
  price: 50.3
}
{
  _id: "3",
  price: 20.8
}
{
  _id: "4",
  price: 80.2
}
{
  _id: "5",
  price: 200.3
}

Find the number of records with a price greater than 50:

const $ = db.command.aggregate
let res = await db.collection('items').aggregate()
  .match({
    price: $.gt(50)
  })
  .count('expensiveCount')
  .end()

The returned results are as follows:

{
  "expensiveCount": 3
}

# geoNear

Aggregation stage. Output the records from near to far from the given point.

property type default required description
near GeoPoint Yes GeoJSON Point, the point used to determine distance
spherical true Yes Required, value is true
maxDistance number No Maximum Distance
minDistance number No Minimum Distance
query Object No requires that records must meet this condition at the same time (same syntax as where)
distanceMultiplier number No Multiply the distance by this number on return
distanceField string Yes The name of the output field to store the distance, a nested field can be represented by dot notation
includeLocs string No Lists the fields to be used for distance calculations, useful if multiple fields in the record are geographic locations
key string No Select the geographic index to use. If the collection is indexed by multiple geographic locations, one must be specified, by specifying the corresponding field

API description

  • geoNear must be the first aggregation stage
  • Must have geo-indexed. If there are more than one, the index to use must be specified with the key parameter.

Example

Suppose the collection attractions has the following records:

{
  "_id": "geoNear.0",
  "city": "Guangzhou",
  "docType": "geoNear",
  "location": {
    "type": "Point",
    "coordinates": [
      113.30593,
      23.1361155
    ]
  },
  "name": "Canton Tower"
},
{
  "_id": "geoNear.1",
  "city": "Guangzhou",
  "docType": "geoNear",
  "location": {
    "type": "Point",
    "coordinates": [
      113.306789,
      23.1564721
    ]
  },
  "name": "Baiyun Mountain"
},
{
  "_id": "geoNear.2",
  "city": "Beijing",
  "docType": "geoNear",
  "location": {
    "type": "Point",
    "coordinates": [
      116.3949659,
      39.9163447
    ]
  },
  "name": "The Palace Museum"
},
{
  "_id": "geoNear.3",
  "city": "Beijing",
  "docType": "geoNear",
  "location": {
    "type": "Point",
    "coordinates": [
      116.2328567,
      40.242373
    ]
  },
  "name": "Great Wall"
}
const $ = db.command.aggregate
let res = await db.collection('attractions').aggregate()
  .geoNear({
    distanceField: 'distance', // 输出的每个记录中 distance 即是与给定点的距离
    spherical: true,
    near: new db.Geo.Point(113.3089506, 23.0968251),
    query: {
      docType: 'geoNear',
    },
    key: 'location', // 若只有 location 一个地理位置索引的字段,则不需填
    key: 'location', // If there is only one field of location index, it is not required to fill in
    includeLocs: 'location', // 若只有 location 一个是地理位置,则不需填
    includeLocs: 'location', // If only location is a geographic location, you don't need to fill in
  })
  .end()

The returned results are as follows:

{
  "_id": "geoNear.0",
  "location": {
    "type": "Point",
    "coordinates": [
      113.30593,
      23.1361155
    ]
  },
  "docType": "geoNear",
  "name": "Canton Tower",
  "city": "Guangzhou",
  "distance": 4384.68131486958
},
{
  "_id": "geoNear.1",
  "city": "Guangzhou",
  "location": {
    "type": "Point",
    "coordinates": [
      113.306789,
      23.1564721
    ]
  },
  "docType": "geoNear",
  "name": "Baiyun Mountain",
  "distance": 6643.521654040738
},
{
  "_id": "geoNear.2",
  "docType": "geoNear",
  "name": "The Palace Museum",
  "city": "Beijing",
  "location": {
    "coordinates": [
      116.3949659,
      39.9163447
    ],
    "type": "Point"
  },
  "distance": 1894750.4414538583
},
{
  "_id": "geoNear.3",
  "docType": "geoNear",
  "name": "Great Wall",
  "city": "Beijing",
  "location": {
    "type": "Point",
    "coordinates": [
      116.2328567,
      40.242373
    ]
  },
  "distance": 1928300.3308822548
}

# group

Aggregation stage. Group the input records according to the given expression, each record represents a group when outputting, and the _id of each record is the key to distinguish different groups. Output records can also include cumulative values, and setting the output field to cumulative values will calculate cumulative values from that grouping.

Using group can easily implement distinct functions similar to SQL

API description

group has the following form:

group({
  _id: <expression>,
  <field1>: <accumulator1>,
  ...
  <fieldN>: <accumulatorN>
})

The _id parameter is required, if you fill in a constant, there is only one set. The other fields are optional and are accumulated values, using accumulators such as $.sum (const $ = db.command.aggregate), but other expressions can also be used.

The accumulator must be one of the following operators:

For details, see Accumulator Operator

operator description
addToSet Adds a value to the array, does nothing if the value already exists in the array
avg Returns the average value of the data corresponding to the specified field in a set of collections
sum Computes and returns the sum of all values in a set of fields
first Returns the value corresponding to the first record of the specified field in a set. This operation only makes sense if the set of collections is sorted by some definition ( sort ).
last Returns the value corresponding to the last record of the specified field in a set. This operation only makes sense if the set of collections is sorted by some definition ( sort ).
max returns the maximum value of a set of values
min Returns the minimum value of a set of values
push In the group stage, returns an array consisting of the column specified by the expression and the corresponding value
stdDevPop Returns the standard deviation of the corresponding values of a set of fields
stdDevSamp Calculates the sample standard deviation of the input values. If the input values represent the population of data, or do not generalize more data, use db.command.aggregate.stdDevPop instead
mergeObjects Merge multiple documents into a single document

Memory Limit

There is a 100M memory usage limit at this stage.

Example 1: Group by Field Value

Suppose the collection avatar has the following records:

{
  _id: "1",
  alias: "john",
  region: "asia",
  scores: [40, 20, 80],
  coins: 100
}
{
  _id: "2",
  alias: "arthur",
  region: "europe",
  scores: [60, 90],
  coins: 20
}
{
  _id: "3",
  alias: "george",
  region: "europe",
  scores: [50, 70, 90],
  coins: 50
}
{
  _id: "4",
  alias: "john",
  region: "asia",
  scores: [30, 60, 100, 90],
  coins: 40
}
{
  _id: "5",
  alias: "george",
  region: "europe",
  scores: [20],
  coins: 60
}
{
  _id: "6",
  alias: "john",
  region: "asia",
  scores: [40, 80, 70],
  coins: 120
}
const $ = db.command.aggregate
let res = await db.collection('avatar').aggregate()
  .group({
    _id: '$alias',
    num: $.sum(1)
  })
  .end()

The returned results are as follows:

{
  "_id": "john",
  "num": 3
}
{
  "_id": "authur",
  "num": 1
}
{
  "_id": "george",
  "num": 2
}

Example 2: Group by Multiple Values

Records can be grouped by multiple values in the way _id is passed in. Or follow the example data above, group by regions with the same highest score (score), and find the total amount of coins in each group:

const $ = db.command.aggregate
let res = await db.collection('avatar').aggregate()
  .group({
    _id: {
      region: '$region',
      maxScore: $.max('$scores')
    },
    totalCoins: $.sum('$coins')
  })
  .end()

The returned results are as follows:

{
  "_id": {
    "region": "asia",
    "maxScore": 80
  },
  "totalCoins": 220
}
{
  "_id": {
    "region": "asia",
    "maxScore": 100
  },
  "totalCoins": 40
}
{
  "_id": {
    "region": "europe",
    "maxScore": 90
  },
  "totalCoins": 70
}
{
  "_id": {
    "region": "europe",
    "maxScore": 20
  },
  "totalCoins": 60
}

# limit

Aggregation stage. Limit the number of records output to the next stage.

Example

Suppose the collection items has the following records:

{
  _id: "1",
  price: 10
}
{
  _id: "2",
  price: 50
}
{
  _id: "3",
  price: 20
}
{
  _id: "4",
  price: 80
}
{
  _id: "5",
  price: 200
}

Return the smallest two records of records with a price greater than 20:

const $ = db.command.aggregate
let res = await db.collection('items').aggregate()
  .match({
    price: $.gt(20)
  })
  .sort({
    price: 1,
  })
  .limit(2)
  .end()

The returned results are as follows:

{
  "_id": "2",
  "price": 50
}
{
  "_id": "4",
  "price": 80
}

# lookup

Aggregation stage. Joint query. Do left outer join with a specified collection in the same database. For each input record in this stage, lookup will add an array field to the record, and the array is a list of records in the linked table that meet the matching conditions. lookup will output the concatenated result to the next stage.

API description

lookup can be used in two ways

# 相等匹配

When matching a field of an input record for equality with a field of the joined collection, the following definitions are used:

lookup({
  from: <要连接的集合名>,
  from: <collection name to connect to>,
  localField: <输入记录的要进行相等匹配的字段>,
  localField: <field of the input record to be matched for equality>,
  foreignField: <被连接集合的要进行相等匹配的字段>,
  foreignField: <the field of the connected collection to be matched for equality>,
  as: <输出的数组字段名>
  as: <output array field name>
})

Detailed description of parameters

Parameter Fields Description
from The name of another collection to join
localField The field name of the input record of the current pipeline, which will be used for equality matching with the foreignField of the collection specified by from. If the field is not present in the input record, the value of the field will be treated as null
foreignField The field name of the concatenated collection that will be used for equality matching with localField. If the field is not present in the records of the joined collection, the value of the field will be treated as null when matching
as Specifies the field name to be stored in the list of records matched by the connection. This array contains the matched records from the from collection. If the field already exists in the input record, the field will be overwritten

This operation is equivalent to the following pseudo-SQL operation:

SELECT *, <output array field>
FROM collection
WHERE <output array field> IN (SELECT *
                               FROM <collection to join>
                               WHERE <foreignField>= <collection.localField>);

example:

  • Specify an equality match condition
  • apply equality matching to array fields
  • Combine mergeObjects to apply equality matching

# 自定义连接条件、拼接子查询

If you need to specify join conditions other than equality matching, or specify multiple equality matching conditions, or need to concatenate the subquery results of the joined sets, you can use the following definitions:

lookup({
  from: <要连接的集合名>,
  from: <collection name to connect to>,
  let: { <变量1>: <表达式1>, ..., <变量n>: <表达式n> },
  let: { <variable1>: <expression1>, ..., <variablen>: <expressionn> },
  pipeline: [ <在要连接的集合上进行的流水线操作> ],
  pipeline: [ <pipeline operation on the collection to join> ],
  as: <输出的数组字段名>
  as: <output array field name>
})

Detailed description of parameters

Parameter Fields Description
from The name of another collection to join
let Optional. Specify a variable that can be used in the pipeline. The value of the variable can refer to the field of the input record. For example, let: { userName: '$name' } means that the name field of the input record is used as the value of the variable userName. The fields of the input record cannot be directly accessed in the pipeline, and must be accessed after the let definition. The way to access is to use the $$ variable name in the expr operator to access, such as $$userName.
pipeline Specifies the aggregation operation to run on the joined collection. If the entire collection is to be returned, this field takes the empty array []. The fields of the input record cannot be directly accessed in the pipeline, and must be accessed after the let definition. The way to access is to use the $$ variable name in the expr operator to access, such as $$userName.
as Specifies the field name to be stored in the list of records matched by the connection. This array contains the matched records from the from collection. If the field already exists in the input record, the field will be overwritten

This operation is equivalent to the following pseudo-SQL statement:

SELECT *, <output array field>
FROM collection
WHERE <output array field> IN (SELECT <documents as determined from the pipeline>
                               FROM <collection to join>
                               WHERE <pipeline> );

example

  • Specify multiple join conditions
  • Subqueries that concatenate the joined collections

Example

Specify an equality match condition

Suppose the orders collection has the following records:

[
  {"_id":4,"book":"novel 1","price":30,"quantity":2},
  {"_id":5,"book":"science 1","price":20,"quantity":1},
  {"_id":6}
]

The books collection has the following records:

[
  {"_id":"book1","author":"author 1","category":"novel","stock":10,"time":1564456048486,"title":"novel 1"},
  {"_id":"book3","author":"author 3","category":"science","stock":30,"title":"science 1"},
  {"_id":"book4","author":"author 3","category":"science","stock":40,"title":"science 2"},
  {"_id":"book2","author":"author 2","category":"novel","stock":20,"title":"novel 2"},
  {"_id":"book5","author":"author 4","category":"science","stock":50,"title":null},
  {"_id":"book6","author":"author 5","category":"novel","stock":"60"}
]

The following aggregation operations can join the orders and books collections with an equality match condition on the book field of the orders collection and the title field of the books collection:

const db = uniCloud.database()
let res = await db.collection('orders').aggregate()
  .lookup({
    from: 'books',
    localField: 'book',
    foreignField: 'title',
    as: 'bookList',
  })
  .end()

result:

[
  {
    "_id": 4,
    "book": "novel 1",
    "price": 30,
    "quantity": 2,
    "bookList": [
      {
        "_id": "book1",
        "title": "novel 1",
        "author": "author 1",
        "category": "novel",
        "stock": 10
      }
    ]
  },
  {
    "_id": 5,
    "book": "science 1",
    "price": 20,
    "quantity": 1,
    "bookList": [
      {
        "_id": "book3",
        "category": "science",
        "title": "science 1",
        "author": "author 3",
        "stock": 30
      }
    ]
  },
  {
    "_id": 6,
    "bookList": [
      {
        "_id": "book5",
        "category": "science",
        "author": "author 4",
        "stock": 50,
        "title": null
      },
      {
        "_id": "book6",
        "author": "author 5",
        "stock": "60",
        "category": "novel"
      }
    ]
  }
]

Apply equality matching to array fields Suppose the authors collection has the following records:

[
  {"_id": 1, "name": "author 1", "intro": "Two-time best-selling sci-fiction novelist"},
  {"_id": 3, "name": "author 3", "intro": "UCB assistant professor"},
  {"_id": 4, "name": "author 4", "intro": "major in CS"}
]

The books collection has the following records:

[
  {"_id":"book1","authors":["author 1"],"category":"novel","stock":10,"time":1564456048486,"title":"novel 1"},
  {"_id":"book3","authors":["author 3", "author 4"],"category":"science","stock":30,"title":"science 1"},
  {"_id":"book4","authors":["author 3"],"category":"science","stock":40,"title":"science 2"}
]

The following operation retrieves author information and their respective published books, using the lookup operation to match the name field of the authors collection with the authors array field of the books collection:

const db = cloud.database()
let res = await db.collection('authors').aggregate()
  .lookup({
    from: 'books',
    localField: 'name',
    foreignField: 'authors',
    as: 'publishedBooks',
  })
  .end()

result

[
  {
    "_id": 1,
    "intro": "Two-time best-selling sci-fiction novelist",
    "name": "author 1",
    "publishedBooks": [
      {
        "_id": "book1",
        "title": "novel 1",
        "category": "novel",
        "stock": 10,
        "authors": [
          "author 1"
        ]
      }
    ]
  },
  {
    "_id": 3,
    "name": "author 3",
    "intro": "UCB assistant professor",
    "publishedBooks": [
      {
        "_id": "book3",
        "category": "science",
        "title": "science 1",
        "stock": 30,
        "authors": [
          "author 3",
          "author 4"
        ]
      },
      {
        "_id": "book4",
        "title": "science 2",
        "category": "science",
        "stock": 40,
        "authors": [
          "author 3"
        ]
      }
    ]
  },
  {
    "_id": 4,
    "intro": "major in CS",
    "name": "author 4",
    "publishedBooks": [
      {
        "_id": "book3",
        "category": "science",
        "title": "science 1",
        "stock": 30,
        "authors": [
          "author 3",
          "author 4"
        ]
      }
    ]
  }
]

Combining mergeObjects applies equality matching

Suppose the orders collection has the following records:

[
  {"_id":4,"book":"novel 1","price":30,"quantity":2},
  {"_id":5,"book":"science 1","price":20,"quantity":1},
  {"_id":6}
]

The books collection has the following records:

[
  {"_id":"book1","author":"author 1","category":"novel","stock":10,"time":1564456048486,"title":"novel 1"},
  {"_id":"book3","author":"author 3","category":"science","stock":30,"title":"science 1"},
  {"_id":"book4","author":"author 3","category":"science","stock":40,"title":"science 2"},
  {"_id":"book2","author":"author 2","category":"novel","stock":20,"title":"novel 2"},
  {"_id":"book5","author":"author 4","category":"science","stock":50,"title":null},
  {"_id":"book6","author":"author 5","category":"novel","stock":"60"}
]

The following operation matches the book field of orders and the title field of books, and merges the books matching result directly into the orders record.

var db = cloud.database()
var $ = db.command.aggregate
let res = await db.collection('orders').aggregate()
  .lookup({
    from: "books",
    localField: "book",
    foreignField: "title",
    as: "bookList"
  })
  .replaceRoot({
    newRoot: $.mergeObjects([ $.arrayElemAt(['$bookList', 0]), '$$ROOT' ])
  })
  .project({
    bookList: 0
  })
  .end()

result

[
  {
    "_id": 4,
    "title": "novel 1",
    "author": "author 1",
    "time": 1564456048486,
    "category": "novel",
    "stock": 10,
    "book": "novel 1",
    "price": 30,
    "quantity": 2
  },
  {
    "_id": 5,
    "category": "science",
    "title": "science 1",
    "author": "author 3",
    "stock": 30,
    "book": "science 1",
    "price": 20,
    "quantity": 1
  },
  {
    "_id": 6,
    "category": "science",
    "author": "author 4",
    "stock": 50,
    "title": null
  }
]

Specify multiple join conditions

Suppose the orders collection has the following records:

[
  {"_id":4,"book":"novel 1","price":300,"quantity":20},
  {"_id":5,"book":"science 1","price":20,"quantity":1}
]

The books collection has the following records:

[
  {"_id":"book1","author":"author 1","category":"novel","stock":10,"time":1564456048486,"title":"novel 1"},
  {"_id":"book3","author":"author 3","category":"science","stock":30,"title":"science 1"}
]

The following operation joins the orders and books collections, requiring two conditions:

  • The book field of orders is equal to the title field of books
  • The stock field of books is greater than or equal to the quantity field of orders
const db = cloud.database()
const dbCmd = db.command
const $ = dbCmd.aggregate
let res = await db.collection('orders').aggregate()
  .lookup({
    from: 'books',
    let: {
      order_book: '$book',
      order_quantity: '$quantity'
    },
    pipeline: $.pipeline()
      .match(dbCmd.expr($.and([
        $.eq(['$title', '$$order_book']),
        $.gte(['$stock', '$$order_quantity'])
      ])))
      .project({
        _id: 0,
        title: 1,
        author: 1,
        stock: 1
      })
      .done(),
    as: 'bookList',
  })
  .end()

result:

[
  {
    "_id": 4,
    "book": "novel 1",
    "price": 300,
    "quantity": 20,
    "bookList": []
  },
  {
    "_id": 5,
    "book": "science 1",
    "price": 20,
    "quantity": 1,
    "bookList": [
      {
        "title": "science 1",
        "author": "author 3",
        "stock": 30
      }
    ]
  }
]

Concatenate subqueries of joined collections

Suppose the orders collection has the following records:

[
  {"_id":4,"book":"novel 1","price":30,"quantity":2},
  {"_id":5,"book":"science 1","price":20,"quantity":1}
]

The books collection has the following records:

[
  {"_id":"book1","author":"author 1","category":"novel","stock":10,"time":1564456048486,"title":"novel 1"},
  {"_id":"book3","author":"author 3","category":"science","stock":30,"title":"science 1"},
  {"_id":"book4","author":"author 3","category":"science","stock":40,"title":"science 2"}
]

Add an array field to each output record whose value is the result of a query on the books collection:

const db = cloud.database()
const $ = db.command.aggregate
let res = await db.collection('orders').aggregate()
  .lookup({
    from: 'books',
    let: {
      order_book: '$book',
      order_quantity: '$quantity'
    },
    pipeline: $.pipeline()
      .match({
        author: 'author 3'
      })
      .project({
        _id: 0,
        title: 1,
        author: 1,
        stock: 1
      })
      .done(),
    as: 'bookList',
  })
  .end()

result

[
  {
    "_id": 4,
    "book": "novel 1",
    "price": 30,
    "quantity": 20,
    "bookList": [
      {
        "title": "science 1",
        "author": "author 3",
        "stock": 30
      },
      {
        "title": "science 2",
        "author": "author 3",
        "stock": 40
      }
    ]
  },
  {
    "_id": 5,
    "book": "science 1",
    "price": 20,
    "quantity": 1,
    "bookList": [
      {
        "title": "science 1",
        "author": "author 3",
        "stock": 30
      },
      {
        "title": "science 2",
        "author": "author 3",
        "stock": 40
      }
    ]
  }
]

Multiple table join table query

Suppose the orders collection has the following records:

[
  {"_id":4,"book":"book1","price":30,"quantity":2},
  {"_id":5,"book":"book2","price":20,"quantity":1}
]

The books collection has the following records:

[
  {"_id":"book1","author":"author1","title":"novel 1"},
  {"_id":"book2","author":"author2","title":"science 1"},
]

The authors collection has the following records:

[
  {"_id":"author1","name":"A1"},
  {"_id":"author2","author":"A2"}
]

If orders are associated with books, and books are associated with authors query, you can use lookup in the pipeline.

const db = cloud.database()
const dbCmd = db.command
const $ = db.command.aggregate
let res = await db.collection('orders').aggregate()
  .lookup({
    from: 'books',
    let: {
      book_id: '$book'
    },
    pipeline: $.pipeline()
      .match(
        dbCmd.expr($.eq(['$_id', '$$book_id']))
      )
      .lookup({
        from: 'authors',
        let: {
          author_id: '$author'
        },
        pipeline: $.pipeline()
          .match(
            dbCmd.expr($.eq(['$_id', '$$author_id']))
          )
          .done(),
        as: 'authorList'
      })
      .done(),
    as: 'bookList',
  })
  .end()

The result is as follows

[
  {
		"_id":4,
		"book":"book1",
		"price":30,
		"quantity":2,
		"bookList": [{
			"_id":"book1",
			"author":"author1",
			"title":"novel 1",
			authorList: [{
				"_id":"author1",
				"name":"A1"
			}]
		}]
	},
  {
		"_id":5,
		"book":"book2",
		"price":20,
		"quantity":1,
		"bookList": [{
			"_id":"book2",
			"author":"author2",
			"title":"science 1",
			authorList: [{
				"_id":"author2",
				"name":"A2"
			}]
		}]
	}
]

# match

Aggregation stage. Filter documents based on a condition and pass those documents that match the condition to the next pipeline stage.

已知问题

支付宝小程序云中,使用match管道过滤时,如果传入的参数是一个对象,将按照字段的值进行相等匹配,包含字段顺序。

例如:过滤内存是8g的计算机商品

// 错误示例
let res = await db.collection('goods')
  .aggregate()
  .match({
    category: 'computer',
    type: {
      memory: 8,
    }
  })
  .end()
  
// 正确示例
let res = await db.collection('goods')
  .aggregate()
  .match({
    category: 'computer',
    "type.memory": 8
  })
  .end()

API description

match has the following form:

match(<查询条件>)
match(<query condition>)

The query conditions are the same as those of ordinary queries, and ordinary query operators can be used. Note that the match stage is different from other aggregation stages. Database operations cannot be used, and only query operators can be used.

// use string directly
match({
  name: 'Tony Stark'
})
// use operator
const dbCmd = db.command
match({
  age: dbCmd.gt(18)
})

Example

Suppose the collection articles has the following records:

{ "_id" : "1", "author" : "stark",  "score" : 80 }
{ "_id" : "2", "author" : "stark",  "score" : 85 }
{ "_id" : "3", "author" : "bob",    "score" : 60 }
{ "_id" : "4", "author" : "li",     "score" : 55 }
{ "_id" : "5", "author" : "jimmy",  "score" : 60 }
{ "_id" : "6", "author" : "li",     "score" : 94 }
{ "_id" : "7", "author" : "justan", "score" : 95 }

match

Here is an example of a direct match:

let res = await db.collection('articles')
  .aggregate()
  .match({
    author: 'stark'
  })
  .end()

The code here tries to find all articles where the author field is stark , then the match is as follows:

{ "_id" : "1", "author" : "stark", "score" : 80 }
{ "_id" : "2", "author" : "stark", "score" : 85 }

count

After match filters out documents, it can also be used in conjunction with other pipeline stages.

For example, in the following example, we use group to match and calculate the number of documents whose score field is greater than 80:

const dbCmd = db.command
const $ = dbCmd.aggregate
let res = await db.collection('articles')
  .aggregate()
  .match({
    score: dbCmd.gt(80)
  })
  .group({
      _id: null,
      count: $.sum(1)
  })
  .end()

The return value is as follows:

{ "_id" : null, "count" : 3 }

# project

Aggregation stage. Pass the specified field to the next pipeline. The specified field can be an existing field or a new calculated field.

API description

project has the following form:

project({
  <expression>
})

Expressions can have the following formats:

Format Description
: <1 or true> Specifies to include an existing field
_id: <0 or false> Discard _id field
: Add a new field, or reset an existing field
: <0 or false> Discard a field (if you specify to discard a non-_id field, you cannot use other expressions in this project)

Specify Include Fields

The _id field is included in the output by default, and any other fields must be specified in the project if they want to be reflected in the output; if the specified field contains a field that does not yet exist, the project will ignore this field, will not be added to the output document;

Specify exclusion fields

If you specify to exclude a field in the project, other fields will be reflected in the output; if the specified exclusion is a non-_id field, then in this project, no other expressions can be used;

Add a new field or reset an existing field

You can use some special expressions to add new fields, or reset an existing field.

Multi-level nested fields

Sometimes some fields are at the bottom of multiple levels of nesting, we can use dot notation:

"contact.phone.number": <1 or 0 or 表达式>
"contact.phone.number": <1 or 0 or expression>

Nested formats can also be used directly:

contact: { phone: { number: <1 or 0 or 表达式> } }
contact: { phone: { number: <1 or 0 or expression> } }

Example

Suppose we have an articles collection with the following documents:

{
    "_id": 666,
    "title": "This is title",
    "author": "Nobody",
    "isbn": "123456789",
    "introduction": "......"
}

Specify to include certain fields

The following code uses project to have the output only contain the _id, title and author fields:

let res = await db.collection('articles')
  .aggregate()
  .project({
    title: 1,
    author: 1
  })
  .end()

The output is as follows:

{ "_id" : 666, "title" : "This is title", "author" : "Nobody" }

Remove _id field from output

_id is included in the output by default, if you don't want it, you can specify to remove it:

let res = await db.collection('articles')
  .aggregate()
  .project({
    _id: 0,  // 指定去除 _id 字段
    title: 1,
    author: 1
  })
  .end()

The output is as follows:

{ "title" : "This is title", "author" : "Nobody" }

Remove a non-_id field

We can also specify to remove a non-_id field from the output, so that all other fields will be output:

let res = await db.collection('articles')
  .aggregate()
  .project({
    isbn: 0,  // 指定去除 isbn 字段
    isbn: 0, // Specifies to remove the isbn field
  })
  .end()

The output is as follows, without the isbn field compared to the input:

{
    "_id" : 666,
    "title" : "This is title",
    "author" : "Nobody",
    "introduction": "......"
}

Add calculated new field

Suppose we have a students collection with the following documents:

{
    "_id": 1,
    "name": "小明",
    "scores": {
        "chinese": 80,
        "math": 90,
        "english": 70
    }
}

In the code below, we use project and add a new field totalScore to the output:

const { sum } = db.command.aggregate
let res = await db.collection('students')
  .aggregate()
  .project({
    _id: 0,
    name: 1,
    totalScore: sum([
        "$scores.chinese",
        "$scores.math",
        "$scores.english"
    ])
  })
  .end()

The output is:

{ "name": "小明", "totalScore": 240 }
{ "name": "Xiao Ming", "totalScore": 240 }

Add new array field

Suppose we have a points collection containing the following documents:

{ "_id": 1, "x": 1, "y": 1 }
{ "_id": 2, "x": 2, "y": 2 }
{ "_id": 3, "x": 3, "y": 3 }

In the code below, we use project to put the x and y fields into a new array field coordinate :

let res = await db.collection('points')
  .aggregate()
  .project({
    coordinate: ["$x", "$y"]
  })
  .end()

The output is as follows:

{ "_id": 1, "coordinate": [1, 1] }
{ "_id": 2, "coordinate": [2, 2] }
{ "_id": 3, "coordinate": [3, 3] }

# replaceRoot

Aggregation stage. Specify an existing field as the root node of the output, or you can specify a new calculated field as the root node.

API description

replaceRoot is used in the following form:

replaceRoot({
    newRoot: <表达式>
    newRoot: <expression>
})

The expression format is as follows:

Format Description
Specifies an existing field as the output root node (if the field does not exist, an error is reported)
Computes a new field and uses this new field as the root node

Example

Use existing field as root node

Suppose we have a schools collection with the following content:

{
  "_id": 1,
  "name": "SFLS",
  "teachers": {
    "chinese": 22,
    "math": 18,
    "english": 21,
    "other": 123
  }
}

The following code uses replaceRoot to output the teachers field as the root node:

let res = await db.collection('schools')
  .aggregate()
  .replaceRoot({
    newRoot: '$teachers'
  })
  .end()

The output is as follows:

{
  "chinese": 22,
  "math": 18,
  "english": 21,
  "other": 123
}

Use calculated new field as root node

Suppose we have a roles collection with the following content:

{ "_id": 1, "first_name": "四郎", "last_name": "黄" }
{ "_id": 1, "first_name": "Shiro", "last_name": "Yellow" }
{ "_id": 2, "first_name": "邦德", "last_name": "马" }
{ "_id": 2, "first_name": "Bond", "last_name": "Horse" }
{ "_id": 3, "first_name": "牧之", "last_name": "张" }
{ "_id": 3, "first_name": "Muzhi", "last_name": "Zhang" }

The following code uses replaceRoot to concatenate first_name and last_name:

const { concat } = db.command.aggregate
let res = await db.collection('roles')
  .aggregate()
  .replaceRoot({
    newRoot: {
      full_name: concat(['$last_name', '$first_name'])
    }
  })
  .end()

The output is as follows:

{ "full_name": "黄四郎" }
{ "full_name": "Huang Shilang" }
{ "full_name": "马邦德" }
{ "full_name": "Mabonde" }
{ "full_name": "张牧之" }
{ "full_name": "Zhang Muzhi" }

# sample

Notice

  • This method may cause slow response when called frequently with large collections
  • There is a default limit of 20 for Tencent Cloud database operations. If you need to use sample to return more than 20 pieces of data, please specify an additional limit, for example: sample({size:40}).limit(40)

Aggregation stage. Randomly picks the specified number of records from the document.

API description

sample has the following form:

sample({
    size: <正整数>
    size: <positive integer>
})

Note: size is a positive integer, otherwise an error occurs.

Example

Suppose the document users has the following records:

{ "name": "a" }
{ "name": "b" }

choose randomly

If there is a lottery now, one lucky user needs to be selected. Then sample is called as follows:

let res = await db.collection('users')
  .aggregate()
  .sample({
    size: 1
  })
  .end()

The record corresponding to a randomly selected user is returned, and the result is as follows:

{ "_id": "696529e4-7e82-4e7f-812e-5144714edff6", "name": "b" }

# skip

Aggregation stage. Specify a positive integer to skip the corresponding number of documents and output the remaining documents.

Example

let res = await db.collection('users')
  .aggregate()
  .skip(5)
  .end()

This code will skip the first 5 documents found and output the remaining documents.

# sort

Aggregation stage. Sorts the entered documents according to the specified field.

API description

The form is as follows:

sort({
    <fieldname1>: <collation>,
    <fieldname2>: <collation>,
})

<collation /> can be the following values:

  • 1 represents ascending order (smallest to largest);
  • -1 represents descending order (large to small);

Example

Ascending/descending order

Suppose we have the collection articles, which contains the following data:

{ "_id": "1", "author": "stark",  "score": 80, "age": 18 }
{ "_id": "2", "author": "bob",    "score": 60, "age": 18 }
{ "_id": "3", "author": "li",     "score": 55, "age": 19 }
{ "_id": "4", "author": "jimmy",  "score": 60, "age": 22 }
{ "_id": "5", "author": "justan", "score": 95, "age": 33 }
let res = await db.collection('articles')
  .aggregate()
  .sort({
      age: -1,
      score: -1
  })
  .end()

The above code performs an aggregate search in the students collection and sorts the results, first according to the age field in descending order, and then according to the score field in descending order.

The output is as follows:

{ "_id": "5", "author": "justan", "score": 95, "age": 33 }
{ "_id": "4", "author": "jimmy",  "score": 60, "age": 22 }
{ "_id": "3", "author": "li",     "score": 55, "age": 19 }
{ "_id": "1", "author": "stark",  "score": 80, "age": 18 }
{ "_id": "2", "author": "bob",    "score": 60, "age": 18 }

# sortByCount

Aggregation stage. Groups the incoming collection according to the incoming expression. Then count the number of distinct groups, and sort the groups by their number, returning the sorted result.

API description

sortByCount is called as follows:

sortByCount(<表达式>)
sortByCount(<expression>)

The form of the expression is: $ + specifies the field. Note: don't leave out the $ sign.

Example

Statistical base type

Suppose the set passages are recorded as follows:

{ "category": "Web" }
{ "category": "Web" }
{ "category": "Life" }

The following code can count the classification information of the article and calculate the number of each classification. That is, perform the sortByCount aggregation operation on the category field.

let res = await db.collection('passages')
  .aggregate()
  .sortByCount('$category')
  .end()

The returned results are as follows: 2 articles in the Web category and 1 article in the Life category.

{ "_id": "Web", "count": 2 }
{ "_id": "Life", "count": 1 }

destructuring array types

Suppose the records of the collection passages are as follows: the value corresponding to the tags field is of type array.

{ "tags": [ "JavaScript", "C#" ] }
{ "tags": [ "Go", "C#" ] }
{ "tags": [ "Go", "Python", "JavaScript" ] }

How to count the tag information of the article and calculate the number of each tag? Because of the array corresponding to the tags field, you need to use the unwind operation to deconstruct the tags field, and then call sortByCount.

The following code implements this functionality:

let res = await db.collection('passages')
  .aggregate()
  .unwind(`$tags`)
  .sortByCount(`$tags`)
  .end()

The result returned is as follows:

{ "_id": "Go", "count": 2 }
{ "_id": "C#", "count": 2 }
{ "_id": "JavaScript", "count": 2 }
{ "_id": "Python", "count": 1 }

# unwind

Aggregation stage. Splits the document using each element in the specified array field. After splitting, the document changes from one to one or more, one for each element of the array.

API description

Splits the document using each element in the specified array field. After splitting, the document changes from one to one or more, one for each element of the array.

unwind can be used in two forms:

The parameter is a field name

unwind(<字段名>)
unwind(<field name>)

The parameter is an object

unwind({
    path: <字段名>,
    includeArrayIndex: <string>,
    preserveNullAndEmptyArrays: <boolean>
})
Field Type Description
path string The field name of the array you want to split, needs to start with $.
includeArrayIndex string Optional, pass in a new field name on which the array index will be stored. New field names cannot start with $.
preserveNullAndEmptyArrays boolean If true, the document will still be output when the field corresponding to path is null, an empty array, or the field does not exist; if false, unwind will not output these documents. Defaults to false.

Example

split array

Suppose we have a products collection with the following data:

{ "_id": "1", "product": "tshirt", "size": ["S", "M", "L"] }
{ "_id": "2", "product": "pants", "size": [] }
{ "_id": "3", "product": "socks", "size": null }
{ "_id": "4", "product": "trousers", "size": ["S"] }
{ "_id": "5", "product": "sweater", "size": ["M", "L"] }

We split these documents based on the size field

db.collection('products')
  .aggregate()
  .unwind('$size')
  .end()

The output is as follows:

{ "_id": "1", "product": "tshirt", "size": "S" }
{ "_id": "1", "product": "tshirt", "size": "M" }
{ "_id": "1", "product": "tshirt", "size": "L" }
{ "_id": "4", "product": "trousers", "size": "S" }
{ "_id": "5", "product": "sweater", "size": "M" }
{ "_id": "5", "product": "sweater", "size": "L" }

After splitting, keep the index of the original array

After we split the document according to the size field, we want to keep the original array index in the new index field.

let res = await db.collection('products')
  .aggregate()
  .unwind({
      path: '$size',
      includeArrayIndex: 'index'
  })
  .end()

The output is as follows:

{ "_id": "1", "product": "tshirt", "size": "S", "index": 0 }
{ "_id": "1", "product": "tshirt", "size": "M", "index": 1 }
{ "_id": "1", "product": "tshirt", "size": "L", "index": 2 }
{ "_id": "4", "product": "trousers", "size": "S", "index": 0 }
{ "_id": "5", "product": "sweater", "size": "M", "index": 0 }
{ "_id": "5", "product": "sweater", "size": "L", "index": 1 }

Keep documents with empty fields

Notice that we have two special rows of null data in our collection:

...
{ "_id": "2", "product": "pants", "size": [] }
{ "_id": "3", "product": "socks", "size": null }
...

If you want to keep documents whose size is an empty array, null, or the size field does not exist in the output, you can use the preserveNullAndEmptyArrays parameter

let res = await db.collection('products')
  .aggregate()
  .unwind({
      path: '$size',
      preserveNullAndEmptyArrays: true
  })
  .end()

The output is as follows:

{ "_id": "1", "product": "tshirt", "size": "S" }
{ "_id": "1", "product": "tshirt", "size": "M" }
{ "_id": "1", "product": "tshirt", "size": "L" }
{ "_id": "2", "product": "pants", "size": null }
{ "_id": "3", "product": "socks", "size": null }
{ "_id": "4", "product": "trousers", "size": "S" }
{ "_id": "5", "product": "sweater", "size": "M" }
{ "_id": "5", "product": "sweater", "size": "L" }

# end

Indicates that the definition of the aggregation operation is completed, and initiates the actual aggregation operation

return value

Promise.<Object>

property type description
list Array.<any> List of aggregated results

Sample code

const $ = db.command.aggregate
let res = await db.collection('books').aggregate()
  .group({
    // group by category field
    _id: '$category',
    // Let each group of records output have an avgSales field whose value is the average of the sales fields of all records in the group
    avgSales: $.avg('$sales')
  })
  .end()