聚合函数¶
本文介绍 NebulaGraph 支持的聚合函数。
avg()¶
avg() 返回参数的平均值。
语法:avg(<expression>)
- 返回类型:double。
示例:
# 返回 Tag player 的 age 属性的平均值。
nebula> MATCH (v:player) RETURN avg(v.player.age);
+--------------------+
| avg(v.player.age) |
+--------------------+
| 33.294117647058826 |
+--------------------+
count()¶
count() 返回参数的数量。
- (原生 nGQL)用户可以同时使用
count()
和GROUP BY
对传参进行分组和计数,再使用YIELD
返回结果。
- (openCypher 方式)用户可以使用
count()
对指定的值进行计数,再使用RETURN
返回结果。不需要使用GROUP BY
。
语法:count({<expression> | *})
- count(*) 返回总行数(包括 NULL)。
- 返回类型:int。
示例:
# 将列表拆分,并返回拆分后行的数量,列表元素的数量,拆分后将行去重后的数量。
nebula> WITH [NULL, 1, 1, 2, 2] As a UNWIND a AS b \
RETURN count(b), count(*), count(DISTINCT b);
+----------+----------+-------------------+
| count(b) | count(*) | count(distinct b) |
+----------+----------+-------------------+
| 4 | 5 | 2 |
+----------+----------+-------------------+
# 返回 player101 follow 的人,以及 follow player101 的人,即双向查询。
# 使用`count()`和`GROUP BY`进行分组和计数。
nebula> GO FROM "player101" OVER follow BIDIRECT \
YIELD properties($$).name AS Name \
| GROUP BY $-.Name YIELD $-.Name, count(*);
+---------------------+----------+
| $-.Name | count(*) |
+---------------------+----------+
| "LaMarcus Aldridge" | 2 |
| "Tim Duncan" | 2 |
| "Marco Belinelli" | 1 |
| "Manu Ginobili" | 1 |
| "Boris Diaw" | 1 |
| "Dejounte Murray" | 1 |
+---------------------+----------+
# 使用`count()`进行计数。
nebula> MATCH (v1:player)-[:follow]-(v2:player) \
WHERE id(v1)== "player101" \
RETURN v2.player.name AS Name, count(*) as cnt ORDER BY cnt DESC;
+---------------------+-----+
| Name | cnt |
+---------------------+-----+
| "LaMarcus Aldridge" | 2 |
| "Tim Duncan" | 2 |
| "Boris Diaw" | 1 |
| "Manu Ginobili" | 1 |
| "Dejounte Murray" | 1 |
| "Marco Belinelli" | 1 |
+---------------------+-----+
上述示例的返回结果有两列:
$-.Name
:查询结果包含的姓名。
count(*)
:姓名出现的次数。
因为测试数据集basketballplayer
中没有重复的姓名,count(*)
列中数字2
表示该行的人和player101
是互相follow
的关系。
# 方法一:统计数据库中的年龄分布情况。
nebula> LOOKUP ON player \
YIELD player.age As playerage \
| GROUP BY $-.playerage \
YIELD $-.playerage as age, count(*) AS number \
| ORDER BY $-.number DESC, $-.age DESC;
+-----+--------+
| age | number |
+-----+--------+
| 34 | 4 |
| 33 | 4 |
| 30 | 4 |
| 29 | 4 |
| 38 | 3 |
+-----+--------+
...
# 方法二:统计数据库中的年龄分布情况。
nebula> MATCH (n:player) \
RETURN n.player.age as age, count(*) as number \
ORDER BY number DESC, age DESC;
+-----+--------+
| age | number |
+-----+--------+
| 34 | 4 |
| 33 | 4 |
| 30 | 4 |
| 29 | 4 |
| 38 | 3 |
+-----+--------+
...
# 统计 Tim Duncan 关联的边数。
nebula> MATCH (v:player{name:"Tim Duncan"}) -[e]- (v2) \
RETURN count(e);
+----------+
| count(e) |
+----------+
| 13 |
+----------+
# 多跳查询,统计 Tim Duncan 关联的边数,返回两列(不去重和去重)。
nebula> MATCH (n:player {name : "Tim Duncan"})-[]->(friend:player)-[]->(fof:player) \
RETURN count(fof), count(DISTINCT fof);
+------------+---------------------+
| count(fof) | count(distinct fof) |
+------------+---------------------+
| 4 | 3 |
+------------+---------------------+
max()¶
max() 返回参数的最大值。
语法:max(<expression>)
- 返回类型:与原参数相同。
示例:
# 返回 Tag player 的 age 属性的最大值。
nebula> MATCH (v:player) RETURN max(v.player.age);
+-------------------+
| max(v.player.age) |
+-------------------+
| 47 |
+-------------------+
min()¶
min() 返回参数的最小值。
语法:min(<expression>)
- 返回类型:与原参数相同。
示例:
# 返回 Tag player 的 age 属性的最小值。
nebula> MATCH (v:player) RETURN min(v.player.age);
+-------------------+
| min(v.player.age) |
+-------------------+
| 20 |
+-------------------+
collect()¶
collect() 返回一个符合表达式返回结果的列表。该函数可以将多条记录或值合并进一个列表,实现数据聚合。
语法:collect(<expression>)
- 返回类型:list。
示例:
# 将列表拆分为单独的行。
nebula> UNWIND [1, 2, 1] AS a \
RETURN a;
+---+
| a |
+---+
| 1 |
| 2 |
| 1 |
+---+
# 将列表拆分为单独的行记录后再合并为一个列表返回。
nebula> UNWIND [1, 2, 1] AS a \
RETURN collect(a);
+------------+
| collect(a) |
+------------+
| [1, 2, 1] |
+------------+
# 统计并返回列表中相同元素的数量。
nebula> UNWIND [1, 2, 1] AS a \
RETURN a, collect(a), size(collect(a));
+---+------------+------------------+
| a | collect(a) | size(collect(a)) |
+---+------------+------------------+
| 2 | [2] | 1 |
| 1 | [1, 1] | 2 |
+---+------------+------------------+
# 降序排列,限制输出行数为 3,然后将结果输出到列表中。
nebula> UNWIND ["c", "b", "a", "d" ] AS p \
WITH p AS q \
ORDER BY q DESC LIMIT 3 \
RETURN collect(q);
+-----------------+
| collect(q) |
+-----------------+
| ["d", "c", "b"] |
+-----------------+
# 将列表拆分后去重,再合并为新的列表返回。
nebula> WITH [1, 1, 2, 2] AS coll \
UNWIND coll AS x \
WITH DISTINCT x \
RETURN collect(x) AS ss;
+--------+
| ss |
+--------+
| [1, 2] |
+--------+
# 将 Tag player 的 age 属性值合并为列表返回。
nebula> MATCH (n:player) \
RETURN collect(n.player.age);
+---------------------------------------------------------------+
| collect(n.player.age) |
+---------------------------------------------------------------+
| [32, 32, 34, 29, 41, 40, 33, 25, 40, 37, ...
...
# 基于年龄聚合姓名。
nebula> MATCH (n:player) \
RETURN n.player.age AS age, collect(n.player.name);
+-----+--------------------------------------------------------------------------+
| age | collect(n.player.name) |
+-----+--------------------------------------------------------------------------+
| 24 | ["Giannis Antetokounmpo"] |
| 20 | ["Luka Doncic"] |
| 25 | ["Joel Embiid", "Kyle Anderson"] |
+-----+--------------------------------------------------------------------------+
...
# 将 player100 的目的点 name 属性值聚合,结果合并为列表返回。
nebula> GO FROM "player100" OVER serve \
YIELD properties($$).name AS name \
| GROUP BY $-.name \
YIELD collect($-.name) AS name;
+-----------+
| name |
+-----------+
| ["Spurs"] |
+-----------+
# 将 Tag player 的 age 属性值聚合,结果合并为列表返回。
nebula> LOOKUP ON player \
YIELD player.age As playerage \
| GROUP BY $-.playerage \
YIELD collect($-.playerage) AS playerage;
+------------------+
| playerage |
+------------------+
| [22] |
| [47] |
| [43] |
| [25, 25] |
+------------------+
...
std()¶
std() 返回参数的总体标准差。
语法:std(<expression>)
- 返回类型:double。
示例:
# 返回 Tag player 的 age 属性值的标准差。
nebula> MATCH (v:player) RETURN std(v.player.age);
+-------------------+
| std(v.player.age) |
+-------------------+
| 6.423895701687502 |
+-------------------+
sum()¶
sum() 返回参数的和。
语法:sum(<expression>)
- 返回类型:与原参数相同。
示例:
# 返回 Tag player 的 age 属性值的总和。
nebula> MATCH (v:player) RETURN sum(v.player.age);
+-------------------+
| sum(v.player.age) |
+-------------------+
| 1698 |
+-------------------+
聚合示例¶
nebula> GO FROM "player100" OVER follow YIELD dst(edge) AS dst, properties($$).age AS age \
| GROUP BY $-.dst \
YIELD \
$-.dst AS dst, \
toInteger((sum($-.age)/count($-.age)))+avg(distinct $-.age+1)+1 AS statistics;
+-------------+------------+
| dst | statistics |
+-------------+------------+
| "player125" | 84.0 |
| "player101" | 74.0 |
+-------------+------------+
最后更新:
May 9, 2024