全文搜索¶
全文搜索是基于全文索引对值为字符串类型的属性进行前缀搜索、通配符搜索、正则表达式搜索和模糊搜索。
在LOOKUP
语句中,使用WHERE
子句指定字符串的搜索条件。
前提条件¶
请确保已经部署全文索引。详情请参见部署全文索引和部署 listener。
注意事项¶
使用全文索引前,请确认已经了解全文索引的使用限制。
全文本查询¶
全文本查询使您能够搜索经过分析的文本字段,使用具有严格语法的解析器,根据提供的查询字符串返回内容。详情参见Query string query。
语法¶
创建全文索引¶
CREATE FULLTEXT {TAG | EDGE} INDEX <index_name> ON {<tag_name> | <edge_name>} (<prop_name> [,<prop_name>]...) [ANALYZER="<analyzer_name>"];
- 创建全文索引时支持多属性的复合索引。
<analyzer_name>
为分词器名称。默认为standard
。使用其他分词器(例如 IK Analysis)需要确保已经提前在 Elasticsearch 中安装对应分词器。
显示全文索引¶
SHOW FULLTEXT INDEXES;
重建全文索引¶
REBUILD FULLTEXT INDEX;
Caution
数据量大时,重建全文索引速度较慢,可以修改 Storage 服务的配置文件(nebula-storaged.conf
)中snapshot_send_files=false
。
删除全文索引¶
DROP FULLTEXT INDEX <index_name>;
使用查询选项¶
LOOKUP ON {<tag> | <edge_type>} WHERE ES_QUERY(<index_name>, "<text>") YIELD <return_list> [| LIMIT [<offset>,] <number_rows>];
<return_list>
<prop_name> [AS <prop_alias>] [, <prop_name> [AS <prop_alias>] ...] [, id(vertex) [AS <prop_alias>]] [, score() AS <score_alias>]
index_name
:索引名称。
text
:搜索条件。WHERE 后只能跟一个 ES_QUERY,所有判断条件必须写在 text 里。详细语法请参见Query string syntax。
score()
:对符合条件的点做 N 度扩展计算出的分数。默认值为1.0
。分数越高,匹配程度越高。返回值默认按照分数从高到低排序。详情参见Search and Scoring in Lucene。
示例¶
//创建图空间。
nebula> CREATE SPACE IF NOT EXISTS basketballplayer (partition_num=3,replica_factor=1, vid_type=fixed_string(30));
//登录文本搜索客户端。
nebula> SIGN IN TEXT SERVICE (192.168.8.100:9200, HTTP);
//检查是否成功登录。
nebula> SHOW TEXT SEARCH CLIENTS;
+-----------------+-----------------+------+
| Type | Host | Port |
+-----------------+-----------------+------+
| "ELASTICSEARCH" | "192.168.8.100" | 9200 |
+-----------------+-----------------+------+
//切换图空间。
nebula> USE basketballplayer;
//添加 listener 到 NebulaGraph 集群。
nebula> ADD LISTENER ELASTICSEARCH 192.168.8.100:9789;
//检查是否成功添加 listener,当状态为 Online 时表示成功添加。
nebula> SHOW LISTENER;
+--------+-----------------+------------------------+-------------+
| PartId | Type | Host | Host Status |
+--------+-----------------+------------------------+-------------+
| 1 | "ELASTICSEARCH" | ""192.168.8.100":9789" | "ONLINE" |
| 2 | "ELASTICSEARCH" | ""192.168.8.100":9789" | "ONLINE" |
| 3 | "ELASTICSEARCH" | ""192.168.8.100":9789" | "ONLINE" |
+--------+-----------------+------------------------+-------------+
//创建 Tag。
nebula> CREATE TAG IF NOT EXISTS player(name string, city string);
//创建单属性全文索引。
nebula> CREATE FULLTEXT TAG INDEX fulltext_index_1 ON player(name) ANALYZER="standard";
//创建多属性全文索引。
nebula> CREATE FULLTEXT TAG INDEX fulltext_index_2 ON player(name,city) ANALYZER="standard";
//重建全文索引。
nebula> REBUILD FULLTEXT INDEX;
//查看全文索引。
nebula> SHOW FULLTEXT INDEXES;
+--------------------+-------------+-------------+--------------+------------+
| Name | Schema Type | Schema Name | Fields | Analyzer |
+--------------------+-------------+-------------+--------------+------------+
| "fulltext_index_1" | "Tag" | "player" | "name" | "standard" |
| "fulltext_index_2" | "Tag" | "player" | "name, city" | "standard" |
+--------------------+-------------+-------------+--------------+------------+
//插入测试数据。
nebula> INSERT VERTEX player(name, city) VALUES \
"Russell Westbrook": ("Russell Westbrook", "Los Angeles"), \
"Chris Paul": ("Chris Paul", "Houston"),\
"Boris Diaw": ("Boris Diaw", "Houston"),\
"David West": ("David West", "Philadelphia"),\
"Danny Green": ("Danny Green", "Philadelphia"),\
"Tim Duncan": ("Tim Duncan", "New York"),\
"James Harden": ("James Harden", "New York"),\
"Tony Parker": ("Tony Parker", "Chicago"),\
"Aron Baynes": ("Aron Baynes", "Chicago"),\
"Ben Simmons": ("Ben Simmons", "Phoenix"),\
"Blake Griffin": ("Blake Griffin", "Phoenix");
//测试查询
nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,"Chris") YIELD id(vertex);
+--------------+
| id(VERTEX) |
+--------------+
| "Chris Paul" |
+--------------+
nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,"Harden") YIELD properties(vertex);
+----------------------------------------------------------------+
| properties(VERTEX) |
+----------------------------------------------------------------+
| {_vid: "James Harden", city: "New York", name: "James Harden"} |
+----------------------------------------------------------------+
nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,"Da*") YIELD properties(vertex);
+------------------------------------------------------------------+
| properties(VERTEX) |
+------------------------------------------------------------------+
| {_vid: "David West", city: "Philadelphia", name: "David West"} |
| {_vid: "Danny Green", city: "Philadelphia", name: "Danny Green"} |
+------------------------------------------------------------------+
nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,"*b*") YIELD id(vertex);
+---------------------+
| id(VERTEX) |
+---------------------+
| "Russell Westbrook" |
| "Boris Diaw" |
| "Aron Baynes" |
| "Ben Simmons" |
| "Blake Griffin" |
+---------------------+
nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,"*b*") YIELD id(vertex) | LIMIT 2,3;
+-----------------+
| id(VERTEX) |
+-----------------+
| "Aron Baynes" |
| "Ben Simmons" |
| "Blake Griffin" |
+-----------------+
nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,"*b*") YIELD id(vertex) | YIELD count(*);
+----------+
| count(*) |
+----------+
| 5 |
+----------+
nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,"*b*") YIELD id(vertex), score() AS score;
+---------------------+-------+
| id(VERTEX) | score |
+---------------------+-------+
| "Russell Westbrook" | 1.0 |
| "Boris Diaw" | 1.0 |
| "Aron Baynes" | 1.0 |
| "Ben Simmons" | 1.0 |
| "Blake Griffin" | 1.0 |
+---------------------+-------+
//对于包含 b 的词的文档,它的得分将乘以加权因子 4,而对于包含 c 的词的文档,使用默认加权因子 1。
nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,"*b*^4 OR *c*") YIELD id(vertex), score() AS score;
+---------------------+-------+
| id(VERTEX) | score |
+---------------------+-------+
| "Russell Westbrook" | 4.0 |
| "Boris Diaw" | 4.0 |
| "Aron Baynes" | 4.0 |
| "Ben Simmons" | 4.0 |
| "Blake Griffin" | 4.0 |
| "Chris Paul" | 1.0 |
| "Tim Duncan" | 1.0 |
+---------------------+-------+
//使用多属性全文索引查询,此时条件会在索引的所有属性内进行匹配。
nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_2,"*h*") YIELD properties(vertex);
+------------------------------------------------------------------+
| properties(VERTEX) |
+------------------------------------------------------------------+
| {_vid: "Chris Paul", city: "Houston", name: "Chris Paul"} |
| {_vid: "Boris Diaw", city: "Houston", name: "Boris Diaw"} |
| {_vid: "David West", city: "Philadelphia", name: "David West"} |
| {_vid: "James Harden", city: "New York", name: "James Harden"} |
| {_vid: "Tony Parker", city: "Chicago", name: "Tony Parker"} |
| {_vid: "Aron Baynes", city: "Chicago", name: "Aron Baynes"} |
| {_vid: "Ben Simmons", city: "Phoenix", name: "Ben Simmons"} |
| {_vid: "Blake Griffin", city: "Phoenix", name: "Blake Griffin"} |
| {_vid: "Danny Green", city: "Philadelphia", name: "Danny Green"} |
+------------------------------------------------------------------+
//使用多属性全文索引查询时,可以对不同的属性指定不同的文本进行查询。
nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_2,"name:*b* AND city:Houston") YIELD properties(vertex);
+-----------------------------------------------------------+
| properties(VERTEX) |
+-----------------------------------------------------------+
| {_vid: "Boris Diaw", city: "Houston", name: "Boris Diaw"} |
+-----------------------------------------------------------+
//删除单属性全文索引。
nebula> DROP FULLTEXT INDEX fulltext_index_1;
最后更新:
September 6, 2024