识别实体
命名实体识别是 Azure AI 语言提供的一项功能, 用于标识和分类非结构化文本中的实体。 它支持实体的多个类别,包括人物、地点、事件、产品、组织等。
有多种方法可以调用命名实体识别 API。 在这里,使用 azure_ai
扩展识别 SQL 查询文本中的实体。
先决条件
你需要具有 Azure Database for PostgreSQL 灵活服务器,且已启用并配置 azure_ai
扩展。 还需要使用 Azure 认知服务对其进行授权,方法是设置语言资源的密钥和终结点。
方案
实体识别在多个域中都很有用,例如:
- 搜索和索引编制:自动生成具有已识别实体的知识关系图和标记目录。
- 过程自动化:自动识别非结构化文本中的产品和位置,并将其路由到客户支持请求。
- 市场分析:度量社交媒体、客户评价、支持工单等中最常见的实体和实体群集,以确定相关主题并预测趋势。
将命名实体识别 (SQL) 与 Azure 认知服务配合使用
Azure Database for PostgreSQL 灵活服务器的 azure_ai 扩展提供用户定义的函数 (UDF) 来直接访问 SQL 内的 AI 功能。 命名实体识别 API 通过 azure_ai
提供的 azure_cognitive.recognize_entities
函数进行访问:
azure_cognitive.recognize_entities(
text text,
language text,
timeout_ms integer DEFAULT 3600000,
throw_on_error boolean DEFAULT true,
disable_service_logs boolean DEFAULT false
)
所需的参数包括 text
、输入和 language
,后者是编写 text
时所采用的语言。 例如,en-us
为美国英语,fr
为法语。 有关可用语言的完整列表,请参阅语言支持。
默认情况下,如果实体识别未在 3,600,000 毫秒(即 1 小时)内完成,则其会停止。 可以通过更改 timeout_ms
来自定义此延迟。
如果发生错误,则默认行为是引发异常,从而导致事务回滚。 通过将 throw_on_error
设置为 false,可禁用此行为。
有关完整参数文档,请参阅 Azure 认知服务扩展文档。
例如,调用以下查询:
SELECT azure_cognitive.recognize_entities('For more information, see Cognitive Services Compliance and Privacy notes.', 'en-us');
结果如下:
{"(\"Cognitive Services\",Skill,\"\",0.94)"}
指示实体的名称为“认知服务”,其被标识为置信度分数为 0.94 的技能。
可以对输入文本使用表列:
SELECT description, azure_cognitive.recognize_entities(description, 'en-us')
FROM listings LIMIT 1;
将返回:
{"(house,Location,\"\",0.77)","(2013.,DateTime,DateRange,1)","(\"rooftop deck\",Location,\"\",0.88)","(\"lounge area\",Location,Structural,0.97)","(tub,Product,\"\",0.52)","
(5,Quantity,Number,0.8)","(bedrooms,Location,\"\",0.92)","(\"gourmet kitchen\",Location,\"\",0.87)","(2-3,Quantity,NumberRange,0.87)","(downtown,Location,Structural,0.8)","(\
"Queen Anne neighborhood\",Location,\"\",0.74)","(house,Location,\"\",0.96)","(barnwood,Product,\"\",0.61)","(steel,Product,\"\",0.73)","(concrete,Product,\"\",0.7)","(living
,Location,Structural,0.53)","(\"gourmet kitchen\",Location,\"\",0.7)","(kitchen,Location,\"\",0.77)","(reading,Skill,\"\",0.54)","(half,Quantity,Number,0.8)","(\"tv room\",Lo
cation,\"\",0.89)","(kitchen,Location,\"\",0.64)","(Fireplace,Product,\"\",0.91)","(sofa,Product,\"\",0.98)","(\"sitting area\",Location,\"\",0.93)","(\"Basement room\",Locat
ion,\"\",0.98)","(kids,PersonType,\"\",0.73)","(room,Location,Structural,0.78)","(patio,Location,Structural,0.75)","(basketball,Product,\"\",0.57)","(bedroom,Location,\"\",0.
8)","(basement,Location,\"\",0.94)","(\"concrete heated floors\",Product,\"\",0.95)","(\"queen sleeper sofa\",Product,\"\",0.86)","(tv,Location,\"\",0.54)","(basement,Locatio
n,\"\",0.92)","(room,Location,Structural,0.9)","(\"a second\",DateTime,Duration,0.85)","(family,PersonType,\"\",0.71)","(kids,PersonType,\"\",0.65)","(\"2nd floor\",Location,
Structural,0.56)","(4,Quantity,Number,0.8)","(bedrooms,Location,\"\",0.66)","(one,Quantity,Number,0.8)","(one,Quantity,Number,0.8)","(bedroom,Location,\"\",0.54)","(\"twin bu
nk beds\",Product,\"\",0.67)"}
总结
命名实体识别标识输入文本中的实体并对其进行分类。 Azure 认知服务语言模型执行大量自然语言处理。 Azure Database for PostgreSQL 的 azure_ai
扩展提供了 azure_cognitive.recognize_entities
API,用于直接在 SQL 查询内访问命名实体识别。