[关键词]
[摘要]
目的 构建基于双向表示神经网络的中医证候归类模型,将临床记录中的证候归类为与ICD-11中相对应的证候,并为临床医生记录证候提供归类提示。方法 首先使用基于RNN的Encoder-Decoder结构和基于RNN与Sigmoid函数结合的Multilabel结构建立对照模型,其后分别使用基于BERT与Sigmoid函数结合的Multilabel结构,以及本研究提出的基于BERT候选项选择结构建立模型。以准确率、精确率、召回率、F1得分对各模型进行评价。并使用提出的多模型决策错误自查法,观察模型的自查误能力。结果 基于BERT的候选项选择结构建立的中医证候归类模型在各项指标中的表现最优,均达85%以上。基于多模型决策错误自查法,发现了87.23%错误结果,模型具有错误自查能力。结论 构建的中医证候归类模型可将医生撰写的证候归类为与ICD-11和《中医病证分类与代码》中相对应的证候,并具备自查误功能。将模型用于中医证候的归类处理,可为大数据背景下的中医归类诊疗及中医辨证论治规律的挖掘提供有力的工具支持。
[Key word]
[Abstract]
Objective To construct the TCM pattern normalization model based on BERT,to normalize the patterns written by TCM practitioners into the corresponding patterns defined by the ICD-11 and CTTCM, and to provide corresponding standardized alternatives when TCM practitioners describe patterns.Methods First, the Encoder-Decoder structure and the Multilabel structure, which are based on recurrent neural network, were used separately to construct the TCM pattern normalization model as a baseline model. Thus, the Multilabel structure based on bidirectional encoder representations from transformers (BERT) and the candidate selection structure based on BERT proposed herein were used separately to construct the TCM pattern normalization model. Each model was evaluated in terms of accuracy, precision, recall, and F1-score. The ability of the model to detect incorrect outputs was observed through the proposed error self-checking function.Results The TCM pattern normalization model, which was based on the BERT candidate selection structure, had the best performance in all indicators, reaching over 89%. The multi-model decision-making error self-checking method showed 74.79% of the error results, indicating that the model can detect incorrect outputs.Conclusion The TCM pattern normalization model can normalize the patterns written by TCM practitioners into the corresponding patterns defined by the ICD-11 and CTTCM and is equipped with an error self-checking function. The application of the model into the normalization of TCM patterns can provide a tool for the standardized diagnosis and treatment of TCM as well as for identifying principles of pattern differentiation under the background of big data.
[中图分类号]
R-058
[基金项目]
国家科学技术部国家重点研发计划课题(2017YFC1700303):基于名老中医学术经验的辨证论治临床辅助决策系统,负责人:李宇航。