">
增大字体
|
摘要
随着社会的信息化,传统的关系数据库已经不能满足人们的某些应用.在关系数据库上增加新的功能特性成为当前的主流的研究方向.例如全文检索就是数据库系统急待增加的一个功能.一旦数据库拥有了全文检索的功能,用户就可以通过SQL语句进行关键字的查询,而且可以完成聚集,连接等一系列复杂的查询.这是一般搜索引擎所不能办到的.
另一方面伴随着XML逐渐成为数据交换的标准,对XML文档的查询也是当前的一个研究热点.目前主要的研究工作还是集中在对XML文档的结构化查询上,而对XML的关键字检索的工作还处于刚刚起步的阶段.结构化的XML文档为什么还需要关键字检索呢 关键字的检索有自己的特点:用户不需要知道XML的结构信息也不需要知道复杂的XML查询语言.对于普通用户来说他们更喜欢这种简单关键字的检索.因此XML的关键字检索有着非常广阔的应用前景.
本文以北京大学数据库教研室开发的CoDB关系数据库为基础,在其上设计并实现了XML全文检索的功能.我们的系统有如下一些特点:
支持XML文档的检索,查询的精度可以控制,可以是在XML文档的元素Element级别也可以是在文档级别.
CoDB中的全文检索功能和数据库查询引擎句紧密地结合在了一起,用户可以完成一些较为复杂的基于关键字的查询.
设计了一种新的自索引的倒排结构可以很好的应用于XML全文检索.
支持对XML文档的重要度和XML元素的重要度排序.
实验证明使用我们的全文检索进行检索时查询速度要比SQL Server快一些,而且在全文检索的功能上还要略强于SQL Server.
关键字
CoDB,XML,全文检索,查询引擎,自索引的倒排索引,Dewey ID
Abstract
As the development of information technology, traditional database system cannot meet the needs of some applications. Adding all kinds of new functions to the DBMS becomes the current research area of database. For example, full-text search is such a function which should been implemented in DBMS. Once full-text search in DBMS is implemented, user can conduct keyword search in DBMS with SQL, even some complicated query such as aggregate and join operation which cannot be done in traditional search engines.
On the other hand, XML become the standard of information exchange. Query XML document is another hot research area. Current works mainly focus in structured XML query languages. However, XML search based on keyword is at initial stage. Why we need keyword search in XML document It has some advantages. Users don't have to remember the structure of XML document or be familiar to XML query languages. Keyword search of XML document will be widely used in the future, since most users like keyword search.
This paper describes how we design and implement a XML Full-text Search in CoDB, a DBMS developed by Database Group of Peking University. Our contribution is mainly as follows,
We support keyword search of XML documents. S
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] ... 下一页 >>
|
|