cs.purdue.edu

Harvesting Relational Tables from Lists on the Web

Authors: 
Elmeleegy, Hazem; Madhavan, Jayant; Halevy, Alon
Year: 
2009

A large number of web pages contain data structured in the form of “lists”. Many such lists can be further split into multi-column tables, which can then be used in more semantically meaningful tasks. However, harvesting relational tables from such lists can be a challenging task. The lists are manually generated and hence need not have well defined templates – they have inconsistent delimiters (if any) and often have missing information.

Syndicate content