Web Page Information Extraction Using Deep Learning

Authors

  • D. Baswaraj , Sasi Bhanu J , Surya Kumar , K. Venkata Raju , D.B.K.Kamesh

Abstract

-In many companies, business units that aim to online sell, need every type of referential data
about the market. In order to collect this data which can be group of price, content, survey etc. with a predefined format, websites which sell similar products can be used. The methods used in the data collection
process are generally categorized by 3 main groups: 1- Manual 2-Half Manual 3-Auto. Statically coded
data collectors (type 1 and type 2) are unable to collect healthy data in the long term and require
continuous development and maintenance effort, as internet pages are dynamic and changes would happen
frequently in their page designs. In this study, a data scraping application (type 3) which is not affected by
structural changes in web pages was developed. This study aims to obtain data from images of web
pages using Deep CNNs.

Published

2020-11-01

Issue

Section

Articles