There are huge volumes of useful health and safety information openly available on the Internet. This includes information pertaining to industrial accidents that have happened around the world, information on new and emerging industry risks, new approaches to mitigating such risks, new regulatory requirements, industry standards, industry guidance and industry good practice. All such information has huge learning value for the Discovering Safety Programme.
Aims and objectives
The aim of this feasibility study was to explore the potential of using technology company, Polecat’s website monitoring and web scraping capabilities, along with and their RepVault platform, to identify and harvest health and safety information from websites of value to the Discovering Safety Programme, integrate it within the RepVault platform and develop enhanced functionality enabling it to be expediently interrogated using the platform. The vision for the deliverable was the creation of an interactive map of the world, made up of a number of information layers, similar to a geographic information system, where different categories of health and safety information, one for each layer, could be explored using the map as an intelligent interface.
Specific objectives were five-fold:
-
Agree health and safety topic areas to provide the focus of the map layers
-
Build web search terms, harvest information content and associated metadata (e.g. information source, date of publication, company name etc.)
-
Geotag and time stamp information records, flag other pertinent contextual data relating to records
-
Integrate within RepVault platform, create search functionality enabling intelligent interrogation by a user
-
Test prototype platform