Peter Murray-Rust - Thur 14 April 2016

OpenForum receives the financial support of the Fédération Wallonie-Bruxelles and FNRS-FRS

Biography of Peter Murray-Rust

Professor Peter Murray-Rust is now Reader Emeritus in Molecular Informatics at the University of Cambridge and Senior Research Fellow Emeritus of Churchill College, Cambridge.

His research interests have involved the automated analysis of data in scientific publications, creation of virtual communities e.g. The Virtual School of Natural Sciences in the Globewide Network Academy and the Semantic Web.

He campaigns for open data, particularly in science, and is on the advisory board of the Open Knowledge Foundation and a co-author of the Panton Principles for Open scientific data.

In 2011 he and Henry Rzepa were joint recipients of the Herman Skolnik Award of the American Chemical Society.

As of 2014, Murray-Rust was granted a Fellowship by Shuttleworth Foundation in relation to the ContentMine project which uses machines to liberate 100,000,000 facts from the scientific literature.

Murray-Rust is also known for his work on making scientific knowledge from literature freely available, and in such taking a stance against publishers that are not fully compliant with the Berlin Declaration on Open Access. In 2014 he actively raised awareness of glitches in the publishing system of Elsevier, where restrictions were imposed by Elsevier on the reuse of papers after the authors had paid Elsevier to make the paper freely available.

A longer biography can be found on the wikipedia page on Peter Murray-Rust, and much can be found on Peter Murray-Rust's blog.

Title of the talk

Open Data & Open Access : getting more from scientific papers with content mining, Thursday 14 April 2016, 18h30, University Foundation (on a map), Brussels

Short abstract

There several thousand scientific papers published each day, and nobody can keep up with them. If they are Open Access they can be aggregated in a single place such as the repositories CORE (UK), HAL (FR), and Europe PubMedCentral (for biomedical papers).

It's then possible to use machines to help us filter them on scientific grounds and select exactly those sections of each paper that the reader wants to read. It's also possible to extract chunks of scientific knowledge such as molecular structures or evolutionary trees and compute completely new knowledge.

I shall demo this system using at least two examples:

  • The "Zika epidemic". What do we actually know about Zika from the peer-reviewed literature? How does it link to other Open Scientific Knowledge?
  • Clinical trials. Europe and other countries have collected 400,000 clinical trials. Can we search them? What procedures where used? How many patients? And, very importantly, has the trial been reported in the recent literature?

This presentation will be accessible to anyone: school students, scientists, policy makers, data journalists, etc.

All content and tools are free and open, and can by used by anyone.

A short video, shot in real time (5 mins), demonstrates that any citizen can access knowledge on that timescale.

Hackday on Friday 15 April 2016, 9h at the École supérieure d'informatique

located Rue royale 67, 1000 Brussels

The hackday will explore the automatic extraction of facts from documents, especially (not not exclusively) science and medicine . By default we can extract:

  • species
  • DNA
  • places
  • genes
  • word frequencies
  • drugs
  • organizations

Participants can also create their own word lists and regular expressions.

By default we'll use the Open Access scientific literature but we can also look at any easily retrieved public documents (e.g. government, NGO).