Access to and reuse of EU legal information

Report of a one-day conference organised by the Publications Office of the European Union, Brussels, 21 March 2016. Paul Magrath, of ICLR, was there.

“Information is the currency of democracy”, says Tibor Navracsics, European Commissioner for Education, Culture, Youth and Sport. He’s quoting Thomas Jefferson, according to Twitter, where the hashtag for this morning’s plenary session is #EULawData4Reuse.

Today we’re not just talking about any old information. We’re talking about legal information. Specifically, about EU legal data – which embraces legislation, case law, treaties and commentary – and about its availability for access and for reuse by citizens, organisations and other publishers. Such reuse will, says Tibor, boost the digital economy. Moreover, it will make the EU decision making process more transparent.

EurLex has over 9m documents and has over 70m visits per year. Its legislation must be available in 24 official languages. The EU Publications Office would like to be proactive in making it all available for reuse.

EU legal information is part of the larger ecosystem comprised also of EU member states’ law. And in the UK, the legal sector is worth a massive £27bn, as we are reminded by the next speaker, Carol Tullo, Director of Information Policy and Services at the National Archives.

#eulawdata4reuse Présentation de Carol Tullo, National Archives UK. J'aime bien. pic.twitter.com/zE8AHF3kGU

— Monique Dejeans (@moniquedejeans) March 21, 2016

She uses the story of how Legislation.gov.uk put the statute book online as an example of the UK approach to open legal data. The story may be familiar to readers of this blog, but for the audience in Brussels it’s a pioneering tale of obstacles overcome in the creation of the world’s first open data legislation service. “Information,” says Carol, “is at the heart of our strategy.”

The legislation.gov.uk platform now has 6.5m web pages and 160,000 documents and is constantly growing. It serves four different jurisdictions, within which any particular enactment may come into force on different dates and with different geographical extents. It includes research tools enabling users to weigh, measure and analyse the contents of the statute book. Though it is one of the most capable platforms of its type in the world, most of its users are non-technical. Over 60% of visits to the site are from a web search.

Yet the one thing professional users are most anxious about is how up to date are its contents, given the susceptibility of legislation to amendment and repeal. Carol reassures us: it is all now fully annotated (ie all the effects have been captured and are visible), and 77.2% of the primary legislation content has been fully consolidated, which means the amendments have all been incorporated into the text, often with a timeline of changes over time.

Since new provisions and effects are being added daily, it will always be a work in progress, but Carol recognises that “people clamour for certainty” and that authenticity, accuracy and status are “critical for the credibility of the site”.

It’s something ICLR has been all too aware of, since we incorporated a search and retrieval of UK legislation into our platform last year, using the legislation.gov.uk API (application programme interface). As such, ICLR is a classic example of a reuser of open legal data, and our platform is also an example of legislation being presented alongside caselaw, though it is early days and there is a lot more we plan to do in that regard, using the fruits of our participation in the Big Data for Law project with the National Archives.

Impression from plenary session of #EUlawdata4reuse conference in Brussels pic.twitter.com/F90F3Xkoez

— Willem van Gemert (@wvangemert) March 21, 2016

Back to the conference: the next presentation is from Reijo Kemppinen, Director General of Communications and Document Management at the European Council and Council of the European Union. He begins by noting that the audience today is one composed of experts: lawyers, computer scientists, librarians, etc. Their common interest is information in a digital format.

Europe is, Reijo notes, now working towards a principle of reuse of ALL public information. Transparency means something more than just providing access to documents. Technology cannot and should not be the driver for change. People should drive change: we have to learn from each other. He concludes by expressing the hope that those on the panel can learn from those in the audience today.

Dr Radboud Winkels, Dean of PPLE College and Associate Professor in Computer Science and Law at the Leibniz Centre for Law at the University of Amsterdam, introduces a BOLD theme: Big Open Legal Data.

He reminds us that Leibniz was not just a philosopher but applied maths to law. In a sense he was the founding father of computer science for law.

There are many sources of law online, but not all are available to all, nor are they interlinked between documents and across jurisdictions. Initiatives like the European Case Law Identifier (ECLI – a system of neutral citations across all European jurisdictions) and the European Legislation Identifier (ELI) aim to facilitate that.

Collections of data can be enhanced both by human enrichment (crowdsourcing info, collecting, annotating, translating etc) and by automatic enrichments, using both natural language processing and network analysis. Going further, one could develop software offering legal predictions (eg as to the likelihood of a particular outcome, based on big data analytics) or finding the most authoritative cases on a particular point.

In terms of multi-jurisdictional platforms, he gives the example of the EU Openlaws project (which I’ve described in an earlier blog post here): basically an interface on top of existing platforms, offering an integrated / federated search of, access to and storage of data from a number of underlying platforms. It would also offer user tools, such as storage of saved searches (with the resulting collection of documents automatically updated) and the social enrichment of content through the integration of personal legal data and commentary.

Finally, Denis Berthault, Director of Online Content Development at LexisNexis, based in France, explains how a big commercial legal information publisher can reuse EU legal data and incorporate it into the content on its platform.

However, he has some statistics that might dampen the enthusiasm of the reuse champions. Lexis has been using EurLex content including Treaties, Legislation, EU Materials and Cases for 25 years. It amounts to 10% of the content on Lexis, some 430,000 documents and 30,000 cases. It is indexed and interlinked together with other legal content by reference to a common taxonomy used in all the countries where Lexis operates. Yet usage is very low in every country, despite the size of the content. Typically, usage is less that 0.1%. In the UK, it is only 0.07% for legislation and only 0.06% for “jurisprudence” (which I take to include case law).

The impression he gets is that Lexis customers, who are expert lawyers, prefer reading professional analysis rather than consulting the original provisions. An EU Tracker product which Lexis developed about seven years ago had to be dropped about five years later because nobody was using it. (They are uncooperative in other ways, too, he notes, such as preferring PDF documents – because laywers love paper, and PDF is just like paper.)

All that said, the EU Publications Office has been doing tremendous work, and as a private (commercial) publisher, Lexis loves it.

#reuse workshop @ #EUlawdata4reuse takes off with a full house pic.twitter.com/0VR2wh45NQ

— EU Open Data (@EU_opendata) March 21, 2016

After a break for lunch (during which I chat with Carol Tullo, Judith Riley and Catherine Tabone of the legislation.gov team), the conference divides into three workshop sessions:

EUR-Lex: access to EU legal information
Reuse of EU legal data
European Legislation Identifier (ELI)

I join the session on the Reuse of EU legal data (see image: excellent view of the back of my head). My plan is simple: to find out how ICLR can do with EU legislation and treaties sourced via EurLex what we have already done and still plan to do with UK legislation sourced via the National Archives.

Paul Magrath is Head of Product Development and Online Content at ICLR.

Engagement welcome via Twitter: @maggotlaw @TheICLR