Pages

23 November 2010

Open data is the electricity of the 21st century « Web of Data

Open data is the electricity of the 21st century « Web of Data

Lonclass and RDF

Lonclass and RDF: "Lonclass and RDF
By DANBRI | Published: 2010-11-18
Lonclass is one of the BBC’s in-house classification systems – the “London classification”. I’ve had the privilege of investigating lonclass within the NoTube project. It’s not currently public, but much of what I say here is also applicable to the Universal Decimal Classification (UDC) system upon which it was based. UDC is also not fully public yet; I’ve made a case elsewhere that it should be, and I hope we’ll see that within my lifetime. UDC and Lonclass have a fascinating history and are rich cultural heritage artifacts in their own right, but I’m concerned here only with their role as the keys to many of our digital and real-world archives.

Why would we want to map Lonclass or UDC subject classification codes into RDF?"

Universal Decimal Classification: Announcement: Classification & Ontology: International UDC Seminar 2011

Universal Decimal Classification: Announcement: Classification & Ontology: International UDC Seminar 2011

culturegraph

culturegraph: "culturegraph.org is a Linked Open Data service that aims to establish shared identifiers (Uniform Resource Identifiers) for cultural works (books and other text, paintings, sculptures, piece of music etc.) to ensure these resources can be reliably and persistently referenced. The service is currently being developed cooperatively by the German National Library (DNB) and the North Rhine-Westphalian Library Service Center with support from the German-speaking Working Group of Library Networks."

The Nature of Connectedness on the Web » AI3:::Adaptive Information

The Nature of Connectedness on the Web » AI3:::Adaptive Information: "What does it mean to interoperate information on the Web? With linked data and other structured data now in abundance, why don’t we see more information effectively combined? Why express your information as linked data if no one is going to use it?

Interoperability comes down to the nature of things and how we describe those things or quite similar things from different sources. This was the major thrust of my recent keynote presentation to the Dublin Core annual conference. In that talk I described two aspects of the semantic “gap”:

One aspect is the need for vetted reference sources that provide the entities and concepts for aligning disparate content sources on the Web, and
A second aspect is the need for accurate mapping predicates that can represent the often approximate matches and overlaps of this heterogeneous content.
I’ll discuss the first “gap” in a later post. What we’ll discuss here is the fact that most relationships between putatively same things on the Web are rarely exact, and are most often approximate in nature."

Importing lots of nodes from DBpedia -OR- Import under request | groups.drupal.org

Importing lots of nodes from DBpedia -OR- Import under request | groups.drupal.org

LISTSERV 15.5 - NGC4LIB Archives

LISTSERV 15.5 - NGC4LIB Archives: "The Variations/FRBR Project at Indiana University (http://vfrbr.info) has released version 1.1 of a set of XML Schemas designed for the representation of FRBR (http://www.ifla.org/en/publications/functional-requirements-for-bibliographic-records) data in XML. The 1.1 Schema release represents some significant improvements over our earlier 1.0 release, particularly in the handling of FRBR relationships. As before, the Variations/FRBR XML Schemas are defined at three 'levels': frbr, which embodies faithfully only those features defined by the FRBR and FRAD reports; efrbr, which adds additional features we hope will make the data format more 'useful'; and vfrbr, which both contracts and extends the FRBR and FRAD models to create a data representation optimized for the description of musical materials and we hope provides a model for other domain-specific applications of FRBR."

Querying the British National Bibliography

Querying the British National Bibliography

Following up on the earlier announcement that the British Library has made the British National Bibliography available under a public domain dedication, the JISC Open Bibliography project has worked to make this data more useable.

The data has been loaded into a Virtuoso store that is queriable through the SPARQL Endpoint and the URIs that we have assigned each record use the ORDF software to make them dereferencable, supporting perform content auto-negotiation as well as embedding RDFa in the HTML representation.

The data contains some 3 million individual records and some 173 million triples. Indexing the data was a very CPU intensive process taking approximately three days. Transforming and loading the source data took about five hours.

To get an idea of the shape of the data, let us consider a sample resource, http://bnb.bibliographica.org/entry/GB8102507 . Apart from linkage between the various representations, the description of the entity itself is as follows

JISC Beginner’s Guide to Digital Preservation

JISC Beginner’s Guide to Digital Preservation: "JISC Beginner’s Guide to Digital Preservation
Welcome to the JISC Beginner’s Guide to Digital Preservation

The Guide has been written for those working on JISC projects who would like help with preserving their outputs.

It is aimed at those who are new to digital preservation but can also serve as a resource for those who have specific requirements or wish to find further resources in certain areas."

16 November 2010

New directions in web architecture. Again. - O'Reilly Radar

New directions in web architecture. Again. - O'Reilly Radar

Open Bibliographic Data Guide

Open Bibliographic Data Guide: "The advice is both general and specific. The guide seeks to clarify in general terms and also in the context of 17 specific Use Cases:

How to license the data
Legal issues to be considered
Potential costs and savings
Practical implications in terms of processes, effort and skills
Data formats and other technical options
These Use Cases cover things you might already do or plan to do as you develop your library service. The Guide provides the rationale and the potential ripple effects of doing those things based on Open Data."

Where Cinema and Biology Meet | KurzweilAI

Where Cinema and Biology Meet | KurzweilAI

New supercomputer rating system proposed | KurzweilAI

New supercomputer rating system proposed | KurzweilAI

Youngest-ever nearby black hole discovered | KurzweilAI

Youngest-ever nearby black hole discovered | KurzweilAI

inkdroid › iogdc ramblings

inkdroid › iogdc ramblings: "There was a question about how to make Linked Data relevant to folks whose focus is Enterprise data. In my opinion Linked Data advocates over emphasize the importance of using RDF and SPARQL (standards), and converting all the data over without completely understanding how invasive these solutions are. Not enough is done to show enterprise data folks, who typically think in terms of relational databases, what they can do to put their lovingly crafted and hugged data on the web. Consider a primary key in a database: what does it identify, what relations does that thing have with other things? Why not use that key in constructing a URL for that thing, and link things together using the URLs? Then other people could use your URLs as well in their own data. I think the drumbeat to use SPARQL and triple stores often misses explaining this fundamental baby step that data owners could take. As Derek Willis said (on the 2nd day, when I’m writing this), people want to use your data, but not your database…people want to browse your data using their web browser. Assigning URLs to the important stuff in your databases is the first important step to make with Linked Data."

Open government and "next generation democracy" - O'Reilly Radar

Open government and "next generation democracy" - O'Reilly Radar

Where the semantic web stumbled, linked data will succeed - O'Reilly Radar

Where the semantic web stumbled, linked data will succeed - O'Reilly Radar

Where the semantic web stumbled, linked data will succeed - O'Reilly Radar

Where the semantic web stumbled, linked data will succeed - O'Reilly Radar

Gapminder: Unveiling the beauty of statistics for a fact based world view. - Gapminder.org

Gapminder: Unveiling the beauty of statistics for a fact based world view. - Gapminder.org

PDF/A: A Viable Addition to the Preservation Toolkit

PDF/A: A Viable Addition to the Preservation Toolkit: "PDF/A, the archival version of the PDF file format, is an International Standards Organization (ISO) vetted, open source tool that can be added to the librarian's and archivist's preservation toolkit. This article describes the format itself, the lessons learned as the authors investigated the tools readily available for creating PDF/A files and the design of the pilot to test implementation of the use of the format in The Ohio State University's repository, the Knowledge Bank. Further, we identify issues in conversion of diverse original formats; strategies for time-saving batch conversion; and considerations in deciding whether to attempt full or partial compliance with the standard."

Trends in Large-Scale Subject Repositories

Trends in Large-Scale Subject Repositories: "Conclusion

This study illustrates that there are a number of trends among the ten largest subject repositories:

the most populated subject repositories were established before 2000, with the exception of PMC
most of the top ten repositories are inter- and multidisciplinary
the sciences and social sciences are predominant
the use of local software was more common for subject repositories until the launch of open source repository software in 1997
'articles,' or pre- or post-prints, is the only common content type
deposits are moderated
repositories discourage withdrawal of materials
submitters are responsible for copyright policies
most repositories are hosted by university libraries or departments"

The Strongest Link: Libraries and Linked Data

The Strongest Link: Libraries and Linked Data: "Abstract

Since 1999 the W3C has been working on a set of Semantic Web standards that have the potential to revolutionize web search. Also known as Linked Data, the Machine-Readable Web, the Web of Data, or Web 3.0, the Semantic Web relies on highly structured metadata that allow computers to understand the relationships between objects. Semantic web standards are complex, and difficult to conceptualize, but they offer solutions to many of the issues that plague libraries, including precise web search, authority control, classification, data portability, and disambiguation. This article will outline some of the benefits that linked data could have for libraries, will discuss some of the non-technical obstacles that we face in moving forward, and will finally offer suggestions for practical ways in which libraries can participate in the development of the semantic web."

15 November 2010

Presentation of interest focusing on Research and Next-Gen Catalogues « The Cataloguing Librarian

Presentation of interest focusing on Research and Next-Gen Catalogues « The Cataloguing Librarian: "Amy Eklund gave a very good presentation on the shortage of research we have examining next generation catalogues, and areas that need to be explored.

Key points?

We should examine next generation catalogues because:
1. So far, a build it and they will come approach has been taken with these catalogues;
2. Discovery tool overlays, such as Encore and AquaBrowser, are not integrated with the catalogue, but sit on top, like an interface;
3. Next generation catalogue features are not based on large scale of evidence; and
4. Rich content contained in our bibliographic records is still not being used to its greatest potential."

Goddard

Goddard
Linked Data tools: Semantic Web for the masses / Lisa Goddard and Gillian Byrne

Abstract
Semantic Web technologies have immense potential to transform the Internet into a distributed reasoning machine that will not only execute extremely precise searches, but will also have the ability to analyze the data it finds to create new knowledge. This paper examines the state of Semantic Web (also known as Linked Data) tools and infrastructure to determine whether semantic technologies are sufficiently mature for non–expert use, and to identify some of the obstacles to global Linked Data implementation.

Sindice - The semantic web index

Sindice - The semantic web index: "Sindice - Data Web Services

Billion pieces of reusable information can already be found across hundreds of millions web pages which embed RDF and Microformats. Start consuming this data today with Sindice Data Web services."

Sindice - The semantic web index

Sindice - The semantic web index: "Sindice - Data Web Services

Billion pieces of reusable information can already be found across hundreds of millions web pages which embed RDF and Microformats. Start consuming this data today with Sindice Data Web services."

Swoogle Semantic Web Search Engine

Swoogle Semantic Web Search Engine

Cloud Computing Needs Standards - Utility Computing

Cloud Computing Needs Standards - Utility Computing: "Cloud computing will not reach its full potential until management and contextual standards are fully developed and stable -- so buyers of cloud services need to choose carefully."

A Simple HTML5 RDFa Example « 3kbo

A Simple HTML5 RDFa Example « 3kbo: "A Simple HTML5 RDFa Example
As part of learning HTML5 and RDFa I put together a Simple HTML5 RDFa Example, using a photo Irene took of Minoan Figurines during a trip to Crete for the main content."

Resource Discovery Taskforce

Resource Discovery Taskforce: "The resource discovery taskforce (RDTF) vision poses a number of challenging technical questions such as:

What is an aggregation?
How do institutions contribute open metadata?
What metadata and standards do we use?
How do you build interfaces that developers will be keen to use?
What needs to be done to existing services and aggregations?"

Open Bibliographic Data Guide

Open Bibliographic Data Guide: "It’s all about the business case
Andy McGregor, the JISC Programme Manager explains:

Why are libraries around the world devoting time and resources to releasing their bibliographic data under an open licence? What’s in it for them and what are the costs and practical issues involved? JISC’s purpose for this guide is to try and provide some answers to these questions and to help academic librarians think about the potential implications for their own library.

One of the possibilities that open bibliographic data offers is the chance for libraries and indeed anyone to reuse the data to build innovative services for researchers, teachers, students and librarians. JISC will be exploring these possibilities through the work of the Resource Discovery Task Force."

Digital Libraries Initiative - Member States Expert Group (MSEG) | Europa - Information Society

Digital Libraries Initiative - Member States Expert Group (MSEG) | Europa - Information Society: "NEW Public hearing of the Comité des Sages on Bringing Europe's Cultural Heritage Online

On 28 October 2010 the Comité held a public hearing to gather the stakeholders' views to feed its reflection and the production of its final report.

Agenda
Video of the hearing Part 1 Part 2 (original language EN; with FR, DE interpretations)
The following position papers have been submitted to the Comité:"

Institutional Repository Bibliography

Institutional Repository Bibliography: "The Institutional Repository Bibliography (IRB) presents selected English-language articles, books, technical reports, and other scholarly textual sources that are useful in understanding institutional repositories. (See the scope note for further details.)

Most sources have been published between 2000 and the present; however, a limited number of key sources published prior to 2000 are also included. Where possible, links are provided to e-prints in disciplinary archives and institutional repositories for published articles. Note that e-prints and published articles may not be identical."

13 November 2010

RSP - Repository Software Survey, November 2010

RSP - Repository Software Survey, November 2010

El bloc de gencat. Generalitat de Catalunya » El projecte Dades Obertes de la Generalitat de Catalunya ja és una realitat

El bloc de gencat. Generalitat de Catalunya » El projecte Dades Obertes de la Generalitat de Catalunya ja és una realitat

Linked Open Data star scheme by example « Web of Data

Linked Open Data star scheme by example « Web of Data: "Linked Open Data star scheme by example
I like TimBL’s 5-star deployment scheme for Linked Open Data. However, every time I use it to explain the migration path from ‘no-data-on-the-Web’ to the ‘Full Monty’, no matter if to students, in training sessions or to industry partners, there comes a point where it would be very handy to refer to a concrete example that demonstrates the entire scheme.

Well, there we go. At

http://lab.linkeddata.deri.ie/2010/star-scheme-by-example/"

Rough draft poem: Document, what art thou?

Rough draft poem: Document, what art thou?: "Rough draft poem: Document, what art thou?
I am the Data Container, Disseminator, and Canvas.
I came to be when the cognitive skills of mankind deemed oral history inadequate.
I am transcendent, I take many forms, but my core purpose is constant - Container, Disseminator, and Canvas.
I am dexterous, so I can be blank, partitioned horizontally, horizontally and vertically, and if you get moi excited and I'll show you fractals.
I am accessible in a number of ways, across a plethora of media.
I am loose, so you can access my content too.
I am loose in a cool way, so you can refer to moi independent of my content.
I am cool in a loose way, so you can refer to my content independent of moi.
I am even cool and loose enough to let you figure out stuff from my content including how its totally distinct from moi.
But...
I am possessive about my coolness, so all Containment, Dissemination, and Canvas requirements must first call upon moi, wherever I might be.
So...
If you postulate about my demise or irrelevance, across any medium, I will punish you with confusion!
Remember...
I just told you who I am.
Lesson to be learned..
When something tells you what it is, and it is as powerful as I, best you believe it.
BTW -- I am Okay with HTTP response code 200 OK :-)"

inkdroid › routers, webcams and thermometers

inkdroid › routers, webcams and thermometers: "At the end of the day, it would be useful if the W3C could de-emphasize httpRange-14, simplify the Architecture of the World Wide Web (by removing the notion of Information Resources), and pave the cowpaths we already are seeing for Real World Objects on the Web. It would be great to have a W3C document that guided people on how to put URIs for things on the web, that fit with how people are already doing it, and made intuitive sense. We’re already used to things like our routers, cameras and thermometers being on the web, and my guess is we’re going to see much, much more of it in the coming years. I don’t think a move like this would invalidate documents like Cool URIs for the Semantic Web, or make the existing Linked Data that is out there somehow wrong. It would simply lower the bar for people who want to publish Linked Data, who don’t necessarily want to go through the process of using URIs to distinguish non-Information Resources from Information Resources.

If the W3C doesn’t have the stomach for it, I imagine we will see the IETF lead the way, or for innovation to happen elsewhere as with HTML5."

Annotator | Open Knowledge Foundation

Annotator | Open Knowledge Foundation: "Annotator
Open-Source Annotation Toolkit for Inline, Online Web Annotation

Simple javascript (+backend) library for web-annotation. Main goals were and are:

Annotation of arbitrary text ranges
Annotate any web (html) document
Easy to use — 2 lines of javascript to insert this in your web page/app etc
Well-factored and library-structured — easy to integrate and easy to extend"

Catalogablog: VRA Core Schemas now Hosted by Library of Congress

Catalogablog: VRA Core Schemas now Hosted by Library of Congress: "The VRA Core is a data standard for the description of works of visual culture as well as the images that document them. The standard is now being hosted by the Network Development and MARC Standards Officeof the Library of Congress (LC) in partnership with the Visual Resources Association . VRA Core’s schemas and documentation are now accessible at http://www.loc.gov/standards/vracore/ while user support materials, such as VRA Core examples, FAQs and presentations, will continue to be accessible at http://www.vraweb.org/projects/vracore4/

In addition, a new listserv has been created called The Core List (vracore@loc.gov). The Core List is an unmoderated computer forum that allows users of the VRA Core community to engage in a mutually supportive environment where questions, ideas, and tools can be shared. The Core List is operated by the Library of Congress Network Development and MARC Standards Office. Users may subscribe to this list by filling out the subscription form at the VRACORE Listserv site"

JISC Digital Media - Cross media: Open Source and Free Software Directory

JISC Digital Media - Cross media: Open Source and Free Software Directory: "The following tables comprise a selective guide to various free and open source software tools for a variety of digital media applications.This is an attempt to provide a first port of call for those looking for free and/or open source applications on the Internet. There is a bewildering array of such software and, as a result, we cannot hope to make this directory comprehensive. In addition, the almost daily changes in existing applications and the constant arrival of new ones means that the accuracy of information in these tables can’t be guaranteed.What this directory can provide is an easy way into the maze of applications on the Internet and a useful list of some of the most popular tools available. A quick glance at the tasks in the left hand column will immediately lead to one or more appropriate applications along with information about what machines it or they will work on. We hope to provide evaluations of some of these applications in the near future as well as keeping the directory as up-to-date as we can."

ICON: International Coalition on Newspapers

ICON: International Coalition on Newspapers: "The International Coalition on Newspapers is a coordinated multi-institutional effort to increase the availability of international newspaper collections by improving both bibliographic and physical access to global newspaper collections."

Research support services: What services do researchers need and use? | Research Information Network

Research support services: What services do researchers need and use? | Research Information Network: "The project’s goal was to discover researchers’ needs and desires in a small sample of UK and US universities and to identify the significant patterns, intersections, gaps and issues from researchers’ points of view, whatever the source of such services."

09 November 2010

Data.gov

Data.gov

W3C eGovernment Wiki

W3C eGovernment Wiki: "eGovernment Interest Group

The mission of the eGovernment Interest Group (eGov IG) is to explore how to improve access to government through better use of the Web and achieve better government transparency using open Web standards at any government level (local, state, national and multi-national). The eGov IG is designed as a forum to support researchers, developers, solution providers, and users of government services that use the Web as the delivery channel, and enable broader collaboration across eGov practitioners.
Find more in the executive summary. Learn more about eGovernment at W3C."

dl.org - DL.org Mission & Vision

dl.org - DL.org Mission & Vision: "DL.org is mobilising Digital Library (DL) designers, developers, end-users and researchers from diverse domains in the drive towards interoperability, best practices and modelling foundatons for the enhanced development of next-generation DL systems. Specific outputs include an enhanced, community-driven DL Reference Model and a Technology & Methodology Cookbook with a portfolio of best practices and solutions on common issues for the development of large-scale interoperable DL systems. The Request for Comments version of the Cookbook is now available. Upcoming events: Education & Research� on Digital Libraries, 9 Nov, Parma. [Read More]"

Nodalities � Blog Archive � LOD Around-the-Clock (LATC)

Nodalities � Blog Archive � LOD Around-the-Clock (LATC): "The challenge that all of these accidental technologists face is how to surface data and bring data together in meaningful ways. As Google’s chief economist Hal Varian has said, the scarce factor is no longer the data, which is essentially free and ubiquitous, but now the “scarce factor is the ability to understand that data and extract value from it.”

The emerging Web of Linked Data is the largest source of this data—multi-domain, real-world and real-time data—that currently exists. As data integration and information quality assessment increasingly depends on the availability of large amounts of real-world data, these new technologists are going to need to find ways to connect to the Linked Open Data (LOD) cloud."

Nodalities � Blog Archive � “Linked Data” at the Guardian

Nodalities � Blog Archive � “Linked Data” at the Guardian: "During October at Guardian News & Media we announced a change in our Open Platform Content API. For the first time, developers and users could query our database of over 1 million content items by using the common external identifiers of a MusicBrainz ID or an ISBN number. It is our first step into the world of ‘Linked Data’.

The Open Platform Content API was launched as a beta in 2009, and earlier this year was launched as a commercial product, allowing partners to re-use Guardian & Observer content in a variety of different ways. There is, for example, a Wordpress plugin that easily allows you to include Guardian content in your blog, and developers have built applications like a bespoke recipe search on top of the data. It is a unique proposition amongst news organisations on the web, and as well as the Content API itself, the Open Platform also includes publishing the source data behind Guardian journalism on the Data Store, and providing a search engine for Government datasets from around the world."

Home | CivicApps.org

Home | CivicApps.org: "Welcome to CivicApps!
Making public data easy to find and easy to use.

The first annual CivicApps Challenge is now open! This unique innovation event recognizes and rewards the best ideas and apps from the community. Join this growing community of innovative thinkers! Help us identify and recognize the best ideas and apps in the region. Share your own ideas. Submit an app to make life easier for everyone. So get your thinking caps on, share your ideas, and show us what you've got.

BE HEARD. Tell us the ideas you would like to see realized. Comment and vote for ways to make public information more accessible and useful.

GET INVOLVED. Show us how to use, combine and represent the information government holds in more useful and interesting ways. Your ideas provide data and input for developers to better understand the local communities' needs and create apps that matter.

TURN IDEAS INTO REALITY. Apps are what make it happen. Your participation is what turns the vision for public data into reality. Submit ideas that unlock the potential of local data and you could win cool stuff."

Public reports and deliverables | Arrow Project

Public reports and deliverables | Arrow Project

DCMI Metadata Terms

DCMI Metadata Terms: "DCMI Metadata Terms

Title: DCMI Metadata Terms
Creator: DCMI Usage Board
Identifier: http://dublincore.org/documents/2010/10/11/dcmi-terms/
Date Issued: 2010-10-11
Latest Version: http://dublincore.org/documents/dcmi-terms/
Replaces: http://dublincore.org/documents/2008/01/14/dcmi-terms/
Translations: http://dublincore.org/resources/translations/
Document Status: This is a DCMI Recommendation.
Description: This document is an up-to-date specification of all metadata terms maintained by the Dublin Core Metadata Initiative, including properties, vocabulary encoding schemes, syntax encoding schemes, and classes."

International Conference on Dublin Core and Metadata Applications

International Conference on Dublin Core and Metadata Applications

Coyle's InFormation: Beyond MARC-up

Coyle's InFormation: Beyond MARC-up