Los árboles y el bosque: 2011

11 May 2011

eFoundations: LOCAH releases Linked Archives Hub dataset

eFoundations: LOCAH releases Linked Archives Hub dataset: "LOCAH releases Linked Archives Hub dataset

Posted by PeteJ at 17:07 11 May 2011 in Linked Data , Metadata , Resource Discovery , Semantic Web | Permalink

The LOCAH project, one of the two JISC-funded projects to which I've been contributing, this week announced the availability of an initial batch of data derived from a small subset of the Archives Hub EAD data as linked data. The homepage for what we have called the 'Linked Archives Hub' dataset is http://data.archiveshub.ac.uk/"

Minimum Viable Record? « The Open Library Blog

Minimum Viable Record? « The Open Library Blog: "Minimum Viable Record?
By George Oates
Having worked more closely with bibliographic data than I had ever expected to over the last couple of years, I still can’t quite believe how complicated it can be. I keep holding tight something Karen Coyle told me when I first started at Open Library, that “library metadata is diabolically rational.”"

10 May 2011

Open data from Fukushima « Ivan’s private site

Open data from Fukushima « Ivan’s private site: "Open data from Fukushima
Filed under: Work Related,Semantic Web — Ivan Herman @ 10:35
Tags: SPARQL, Resource Description Framework, Linked Open Data
This is just an extended tweet… Masahide Kanzaki has just posted an announcement on the LOD mailing list on releasing some data he collected on the radioactivity levels on different places in Japan, enriched with metadata (e.g., geo data or time). Though the original data were in PDF, the results are integrated in RDF with a SPARQL endpoint. He also added some visualization endpoint that gives a simple visualization of the SPARQL query results:"

Inserting data from a SPARQL endpoint into a relational database - bobdc.blog

Inserting data from a SPARQL endpoint into a relational database - bobdc.blog: "Inserting data from a SPARQL endpoint into a relational database
27 April 2011

Via XML.
Retrieval of triples from relational databases is a popular topic in the semantic web world, but I was recently wondering how much trouble it would be to go in the opposite direction: to retrieve data from a SPARQL endpoint and load it into a relational database. It wasn't much trouble at all. When you retrieve the results in the SPARQL query results XML format, a straightforward XSLT spreadsheet can convert it into the necessary SQL INSERT statements. I was able to automate the data retrieval, conversion to INSERT statements, and actual insertion into a MySQL database with a three-line batch file that used no Windows-specific tricks, so I'm sure it would work on Linux just as well."

blog.aksw.org » Blog Archive » LinkedGeoData Release 2

blog.aksw.org » Blog Archive » LinkedGeoData Release 2: "LinkedGeoData Release 2

April 27, 2011 - 6:32 pm by ThomasRiechert - One comment »
The AKSW research group is happy to announce a new release of LinkedGeoData!
The aim of the LinkedGeoData (LGD) project is to make the OpenStreetMap (OSM) datasets easily available as RDF. As such the main target audience is the Semantic Web community, however it may turn out to be useful to a much larger audience. Additionally, we are providing interlinking with DBpedia and GeoNames and integration of class labels from translatewiki and icons from the Brian Quinion Icon Collection.

The result is a rich, open, and integrated dataset which we hope to be useful for research and application development. The datasets can be publicly accessed via downloads, Linked Data, and SPARQL-endpoints. We have also launched an experimental “Live-SPARQL-endpoint” that is synchronized with the minutely updates from OSM whereas the changes to our store are republished as RDF."

Open Knowledge Foundation Blog » Blog Archive » Open Data Search: finding useful datasets, worldwide

Open Knowledge Foundation Blog » Blog Archive » Open Data Search: finding useful datasets, worldwide: "Open Data Search: finding useful datasets, worldwide
March 16th, 2011
The following post is from Friedrich Lindenberg, who is a developer at the Open Knowledge Foundation working on CKAN, PublicData.eu and Open Spending.

Recently, there has hardly been a week in which there hasn’t been an announcement of a new local, regional or national open data initiative – including ever more extensive catalogues of data that is being opened up (CKAN alone now runs in 20 or more places). While this is great news for those of us interested in re-using the data, it also means it becomes increasingly hard to keep a good overview of what kind of data are available for which places. To get a better overview we’ve now started a meta search engine for open data, opendatasearch.org."

Go To Hellman: Open Access eBooks, Part 3. Business Models for Creation

Go To Hellman: Open Access eBooks, Part 3. Business Models for Creation: "Open Access eBooks, Part 3. Business Models for Creation
Here's the third section of my draft of a book chapter for a book edited by No Shelf Required's Sue Polanka. I previously posted the introduction; and What does Open Access mean for eBooks subsequent posts will cover Open Access E-Books in Libraries. Note that while the blog always uses 'ebook' as one word, the book will use the hyphenated form, 'e-book'. The comments on the second section prompted me to make significant revisions, which I have posted."

Go To Hellman: Open Access eBooks, Part 2. What does Open Access mean for e-books?

Go To Hellman: Open Access eBooks, Part 2. What does Open Access mean for e-books?: "Open Access eBooks, Part 2. What does Open Access mean for e-books?
Here's the second section of my draft of a book chapter for a book edited by No Shelf Required's Sue Polanka. I previously posted the introduction; subsequent posts will include sections on Business Models for Open Access E-Books, and Open Access E-Books in Libraries. Note that while the blog always uses 'ebook' as one word, the book will use the hyphenated form, 'e-book'. The comments on the first section have been really good; please don't stop!"

Go To Hellman: Open Access eBooks, Part 1

Go To Hellman: Open Access eBooks, Part 1: "Open Access eBooks, Part 1
I've been working on on a book chapter for a book edited by No Shelf Required's Sue Polanka. My chapter covers 'Open Access E-Books'. Over the next week or two, I'll be posting drafts for the chapter on the blog. Many readers know things that I don't about this area, and I would be grateful for their feedback and corrections. Today, I'll post the introduction, subsequent posts will include sections on Types of Open Access E-Books, Business Models for Open Access E-Books, and Open Access E-Books in Libraries. Note that while the blog always uses 'ebook' as one word, the book will use the hyphenated form, 'e-book'."

Spanish Cadastral Mass Download Service Launched / News / News / Home - ePSIplus - Public Sector Information

Spanish Cadastral Mass Download Service Launched / News / News / Home - ePSIplus - Public Sector Information: "A new massive download option for data on 75 million properties has been launched by the Spanish Cadastre.

Madrid, 6 April 2011,

(by Ton Zijlstra)

The Spanish Cadastre has launched a mass download service for cadastral data, allowing re-use by citizens and businesses for both commercial and non-commercial purposes. The official launch of the service (which we announced late March) was attended by over 300 people from almost 200 interested organizations.

Data on 75 million real estate properties in Spain can now be accessed for download an re-use, and ensures free access to all digital cartography material. This is a massive amount of data."

jhove2 / main / wiki / JHOVE2-2.0.0 Download – Bitbucket

inkdroid › DOIs as Linked Data

inkdroid › DOIs as Linked Data: "DOIs as Linked Data
Last week Ross Singer alerted me to some pretty big news for folks interested in Library Linked Data: CrossRef has made the metadata for 46 million Digital Object Identifiers (DOI) available as Linked Data. DOIs are heavily used in the publishing space to uniquely identify electronic documents (largely scholarly journal articles). CrossRef is a consortium of roughly 3,000 publishers, and is a big player in the academic publishing marketplace.

So practically what this means is that all the places in the scholarly publishing ecosystem where DOIs are present (caveat below), it’s now possible to use the Web to retrieve metadata associated with that electronic document. Say you’ve got a DOI in the database backing your institutional repository:"

Automatic subject cataloging at the German National Library « all things cataloged

Automatic subject cataloging at the German National Library « all things cataloged: "Automatic subject cataloging at the German National Library
MAY 5
Posted by Saskia
The German National Library has been working on a project for automatic subject classification that is expected to go live at the end of this year. This project, called PETRUS, is explained in detail in this article (in German). Part of the Library’s mission is the creation of the German National Bibliography, and to keep pace with the onslaught of (digital) resources more efficiency is needed. The main areas where human intellectual work is supported by machines in the PETRUS project are indexing, classification, metadata extraction and metadata generation."

15 April 2011

Weibel Lines: Principles of Linked Data Recast

Weibel Lines: Principles of Linked Data Recast: "Not so fast. I'm afraid we need another principle. Data curation. Data worth linking to is cared for, managed, corrected, updated. In the words of the old song (and yes, I expose my age) 'Does your chewing gum lose its flavor on the bedpost overnight?' Bit-rot prevails, as sure as death and taxes."

Outgoing: Changes to VIAF's RDF

Outgoing: Changes to VIAF's RDF: "VIAF RDF has evolved over time and is about to change again to streamline the model. Using Barbara Tillett as an example, these before and after diagrams might give a sense of the changes in the next generation."

Open Knowledge Foundation Blog » Blog Archive » Open Data Search: finding useful datasets, worldwide

Open Knowledge Foundation Blog » Blog Archive » Open Data Search: finding useful datasets, worldwide: "March 16th, 2011
The following post is from Friedrich Lindenberg, who is a developer at the Open Knowledge Foundation working on CKAN, PublicData.eu and Open Spending.

Recently, there has hardly been a week in which there hasn’t been an announcement of a new local, regional or national open data initiative – including ever more extensive catalogues of data that is being opened up (CKAN alone now runs in 20 or more places). While this is great news for those of us interested in re-using the data, it also means it becomes increasingly hard to keep a good overview of what kind of data are available for which places. To get a better overview we’ve now started a meta search engine for open data, opendatasearch.org."

BibServer

BibServer: "BibServer

BibServer is a Python program which creates a network of displays of bibliographic data maintained in BibTeX format by contributing authors and editors. The displays link whenever possible to full text in open archives such as arXiv and PubMed, in electronic journals, and on author's homepages. Links to related articles and citations are provided by Google Scholar. For articles in mathematics, links are provided if possible to abstracts and reviews in MathSciNet and Zentralblatt MATH .
BibServer index of personal bibliographies

Personal BibServer: documentation

BibServer MetaSearch Access to various search engines and databases to facilitate control of multiple searches and navigation between different sources."

Home - Bibliographica

Home - Bibliographica: "Login with OpenID
Skip to content
Home
SPARQL
ISBN search
Collections
Search

Welcome to Bibliographica

Bibliographica is an open catalogue of cultural works. There are currently 3020429 works in the database.

Search by name, (e.g. Dickens), title, (e.g. War and Peace), etc."

Neil Beagrie’s Blog » Blog Archive » New Project for 2011 – Digital Preservation Benefit Analysis Tools

Neil Beagrie’s Blog » Blog Archive » New Project for 2011 – Digital Preservation Benefit Analysis Tools: "The “Digital Preservation Benefit Analysis Tools” project is funded by the Joint Information Systems Committee (JISC) and will run from 1st February to 31 July 2011.

The project aims to test, review and promote combined use of the Keeping Research Data Safe (KRDS) Benefits Taxonomy and the Value Chain and Impact Analysis tool first applied in the I2S2 project for assessing the benefits and impact of digital preservation of research data. We will extend their utility to and adoption within the JISC community by providing user review and guidance for the tools and creating an integrated toolset. The project consortium consists of a mix of user institutions, projects, and disciplinary data services committed to the testing and exploitation of these tools and the lead partners in their original creation. We will demonstrate and critique the tools, and then create and disseminate the toolset and accompanying materials such as User Guides and Factsheets to the wider community."

Top 100 most popular RDF namespace prefixes | cygri’s notes on web data

Top 100 most popular RDF namespace prefixes | cygri’s notes on web data: "Top 100 most popular RDF namespace prefixes
Posted on February 15, 2011 by Richard Cyganiak
I run prefix.cc, a website for RDF developers where anyone can register and look up the expansion URIs for namespace prefixes such as foaf, dc, qb or void. The site tracks which prefixes gets looked up most often. This allows some insight into the popularity of RDF vocabularies and datasets.

This post is a snapshot of the top 100 most requested prefixes as of today."

State of the LOD Cloud

State of the LOD Cloud: "State of the LOD Cloud

Chris Bizer (Freie Universität Berlin)
Anja Jentzsch (Freie Universität Berlin)
Richard Cyganiak (DERI, NUI Galway)
Version 0.2, 03/28/2011

This document provides statistics about the structure and content of the LOD cloud. It also analyzes the extend to which LOD data sources implement nine best practices that are either recommended W3C or have emerged within the LOD community.

All statistics within this document are based on the LOD data set catalog that is maintained on CKAN. If you spot any errors in the data describing the LOD data sets, it would be great if you would correct them directly on CKAN. For information on how to describe datasets on CKAN please refer to the Guidelines for Collecting Metadata on Linked Datasets in CKAN."

State of the LOD Cloud

Connected Histories

Connected Histories: "British History Sources, 1500-1900
Connected Histories brings together a range of digital resources related to early modern and nineteenth century Britain with a single federated search that allows sophisticated searching of names, places and dates, as well as the ability to save, connect and share resources within a personal workspace.
Connected Histories is a not-for-profit project. We welcome proposals for new content."

SCA IPR and Licensing module

SCA IPR and Licensing module: "This IPR and Licensing module has been developed by the Strategic Content Alliance for staff working in public sector bodies to introduce to them the concepts of copyright and other Intellectual Property Rights and how they might deal with the rights and licensing issues associated with the curation and creation of digital content.
The module has been divided into 6 learning objects divided into key themes:

1) Introduction to IPR and Licensing
2) Creative Commons Licences
3) Orphan Works and Risk Management
4) Digital Economy Act
5) Accessing and Using Third Party Content
6) Protecting and Managing Rights"

SCA IPR and Licensing module

SCA IPR and Licensing module: "his IPR and Licensing module has been developed by the Strategic Content Alliance for staff working in public sector bodies to introduce to them the concepts of copyright and other Intellectual Property Rights and how they might deal with the rights and licensing issues associated with the curation and creation of digital content.
The module has been divided into 6 learning objects divided into key themes:

1) Introduction to IPR and Licensing
2) Creative Commons Licences
3) Orphan Works and Risk Management
4) Digital Economy Act
5) Accessing and Using Third Party Content
6) Protecting and Managing Rights"

Final Report Online Consultation Published / News / News / Home - ePSIplus - Public Sector Information

Final Report Online Consultation Published / News / News / Home - ePSIplus - Public Sector Information: "Source: European Commission / Information Society

The final report on the online consultation on the PSI Directive has been published.

Luxembourg, 25 March 2011

(by Ton Zijlstra)

In the last months of 2010 an online consultation took place about the PSI Directive, its impact and the type of amendments the various stakeholders think are needed. The final report on this consultation is now available. With almost 600 responses to the consultation, or 15 times more than the 2008 consultation it is clear that PSI re-use is a topic that is getting a lot of attention. Responses came from 37 states, the majority from 11 EU member states (with Germany forming the largest group of responses). Five Member States entered official responses: Belgium, France, Denmark, UK, Netherlands.

Earlier we already reported on the publication of all the reactions received.

The final report on the online consultation is embedded below in full (and can be downloaded here, as well as from the EC DG Information Society PSI website in PDF)."

Topic Report 'Local and Regional Data' Published / News / News / Home - ePSIplus - Public Sector Information

Topic Report 'Local and Regional Data' Published / News / News / Home - ePSIplus - Public Sector Information: "Topic Report 27 'Local and Regional Data' is now available.

Luxembourg, 31 March 2011

(by Ton Zijlstra)

A new Topic Report is now available, titled 'Local and Regional Data'. Written by Rob Davies, this is the 27th and final report produced under the 2009-2011 service contract for the ePSI platform. The series of regular reports will be continued in the coming 2 years.

This report describes the recent progress being made in improving access to local government data for re-use, catalyzed by Open Data initiatives, reflecting some of the mechanisms which are being deployed. It summarises some apparent issues which remain to be resolved and outlined some initiatives and possible ways of improving discovery and access across localities and borders.

The report is also embedded below for browsing and downloading."

JISC Digitisation Programme » Presentations – New Strategies for Digital Content

JISC Digitisation Programme » Presentations – New Strategies for Digital Content: "JISC hosted the New Strategies for Digital Content conference in London on March 18 2011.

The event looked at two themes

the need for institutions to develop the necessary skills and strategies to embed digitisation within institutional strategies and practices as well as devise effective business models for the long term sustainability of digitised content
the need to break down silos of content by clustering existing and complementary digitised resources and enhancing their offerings, thus making them more relevant and usable for target users
Presentations and links related to the day are below."

DigitalKoans » Blog Archive » "JISC CETIS 2011 Informal Horizon Scan"

DigitalKoans » Blog Archive » "JISC CETIS 2011 Informal Horizon Scan": "This report outlines some technology trends and issues of interest and relevance to CETIS. It should be seen as a set of un-processed perceptions rather than the product of a formal process; a great deal of ground is not scanned in this paper and it should be understood that no formal prioritisation process was undertaken."

DigitalKoans » Blog Archive » Preservation of Digitized Books and Other Digital Content Held by Cultural Heritage Organizations

DigitalKoans » Blog Archive » Preservation of Digitized Books and Other Digital Content Held by Cultural Heritage Organizations: "In one response to this need to develop models of digital preservation, the NEH and IMLS awarded a grant to Portico, in partnership with Cornell University Library, through the 'Advancing Knowledge: The IMLS/NEH Digital Partnership grant program' to develop a practical model for how preservation can be accomplished for digitized books. Through this initiative and other efforts, Portico had the opportunity to discuss digital collections and their long-term preservation with 27 cultural heritage organizations. In addition, Cornell University Library provided significant samples of content to analyze. Out of this research and the extensive experience in preservation at both Portico and Cornell University Library, we developed a model for the preservation of digitized books and other 'document like' digital content at cultural heritage organizations."

11 April 2011

eFoundations: RDTF metadata guidelines - Limp Data or Linked Data?

eFoundations: RDTF metadata guidelines - Limp Data or Linked Data?: "Having just finished reading thru the 196 comments we received on the draft metadata guidelines for the UK RDTF I'm now in the process of wondering where we go next. We (Pete and I) have relatively little effort to take this work forward (a little less than 5 days to be precise) so it's not clear to me how best we use that effort to get something useful out for both RDTF and the wider community."

Fihrist - Home

Fihrist - Home: "Welcome to Fihrist. This catalogue provides a searchable interface to more than 3,000 basic manuscript descriptions taken from printed and card catalogues of the collections of the Bodleian Libraries, Oxford and Cambridge University Library. It contains records for Arabic manuscripts at Oxford and both Arabic and Persian records for Cambridge. Fihrist was created with JISC funding by the OCIMCO project."

[1103.5046] From Linked Data to Relevant Data -- Time is the Essence

[1103.5046] From Linked Data to Relevant Data -- Time is the Essence: "From Linked Data to Relevant Data -- Time is the Essence

Markus Kirchberg, Ryan K L Ko, Bu Sung Lee
(Submitted on 25 Mar 2011)
The Semantic Web initiative puts emphasis not primarily on putting data on the Web, but rather on creating links in a way that both humans and machines can explore the Web of data. When such users access the Web, they leave a trail as Web servers maintain a history of requests. Web usage mining approaches have been studied since the beginning of the Web given the log's huge potential for purposes such as resource annotation, personalization, forecasting etc. However, the impact of any such efforts has not really gone beyond generating statistics detailing who, when, and how Web pages maintained by a Web server were visited."

[1103.5043] An Empirical Study of Real-World SPARQL Queries

[1103.5043] An Empirical Study of Real-World SPARQL Queries: "An Empirical Study of Real-World SPARQL Queries

Mario Arias, Javier D. Fernández, Miguel A. Martínez-Prieto, Pablo de la Fuente
(Submitted on 25 Mar 2011)
Understanding how users tailor their SPARQL queries is crucial when designing query evaluation engines or fine-tuning RDF stores with performance in mind. In this paper we analyze 3 million real-world SPARQL queries extracted from logs of the DBPedia and SWDF public endpoints. We aim at finding which are the most used language elements both from syntactical and structural perspectives, paying special attention to triple patterns and joins, since they are indeed some of the most expensive SPARQL operations at evaluation phase. We have determined that most of the queries are simple and include few triple patterns and joins, being Subject-Subject, Subject-Object and Object-Object the most common join types. The graph patterns are usually star-shaped and despite triple pattern chains exist, they are generally short."

[1103.4295] Linking Literature and Data: Status Report and Future Efforts

[1103.4295] Linking Literature and Data: Status Report and Future Efforts: "Linking Literature and Data: Status Report and Future Efforts

Alberto Accomazzi
(Submitted on 22 Mar 2011)
In the current era of data-intensive science, it is increasingly important for researchers to be able to have access to published results, the supporting data, and the processes used to produce them. Six years ago, recognizing this need, the American Astronomical Society and the Astrophysics Data Centers Executive Committee (ADEC) sponsored an effort to facilitate the annotation and linking of datasets during the publishing process, with limited success. I will review the status of this effort and describe a new, more general one now being considered in the context of the Virtual Astronomical Observatory."

SUSHI Server - National Information Standards Organization

SUSHI Server - National Information Standards Organization: "Author(s): Brinda Shah is a web programmer at H.W. Wilson..
doi: 10.3789/isqv23n1.2011.05
Citation: Shah, Brinda. SUSHI Implementation: The Server Side Experience. Information Standards Quarterly, 2011 Winter, 23(1):20-22.
Abstract: The author describes her experience in implementing the server side of the Standardized Usage Statistics Harvesting Initiative (SUSHI) Protocol (ANSI/NISO Z39.93) at H.W. Wilson. She describes her learning curve with web services, the steps involved in implementing a SUSHI server for delivering usage data to clients. Tools used include J2EE framework, Apache Tomcat web application server, Axis SOAP engine, and Eclipse development tool."

SUSHI Client - National Information Standards Organization

SUSHI Client - National Information Standards Organization: "Author(s): Omar Villa is IT Development Manager at Grupo Integra in Mexico City, Mexico.
doi: 10.3789/isqv23n1.2011.04
Citation: Villa, Omar. SUSHI Implementation: The Client Side Experience. Information Standards Quarterly, 2011 Winter, 23(1):18-19.
Abstract: The author describes his experience in implementing the client side of the Standardized Usage Statistics Harvesting Initiative (SUSHI) Protocol (ANSI/NISO Z39.93) at Grupo Integra. He developed a module for their Kenvo Stats system, which generates statistics on the usage of electronic resources, to automate the retrieval of the COUNTER report statistics. After trying some PHP tools and a Java implementation, the final client was built using PHP Sockets."

Democratizing Information with Semantics » AI3:::Adaptive Information

Democratizing Information with Semantics » AI3:::Adaptive Information: "Self-service Information Management for Knowledge Workers

Though I have alluded to it numerous times in my past writings [1], I think one of the most pervasive and important benefits from semantic technologies in the enterprise will come from the democratization of information. These benefits will arise mostly from a fundamental change in how we manage and consume information. A new “system” of semantic technologies is now largely available that can put the collection, assembly, organization, analysis and presentation of information directly in the hands of those who need it most — the consumers of information."

ResourceBlog Article: Create QR Codes on Wolfram|Alpha

Gather Evidence to Inform Changes in MARC Metadata Practices [OCLC - Activities]

Mar 16: Metadata Harmonization (NISO/DCMI Webinar) - National Information Standards Organization

Mar 16: Metadata Harmonization (NISO/DCMI Webinar) - National Information Standards Organization: "Metadata Harmonization: Making Standards Work Together"

Coyle's InFormation: Open Data II

Coyle's InFormation: Open Data I

01 April 2011

Journal Article Tag Suite

Journal Article Tag Suite: "Journal Article Tag Suite
... is an application of NISO Z39.96, which defines a set of XML elements and attributes for tagging journal articles and describes three article models.
The content on this site is the supporting documentation for the standard. JATS is a continuation of the NLM Archiving and Interchange DTD work begun in 2002 by NCBI."

20 February 2011

Introducing UMBEL Version 1.00 - semanticweb.com

Introducing UMBEL Version 1.00 - semanticweb.com

After four years of tinkering, Structured Dynamics and Ontotext are ready to announce the release of UMBEL version 1.00, the first production-grade release of the system. According to the article, UMBEL (Upper Mapping and Binding Exchange Layer) “is primarily a reference ontology, which contains 28,000 concepts (classes and relationships) derived from the Cyc knowledge base. The reference concepts of UMBEL are mapped to Wikipedia, DBpedia ontology classes, GeoNames and PROTON.”

2010 Visualization Challenge

2010 Visualization Challenge

An “ocean” composed of a single layer of molecules; an intricate depiction of an HIV particle as a study in orange and gray; a phantasmagoria of fungi; a video tracing the long-distance travels of items dumped in the trash in Seattle: The four first-place winners in this year's International Science & Engineering Visualization Challenge grab your attention and draw you into unseen worlds in very different ways.

The Semantic Puzzle | Transforming spreadsheets into SKOS with Google Refine

CILIP | Cataloguing and Indexing Tools and resources - cataloguing, indexing and classification

Professional tools and resources

A taxonomy of professional tools, resources and publications supporting traditional cataloguing, metadata manipulation, classification and indexing. All links point to professional articles, tools and resources that are either open access or licensed for free use by cataloguers.

This section is maintained by David Bennett.

Please contact David at david.bennett@port.ac.uk if you have any comments or suggestions.

19 February 2011

DSPL: Dataset Publishing Language - Google Code

TILE | Text-Image Linking Environment

The Text-Image Linking Environment (TILE) is a web-based tool for creating and editing image-based electronic editions and digital archives of humanities texts.

This initial release of TILE features tools for importing and exporting transcript lines and images of text, an image markup tool, a semi-automated line recognizer that tags regions of text within an image, and plugin architecture to extend the functionality of the software.

Try The Sandbox » Download » More Info »

DigitalKoans » Blog Archive » Library of Congress Funds Omeka + Neatline Project

Library of Congress Funds Omeka + Neatline Project

The Library of Congress has awarded $665,248 in funding to the Omeka + Neatline project.

Here's an excerpt from the announcement:

The Scholars' Lab at the University of Virginia Library and the Center for History and New Media (CHNM) at George Mason University are pleased to announce a collaborative "Omeka + Neatline" initiative, supported by $665,248 in funding from the Library of Congress.

The Omeka + Neatline project's goal is to enable scholars, students, and library and museum professionals to create geospatial and temporal visualizations of archival collections using a Neatline toolset within CHNM's popular, open source Omeka exhibition platform. Neatline, a "contribution to interpretive humanities scholarship in the visual vernacular," is a project of the UVa Library Scholars' Lab, originally bolstered by a Start-Up Grant from the Office of Digital Humanities at the National Endowment for the Humanities. Omeka is an award-winning web-publishing platform for the display of cultural heritage and scholarly collections and exhibits, funded by the Institute of Museum and Library Services, Alfred P. Sloan Foundation, Andrew W. Mellon Foundation, and Samuel H. Kress Foundation.

17 February 2011

Open Knowledge Foundation Blog » Blog Archive » DataMarket.com Launches with 100 Million Open Data Time Series – remains firmly commited to open data

Using the bibliographica sparql API | Open Bibliography and Open Bibliographic Data

TRLN Digitization Strategy Announcement [OCLC]

TRLN Digitization Strategy Announcement [OCLC]: "DUBLIN, Ohio, USA, 15 February 2011—This is the first formally published strategy for providing access to unpublished materials online based on an approach created by OCLC Research and the RLG Partnership.

This approach is described in a document titled, 'Well‐intentioned practice for putting digitized collections of unpublished materials online' and is the output of an 'Undue Diligence' invitational seminar held in the spring of 2010. During this event, OCLC Research convened a group of RLG Partner experts from archives, special collections and the law to develop and define streamlined, community-accepted procedures for managing copyright in the digital age that would cut costs and boost confidence in libraries' and archives' ability to increase visibility of and access to unpublished materials online. The group acknowledged that, although there is risk in digitizing materials that may be in copyright, this risk should be balanced with the harm to scholarship and society inherent in not making collections fully accessible. Based on this premise, they identified a practical approach to selecting collections, making decisions, seeking permissions, recording outcomes, establishing policy and working with future donors, which O"

Top 100 most popular RDF namespace prefixes | cygri’s notes on web data

Will RDA kill MARC?

http://pages.uoregon.edu/kelleym/KM_MWpresentation.pdf

14 February 2011

SKOS Now Interoperates with OWL 2 » AI3:::Adaptive Information

13 February 2011

Web 3.0 Could Lead to E-Government That Anticipates Citizens’ Needs

Web 3.0 Could Lead to E-Government That Anticipates Citizens’ Needs: "“Web 3.0” is an IT buzzword that’s appearing with greater frequency among the state and local government IT community. Explanations differ as to what it means in terms of implementation, but the overarching concept is “machine-to-machine” communication on the Internet."

What SKOS-XL adds to SKOS - bobdc.blog

What SKOS-XL adds to SKOS - bobdc.blog: "Extra flexibility for label metadata.
In my first few glances at SKOS eXtension for Labels, I didn't quite get it. Recently, though, while looking at a client's requirements document at TopQuadrant, when I saw that they wanted to attach metadata to individual terms, I started modeling this in my head and then I realized I didn't need to: SKOS-XL already had."

Special Online Collection: Dealing with Data

Special Online Collection: Dealing with Data: "In the 11 February 2011 issue, Science joins with colleagues from Science Signaling, Science Translational Medicine, and Science Careers to provide a broad look at the issues surrounding the increasingly huge influx of research data. This collection of articles highlights both the challenges posed by the data deluge and the opportunities that can be realized if we can better organize and access the data.

The Science cover (left) features a word cloud generated from all of the content from the magazine's special section.

Science is making access to this entire collection FREE (simple registration is required for non-subscribers)."

Open Knowledge Foundation Blog » Blog Archive » Playing around with Open Linked Data: data.totl.net

Open Knowledge Foundation Blog » Blog Archive » Playing around with Open Linked Data: data.totl.net: "The following guest post is by Christopher Gutteridge, a Web & Systems Programmer and Open Data Architect at the University of Southampton. When he was young he wrote the “coffee stain” filter for GIMP, and is the developer of Graphite RDF PHP library & tools. He is a member of the OKF Working Group on Open Bibliographic Information.

I know that it’s best to practice a new technique before employing it on anything major. I also like over-engineering for its own sake. (Beware the Modeller!) That’s how I ended up building data.totl.net."

Home - ePSIplus - Public Sector Information

Home - ePSIplus - Public Sector Information: "Europe's One-Stop Shop on Public Sector Information (PSI) Re-use

Working to Stimulate PSI Re-use

The aim of the ePSIplatform is to stimulate action, report developments and monitor progress towards a stronger and more transparent environment for the growth of national and European PSI re-use markets."

Official Google Blog: Register for Google I/O 2011

Official Google Blog: Register for Google I/O 2011: "The focus of I/O 2011 will be all about the cloud, and feature the latest Google products and technologies including Android, Google Chrome, App Engine, Google Web Toolkit and Google APIs. There will be many opportunities to meet members of Google’s engineering teams and take deep dives into the technologies with more than 100 technical sessions, roundtables and more. The Developer Sandbox, which we introduced at I/O 2009, will be back, featuring developers from more than 100 companies to demo their apps, share their experiences and exchange ideas."

HEFCE Review of JISC

1. This report sets out the findings and recommendations of the Review Group

chaired by Professor Sir Alan Wilson into the strategy, activities and effectiveness of the

Joint Information Systems Committee (JISC). The review’s terms of reference are at

Annex A and the Review Group membership is listed at Annex B.

http://www.jisc.ac.uk/media/documents/aboutus/aboutjisc/JISCReview.pdf

China building a city for cloud computing | KurzweilAI

"Copy, paste, map" - O'Reilly Radar

"Copy, paste, map" - O'Reilly Radar: "Data, data everywhere, and all too many spreadsheets to think.

Citizens have a new tool to visualize data and map it onto their own communities. Geospatial startup FortiusOne and the Federal Communications Commission (FCC) have teamed up to launch IssueMap.org. IssueMap is squarely aimed at addressing one of the biggest challenges that government agencies, municipalities and other public entities have in 2011: converting open data into information that people can distill into knowledge and insight.

IssueMap must, like the data it visualizes, be put in context. The world is experiencing an unprecedented data deluge, a reality that my colleague Edd Dumbill described as another 'industrial revolution' at last week's Strata Conference. The release of more data under the Open Government Directive issued by the Obama Administration has resulted in even more data becoming available. The challenge is that for most citizens, the hundreds of thousands of data sets available at Data.gov, or at state or city data catalogs, don't lead to added insight or utility in their every day lives. This partnership between FortiusOne and the FCC is an attempt to give citizens a mapping tool to make FCC data meaningful."

T H E H O R I Z O N R E P O R T 2 0 1 1 E D I T I O N

http://net.educause.edu/ir/library/pdf/HR2011.pdf

The internationally recognized series of Horizon

Reports is part of the New Media Consortium’s

Horizon Project, a comprehensive research venture

established in 2002 that identifies and describes

emerging technologies likely to have a large impact

over the coming five years on a variety of sectors

around the globe. This volume, the 2011 Horizon

Report, examines emerging technologies for their

potential impact on and use in teaching, learning,

and creative inquiry. It is the eighth in the annual

series of reports focused on emerging technology in

the higher education environment.

Shared Services in Cloud Computing | Digital Curation Centre

Shared Services in Cloud Computing | Digital Curation Centre: "Today's announcement by HEFCE of funding for the development of cloud services in UK Higher Education will allow the DCC to develop some significant new services in the coming year. As well as deploying our own tools in the cloud infrastructure, we'll be developing advice and support for institutions considering use of the cloud for any and all aspects of research data management.This funding will also allow us to take forward many of the salient recommendations from the UK Research Data Service (UKRDS) project and intensify a number of our existing activities, such as the DCC roadshows. We'll be developing detailed plans in the next few weeks and expect to make a number of further announcements during and after that time.The DCC will be working closely with JISC, JANET, Eduserv and others to realise the benefits of this exciting but challenging programme.The full text of the announcement from HEFCE, containing details of all aspects of the programme, is here."

Tasty, New Sweet Tools Release » AI3:::Adaptive Information

Tasty, New Sweet Tools Release » AI3:::Adaptive Information: "Sweet Tools, AI3‘s listing of semantic Web and -related tools, has just been released with its 17th update. The listing now contains more than 900 tools, nearly a 10% increase over the last version. Significantly the listing is also now presented via its own semantic tool, the structSearch sComponent, which is one of the growing parts to Structured Dynamics‘ open semantic framework (OSF).

So, we invite you to go ahead and try out this new Flex/Flash version with its improved"

SKOS Now Interoperates with OWL 2 » AI3:::Adaptive Information

SKOS Now Interoperates with OWL 2 » AI3:::Adaptive Information: "n the semantic Web, arguably SKOS is the right vocabulary for representing simple knowledge structures [1] and OWL 2 is the right language for asserting axioms and ontological relationships. In the early days we chose a reliance on SKOS for the UMBEL reference concept ontology, because of UMBEL’s natural role as a knowledge structure. Most recently we also migrated UMBEL to OWL 2 to gain (among other reasons) the metamodeling advantages of “punning”, which allows us to treat things as either classes or instances depending on modeling needs, context and viewpoint [2].

But — until today — we could not get SKOS and OWL 2 to play together nicely. This meant we could not take full advantage of each language’s respective strengths.

Happily, that gap has now been closed. By action of a relatively minor change and the addition of a simple statement, SKOS more fully interoperates with OWL 2."

Library Management Services in the Cloud [OCLC]

Library Management Services in the Cloud [OCLC]: "Presented at ALA Midwinter 2011, January 9, 2011

Listen as early members of the user community explain why they chose OCLC Web-scale Management Services and share their progress to-date.

Introduction [28 minutes]
Jay Jordan, OCLC President and CEO, and Andrew K. Pace Executive Director of Networked Services
Jason Griffey [9 minutes]
Head of Library Information Technology, The University of Tennessee at Chattanooga
Jackie Beach [13 minutes]
Director, CPC Regional Libraries (North Carolina)
Michael Dula [12 minutes]
Director for Digital Initiatives and Technology Strategy, Pepperdine University Libraries
Discussion [17 minutes]"

06 February 2011

Open Knowledge Foundation Blog » Blog Archive » Open Public Data: Then What? – Part 2

Open Knowledge Foundation Blog » Blog Archive » Open Public Data: Then What? – Part 2: "One may believe that one of the three scenarios for the future of Open Public Data that I discussed in my previous post is more likely than the other. The problem is, why? What actions, decisions, or conditions, are more likely to get us going along one road rather than the other? Can we go wrong on one count, and right on another? I believe we have hardly begun to figure that out."

Open Knowledge Foundation Blog » Blog Archive » Open Public Data: Then What? – Part 1

Open Knowledge Foundation Blog » Blog Archive » Open Public Data: Then What? – Part 1: "We tend to assume that the opening up of public data will only produce positive outcomes for individuals, for society and the economy. But the opposite may be true. We should start thinking further ahead on the possible consequences of releasing public data, and how we can make sure they are mostly positive."

wg/humanities - Open Knowledge Foundation Wiki

wg/humanities - Open Knowledge Foundation Wiki: "Working Group on Open Resources in the Humanities
Purpose
Act as a central point of reference and support for people interested in open resources in humanities research and teaching.
Possible Projects
A list of free/open source software tools for facilitating research and teaching in the humanities (including and linking to existing directories).
Maintaining a registry of collections of public domain and open access humanities resources on CKAN.
Guide to using structured text formats when publishing or making available textual databases or other resources.
Guide on best practices for using licenses and other legal tools.
Created: 2008-06-30"

DigitalHumanities - Open Knowledge Foundation Wiki

Open Knowledge Foundation Blog » Blog Archive » Europe’s Energy: a new mini-app to put the European energy targets into context

Open Knowledge Foundation Blog » Blog Archive » Europe’s Energy: a new mini-app to put the European energy targets into context: "The application aims to help to put European energy policy (including the 2020 energy targets) into context, building on the work we did at the Eurostat Hackday in London just before Christmas."

Nodalities » Blog Archive » Linked Spending Data – How and Why Bother Pt2

Nodalities » Blog Archive » Linked Spending Data – How and Why Bother Pt2: "To help with this I am going to use, some of the excellent work that Stuart Harrison at Lichfield District Council has done in this area, as examples. Take a look at the spending data part of their site: spending.lichfielddc.gov.uk/. On the surface navigating your way around the site looking at council spend by type, subject, month, and supplier is the kind of experience a user would expect. Great for a website displaying information about a single council. "

Resource Discovery Taskforce

Resource Discovery Taskforce: "We realise that the scale and complexity of the RDTF Vision work requires a robust management framework. Over the next eight months we’ll be working closely with Mimas to design and implement that framework. In this post Joy Palmer from Mimas shares the approach and ethos in carrying forward this work with us, and she also provides an overview of the management framework activities which will be taking place between now and July 2011. – Andy McGregor."

eFoundations: Metadata guidelines for the UK RDTF - please comment

eFoundations: Metadata guidelines for the UK RDTF - please comment: "As promised last week, our draft metadata guidelines for the UK Resource Discovery Taskforce are now available for comment in JISCPress. The guidelines are intended to apply to UK libraries, museums and archives in the context of the JISC and RLUK Resource Discovery Taskforce activity.

The comment period will last two weeks from tomorrow and we have seeded JISCPress with a small number of questions (see below) about issues that we think are particularly worth addressing. Of course, we welcome comments on all aspects of the guidelines, not just where we have raised issues. (Note that you don't have to leave public comments in JISCPress if you don't want to - an email to me or Pete will suffice. Or you can leave a comment here.)

The guidelines recommend three approaches to exposing metadata (to be used individually or in combination), referred to as:

the community formats approach;
the RDF data approach;
the Linked Data approach."

Technical standards in education, Part 1: Introducing the educational standards

Cloud and industry, Part 1: PaaS best practices and patterns

Cloud and industry, Part 1: PaaS best practices and patterns: "Summary: This article is the first part of a series on enabling cloud computing in industry solutions. This introduction covers basic cloud computing philosophy and industry solution knowledge. You will learn about the requirements and functions of three models to deliver industry solutions, Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS), and how you can use best practices and patterns with the PaaS framework in particular to deploy and manage cloud computing solutions. The next articles in the series will discuss how cloud computing capabilities can be applied specifically to the chemical and petroleum and telecommunications domains."

Catalogablog: MapFAST

Catalogablog: MapFAST: "MapFAST is a mashup prototype that uses a Google Maps interface to present FAST Geographic authority records. The prototype presents a different way to look at subject access to bibliographic records. It also demonstrates a strength of the subject faceting approach of FAST over coordinated subject headings."

Semantically Enhancing Collections of Library and Non-Library Content

Semantically Enhancing Collections of Library and Non-Library Content: "Many digital libraries have not made the transition to semantic digital libraries, and often with good reason. Librarians and information technologists may not yet grasp the value of semantic mappings of bibliographic metadata, they may not have the resources to make the transition and, even if they do, semantic web tools and standards have varied in terms of maturity and performance. Selecting appropriate or reasonable classes and properties from ontologies, linking and augmenting bibliographic metadata as it is mapped to triples, data fusion and re-use, and considerations about what it means to represent this data as a graph, are all challenges librarians and information technologists face as they transition their various collections to the semantic web. This paper presents some lessons we have learned building small, focused semantic digital library collections that combine bibliographic and non-bibliographic data, based on specific topics. The tools map and augment the metadata to produce a collection of triples. We have also developed some prototype tools atop these collections which allow users to explore the content in ways that were either not possible or not easy to do with other library systems."

Semantically Enhancing Collections of Library and Non-Library Content

Making Connections Real » AI3:::Adaptive Information

Making Connections Real » AI3:::Adaptive Information: "We are only days away from releasing the first commercial version 1.00 of UMBEL (Upper Mapping and Binding Exchange Layer) [1]. To recap, UMBEL has two purposes, both aimed to promote the interoperability of Web-accessible content. First, it provides a general vocabulary of classes and predicates for describing domain ontologies and external datasets. Second, UMBEL is a coherent framework of 28,000 broad subjects and topics (the “reference concepts”), which can act as binding nodes for mapping relevant content."

hangingtogether.org » Blog Archive » The Core of Bibliographic Description

26 January 2011

When supercomputers meet the Semantic Web… - semanticweb.com

When supercomputers meet the Semantic Web… - semanticweb.com: "Despite everything that has happened over the years, from technological advances to organisational wobbles, bankruptcies and buy-outs, the name retains a certain cachet. They make super computers! Their computers come (came) with seats and bubbling coolant systems and everything! To someone growing up with early examples of rudimentary computing in the home, Cray was the stuff of Tomorrow’s World, Bond villains, and more. This was what real computers were all about."

Microformats and RDFa deployment across the Web « Tripletalk

Microformats and RDFa deployment across the Web « Tripletalk: "Microformats and RDFa deployment across the Web
Posted January 25, 2011
Filed under: Uncategorized |
I have presented on previous occasions (at Semtech 2009, SemTech 2010, and later at FIA Ghent 2010, see slides for the latter, also in ISWC 2009) some information about microformat and RDFa deployment on the Web. As such information is hard to come by, this has generated some interest from the audience. Unfortunately, Q&A time after presentations is too short to get into details, hence some additional background on how we obtained this data and what it means for the Web. This level of detail is also important to compare this with information from other sources, where things might be measured differently."

LITA Standards Task Force White Paper | ALA Connect

LITA Standards Task Force White Paper | ALA Connect: "As part of the strategic initiatives, LITA believes it needs to be an active participant in the creation and adoption of standards that align with the library technology community. The LITA Executive Committee approved the creation of a LITA Standards Task Force in March, 2010. The Task Force was charged to:

● Explore and recommend strategies and initiatives LITA can implement to become more active in the creation and adoption of new technology related standards that align with the library community.

● Propose an organizational structure that will support and sustain LITA's increased involvement in the standards arena both within ALA and beyond."

Catalogablog: Metadata Scheme for the Publication and Citation of Research Data

Catalogablog: Metadata Scheme for the Publication and Citation of Research Data: "The DataCite Metadata Scheme is a list of core metadata properties chosen for the accurate and consistent identification of data for citation and retrieval purposes, along with recommended use instructions. At a minimum, the mandatory metadata scheme properties must be provided at the time of identifier registration. Data centres and other submitters may also choose to use the optional properties to identify their data more clearly. This metadata scheme can fulfill several key functions in support of the larger goals of DataCite. Primarily these are:"

: IFLA World Report 2010 :

: IFLA World Report 2010 :: "The World Report series is a biennial report series that reports on the state of the world in terms of freedom of access to information, freedom of expresion and related issues. The reports are available online at http://www.ifla.org/en/publications/iflafaife-world-report-series and can be downloaded free of charge. The 2010 Report has been designed as a customizable interactive electronic publication and can be accessed in different formats through the maps below:"

23 January 2011

Open Knowledge Foundation Blog » Blog Archive » More library-related open data!

Open Knowledge Foundation Blog » Blog Archive » More library-related open data!: "Open Knowledge Foundation Blog
More library-related open data!
January 5th, 2009
You may have heard that lcsh.info - which explored how Library of Congress Subject Headings could be represented as a Semantic Web application - was closed down last month.

The good news is that there are now two new projects publishing library-related open data:

http://ckan.net/package/read/iconclass
http://ckan.net/package/read/hud-library-usagedata"

Open Knowledge Foundation Blog » Blog Archive » Introducing GetTheData.org: Ask and Answer Data Related Questions

Open Knowledge Foundation Blog » Blog Archive » Introducing GetTheData.org: Ask and Answer Data Related Questions: "Introducing GetTheData.org: Ask and Answer Data Related Questions
January 20th, 2011
The following post is by Tony Hirst, who has been working with Rufus Pollock of the Open Knowledge to create http://GetTheData.org/, a new question and answer site for data-related questions.

Where can I find a list of airports with their locations? Where can I find historical weather data? How do I find the county from a postcode or a state from a zipcode? How do I find a book title from its ISBN? What’s the best tool(s) for scraping data from websites? Is there a way to get RDF Linked Data in a format that you can use?"

Ebooks and Libraries: A Stream of Concerns | Information Wants To Be Free

Go To Hellman: eBook Identifier Confusion Shakes Book Industry

Go To Hellman: eBook Identifier Confusion Shakes Book Industry: "The Book Industry has been experiencing tectonic shifts as it moves from the solid foundation of print-based production and distribution to digital forms. The so-called 'supply chain' is a long-standing edifice of the book industry being shaken by the resulting quakes. One of the strings holding the supply chain together is the ISBN, and it has proven to be reasonably robust. Still, there's been enough 'damage' to the ISBN and the supply chain it holds together that many participants in the book industry have been concerned for its integrity. (I wrote about the situation in July.)"

DigitalKoans » Blog Archive » Digital Curation and Preservation Bibliography, Version 2

DigitalKoans » Blog Archive » Digital Curation and Preservation Bibliography, Version 2: "Digital Curation and Preservation Bibliography, Version 2
Version 2 of the Digital Curation and Preservation Bibliography is now available from Digital Scholarship as an XHTML website with live links to many included works. This selective bibliography includes over 500 articles, books, and technical reports that are useful in understanding digital curation and preservation. All included works are in English. It is available under a Creative Commons Attribution-Noncommercial 3.0 United States License."

[1101.3186] Data Preservation in High Energy Physics

[1101.3186] Data Preservation in High Energy Physics: "Data from high-energy physics (HEP) experiments are collected with significant financial and human effort and are in many cases unique. At the same time, HEP has no coherent strategy for data preservation and re-use, and many important and complex data sets are simply lost. In a period of a few years, several important and unique experimental programs will come to an end, including those at HERA, the b-factories and at the Tevatron. An inter-experimental study group on HEP data preservation and long-term analysis (DPHEP) was formed and a series of workshops were held to investigate this issue in a systematic way. The physics case for data preservation and the preservation models established by the group are presented, as well as a description of the transverse global projects and strategies already in place."

ALCTS E-Forum on Digital Preservation | Celeripedean

ALCTS E-Forum on Digital Preservation | Celeripedean: "There was also a considerable amount of discussion around difficult file formats, especially challenges with video. It’s also really clear that a lot of people are trying to find ways to get institutional support and a sensible infrastructure for digital preservation activities. Creating relevant policies, effectively engaging administrators, and planning with few resources are clearly things all of those in this area have in common."

Personanondata: BISG eBook ISBN Study Findings Released

Personanondata: BISG eBook ISBN Study Findings Released: "BISG eBook ISBN Study Findings Released
BISG held a meeting last Thursday to review the findings from the eBook ISBN study which I conducted for the group. BISG intends to use this study as a first step in defining what the industry should do to identify eBooks and eContent for the future."

Perceptions OCLC Membership Report [OCLC]

Perceptions OCLC Membership Report [OCLC]: "OCLC releases new membership report: Perceptions of Libraries, 2010: Context and Community
DUBLIN, Ohio, USA, 20 January 2011—Americans are using libraries a lot more as the economic downturn has impacted lives, careers and incomes. Americans see increased value in libraries and the value that libraries provide to their communities, and report even stronger appreciation of the value librarians bring to the information search experience, according to a new membership report by OCLC, a nonprofit library services and research organization.

Perceptions of Libraries, 2010: Context and Community is a follow-up to the 2005 Perceptions of Libraries and Information Resources. The new report provides updated information and new insights into information consumers and their online information habits, preferences and perceptions. Particular attention was paid to how the current economic downturn has affected information-seeking behaviors and how those changes are reflected in the use and perception of libraries."

Go To Hellman: 2010 Summary: Libraries are Still Screwed

Go To Hellman: 2010 Summary: Libraries are Still Screwed: "In mathematics, catastrophe theory is the study of nonlinear dynamical systems which exhibit points or curves of singularity. The behavior of systems near such points is characterized by sudden and dramatic changes resulting from even very small perturbations. The simplest sort of catastrophe is the fold catastrophe."

19 January 2011

The State of Open Data in Europe

UNCHARTERED WATERS

The State of Open Data in Europe

Alexander Schellong Ekaterina Stepanets

Opening up government data to the public has been part of the European policy agenda since the

introduction of the PSI directive in 2003. European Member States continue to lean towards a cautious

approach of making their data available to citizens. This is partly caused by conflicting legal frameworks,

cultural norms and the idea to recover the costs of data production. At the same time and inspired by

activities in the U.S. and UK, the open data movement has emerged in many countries around the globe.

They have a simple demand: Government agencies should put as much of their data online as possible in

a machine-readable format so that everyone can re-use it since they were paid for by taxes. This study

analyses the current state of the open data policy ecosystem and open government data offerings in nine

European Member States. Since none of the countries studied currently offers a national open data

portal, this study compares the statistics offices’ online data offerings. The analysis shows that they fulfill

a number of open data principles but that there is still a lot of room for improvement. This study underlines that the development of data catalogues and portals should not be seen as means to an end

http:http://assets1.csc.com/de/downloads/CSC_policy_paper_series_01_2011_unchartered_waters_state_of_open_data_europe_English_2.pdf//assets1.csc.com/de/downloads/CSC_policy_paper_series_01_2011_unchartered_waters_state_of_open_data_europe_English_2.pdf

17 January 2011

IBM Identifies Analytics, Cloud Computing as Key in IT Rebound

IBM Identifies Analytics, Cloud Computing as Key in IT Rebound: "New research commissioned by IBM (news, site) suggests that most mid-sized companies have put the recession behind them and will be focusing on the deployment of new technologies, particularly analytics, over the next 12 months. "

Personanondata: BISG eBook ISBN Study Findings Released

16 January 2011

FUMSI Article: ISKO UK's Legal Know-How Event: Organization & Semantic Analysis

FUMSI Article: ISKO UK's Legal Know-How Event: Organization & Semantic Analysis: "ISKO UK is a not-for-profit scientific/professional association with the objective of promoting research and communication in the domain of knowledge organisation. The Legal Know-How event was held with the support of the Department of Information Studies, University College London (http://www.slais.ucl.ac.uk) on 10 November 2010, attracting over 80 participants. There were six excellent presentations (slides available from http://digbig.com/5bdate) from the field of legal knowledge organisation, kicked off with a legal practitioner's viewpoint, then through practical knowledge management work, to exciting research into using ontologies and challenges for the future."

Turning the page: The future of eBooks

Turning the page: The future of eBooks: "This new study examines trends and developments in the eBooks and eReaders market in the United States, United Kingdom, the Netherlands, and Germany, and discusses major challenges and key questions for the publishing industry worldwide. It also identifies market opportunities and developments for eBooks and eReaders, and makes recommendations for publishers, traditional retailers, online retailers, and intermediaries.

Given that publishers, internet bookstores, and companies that manufacture eReaders have high expectations for the digital future of the book industry, the study asks if a new generation of eReaders may, at last, achieve the long-awaited breakthrough that lures consumers away from paper and ink."

Open Knowledge Foundation Blog » Blog Archive » Opening up linguistic data at the American National Corpus

Open Knowledge Foundation Blog » Blog Archive » Opening up linguistic data at the American National Corpus: "The American National Corpus (ANC) project is creating a collection of texts produced by native speakers of American English since 1990. Its goal is to provide at least 100 million words of contemporary language data covering a broad and representative range of genres, including but not limited to fiction, non-fiction, technical writing, newspaper, spoken transcripts of various verbal communications, as well as new genres (blogs, tweets, etc.). The project, which began in 1998, was originally motivated by three major groups: linguists, who use corpus data to study language use and change; dictionary publishers, who use large corpora to identify new vocabulary and provide examples; and computational linguists, who need very large corpora to develop robust language models—that is, to extract statistics concerning patterns of lexical, syntactic, and semantic usage—that drive natural language understanding applications such as machine translation and information search and retrieval (à la Google)."

Europeana - Europeana's Strategic Plan 2011-2015 - Europeana News - group

Europeana - Europeana's Strategic Plan 2011-2015 - Europeana News - group: "Europeana's Strategic Plan 2011-2015
January 14, 2011 10:08 AM
The Plan sets out a clear vision for the further development of Europeana. It focuses on four strategic tracks - aggregate, facilitate, distribute and engage - that will enable us to generate real value for our stakeholders.

In the light of the release this week of The New Renaissance, the Comité des Sages’ report on digital cultural heritage to the Commission, this is an opportune moment to look at Europeana’s future direction. Download the full colour version or the black and white print version of the Strategic Plan 2011-2015."

13 January 2011

VocabControl » Online Information Conference – day two

VocabControl » Online Information Conference – day two: "Linked Data in Libraries
I stayed in the Linked Data track for Day 2 of the Online Information Conference, very much enjoying Karen Coyle’s presentation on metadata standards - FRBR, FRSAR, FRAD, RDA - and Sarah Bartlett’s enthusiasm for using Linked Data to throw open bibliographic data to the world so that fascinating connections can be made. She explained that while the physical sciences have been well mapped and a number of ontologies are available, far less work has been done in the humanities. She encouraged humanities researchers to extend RDF and develop it."

Nodalities » Blog Archive » A Year of Open Government Data: Transparency, but also Innovation

Nodalities » Blog Archive » A Year of Open Government Data: Transparency, but also Innovation: "A Year of Open Government Data: Transparency, but also Innovation
12th January 2011, 05:32 pm by Zach Beauvais In: linked data
Towards the end of 2010, Wikileaks generates many headlines as it publishes information on the web, causing controversy and leading to talk about politicians hiding information from the public. Reporters and commentators express shock or admiration when telling the story of a rogue organisation making governmental information public. What has not been as mainstream is that for the past year or more, governments around the world have been doing something very similar themselves: publishing information online."

What to expect in EPUB3 - O'Reilly Radar

What to expect in EPUB3 - O'Reilly Radar: "Just as publishers are wrapping their heads — and workflows — around the current version of EPUB, a new release is scheduled for May. The EPUB3 draft is set to publish for comment later this month, giving publishers and developers their first blush at what the release will mean to them.

In the following interview, Bob Kasher, business development manager for integrated solutions at Book Masters and a member of the International Digital Publishing Forum EPUB Working Group, highlights some of the changes the new version will bring to the publishing industry. Kasher is scheduled to speak in depth on EPUB3 at February's Tools of Change for Publishing conference in New York."

Home - Yourtopia.net

Home - Yourtopia.net: "The idea: Construct a measure of social progress world-wide based on your preferences for development. Participate in a global effort to improve tracing of humanity's progress towards the Millennium Development Goals.
More about YourTopia »"

12 January 2011

Academic Bibliography data available from Acta Cryst E | Open Biblio (graphic) Projects

Academic Bibliography data available from Acta Cryst E | Open Biblio (graphic) Projects: "he bibliographic data from Acta Cryst E, a publication by the International Union of Crystallography (IUCr), has been extracted and made available with their consent.

You can find a SPARQL endpoint for the data here and the full dataset here.

I have also geocoded a number of the affiliations of the authors, plotting them on a timemap (visualising the time of publication against the location of the authors), and you can see this at this location."

OPDS Catalog 1.0

OPDS Catalog 1.0: "The Open Publication Distribution System (OPDS) Catalog format is a syndication format for electronic publications based on Atom and HTTP. OPDS Catalogs enable the aggregation, distribution, discovery, and acquisition of electronic publications. OPDS Catalogs use existing or emergent open standards and conventions, with a priority on simplicity."

http://ec.europa.eu/information_society/activities/digital_libraries/doc/reflection_group/final-report-cdS3.pdf

REPORT OF THE ‘COMITÉ DES SAGES’

REFLECTION GROUP ON BRINGING EUROPE’S

CULTURAL HERITAGE ONLINE

07 January 2011

SLA Taxonomy Division - SLA Taxonomy - SLA's Wiki Spaces

SLA Taxonomy Division - SLA Taxonomy - SLA's Wiki Spaces: "Welcome to the home page for the SLA Taxonomy Division. We are very pleased to have you visit us here.
In this space you will find an increasing resource of information on Controlled Vocabularies including Taxonomies, Thesauri, Ontologies, Terminologies, and other Knowledge Organization and Classification Systems. We encourage you to contribute!"

FreePint Newsletter: 317

FreePint Newsletter: 317: "My Favourite Tipples
By Heather Hedden

Taxonomist Tipples

As a consultant and trainer in the field of taxonomies, people have often asked me what resources I would recommend for doing taxonomy work. There are various useful sites, including blogs, the collected presentations and articles of taxonomy consultancies, professional organization sites, and sites that are collections of links to publicly accessible taxonomy examples. Following are some resources that cover all the basics of taxonomies for those, by accident or not, find themselves working as taxonomists."

Online Thesauri and Authority Files - The American Society For Indexing

FreePint Newsletter: 317

FreePint Newsletter: 317: " My Favourite Tipples
By Heather Hedden

Taxonomist Tipples

As a consultant and trainer in the field of taxonomies, people have often asked me what resources I would recommend for doing taxonomy work. There are various useful sites, including blogs, the collected presentations and articles of taxonomy consultancies, professional organization sites, and sites that are collections of links to publicly accessible taxonomy examples. Following are some resources that cover all the basics of taxonomies for those, by accident or not, find themselves working as taxonomists."

OCLC and The Combined Regions announce plans to launch first Web [OCLC]

OCLC and The Combined Regions announce plans to launch first Web [OCLC]: "updates and offers
News releases
OCLC and The Combined Regions announce plans to launch Web-based public library national union catalogue in UK
BIRMINGHAM, UK, 06 January 2011—

New shared Web catalogue to boost visibility and usage of public library resources

OCLC and The Combined Regions (TCR) have announced plans to launch Britain's first freely accessible national public library union catalogue. Containing the bibliographic data from 80% of the UK's public libraries, the service will make it possible for Web users to simultaneously search 9 million bibliographic records and 50 million holdings.

Leveraging information already indexed in WorldCat, the world's largest online resource for finding library materials, this customised union catalogue will provide a view of holdings contributed by the 149 local authorities with a current full package subscription to UnityUK, the UK's only nationwide network for resource sharing.

The initiative will make bibliographic data more discoverable on the open Web. Indexing of WorldCat data through search engines such as Google and Yahoo! will vastly improve awareness of public library resources and drive significantly increased traffic back to local libraries."

Cloud-sourcing Research Collections Report Announcement [OCLC]

Cloud-sourcing Research Collections Report Announcement [OCLC]: "DUBLIN, Ohio, USA, 6 January 2011—This report presents findings from a year-long study designed and executed by OCLC Research, the HathiTrust, New York University's Elmer Bobst Library, and the Research Collections Access & Preservation (ReCAP) consortium, with support from The Andrew W. Mellon Foundation.

The objective of the project was to examine the feasibility of outsourcing management of low-use print books held in academic libraries to shared service providers, including large-scale print and digital repositories. The study assessed the opportunity for library space saving and cost avoidance through the systematic and intentional outsourcing of local management operations for digitized books to shared service providers and progressive downsizing of local print collections in favor of negotiated access to the digitized corpus and regionally consolidated print inventory."

04 January 2011

Catalogablog: New Vocabularies Added to LC Authorities and Vocabularies Service

Catalogablog: New Vocabularies Added to LC Authorities and Vocabularies Service: "Good news from LC more linked data.
The Library of Congress is pleased to make available new vocabularies from its Authorities and Vocabularies web service (ID.LOC.GOV), which provides access to Library of Congress standards and vocabularies as Linked Data. The new additions include :

MARC Code List for Countries
MARC Code List for Geographic Areas
MARC Code List for Languages"

MagicTile - geometrical and topolgical analogues of Rubik's Cube

MagicTile - geometrical and topolgical analogues of Rubik's Cube: "...and many more!
This program aims to support twisty puzzles based on regular polygonal tilings having Schlafli symbols of the form {p,3} for any p>=2. That is, all regular tilings of polygons with two or more sides, where three tiles (puzzle faces) meet at a vertex. The Rubik's cube is the special case where faces are squares (p=4). The other familiar special cases are the Megaminx (p=5) and the Pyraminx (p=3), although you'll discover the last takes a slightly different form under this abstraction (akin to Jing's Pyraminx). All the other puzzles are new as far as I know, and some may be surprising, e.g. the puzzles based on digons (p=2)."

Semantic Web New Year's Resolutions - Web Science - the World of the World Wide Web Blog | Nature Publishing Group

Semantic Web New Year's Resolutions - Web Science - the World of the World Wide Web Blog | Nature Publishing Group: "Semantic Web New Year's Resolutions
Posted by James Hendler on Jan 3, 2011
I am using the New Year as an excuse to explore my Semantic Web research agenda. This past year was a great year for the Semantic in industrial uptake, in seeing respect grow in the government open data community, and in seeing a new intake of people at the Semantic Technologies conference in the US and related conferences overseas. (My slideshare talk on the status of the Semantic Web covers some of this)"

Peter Suber, SPARC Open Access Newsletter, 1/2/11

Peter Suber, SPARC Open Access Newsletter, 1/2/11: "Open access in 2010

The growth of OA over the past year was deep, wide, and steady. While this has been true every year since my first year-end review in 2003, the difficulty of documenting that growth with useful detail has become nearly unmangeable. In fact, this has also been true for several years. At some point --roughly now-- we'll have to accept that OA movement is so large that annual reviews must either be sketchy or come out six months late. To cover the territory in a manageable time, I've long since dropped most new developments in open education, public-sector information, and wikis. I don't even try to list all new individual OA journals, OA repositories, or all new open-data or open-digitization projects. I'm keeping the section I added last year on the recession, since the recession continues to permeate action and policy nearly everywhere.

But with these caveats, here's a feast of the OA highlights from 2010. As always, apologies to the many projects I had to omit.

If you're in a hurry, jump to Section 10 for some highlights of the highlights."

03 January 2011

Guide to Managing IPR in Digital Repositories | Digital Curation Centre

Guide to Managing IPR in Digital Repositories | Digital Curation Centre: "The final outputs of the JISC TrustDR project, which examined the practical issues in setting up digital rights management systems (DRM) in repositories of learning objects, are now available: Managing Intellectual Property Rights (IPR) in Digital Learning Materials: A Development Pack for Institutional Repositories. Distributed under a Creative Commons License - Attribution 2.5 UK: Scotland, the pack is aimed at those who are setting up or running digital collections of learning materials that are managed at an institutional level."

Provenance XG Final Report

Provenance XG Final Report: "2. What is provenance

Provenance is too broad a term for it to be possible to have one, universal definition - like other related terms such as 'process', 'accountability', 'causality' or 'identity', we can argue about their meanings forever (and philosophers have indeed debated concepts such as identity or causality for thousands of years without converging). Our goal was to develop a working definition reflecting how the W3C Provenance Incubbator Group views provenance in the context of the Web.

To develop this view, we first the activities reported in the rest of this document. That is, we did not start out trying to agree on a definition of provenance but rather the group came to a shared view once we had a common background and context, based on months of discussions.

2.1 A Working Definition of Provenance

Provenance is a very broad topic that has many meanings in different contexts. The W3C Provenance Incubator Group developed a working definition of provenance on the Web:

Provenance of a resource is a record that describes entities and processes involved in producing and delivering or otherwise influencing that resource. Provenance provides a critical foundation for assessing authenticity, enabling trust, and allowing reproducibility. Provenance assertions are a form of contextual metadata and can themselves become important records with their own provenance."

W3C Provenance Incubator Group Wiki - XG Provenance Wiki

02 January 2011

2010 Gov 2.0 Year in Review - O'Reilly Radar

2010 Gov 2.0 Year in Review - O'Reilly Radar: "I recently talked with Federal News Radio anchor Chris Dorobek about Gov 2.0 in 2010 and beyond. While our conversation ranged over a wide variety of topics, it was clear afterwards that I'd missed many of the year's important stories in Gov 2.0 during the relatively short segment. I went back over hundreds of posts on Gov 2.0 at Radar and GovFresh, thousands of tweets and other year-end lists, including Govloop's year in review, Gartner's Top 10 for Government 2.0 in 2010, Bill Allison's end of year review, Andrew P. Wilson's memorables from 2010, Ellen Miller's year in Sunlight 2010, John Wonderlich's 2010 in policy and GovTwit's top Gov 2.0 stories. Following are the themes, moments and achievements that made an impact."

The Pedantic Web Group

hangingtogether.org » Blog Archive » OCLC Research 2010: Blue Ribbon Task Force on Sustainable Preservation and Access

hangingtogether.org » Blog Archive » OCLC Research 2010: Blue Ribbon Task Force on Sustainable Preservation and Access: "2010 marked the conclusion of work of the Blue Ribbon Task Force on Sustainable Digital Preservation and Access. Formed in 2007, the Task Force was an international group convened to examine the issue of economic sustainability in a digital preservation context. Membership included experts from across the digital preservation community, including the public sector, the private sector, cultural heritage, and academia, and reflected a range of expertise, including librarians, archivists, computer scientists, and economists.

The Task Force produced two substantial reports which, provide:

the first comprehensive study of the economics of digital preservation;
a clear definition of the conditions that must be met to achieve economic sustainability in a digital preservation context;
practical, actionable recommendations for achieving economic sustainability, based on detailed analysis of both the economic environment in which preservation decision-making takes place, and the attributes of digital preservation as an economic activity;
a list of priorities for near-term action;
a strong foundation to catalyze additional work on economically sustainable digital preservation."

TAI CHI Webinar Series [OCLC - Webinars]

TAI CHI Webinar Series [OCLC - Webinars]: "Technical Advances for Innovation in Cultural Heritage Institutions (TAI CHI) Webinar Series
OCLC Research, on behalf of the RLG Partnership and under the management of Senior Program Officer Roy Tennant, has launched a series of webinars to teach library staff new technology skills and educate them about new products to help increase their productivity in today's changing library, archive and museum environment. The goal of these webinars is to highlight specific innovative applications, often locally developed, that libraries, museums and archives may find effective in their own environments, as well as to teach technical staff new technologies and skills. The series, titled Technical Advances for Innovation in Cultural Heritage Institutions (TAI CHI), has two tracks:"

Pages