Wednesday, June 13, 2007

Technique and Low-Cost Way to Make the Availability and Accessibility of Your Libraries’ Digital Assets Known Globally

If you are reading this article, chances are you or your institution is involved in the library world. Most likely, with the widespread interest in digital assets these days, you are either thinking of digitizing content or are already involved in a program of digitizing.

A topic that is at least as challenging as digitizing materials, perhaps even more so, is figuring out how to make the availability and accessibility of your digital materials known to others. It is one thing to have a valuable digital collection that is perfect for research and learning and another for potential users, both inside and outside of your institution, to know about them.

The powerful techniques that I am about to describe utilize technology and are low-cost. They do not require you to purchase anything. They have been successfully used by the Ball State University Libraries and are recommended to you.

The first step to make your digital resources known to a global audience is for the various search engines, e.g., Google, Yahoo!, Windows Live Search (previously MSN), and others to “crawl” your website’s pages to index the site’s contents. The reason for alerting the search engines about your site, rather than waiting for them to eventually crawl the site, is that your resources become indexed faster. You can accomplish this notification in various ways:

Register your site with search engines, a selective listing of which is provided below. Some search engines do not support manual registration. The best way to find out whether you can register your site with an engine or not is to visit the engine’s website and find information related to webmaster on the Help or About pages.
○ Google.com: www.google.com/webmasters/sitemaps
○ Yahoo.com: https://siteexplorer.search.yahoo.com/submit
○ Windows Live Search: http://search.msn.com/docs/submit.aspx
○ Ask.com: http://about.ask.com/en/docs/about/webmasters.shtml#18
Register your site and information about your site with the Open Directory Project, http://dmoz.org/add.html. Google retrieves information about a site from the ODP.

Since registering your site with various search engines does not guarantee that your newly created content will be crawled, there is more to do. In explaining this, I will focus mainly on an approach using Google Sitemap services because, from my experience, Google is the search engine that has been trying the hardest to gather as much information as it can about our digital content. The approach for other search engines is similar.

A sitemap is comparable to a book’s table of contents. That is, the book’s table of contents contains information about a topic, and it identifies where to find it, such as a page number. It directs readers to an exact location containing the information for which they are looking without requiring the person to read the book to find it. A sitemap works the same way.

A site map contains every link to the digital objects you have on your site. With this information, the search engine knows where to crawl on your website to find the digital assets that are to be indexed. Without this information, the search engine likely will not crawl the site.

Because the site’s webmaster knows the site best, Google involves the webmasters to help them build a list of links on the site and submit it to them so that their crawlers will most efficiently go about their task of indexing the digital assets. They build the keywords to the site after the webmaster has provided information about the location of the assets – the page number in the table of contents metaphor.

There are two ways to create a sitemap before submitting them to Google:
Automatically, by using a sitemap generator, such as www.xml-sitemaps.com/index.php
Manually, by building your own macros to generate and control the content of the sitemap

The latter method is more difficult to accomplish, although it results in a more accurate and reliable sitemap. There is a protocol, and you must adhere to it when creating a sitemap, see www.google.com/webmasters/tools/docs/en/protocol.html. You can also find a way to validate your sitemap there.

Building a sitemap for a CONTENTdm collection is possible. The time consuming part is generating the list of URLs in a collection without having to spend time copying and pasting. Information needed to create sitemap URLs for CONTENTdm collections is stored in system files. A CONTENTdm system administrator has access to the files containing the necessary link components for each collection.

The following steps can be used to create URLs for assets (not including compound objects) in CONTENTdm collections:

· Make a copy of the CONTENTdm system file and open it in Microsoft Word
· Create two macros and name them “findDMRecord” and “makeURL”
· Copy and paste the code for the macro from www.bsu.edulibrarieswikiindex.phptitle=URL_Creator_for_Google_Sitemap
· Run the “findDMRecord” macro and change the code accordingly
· Clean up the document after the macro has completed
· Run the “makeURL” macro
· Create your sitemap based on the URL generated in the step above

In my next article, I will discuss more about other methods that can be applied to further identify and promote your digital objects to the world. Using the available search engine tools to expand awareness of valuable, local digital assets is a low-cost, effective process.

For more information, contact P. Budi Wibowo, Ball State University Libraries’ Head of Digital Libraries and Web Services, BWibowo@bsu.edu, (765) 285-8032.

0 Comments:

Post a Comment

<< Home