Dec 312006
 

I love OpenSearch.  It’s been one of those things that I’ve been wanting to spend more time looking at — maybe incorporate into Dspace or some of our other services like LibraryFind (which actually is now on the todo list).  Anyway, folks may not know it, but Kyle Banerjee and I are writing a book.  A how to of sorts, for folks doing digital repositories.  I’ve been lights out for most of December cranking out 4 finished and 1 1/2 nearly completed chapters.  So far so good.  Well, one of the parts that of the book deals with exposing resources to larger audiences and a discussion of OpenSearch falls into that section.  As  I was looking through the specification this afternoon, I thought, wow, this would be easy to implement just about everywhere.   So I took a 1/2 hour, and quickly whipped up some code that integrated OpenSearch into CONTENTdm.  I’ll post the code shortly.  However, what I thought cool was the number of resources that have embraced OpenSearch as a query method.  IE7 for example, utilizes OpenSearch as the method for querying search providers.  This means that by adding an OpenSearch server to my Contentdm instance, I instantly am able to add this resource as a search target in IE 7.

 

 

As I mentioned, writing the code took about 30 minutes and was much easier than I’d anticipated.  Given the speed at which OpenSearch has caught on outside the library community (I was surprised at how many applications and services support it) and how simple it is to implement — I’m thinking that its almost crazy not to spend the time and integrate the protocol into our organization’s services if only to give developers outside the library community an straighter line for service integration.

 

–TR 

 Posted by at 3:59 am
Sep 292006
 

At OSU, we’ve played with this on and off and finally decided to just take this live.  For those that use CONTENTdm, I’ve created a small document that discusses how this works and what it looks like.  As I said, simple implementation at this point, but if use takes off, I’ll look to add things like tag clouds, integrated search results, etc.  This won’t be interesting to anyone but folks using CONTENTdm.  Sorry.

Here’s a link to the document: CONTENTdm_Tagging.doc

 

–TR

 Posted by at 11:16 am
Aug 302006
 

Updated: Fixed a couple of typos below

Updated two: Thanks to Josh Kline for pointing out that the PubDate wasn’t RFC 822 complient. This has been updated.

About a year ago, I created an RSS generator for CONTENTdm.  At the time, CONTENTdm really didn’t have an API that could be worked with, so in building the generator, I created a perl script that would simply ping the server periodically and report back changes.  This has been working fine, even with the newer 4.0 interfaces — but a few things had broken…just who has the time to fix everything.  :) 

Anyway, over the past month, I have been updating all my older tools, documenting new ones and getting thing posted onto my CONTENTdm projects website (http://oregonstate.edu/~reeset/contentdm).  A number of the new tools that I’ve been creating are relating to social software.  I.e., I’ve created a commenting and tagging plugin for CONTENTdm (which I’ll likely post about once I finish documenting), updated this RSS feed and then have finally added LDAP authentication support to our CDM interface (though I wish DiMeMia allowed more customizability to the administration interfaces so I could integrate this better [and again, I’ll post a code snippet once the docs are completed]). 

The new RSS plugin is written entirely in PHP (to match the rest of the CONTENTdm files) and can generate feeds for the entire server or individual collections.  The plugin makes use of the OAI server to extract information and then reformat for delivery in an RSS 2.0 wrapper.  Here is an example of what this looks like:

For the most part, the plugin requires no changes to current CONTENTdm interfaces other than the presences of a link to the feed.  We’ve done the following at our main collection page: http://digitalcollections.library.oregonstate.edu/

Now the code — drop dead easy. 

Code:

<?
  /*
   * Terry Reese
   * Modified: August 30, 2006
   *
   * Changes:
   *    Updated PubDate to correct date format.  Thanks to Josh Kline for pointing this out
   */
  define("CONTENTdmPath", "/usr/local/Content/docs/");
  define("DMSCRIPTS", "dmscripts/");
  define("BaseURL", "http://" . $_SERVER['SERVER_NAME'] . "/");
  define("OAIURL", BaseURL . "cgi-bin/oai.exe?verb=ListRecords&metadataPrefix=oai_dc{set}&from={start}&until={end}");
  define("DEF_TITLE", "OSU CONTENTdm Image Collection");
  include(CONTENTdmPath . DMSCRIPTS . "DMSystem.php");
  
  if (isset($_GET['set'])) { $set = $_GET['set']; } else { $set = "";}

  class RSS {
     function header ($title,$link) {
        $string = '<?xml version="1.0" encoding="UTF-8"?>' .  "\n" .
		  '<rss version="2.0"' . "\n" .
		  'xmlns:content="http://purl.org/rss/1.0/modules/content/"' . "\n" . 
	 	  'xmlns:wfw="http://wellformedweb.org/CommentAPI/"' . "\n" . 
		  'xmlns:dc="http://purl.org/dc/elements/1.1/">' . "\n" . 
	          '<channel>' . "\n" . 
		  '<title>'.$title.'</title>' . "\n" . 
		  '<link>'.$link.'</link>' . "\n" . 
		  '<description></description>' . "\n" . 
		  '<pubDate>Mon, 28 Aug 2006 15:48:41 +0000</pubDate>' . "\n" . 
		  '<language>en</language>' . "\n"; 
	 return $string;
     }

     function footer () {
	$string = "</channel>\n";
    	$string .= "</rss>";
	return $string;
     }

     function buildItem($DCValues) {
  	$string = "<item>\n" . 
		  "<title>".$DCValues["title"]."</title>\n" . 
		  "<link>".$DCValues["identifier"]."</link>\n" .
		  "<pubDate>". date("D, d M Y",  strtotime($DCValues["datestamp"])) . " 00:00:00 +0000</pubDate>\n" . 
		  "<dc:creator>".$DCValues["creator"]."</dc:creator>\n";  
	if (strlen($DCValues["description"])>255) {
		$string .= "<description><![CDATA[" . substr($DCValues["description"], 0, 255) . "[...]]]></description>\n";
        } else {
		$string .= "<description><![CDATA[" . $DCValues["description"] .  "]]></description>\n";
	}
	$string .= "<content:encoded><![CDATA[" . $DCValues["description"] . "]]></content:encoded>\n";
	$string .= "</item>\n";   
        return $string;
     } 

     function encodeDescription($set, $description, $subjects,  $uri) {
	$tarr = explode("/", $uri);
	$parr = explode(",", $tarr[count($tarr)-1]);
	$ptr = $parr[1];
	$set = $parr[0];
        $string = "<img src=\"" . BaseURL .  "cgi-bin/getimage.exe?CISOROOT=/" . $set . "&CISOPTR=" . $ptr . "&DMSCALE=10.5&DMWIDTH=250&DMHEIGHT=250\" border=\"0\" />";
	$string .= "<p>" . $description . "<br /><br />\n" . 
		   "Subjects: " . $subjects . "<br />\n" .
		   "<a href=\"" . $uri . "\">Get MetaData</a></p>";
	return $string;
      }
  } 
 
  $objRSS = new RSS;
 
  if ($set!="") {
     $oai_url = str_replace("{set}", "&set=" . $set, OAIURL);
  } else {
     $oai_url = str_replace("{set}", "", OAIURL);
  }

  $date = date("m");
  $year = date("Y");

  $oai_url = str_replace("{start}", $year . "-" . sprintf("%02d", $date) . "-01", $oai_url);
  if ($date == "12") {
     $year = intval(date("Y")) + 1;
     $date = "01";
  } else {
     $date = intval(date("m")) + 1;
  }
  $oai_url = str_replace("{end}", $year . "-" . sprintf("%02d", $date) . "-01", $oai_url);

  $_xml = file_get_contents($oai_url); 
  $p = xml_parser_create();
  xml_parse_into_struct($p, $_xml, $vals);
  xml_parser_free($p);

  $dc = array();
  $dc['title'] = "";
  $dc['setspec'] = "";
  $dc['identifier'] = "";
  $dc['subject'] = "";
  $dc['creator'] = "";
  $dc['description'] = "";
  $dc['datestamp'] = "";
 
  if ($set=="") {
	$coll_title = DEF_TITLE;
	$coll_link = BaseURL;
  } else {
        $rc = dmGetCollectionParameters("/" . $set, $coll_title, $path);
 	$coll_link = BaseURL . $set;
  }
  header("Content-type:  text/xml\n\n");
  print $objRSS->header($coll_title, $coll_link);
  foreach ($vals as $tag) {
     if ($tag['type'] == 'complete') {
	if ($tag['tag']=='DC:TITLE' && $dc['title']=="") {
	   $dc['title'] = htmlspecialchars($tag['value']);
	} else if ($tag['tag'] == 'DC:IDENTIFIER') {
	   $dc['identifier'] = $tag['value'];
	} else if ($tag['tag'] == 'SETSPEC' && $dc['setspec'] == "") {
	   $dc['setspec'] = $tag['value'];
	} else if ($tag['tag'] == 'DATESTAMP' && $dc['datestamp'] == "") {
	   $dc['datestamp'] = $tag['value'];
  	} else if ($tag['tag'] == 'DC:DESCRIPTION' && $dc['description'] == "") {
	   $dc['description'] = $tag['value'];
	} else if ($tag['tag'] == 'DC:CREATOR' && $dc['creator'] == "") {
	   $dc['creator'] = htmlspecialchars($tag['value']);
	} else if ($tag['tag'] == 'DC:SUBJECT' && $dc['subject'] == "") {
	   $dc['subject'] = htmlspecialchars($tag['value']);
	}
     } else if ($tag['type']=='close' && $tag['tag']=='RECORD') {
	$dc['description'] = $objRSS->encodeDescription($set, $dc['description'], $dc['subject'],  $dc['identifier']);
	print $objRSS->buildItem($dc);
	$dc['title'] = "";
	$dc['setspec'] = "";
	$dc['identifier'] = "";
	$dc['subject'] = "";
	$dc['creator'] = "";
	$dc['description'] = "";
	$dc['datestamp'] = "";
     }
  }
  print $objRSS->footer();
  
?>

And that’s it.

TR

 Posted by at 3:58 pm