Information Science Blog Aggregation

ISBLOGSbuilding a better blog database

FRANCESCAGIANNETTI

ELLIEDICKSON

FRANNYGAEDE

DARIENLARGE

VIRGINIATRUEHEART

Goals✦ Pull in RSS feeds to show article snippets

& other info✦ Create a tag cloud to offer an additional

entry point to the collection

Create a resource for incoming and continuing iSchool students

Blog Curationsee also: sisyphean tasks

✦ Many (most?) incoming students are not info science people

✦ Info science is truly multi-disciplinary, blogosphere is doubleplusbig

✦ How to find the good stuff?✦ Get your friends to find it for

you✦ Aren’t we friends?

Populating the DBsetting the table(s)

✦ Virginia the Architect structured the database.

✦ Look at all the table definitions. Look at ‘em.

‣ author‣ blog‣ blog_author‣ blog_cat‣ blog_maintainer

‣ category‣ feed‣ maintainer‣ tag‣ tag_blog

$toreturn['title'] = $article->find('title', 0)->plaintext;

$toreturn['pubDate'] = $article->find('published', 0)->plaintext; //print($toreturn['pubDate']);

$toreturn['link'] = $article->find('link', 0)->href;

$articletext = $article->find('summary', 0)->xmltext; //print($articletext); $articletext=trim($articletext); //print "<p>found content:encoded: $articletext</p>"; if ($articletext=='') { print "<em style='background-color:yellow;'>Could not find article in content:encoded; trying description</em>"; $articletext=$article->find('description', 0)->xmltext; } $articletext=preg_replace("/\[...\]/", "", $articletext); $articletext=preg_replace("/<img[^>]*\/>/", "", $articletext); $articletext=preg_replace("/<iframe[^>]*>/", "", $articletext); $articletext=preg_replace("/src *= *'[^']*'/", "", $articletext); $articletext=preg_replace("/<div[^>]*>/", "", $articletext); $articletext=preg_replace("/<span[^>]*>/", "", $articletext); $firstparapos=strpos($articletext, "</p>"); // print $articletext;$toreturn[text]=$articletext; $html->clear(); unset($html);return $toreturn;}?>

RSS & PHPacronym bros

✦ Select items to display by blog, category or maintainer

✦ Add & modify feed URLs for blogs✦ Retrieve & display content from blog

feeds

$toreturn['title'] = $article->find('title', 0)->plaintext;

$toreturn['pubDate'] = $article->find('published', 0)->plaintext; //print($toreturn['pubDate']);

$toreturn['link'] = $article->find('link', 0)->href;

$articletext = $article->find('summary', 0)->xmltext; //print($articletext); $articletext=trim($articletext); //print "<p>found content:encoded: $articletext</p>"; if ($articletext=='') { print "<em style='background-color:yellow;'>Could not find article in content:encoded; trying description</em>"; $articletext=$article->find('description', 0)->xmltext; } $articletext=preg_replace("/\[...\]/", "", $articletext); $articletext=preg_replace("/<img[^>]*\/>/", "", $articletext); $articletext=preg_replace("/<iframe[^>]*>/", "", $articletext); $articletext=preg_replace("/src *= *'[^']*'/", "", $articletext); $articletext=preg_replace("/<div[^>]*>/", "", $articletext); $articletext=preg_replace("/<span[^>]*>/", "", $articletext); $firstparapos=strpos($articletext, "</p>"); // print $articletext;$toreturn[text]=$articletext; $html->clear(); unset($html);return $toreturn;}?>

RSS & PHPacronym bros

✦ Retrieve info from database✦ Check format of URL

✦ RSS vs. Atom✦ Retrieve contents as object✦ Get latest item from contents✦ Parse elements✦ Search for text of latest blog entry✦ Perform text processing✦ Return the information found

http://herbie.ischool.utexas.edu/isblogs/feedscode.html

http://herbie.ischool.utexas.edu/isblogs/feedscode.html

Tag Cloudchance of rain: 0%

✦ Ellie & Frankie installed and customized v-nessa.net’s PHP-based tag cloud like bosses

✦ And this is it:

http://herbie.ischool.utexas.edu/isblogs/blog_tagger2.php

http://herbie.ischool.utexas.edu/isblogs/blog_tagger2.php

http://herbie.ischool.utexas.edu/isblogs/blog_tagger.php

































Putting it Togetherbug squashing 4evar

✦ CSS is fun. And magic. And occasionally a pain in the ass

✦ RSS is not the most consistent medium✦ Populating the tables through forms✦ Backups through CRON✦ Search

image creditsphotoshop is hard

✦ Slide 5: http://www.moogo.com/blog/2010/06/15/106✦ Slide 6: https://code.fluendo.com/elisa/trac/browser/trunk/elisa/core/plugins/

data/weather/cloud.png?rev=1222

http://www.moogo.com/blog/2010/06/15/106

http://www.moogo.com/blog/2010/06/15/106

https://code.fluendo.com/elisa/trac/browser/trunk/elisa/core/plugins/data/weather/cloud.png?rev=1222




Information Science Blog Aggregation

Education

Transcript of Information Science Blog Aggregation