Information Science Blog Aggregation
Click here to load reader
-
Upload
franny-gaede -
Category
Education
-
view
258 -
download
0
Transcript of Information Science Blog Aggregation
![Page 1: Information Science Blog Aggregation](https://reader038.fdocuments.net/reader038/viewer/2022100602/5586d99bd8b42a5c718b4577/html5/thumbnails/1.jpg)
ISBLOGSbuilding a better blog database
FRANCESCAGIANNETTI
ELLIEDICKSON
FRANNYGAEDE
DARIENLARGE
VIRGINIATRUEHEART
![Page 2: Information Science Blog Aggregation](https://reader038.fdocuments.net/reader038/viewer/2022100602/5586d99bd8b42a5c718b4577/html5/thumbnails/2.jpg)
Goals✦ Pull in RSS feeds to show article snippets
& other info✦ Create a tag cloud to offer an additional
entry point to the collection
Create a resource for incoming and continuing iSchool students
![Page 3: Information Science Blog Aggregation](https://reader038.fdocuments.net/reader038/viewer/2022100602/5586d99bd8b42a5c718b4577/html5/thumbnails/3.jpg)
Blog Curationsee also: sisyphean tasks
✦ Many (most?) incoming students are not info science people
✦ Info science is truly multi-disciplinary, blogosphere is doubleplusbig
✦ How to find the good stuff?✦ Get your friends to find it for
you✦ Aren’t we friends?
![Page 4: Information Science Blog Aggregation](https://reader038.fdocuments.net/reader038/viewer/2022100602/5586d99bd8b42a5c718b4577/html5/thumbnails/4.jpg)
Populating the DBsetting the table(s)
✦ Virginia the Architect structured the database.
✦ Look at all the table definitions. Look at ‘em.
‣ author‣ blog‣ blog_author‣ blog_cat‣ blog_maintainer
‣ category‣ feed‣ maintainer‣ tag‣ tag_blog
![Page 5: Information Science Blog Aggregation](https://reader038.fdocuments.net/reader038/viewer/2022100602/5586d99bd8b42a5c718b4577/html5/thumbnails/5.jpg)
$toreturn['title'] = $article->find('title', 0)->plaintext;
$toreturn['pubDate'] = $article->find('published', 0)->plaintext; //print($toreturn['pubDate']);
$toreturn['link'] = $article->find('link', 0)->href;
$articletext = $article->find('summary', 0)->xmltext; //print($articletext); $articletext=trim($articletext); //print "<p>found content:encoded: $articletext</p>"; if ($articletext=='') { print "<em style='background-color:yellow;'>Could not find article in content:encoded; trying description</em>"; $articletext=$article->find('description', 0)->xmltext; } $articletext=preg_replace("/\[...\]/", "", $articletext); $articletext=preg_replace("/<img[^>]*\/>/", "", $articletext); $articletext=preg_replace("/<iframe[^>]*>/", "", $articletext); $articletext=preg_replace("/src *= *'[^']*'/", "", $articletext); $articletext=preg_replace("/<div[^>]*>/", "", $articletext); $articletext=preg_replace("/<span[^>]*>/", "", $articletext); $firstparapos=strpos($articletext, "</p>"); // print $articletext;$toreturn[text]=$articletext; $html->clear(); unset($html);return $toreturn;}?>
RSS & PHPacronym bros
✦ Select items to display by blog, category or maintainer
✦ Add & modify feed URLs for blogs✦ Retrieve & display content from blog
feeds
![Page 6: Information Science Blog Aggregation](https://reader038.fdocuments.net/reader038/viewer/2022100602/5586d99bd8b42a5c718b4577/html5/thumbnails/6.jpg)
$toreturn['title'] = $article->find('title', 0)->plaintext;
$toreturn['pubDate'] = $article->find('published', 0)->plaintext; //print($toreturn['pubDate']);
$toreturn['link'] = $article->find('link', 0)->href;
$articletext = $article->find('summary', 0)->xmltext; //print($articletext); $articletext=trim($articletext); //print "<p>found content:encoded: $articletext</p>"; if ($articletext=='') { print "<em style='background-color:yellow;'>Could not find article in content:encoded; trying description</em>"; $articletext=$article->find('description', 0)->xmltext; } $articletext=preg_replace("/\[...\]/", "", $articletext); $articletext=preg_replace("/<img[^>]*\/>/", "", $articletext); $articletext=preg_replace("/<iframe[^>]*>/", "", $articletext); $articletext=preg_replace("/src *= *'[^']*'/", "", $articletext); $articletext=preg_replace("/<div[^>]*>/", "", $articletext); $articletext=preg_replace("/<span[^>]*>/", "", $articletext); $firstparapos=strpos($articletext, "</p>"); // print $articletext;$toreturn[text]=$articletext; $html->clear(); unset($html);return $toreturn;}?>
RSS & PHPacronym bros
✦ Retrieve info from database✦ Check format of URL
✦ RSS vs. Atom✦ Retrieve contents as object✦ Get latest item from contents✦ Parse elements✦ Search for text of latest blog entry✦ Perform text processing✦ Return the information found
![Page 7: Information Science Blog Aggregation](https://reader038.fdocuments.net/reader038/viewer/2022100602/5586d99bd8b42a5c718b4577/html5/thumbnails/7.jpg)
Tag Cloudchance of rain: 0%
✦ Ellie & Frankie installed and customized v-nessa.net’s PHP-based tag cloud like bosses
✦ And this is it:
![Page 8: Information Science Blog Aggregation](https://reader038.fdocuments.net/reader038/viewer/2022100602/5586d99bd8b42a5c718b4577/html5/thumbnails/8.jpg)
Putting it Togetherbug squashing 4evar
✦ CSS is fun. And magic. And occasionally a pain in the ass
✦ RSS is not the most consistent medium✦ Populating the tables through forms✦ Backups through CRON✦ Search
![Page 9: Information Science Blog Aggregation](https://reader038.fdocuments.net/reader038/viewer/2022100602/5586d99bd8b42a5c718b4577/html5/thumbnails/9.jpg)
image creditsphotoshop is hard
✦ Slide 5: http://www.moogo.com/blog/2010/06/15/106✦ Slide 6: https://code.fluendo.com/elisa/trac/browser/trunk/elisa/core/plugins/
data/weather/cloud.png?rev=1222