Greenstone Digital Library

imraninflibnet 17,477 views 33 slides May 03, 2011
Slide 1
Slide 1 of 33
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33

About This Presentation

Greenstone Digital Library


Slide Content

Greenstone Digital Library Software:
An Overview
ImranMansuri
Project Assistant (Library Science)
INFLIBNET Centre
17 March 2011 Prepared by Imran Mansuri

Agenda
•Introduction : Digital Library Software (DL)
•Greenstone Digital Library Software (GSDL)
•Introduction
•History
•Versions
•Features
•Unique Features
•Technology used
•Example Sites
•Example Collections
27 March 2011 Prepared by Imran Mansuri

Digital Library Software
•Theterm“DigitalLibrary”referstoalibraryin
whichcollectionsarestoredindigitalformats
(asopposedtoprint,microform,orother
media)andaccessiblebycomputers
•Thedigitalcontentmaybestoredlocallyor
accessedremotelyviacomputernetworks
•Accessthebooks,imagesareindigitalformat
•UsingNetaccesstoinformationfrom
anywhere
37 March 2011 Prepared by Imran Mansuri

Digital Libraries : Features
Dynamic Electronic Information Systems
Increase Portability
Efficiency of Access
Flexibility
Availability
47 March 2011 Prepared by Imran Mansuri

Digital Library Software
Dspace
Fedora
Eprints
Resource Space
Greenstone
57 March 2011 Prepared by Imran Mansuri

Greenstone Digital Library Software
•TheGreenstoneDigitalLibrarySoftware(GSDL)
providesawayofbuildinganddistributingdigital
librarycollections,openingupnewpossibilities
fororganizinginformationandmakingitavailable
overtheInternetoronCD-ROM
•DevelopedbytheNewZealandDigitalLibrary
Project(www.nzdl.org)attheUniversityof
Waikato
•Distributed in co-operation with UNESCO and
Humanities Library Project, Romania
67 March 2011 Prepared by Imran Mansuri

GSDL : Some Facts
•Current version:
2.82 and 3.03
Available from http://www.greenstone.org
•Software suite for building, maintaining, and
distributing digital library collections
•Comprehensive, open-source
•Distribution and promotion partners:
oUNESCO
oHuman Info NGO, Belgium
77 March 2011 Prepared by Imran Mansuri

GSDL : History
1995 -Digital library of Computer Science
Technical Reports. Its established by New Zealand
Digital Library
1997 -Decision to use the GPL (General Public
License ); name : Greenstone adopted ; Work
with Human Info NGO to produce humanitarian
CD-ROMs
1998 Apr -First CD-ROM collection released:
Humanity Development Library
1998 Aug -Greenstone.org website established
1999 BBC -Collection established
87 March 2011 Prepared by Imran Mansuri

2000 Apr -Greenstone mailing list started
Aug -Formally established cooperative effort with
UNESCO and Human Info NGO
Nov -Distribute software on SourceForge
2002 Apr -Development of Greenstone3
Mar -Official opening of the Niupepacollection,
development of the Greenstone Librarian
Interface
Jun -First UNESCO Greenstone CD-ROM
97 March 2011 Prepared by Imran Mansuri

2003 -A Java development that became
known as the Greenstone Librarian Interface
2005 Nov -Initial release of Greenstone3
2006 Apr -Greenstone Support Group for
South Asia launched
107 March 2011 Prepared by Imran Mansuri

GSDL : Version
2000 Feb -gsdl2.12
Apr -gsdl2.21
Dec -gsdl2.30
2001 Feb –gsdl2.31
2002 Jan –gsdl2.38
2003 Jun -gsdl2.40
2004 Feb –gsdl2.50
2005 Apr –gsdl2.60 and in November -gsdl3.00
2006 Mar –gsdl2.70
2007 Apr –gsdl2.80
2008 gsdl3.03
Current release gsdl2.82
117 March 2011 Prepared by Imran Mansuri

GSDL : Features
Multi S/W Platform
Multi Lingual Support
Structured Metadata in XML using DC
Metadata Extraction
Plug-ins for Documents
Full-text mirroring
Text Level Penetration
Concurrent & Dynamic Content Development
Uniform Presentation
127 March 2011 Prepared by Imran Mansuri

Collection Building
•Web and command line mode
•Input collections:
•GSDL server (files)
•Remote (FTP -files, HTTP –website pages)
•Collection input: batch mode, NOT interactive
•Document formats: HTML, PDF, Text, Word
•(Doc, RTF), PS, e-mail, bibliographic
137 March 2011 Prepared by Imran Mansuri

•Support for full text tagging for hierarchical
document browsing
•Automatic text extraction and indexing
‘Plugins’fordifferentdocumentformats
(HTMLPlug,PDFPlug,etc.)Mayfailforsome
documents!
XML representation –conversion to HTML for
Display
Native document format –storage and display (via
browser plugins, helper applications)
•Data compression support
147 March 2011 Prepared by Imran Mansuri

•Metadata
Automatic extraction of simple metadata
(e.g. Title, date)
Explicit metadata via ‘Classifiers’
Hierarchical (e.g. Subject)
List (e.g. Organization, Author)
Used for browsing and field-based searching
Multi-language support via Unicode
157 March 2011 Prepared by Imran Mansuri

Collection Browse and Search
•Full text search
•Metadata (field) search and browse
•Boolean
•Ranked
•Multi-language support for browse/
search interface
•Search history, search term
•highlighting…
167 March 2011 Prepared by Imran Mansuri

Collection Presentation
•Search results formatting
Format strings in the configuration file
•Home page customization
Using macros
177 March 2011 Prepared by Imran Mansuri

GSDL : Features
Easy Installation
Easy Maintenance
Hierarchy Structure
Interface Customization
–Front Page Design, Header for the Digital
Library, Collection Icon, Cover Images
Collection Configuration (Collect.cfg) File
Scalability, Flexibility
187 March 2011 Prepared by Imran Mansuri

Collection Distribution
•Web
•CD-ROM
Publish created collections to the CD-ROM
Windows only
Two possibilities:
oInstall GSDL software to HDD and access
content on CD
oRun GSDL search engine out of the CD!
197 March 2011 Prepared by Imran Mansuri

GSDL : Unique Features
Incremental Collection Building
Content Development in 3 different ways
Good Documentation and Active Mailing
List
Variety of Plug-ins for different document
Types
Publishing on CD-ROMs
Data Compression
207 March 2011 Prepared by Imran Mansuri

GSDL : Technology Used
•Technology used in the current version
–Java 1.6 (Higher)
–Image Magic
–Application Server : Apache 2.2
–GSDL_Linux2.82 and Win
217 March 2011 Prepared by Imran Mansuri

GSDL : Example Sites
India: Archives of Indian Labour
227 March 2011 Prepared by Imran Mansuri

United States: New York Botanical Garden
237 March 2011 Prepared by Imran Mansuri

International: Global Library Services Network
247 March 2011 Prepared by Imran Mansuri

257 March 2011 Prepared by Imran Mansuri

267 March 2011 Prepared by Imran Mansuri

277 March 2011 Prepared by Imran Mansuri

Some Observations
Strengths:
Configurability: content extraction for indexing,
presentation layout, metadata for browsing and field-
based searching (little difficult though!)
Extensibility:
Pluginsfor content extraction, Unicode for
multilanguagesupport, source code availability
Fulltextsearch on variety of document formats
XML, Unicode, Dublin Core support
Data compression
CD-ROM publishing
287 March 2011 Prepared by Imran Mansuri

Limitations:
Interactive content updating and management
not possible
No duplicate identification
Metadata handling appears to be little complex
Linux version seems to be more robust than
Windows
Hangs while processing some documents during
collection building –no way to gracefully handle
this
297 March 2011 Prepared by Imran Mansuri

Current Status
Strong development work –CS department
at University of Waikato, NZ
Z39.50 experimental interface now available
Promoted by UNESCO
Beginning to be used worldwide Can be
expected to reach CDS/ISIS like popularity
(particularly in developing countries)
307 March 2011 Prepared by Imran Mansuri

Documentation and Help
•Available at: http://www.greenstone.org
–Software
–Demo collections
–FAQ
–Tutorial materials
•Documentation:
Installer’s Guide, User’s Guide, Developer’sGuide,
and other reading materials
317 March 2011 Prepared by Imran Mansuri

•Mailing lists:
–Greenstone Users List
–Greenstone Developers List
•Greenstone Documentation Wiki
http://wiki.greenstone.org/wiki/index.php/Gr
eenstoneWiki
327 March 2011 Prepared by Imran Mansuri

337 March 2011 Prepared by Imran Mansuri
Tags