imraninflibnet
17,477 views
33 slides
May 03, 2011
Slide 1 of 33
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
About This Presentation
Greenstone Digital Library
Size: 282.2 KB
Language: en
Added: May 03, 2011
Slides: 33 pages
Slide Content
Greenstone Digital Library Software:
An Overview
ImranMansuri
Project Assistant (Library Science)
INFLIBNET Centre
17 March 2011 Prepared by Imran Mansuri
Agenda
•Introduction : Digital Library Software (DL)
•Greenstone Digital Library Software (GSDL)
•Introduction
•History
•Versions
•Features
•Unique Features
•Technology used
•Example Sites
•Example Collections
27 March 2011 Prepared by Imran Mansuri
Digital Library Software
•Theterm“DigitalLibrary”referstoalibraryin
whichcollectionsarestoredindigitalformats
(asopposedtoprint,microform,orother
media)andaccessiblebycomputers
•Thedigitalcontentmaybestoredlocallyor
accessedremotelyviacomputernetworks
•Accessthebooks,imagesareindigitalformat
•UsingNetaccesstoinformationfrom
anywhere
37 March 2011 Prepared by Imran Mansuri
Digital Libraries : Features
Dynamic Electronic Information Systems
Increase Portability
Efficiency of Access
Flexibility
Availability
47 March 2011 Prepared by Imran Mansuri
Digital Library Software
Dspace
Fedora
Eprints
Resource Space
Greenstone
57 March 2011 Prepared by Imran Mansuri
Greenstone Digital Library Software
•TheGreenstoneDigitalLibrarySoftware(GSDL)
providesawayofbuildinganddistributingdigital
librarycollections,openingupnewpossibilities
fororganizinginformationandmakingitavailable
overtheInternetoronCD-ROM
•DevelopedbytheNewZealandDigitalLibrary
Project(www.nzdl.org)attheUniversityof
Waikato
•Distributed in co-operation with UNESCO and
Humanities Library Project, Romania
67 March 2011 Prepared by Imran Mansuri
GSDL : Some Facts
•Current version:
2.82 and 3.03
Available from http://www.greenstone.org
•Software suite for building, maintaining, and
distributing digital library collections
•Comprehensive, open-source
•Distribution and promotion partners:
oUNESCO
oHuman Info NGO, Belgium
77 March 2011 Prepared by Imran Mansuri
GSDL : History
1995 -Digital library of Computer Science
Technical Reports. Its established by New Zealand
Digital Library
1997 -Decision to use the GPL (General Public
License ); name : Greenstone adopted ; Work
with Human Info NGO to produce humanitarian
CD-ROMs
1998 Apr -First CD-ROM collection released:
Humanity Development Library
1998 Aug -Greenstone.org website established
1999 BBC -Collection established
87 March 2011 Prepared by Imran Mansuri
2000 Apr -Greenstone mailing list started
Aug -Formally established cooperative effort with
UNESCO and Human Info NGO
Nov -Distribute software on SourceForge
2002 Apr -Development of Greenstone3
Mar -Official opening of the Niupepacollection,
development of the Greenstone Librarian
Interface
Jun -First UNESCO Greenstone CD-ROM
97 March 2011 Prepared by Imran Mansuri
2003 -A Java development that became
known as the Greenstone Librarian Interface
2005 Nov -Initial release of Greenstone3
2006 Apr -Greenstone Support Group for
South Asia launched
107 March 2011 Prepared by Imran Mansuri
GSDL : Version
2000 Feb -gsdl2.12
Apr -gsdl2.21
Dec -gsdl2.30
2001 Feb –gsdl2.31
2002 Jan –gsdl2.38
2003 Jun -gsdl2.40
2004 Feb –gsdl2.50
2005 Apr –gsdl2.60 and in November -gsdl3.00
2006 Mar –gsdl2.70
2007 Apr –gsdl2.80
2008 gsdl3.03
Current release gsdl2.82
117 March 2011 Prepared by Imran Mansuri
GSDL : Features
Multi S/W Platform
Multi Lingual Support
Structured Metadata in XML using DC
Metadata Extraction
Plug-ins for Documents
Full-text mirroring
Text Level Penetration
Concurrent & Dynamic Content Development
Uniform Presentation
127 March 2011 Prepared by Imran Mansuri
Collection Building
•Web and command line mode
•Input collections:
•GSDL server (files)
•Remote (FTP -files, HTTP –website pages)
•Collection input: batch mode, NOT interactive
•Document formats: HTML, PDF, Text, Word
•(Doc, RTF), PS, e-mail, bibliographic
137 March 2011 Prepared by Imran Mansuri
•Support for full text tagging for hierarchical
document browsing
•Automatic text extraction and indexing
‘Plugins’fordifferentdocumentformats
(HTMLPlug,PDFPlug,etc.)Mayfailforsome
documents!
XML representation –conversion to HTML for
Display
Native document format –storage and display (via
browser plugins, helper applications)
•Data compression support
147 March 2011 Prepared by Imran Mansuri
•Metadata
Automatic extraction of simple metadata
(e.g. Title, date)
Explicit metadata via ‘Classifiers’
Hierarchical (e.g. Subject)
List (e.g. Organization, Author)
Used for browsing and field-based searching
Multi-language support via Unicode
157 March 2011 Prepared by Imran Mansuri
Collection Browse and Search
•Full text search
•Metadata (field) search and browse
•Boolean
•Ranked
•Multi-language support for browse/
search interface
•Search history, search term
•highlighting…
167 March 2011 Prepared by Imran Mansuri
Collection Presentation
•Search results formatting
Format strings in the configuration file
•Home page customization
Using macros
177 March 2011 Prepared by Imran Mansuri
GSDL : Features
Easy Installation
Easy Maintenance
Hierarchy Structure
Interface Customization
–Front Page Design, Header for the Digital
Library, Collection Icon, Cover Images
Collection Configuration (Collect.cfg) File
Scalability, Flexibility
187 March 2011 Prepared by Imran Mansuri
Collection Distribution
•Web
•CD-ROM
Publish created collections to the CD-ROM
Windows only
Two possibilities:
oInstall GSDL software to HDD and access
content on CD
oRun GSDL search engine out of the CD!
197 March 2011 Prepared by Imran Mansuri
GSDL : Unique Features
Incremental Collection Building
Content Development in 3 different ways
Good Documentation and Active Mailing
List
Variety of Plug-ins for different document
Types
Publishing on CD-ROMs
Data Compression
207 March 2011 Prepared by Imran Mansuri
GSDL : Technology Used
•Technology used in the current version
–Java 1.6 (Higher)
–Image Magic
–Application Server : Apache 2.2
–GSDL_Linux2.82 and Win
217 March 2011 Prepared by Imran Mansuri
GSDL : Example Sites
India: Archives of Indian Labour
227 March 2011 Prepared by Imran Mansuri
United States: New York Botanical Garden
237 March 2011 Prepared by Imran Mansuri
International: Global Library Services Network
247 March 2011 Prepared by Imran Mansuri
257 March 2011 Prepared by Imran Mansuri
267 March 2011 Prepared by Imran Mansuri
277 March 2011 Prepared by Imran Mansuri
Some Observations
Strengths:
Configurability: content extraction for indexing,
presentation layout, metadata for browsing and field-
based searching (little difficult though!)
Extensibility:
Pluginsfor content extraction, Unicode for
multilanguagesupport, source code availability
Fulltextsearch on variety of document formats
XML, Unicode, Dublin Core support
Data compression
CD-ROM publishing
287 March 2011 Prepared by Imran Mansuri
Limitations:
Interactive content updating and management
not possible
No duplicate identification
Metadata handling appears to be little complex
Linux version seems to be more robust than
Windows
Hangs while processing some documents during
collection building –no way to gracefully handle
this
297 March 2011 Prepared by Imran Mansuri
Current Status
Strong development work –CS department
at University of Waikato, NZ
Z39.50 experimental interface now available
Promoted by UNESCO
Beginning to be used worldwide Can be
expected to reach CDS/ISIS like popularity
(particularly in developing countries)
307 March 2011 Prepared by Imran Mansuri
Documentation and Help
•Available at: http://www.greenstone.org
–Software
–Demo collections
–FAQ
–Tutorial materials
•Documentation:
Installer’s Guide, User’s Guide, Developer’sGuide,
and other reading materials
317 March 2011 Prepared by Imran Mansuri
•Mailing lists:
–Greenstone Users List
–Greenstone Developers List
•Greenstone Documentation Wiki
http://wiki.greenstone.org/wiki/index.php/Gr
eenstoneWiki
327 March 2011 Prepared by Imran Mansuri