Parser (and parsing)
A concepts from computational linguistics. A parser is a program which utilizes a grammar to analyze a text in its syntactical components. The text may be natural language or programming language. Human sentences are not easily parsed by programs, as there is substantial ambiguity in the structure of human language.


Harter (1986, p. 87ff.) writes:

 

 "When a database is loaded onto a search system for the first time or is being updated, an algorithm called a parsing rule is used to prepare the inverted index. A parsing rule is specific to a particular field of a given database. It refers to a set of separating and sorting operations performed by the search service on that data field. The parsing rule is applied when the inverted index is created from the linear file for that database.
    The simplest parsing rule is just to make an entry in the inverted index for every word in the field. There are at least two reasons why some modification of this rule for some fields may be useful. First, there are many common, function words in natural language that occur frequently and that would not be useful as search terms (for example, "and"). Such terms are often eliminated from the inverted index.
    Second, there may be reasons for wanting to preserve phrases in certain fields, so that *false drops in these fields can be minimized at the time of search. The descriptor field is an obvious example of this. Clearly, little is accomplished by indexing documents with phrases such as "chemical bonding" and then destroy this *pre-coordinated phrase by parsing the descriptor field on a word by word basis...."

 

 

 

Literature:


Fischler, M.  A. & Firschein, O. (1987). Intelligence: The Eye, the Brain, and the Computer. Reading, Massachusetts: Addison-Wesley, pp. 175-186.


Harter, Stephen P. (1986). Online Information Retrieval. Concepts, Principles, and Techniques. London: Academic Press.

 

Wikipedia. The free encyclopedia. (2006). Parsing. http://en.wikipedia.org/wiki/Parsing

 

 

 

See also: Natural Language Processing
 

 

 

 

Birger Hjørland

Last edited: 31-05-2006

Home

 

 

 

 

Appendix:

Example of using the commercial parser "Connexor" http://193.185.105.50/cgi-bin/parser-demo.pl  on the following text:

Information retrieval is an important but generally neglected part of the research method in psychology. On the basis of a case study, which consists of an examination of the search strategy in a Swedish dissertation, the problems of searcing are overviewed, with regard to both the selection of sources, and the construction of the search profile. Attention is given to subject faceting in psychology. A model used by Psychological Abstracts in building on the concepts of experimental variables is replaced by a facet model developed on the basis of the Bliss Classification System. This model is illustrated using the above-mentioned dissertation as an example, and it is shown that the model can help in formulating search questions in psychology. Also discussed are problems that concern the use of abstracts or full texts in the selection of documents. In addition, attention is given to the question of types of research in psychology that can benefit from computer-based retrieval methods.

  

Text

Baseform

Phrase syntax and part-of-speech

Information

information

premodifier, noun, noun phrase begins

retrieval 

retrieval 

nominal head, noun, noun phrase ends 

is 

be 

main verb, indicative present 

an 

an 

premodifier, determiner 

important 

important 

premodifier, adjective, noun phrase begins 

but 

but 

coordination marker, noun phrase continues 

generally 

generally 

premodifier, adverb, noun phrase continues 

neglected 

neglected 

premodifier, adjective, noun phrase continues 

part 

part 

nominal head, noun, noun phrase continues 

of 

of 

postmodifier, preposition, noun phrase continues 

the 

the 

premodifier, determiner, noun phrase continues 

research 

research 

premodifier, noun, noun phrase continues 

method 

method 

nominal head, noun, noun phrase ends 

in 

in 

preposed marker, preposition 

psychology 

psychology 

nominal head, noun, single-word noun phrase 

sentence boundary 

On 

on 

preposed marker, preposition 

the 

the 

premodifier, determiner 

basis 

basis 

nominal head, noun, noun phrase begins 

of 

of 

postmodifier, preposition, noun phrase continues 

premodifier, determiner, noun phrase continues 

case 

case 

premodifier, noun, noun phrase continues 

study 

study 

nominal head, noun, noun phrase ends 

 

which 

which 

nominal head, pro-nominal 

consists 

consist 

main verb, indicative present 

of 

of 

preposed marker, preposition 

an 

an 

premodifier, determiner 

examination 

examination 

nominal head, noun, noun phrase begins 

of 

of 

postmodifier, preposition, noun phrase continues 

the 

the 

premodifier, determiner, noun phrase continues 

search 

search 

premodifier, noun, noun phrase continues 

strategy 

strategy 

nominal head, noun, noun phrase ends 

in 

in 

preposed marker, preposition 

premodifier, determiner 

Swedish 

Swedish 

premodifier, adjective, noun phrase begins 

dissertation 

dissertation 

nominal head, noun, noun phrase ends 

 

the 

the 

premodifier, determiner 

problems 

problem 

nominal head, plural noun, noun phrase begins 

of 

of 

postmodifier, preposition, noun phrase continues 

searcing 

searcing 

nominal head, noun, noun phrase ends 

are 

be 

auxiliary verb, indicative present 

overviewed 

overview 

main verb, participle perfect 

 

with 

with 

preposed marker, preposition 

regard 

regard 

nominal head, noun, single-word noun phrase 

to 

to 

postmodifier, preposition 

both 

both 

nominal head, pro-nominal 

the 

the 

premodifier, determiner 

selection 

selection 

nominal head, noun, noun phrase begins 

of 

of 

postmodifier, preposition, noun phrase continues 

sources 

source 

nominal head, plural noun, noun phrase ends 

 

and 

and 

coordination marker 

the 

the 

premodifier, determiner 

construction 

construction 

nominal head, noun, noun phrase begins 

of 

of 

postmodifier, preposition, noun phrase continues 

the 

the 

premodifier, determiner, noun phrase continues 

search 

search 

premodifier, noun, noun phrase continues 

profile 

profile 

nominal head, noun, noun phrase ends 

sentence boundary 

Attention 

attention 

nominal head, noun, single-word noun phrase 

is 

be 

auxiliary verb, indicative present 

given 

give 

main verb, participle perfect 

to 

to 

preposed marker, preposition 

subject 

subject 

main verb, infinitive 

faceting 

facet 

main verb, participle progressive 

in 

in 

preposed marker, preposition 

psychology 

psychology 

nominal head, noun, single-word noun phrase 

sentence boundary 

premodifier, determiner 

model 

model 

nominal head, noun, single-word noun phrase 

used 

use 

main verb, participle perfect 

by 

by 

preposed marker, preposition 

Psychological 

Psychological 

premodifier, proper noun, noun phrase begins 

Abstracts 

Abstract 

nominal head, plural proper noun, noun phrase ends 

in 

in 

preposed marker, preposition 

building 

building 

nominal head, noun, single-word noun phrase 

on 

on 

preposed marker, preposition 

the 

the 

premodifier, determiner 

concepts 

concept 

nominal head, plural noun, noun phrase begins 

of 

of 

postmodifier, preposition, noun phrase continues 

experimental 

experimental 

premodifier, adjective, noun phrase continues 

variables 

variable 

nominal head, plural noun, noun phrase ends 

is 

be 

auxiliary verb, indicative present 

replaced 

replace 

main verb, participle perfect 

by 

by 

preposed marker, preposition 

premodifier, determiner 

facet 

facet 

premodifier, noun, noun phrase begins 

model 

model 

nominal head, noun, noun phrase ends 

developed 

develop 

main verb, indicative past 

on 

on 

preposed marker, preposition 

the 

the 

premodifier, determiner 

basis 

basis 

nominal head, noun, noun phrase begins 

of 

of 

postmodifier, preposition, noun phrase continues 

the 

the 

premodifier, determiner, noun phrase continues 

Bliss 

Bliss 

premodifier, proper noun, noun phrase continues 

Classification 

Classification 

premodifier, proper noun, noun phrase continues 

System 

System 

nominal head, proper noun, noun phrase ends 

sentence boundary 

This 

this 

premodifier, pro-nominal 

model 

model 

nominal head, noun, single-word noun phrase 

is 

be 

auxiliary verb, indicative present 

illustrated 

illustrate 

main verb, participle perfect 

using 

use 

main verb, participle progressive 

the 

the 

premodifier, determiner 

above-mentioned 

above mentioned 

premodifier, adjective, noun phrase begins 

dissertation 

dissertation 

nominal head, noun, noun phrase ends 

as 

as 

preposed marker, preposition 

an 

an 

premodifier, determiner 

example 

example 

nominal head, noun, single-word noun phrase 

 

and 

and 

coordination marker 

it 

it 

nominal head, pro-nominal 

is 

be 

auxiliary verb, indicative present 

shown 

show 

main verb, participle perfect 

that 

that 

preposed marker, clause marker 

the 

the 

premodifier, determiner 

model 

model 

nominal head, noun, single-word noun phrase 

can 

can 

auxiliary verb, indicative present 

help 

help 

main verb, infinitive 

in 

in 

preposed marker, preposition 

formulating 

formulate 

main verb, participle progressive 

search 

search 

premodifier, noun, noun phrase begins 

questions 

question 

nominal head, plural noun, noun phrase ends 

in 

in 

preposed marker, preposition 

psychology 

psychology 

nominal head, noun, single-word noun phrase 

sentence boundary 

Also 

also 

adverbial head, adverb 

discussed 

discuss 

main verb, participle perfect 

are 

be 

main verb, indicative present 

problems 

problem 

nominal head, plural noun, single-word noun phrase 

that 

that 

nominal head, pro-nominal 

concern 

concern 

main verb, indicative present 

the 

the 

premodifier, determiner 

use 

use 

nominal head, noun, noun phrase begins 

of 

of 

postmodifier, preposition, noun phrase continues 

abstracts 

abstract 

nominal head, plural noun, noun phrase ends 

or 

or 

coordination marker 

full 

full 

premodifier, adjective, noun phrase begins 

texts 

text 

nominal head, plural noun, noun phrase ends 

in 

in 

preposed marker, preposition 

the 

the 

premodifier, determiner 

selection 

selection 

nominal head, noun, noun phrase begins 

of 

of 

postmodifier, preposition, noun phrase continues 

documents 

document 

nominal head, plural noun, noun phrase ends 

sentence boundary 

In 

in 

preposed marker, preposition 

addition 

addition 

nominal head, noun, single-word noun phrase 

 

attention 

attention 

nominal head, noun, single-word noun phrase 

is 

be 

auxiliary verb, indicative present 

given 

give 

main verb, participle perfect 

to 

to 

preposed marker, preposition 

the 

the 

premodifier, determiner 

question 

question 

nominal head, noun, noun phrase begins 

of 

of 

postmodifier, preposition, noun phrase continues 

types 

type 

nominal head, plural noun, noun phrase continues 

of 

of 

postmodifier, preposition, noun phrase continues 

research 

research 

nominal head, noun, noun phrase continues 

in 

in 

postmodifier, preposition, noun phrase continues 

psychology 

psychology 

nominal head, noun, noun phrase ends 

that 

that 

nominal head, pro-nominal 

can 

can 

auxiliary verb, indicative present 

benefit 

benefit 

main verb, infinitive 

from 

from 

preposed marker, preposition 

computer-based 

computer based 

premodifier, adjective, noun phrase begins 

retrieval 

retrieval 

premodifier, noun, noun phrase continues 

methods 

method 

nominal head, plural noun, noun phrase ends 

sentence boundary