Executive Summary - Uppsala University · Structural errors concern the syntactic structure of a...

40

Transcript of Executive Summary - Uppsala University · Structural errors concern the syntactic structure of a...

����������� ����������� ����� � ������������������ �"!#�"$�� %'&�(*)+)�, � %'-/.10�2�340�.15�36.1087�9#:�:<; � 9#=�.2�340�>@?�:�:1ACB

DE�$"!"F����GH�$��I�KJ�LEH��� M ���ON�P ��PQ�DE�$"!#FR���GHK$��I�"!#�"$�� S�T'U�V�VWS�X'Y�V�ZW[]\_^`U�a�bcbda'e�f#g�a'hji�U�U�['U�ZIflkOm'ncV�o'f"ZT

pqJKLrHI��s���RtIGu���v ���

w ��xzy{G�v}|~� ��v}t�����v}!"H�$�� (_0�0�.�%���>�5.�A4A��`=�340���M�=���.19��#�*=�0'�j:z; � 340�>��'3CB��#34-/B��R������B�.1AC.�*0'3�5=�9#B�3��l������:������ � ��% � � ��� ��� ������B�.1AC.��%���=�2�=�0� �*.13�A4�.10'0�.1��A�340�> P �'� P B�=

� J��#�I�����#v}� �_A6>�.��O=�2��# /=�9 )+.1�_��=�A�A¡K¢£������������¤¥����!"1�� 7�3C=�9�9#= � 7�.1�'A¦%�:�0�2�.>

§_�¨'©d����ª�v =�9�9«:19��l����:�AC:�>�����A6.10�>���.>�=�=�9�9#:�9#B���%'�¬=�2�3CB�­ ����.19��"3C.�A���.19#B�340�>���AC:1-/.1A=�9�9«:19�9��'AC=�B

� H�v}�#��G�� ®�¯ °²±�³�´µOµ¶´·#¸º¹1´»�°¼³�³�½¾³�¿�¸ºÀq·"¯ °{°¾³�³�½¾³�·#Á'Â+½¾»]½�±�Ár´À�ò¸ºÀc·#¯ °°¾³�³�½¾³*Ã�´·�´ÄK´�¿]°²½¾Å ÆÈÇq°�Ã�¸]¿É¯c¯�´Ê °ËÄ�°¾°¾ÀE´À�´»ºÁ�¿É°�ò·�½ÌÃ�¸]¿É·"¸4À ͱ�Îϸ]¿É¯cÄ�°¾·#Çc°¼°¾Àc°¼³�³�½¾³�·"Á'ÂÏ°�¿Ï·"¯�´· ¹1´ÀqÄ�°{¯�´À�Ã�»º°�òÄ�ÁOÐ"Ñ�Ò1Â+´³ Í·"¸É´»�Â+´³�¿]¸4À�±�Ó�ÐlÔ�Ò1»]½¾¹1´»�°¼³�³�½¾³�³�ÎÏ»º°�¿ÉÓ�½¾³*Ð Õ�Ò1À °¾¸4·"¯ °¼³�Â+´³�·"¸]´»ÈÂ+´³ ͿɸºÀ�±8À�½¾³�»É½¾¹1´»�°¾³�³�½¾³�³�ÎÏ»4°�¿×Ö®�¯ °²±�³�´µOµ¶´·#¸º¹1´»�°¼³�³�½¾³�¿�¸ºÀq·"¯ °ÌÃ�´·�´ÄK´�¿É°{¯�´Ê °ËÄ�°¾°¾ÀE´�¿Ø¿É¸]±�À Í°�Ã~´IÎÏ·#¸º»º¸º·#Á@ÊR´»ºÎϰ˸ºÀd´¹�¹�½¾³�ÃK´À ¹�°{Çq¸º·"¯q·"¯ °Ì´À�´ »4ÁR¿É°�¿+Ð"ÑÓ�Ô�Ó�½¾³Õ�Ò"ÖÙ Âϳ�°¼»º¸ºµO¸ºÀ�´³�ÁO½¾³�Ã�°¾³*½¾Å Âϳ�¸]½¾³�¸º·"ÁÚÅ�½¾³�Å"ÎÏ·"Îϳ�°ËÇd½¾³�ÛÚ¸ºÀqÜÞÝ+߯�´�¿ÏÄ�°¾°¼Àd¿ÉÎ+±'±�°�¿]·#°�Ã�Ó�ÄK´�¿É°�Ã~½ Àc·#¯ °Ì½¾ÎÏ·#¹1½¾µr°²½¾Å ·#¯ °Ì´À�´ »ºÁ�¿]°�¿´À�òÅ#³�°�à�ÎÏ°¾À ¹�ÁÚ¸ºÀ Å�½¾³�µ¶´·"¸É½¾ÀcÅ"³�½¾µá·"¯ °{°¾³�³�½¾³*Ã�´·�´ÄK´�¿]°�Ö

Executive Summary

The objective was to analyse the grammar errors found during the work with the error ty-pology and the error database in order to explore how to approach them in the grammar checker. In connection with this work, the order of priority of the different error types is also discussed.

The purpose of the analyses of grammatical errors is to distinguish between the following: (1) errors that can be handled by partial parsing, (2) errors that can be handled by local error rules, and (3) errors that lie outside the scope of partial parsing and local error rules. The fundamental difference is that partial parsing rules implement a positive grammar with constraint relaxation, while local error rules state a negative grammar. In principal, partial parsing rules express erroneous language usage implicitl y by not allowing certain struc-tures and at the same time permitting certain types of violations. Local error rules are ex-plicit about the anticipated language errors that can be handled.

Generally, partial parsing handles non-structural errors and local error rules handle struct-ural errors. Structural errors concern the syntactic structure of a clause or a phrase, and thereby structural errors concern the sequencies of word categories. A missing finite verb is a structural error, since the verb category slot in the parsing rule is left empty. Non-structural errors concern the features, they are about feature mis-matches. An agreement error within the noun phrase is a non-structural error.

Each grammatical error in the Error Corpora Database has been assigned a utilit y value (1, 2, or 3) based on the analyses of its error type code. Frequency information about the different types of errors has also been taken into account when making the preliminary order of priority between the grammatical errors for the future work with the grammar checker.

Uppsala university Department of Linguistics SCARRIE 3 March 1998

Three Types of Grammatical Errors

in Swedish

Olga Wedbjer Rambell

SCARRIE

DEL 6.2.3

FINAL VERSION 1.0

Contents

1 Introduction.......................................................................................................................1

2 Analyses of Grammar Problems........................................................................................2

2.1 Noun Phrase (GPNP) .............................................................................................2

2.2 Adjective Phrase (GPAP).......................................................................................9

2.3 Adverb Phrase (GPAB)........................................................................................10

2.4 Prepositional Phrase (GPPP)................................................................................10

2.5 Conjunctions and Conjunctive Adverbs (GPCN) ................................................11

2.6 Verb Phrase in the Limited Sense (GPVF) ..........................................................13

2.7 Verb Valency (GPVV).........................................................................................17

2.8 Pronoun Case (GPPC)..........................................................................................23

2.9 Agreement (GPAG) .............................................................................................23

2.10 Referential Problems (GPRP) ..............................................................................25

2.11 Word Order (GPWO)...........................................................................................26

2.12 Wrong Word Category (GPWC)..........................................................................27

2.13 Other Grammar Problems (GPOG)......................................................................28

3 Analyses of Other Error Types........................................................................................29

3.1 Spelli ng Errors (SE).............................................................................................29

3.2 Punctuation Problems (PU) .................................................................................29

3.3 Graphical Problems (GR) ....................................................................................32

3.4 Style, Meaning, and Reference (SP)....................................................................33

4 Proposed Order of Priority ..............................................................................................34

Literature..............................................................................................................................35

– 1 –

1 Introduction

The objective was to analyse the grammar errors found during the work with the error ty-pology (Wedbjer Rambell 1998) and the error database (Wedbjer Rambell et al. 1998) in order to explore how to approach them in the grammar checker. In connection to this work, the order of priority of the different error types is also discussed.

The parser is to consult a dictionary with morphosyntactic information expressed in cate-gories (e.g. noun and verb) and features (e.g. number and present tense). From this point of view, grammatical errors can be divided into two types: structural errors and non-structural errors (Bustamente & León 1996). Structural errors concern the syntactic structure of a clause or a phrase, and thereby structural errors concern the categories. A missing finite verb is a structural error, since the verb category slot in the parsing rule is left empty. Non-structural errors concern the features, they are about feature mis-matches. An agreement error within the noun phrase is a non-structural error. A parser may handle non-structural errors by feature propagation (Vosse 1994, p.146 ff) .

The division between the two error types is very much dependent on what information is expressed in terms of categories and features, respectively. For instance, participles are verb forms but since they behave as adjectives it could be an advantage to let them form a category of their own. The division is also dependent on the parser, its functionality, and its formalism.

The purpose of the analyses of grammatical errors is to distinguish between the following: (1) errors that can be handled by partial parsing, (2) errors that can be handled by local error rules, and (3) errors that lie outside the scope of partial parsing and local error rules. The fundamental difference is that partial parsing rules implement a positive grammar with constraint relaxation, while local error rules state a negative grammar. In principal, partial parsing rules express erroneous language usage implicitl y by not allowing certain struc-tures and at the same time permitting certain types of violations. Local error rules are ex-plicit about the anticipated language errors that can be handled.

One problem in making a division between errors that can be handled by partial parsing and local error rules, respectively, is that many errors may be dealt with in both ways. In addition, insertion of a local error rule will i nfluence the parsing rules – the parsing rules must be adapted so that they do not accept the error that the local error rule is meant to react upon.

Generally, partial parsing handles non-structural errors and local error rules handle struct-ural errors (Sågvall Hein forthcoming). The error types in the typology are analysed ac-cording to the three ways of approaching them, and a utilit y value is attached to each entry in the Error Corpora Database based on its error type code to tell which error detection approach to prefer.

In the next chapter, the different grammar errors will be analysed. The analyses are prima-rily based on the error typology supplemented with information from the Error Corpora Database. In the third chapter, the other error groups in the typology are discussed: spelli ng errors, punctuation problems, graphical problems, and style, meaning, and referential pro-blems. Finally, the fourth chapter is focused on the order of priority between the grammar problems, mainly based on error frequencies from the database.

– 2 –

2 Analyses of Grammar Problems

The grammar problems in the typology and in the database have been analysed to investi-gate how to approach the different types of grammar problems. For each error type, it has been established whether the errors ought to be handled by one of the following alterna-tives:

1 partial parsing 2 local error rules 3 neither

The discussion on each error type is brief. The chapter follows the structure of the error typology (Wedbjer Rambell 1998).

Based on the error type code, the utilit y value has been inserted for every grammar pro-blem in the Error Corpora Database.

2.1 Noun Phrase (GPNP)

Agreement (GPNPAG)

All agreement errors are non-structural and as such they can be handled by means of par-tial parsing. However, some diff iculties arise, although they do not change the approach with partial parsing.

In coordinated nouns, feature agreement is not always wanted. Gender is inherent, but also number agreement is not always appropriate, so that problems occur concerning the feature values of the resulting NP. Another problem is that certain nouns may precede another noun without being in the genitive case (e.g. ett par byxor).

Gender (GPNPGE)

01 grammatical gender versus semantic gender To handle semantic gender, an extra feature is the most appropriate way to specify if a word is masculine or not. The problem can thus be handled by partial parsing.

02 wrong gender of the indefinite article in genitive premodifier The premodifier is governed by the noun in the genitive form and not by the head noun. Special rules may be written that capture the sequence of categories, and the gender agree-ment is handle by feature propagation. Partial parsing is thus the way to handle this error type.

Number (GPNPNB)

01 the plural => the singular It is very diff icult to formalise what governs the choice between the singular and the plural forms. Therefore, this problem lies outside the scope of partial parsing and local error rules.

04 the singular => the plural As above: this problem lies outside the scope of partial parsing and local error rules.

– 3 –

02 number problems between premodifier - noun: uncountable / countable Problems concerning the distinction between countable and uncountable nouns may be solved by partial parsing.

03 semantic number different from grammatical number Semantic and grammatical number is a problem solvable by partial parsing, providing the information is assigned to the words.

Species (GPNPSS)

Problems with missing articles are structural errors, which can be detected by local error rules. Partial parsing, on the other hand, best solves problems with incorrect inflectional form of the head noun, providing the context for such a decision can be formalised. If a preceding determiner governs the choice, the problem can be expressed in terms of cate-gories and features and be solved by partial parsing. But if the decision is based on given or new information, the conditions can hardly be expressed by partial parsing or local error rules. Error types that may involve either missing article or erroneous inflection are pro-bably best handled by local error rules.

01 definite article missing or erroneous definite inflection in definite noun phrase with adjective attribute

Problems concerning missing constituents could be handled by local error rules.

02 erroneous definite inflection after genitive attribute This error type may be detected by partial parsing, as discussed above.

03 indefinite article missing or definite article (and definite inflection) missing in indefinite noun phrase in the singular without article

Problems concerning missing constituents could be handled by local error rules.

09 definite inflection missing in noun phrase in PP This error is partly due to the semantics of the prepositional phrase. But as for most usage problems, this error type lies outside the scope of partial parsing and local error rules.

12 other cases of missing definite inflection When to use the definite form is very much dependent on the context. Even if certain cases may be identifiable and formalised, the errors would probably best be solved with local error rules; the error type as such lies outside the scope of partial parsing and local error rules.

04 erroneous definite inflection before a necessary relative clause This error type is best solved by partial parsing, see discussion above.

05 erroneous definite inflection after certain pronouns and adjectives This error type is best solved by partial parsing, see discussion above.

10 erroneous definite inflection in titles This error type is best solved by partial parsing.

– 4 –

11 other cases of erroneous definite inflection When not to use the definite form is very much dependent on the context. Even if certain cases may be identifiable and formalised, the errors would probably best be solved with local error rules; the error type as such lies outside the scope of partial parsing and local error rules.

06 demonstrative pronoun / definite article should be removed When not to use a demonstrative pronoun or definite article is very much dependent on the context. Even if certain cases may be identifiable and formalised, the errors would probab-ly best be solved with local error rules; the error type as such lies outside the scope of par-tial parsing and local error rules.

07 the indefinite article should be removed When not to use the indefinite article is very much dependent on the context. Even if cer-tain cases may be identifiable and formalised, the errors would probably best be solved with local error rules; the error type as such lies outside the scope of partial parsing and local error rules.

08 double articles This error type is a structural error that is best solved by local error rules.

Case (GPNPCA)

When a noun or personal pronoun is preceding another noun, it ought to be in the genitive case. This is not always true, but easy to capture in the grammar checker by using partial parsing. It is also easy, by using the same method, to handle nouns in the genitive form taking the place of the head noun. Elli ptic expressions would, however, give rise to false alarms.

Adjectives may also take the genitive form. To accept noun phrases containing adjectives in the genitive is easy, it could be more diff icult to detect when the genitive is missing.

01 common noun should be in the genitive case This error type can be handled by partial parsing.

02 proper noun should be in the genitive case This error type can be handled by partial parsing.

03 the genitive case => the basic case This error type can be handled by partial parsing.

04 error in forming of the genitive case in word group This error type lies outside the scope of partial parsing and local error rules.

05 pronoun should be possessive pronoun This error type can be handled by partial parsing, provided that the subdivision between personal and possessive pronouns is made by feature values.

– 5 –

06 adjective used as a noun should be in the genitive case This error type could be handled by partial parsing, but the approach is very much depen-dent on the specific errors in the database.

07 other problems with case This error type could be handled by partial parsing, but the approach is very much dependent on the specific errors in the database.

Adjective phrase (GPNPAP)

Wrong word category problems are structural problems that can be solved by local error rules, while wrong type of adverb could be detected by partial parsing if an appropriate subdivision is made of the adverbs in terms of feature values. Partial parsing may also be used for choosing the proper form of the adjective when it is used as a noun.

01 wrong word category of the premodifier This error type can be handled by local error rules.

04 wrong word category of the head adjective This error type can be handled by local error rules.

02 wrong type of adverb in premodifier This error type can be handled by partial parsing.

03 adjective used as a noun This error type can be handled by partial parsing.

05 other problems This error type could be handled by partial parsing or local error rules, but the approach is very much dependent on the specific errors in the database.

Participles (GPNPPE)

Participles may be viewed either as verbs or as adjectives. Wrong verb form is a non-struc-tural error if participles are treated as members of the verb category, and a structural error if the participles form a separate category. In the former case, the problem is easily handled by partial parsing. Otherwise local error rules of foreseen erroneous categories can be writ-ten. Since participles have more in common with adjectives, in these analyses they are treated as a separate category, thus the error types in this subcategory are to be handled by local error rules.

01 wrong verb form in premodifier This error type can be handled by local error rules, provided that participles are treated as a separate category.

02 wrong verb form in postmodifier This error type can be handled by local error rules, provided that participles are treated as a separate category.

– 6 –

Numerals (GPNPNL)

01 approximate number This error type could be handled by local error rules, since the problem is about word cate-gories.

02 numeral missing in certain expressions This error type could be handled by local error rules, since the problem is about a missing category.

03 wrong word category This error type could be handled by local error rules, since the problem is about word cate-gories.

Nouns (GPNPNN)

01 head noun missing A missing head noun is a structural error that can be handled by local error rules.

02 wrong word category Wrong word category is a structural error that can be handled by local error rules.

03 doubled noun Doubled noun is a structural error that can be handled by local error rules.

Pronouns (GPNPPN)

01 relative pronoun missing A missing relative pronoun is a structural error that can be handled by local error rules.

02 doubled pronoun Doubled pronoun is a structural error that can be handled by local error rules.

03 wrong type of pronoun Wrong type of pronoun is a non-structural error if the distinction between pronouns is made in a feature. If the pronouns are divided into different categories, the error type is a structural problem. The assumption is that pronouns are held together in one category, thus making this problem solvable by partial parsing.

04 wrong word category Wrong word category is a structural error that can be handled by local error rules.

05 other problems This error type could be handled by local error rules, but the approach is very much depen-dent on the specific errors in the database.

– 7 –

Choice of preposition after a noun (GPNPCP)

Error types addressing choice of preposition are all non-structural problems. Therefore, partial parsing is the best way to handle these problems.

01 noun + preposition + NP This error type can be handled by partial parsing.

02 noun + preposition + infinitive phrase This error type can be handled by partial parsing.

10 noun + preposition + att-clause This error type can be handled by partial parsing.

03 noun + preposition + subordinate clause This error type can be handled by partial parsing.

04 noun + PP + preposition + NP This error type can be handled by partial parsing.

Error types addressing removal of preposition are all structural problems, but could be handled by partial parsing provided that the valency information is handled as feature values.

05 noun + infinitive phrase [no preposition] This error type can be handled by partial parsing.

06 noun + att-clause [no preposition] This error type can be handled by partial parsing.

08 noun + noun [no preposition] This error type can be handled by partial parsing.

Error types addressing removal of superfluous preposition are structural problems best handled by local error rules:

07 doubled preposition This error type can be handled by local error rules.

09 one preposition too many This error type can be handled by local error rules.

Preposition missing after a noun (GPNPMP)

Error types involving inserting a preposition are structural errors that can be handled by partial parsing if the valency information is given in a feature. Otherwise, local error rules provide the best solution.

02 noun + preposition This error type can be handled by local error rules.

– 8 –

03 noun + preposition + NP This error type can be handled by partial parsing.

06 noun + preposition + infinitive phrase This error type can be handled by partial parsing.

01 noun + preposition + att-clause This error type can be handled by partial parsing.

05 noun + preposition + subordinate clause This error type can be handled by partial parsing.

04 other missing prepositions This error type can be handled by local error rules.

Other noun valency problems (GPNPNV)

01 noun + preposition + att-clause – att missing This error type can be handled by local error rules.

02 wrong word category This error type can be handled by local error rules.

03 repetition or not of preposition It is diff icult to formalise when to repeat the preposition and when not to, why the error type lies outside the scope of partial parsing and local error rules.

Coordination (GPNPCO)

Error types that consist of constituents that should be inserted or removed constitute struc-tural errors. The problem with many coordinations is the task of establishing what is co-ordinated with what.

01 conjunction missing This error type can be handled by local error rules.

02 asymmetric coordination This error type can be handled by local error rules.

03 comma replaced by coordinating conjunction This error type can be handled by local error rules.

04 other coordination problem This error type embraces different kinds of problems, why the error type as such lies out-side the scope of partial parsing and local error rules, although some of the errors in the database may be handled by local error rules.

– 9 –

Word order (GPNPWO)

Word order problems involves categories and are therefore always structural problems that could be handled by local error rules.

01 noun & adjective This error type can be handled by local error rules.

02 noun & participle This error type can be handled by local error rules.

Other problems (GPNPOP)

Since the errors gathered here are of various kinds, no general approach can be taken.

2.2 Adjective Phrase (GPAP)

Adjective phrases considered in the adjective phrase category occur as predicatives. There are three kinds of AP problems, the majority of them being structural problems.

Wrong word category (GPAPWC)

The erroneous word precedes the head adjective. Since word categories are involved, wrong word category is a structural problem.

01 adjective => adverb This error type could be handled by local error rules.

Choice of preposition after an adjective (GPAPCP)

Changing one preposition to another one is a non-structural problem, while removing a preposition is a structural problem. However, if the valency information is handled by a feature, partial parsing can be used for both types of problems.

02 adjective + preposition + att-clause This error type could be handled by partial parsing.

03 adjective + infinitive phrase [no preposition] This error type could be handled by partial parsing.

01 adjective + att-clause [no preposition] This error type could be handled by partial parsing.

Comparing ” än” (GPAPCM)

After an adjective in the comparative the word än may follow when making a comparison. However, this word might be missing or replaced by another expression. In both cases, the errors are structural ones that could be handled by local error rules.

– 10 –

2.3 Adverb Phrase (GPAB)

Word missing, doubled word, and word order are all structural problems.

Word missing (GPABWM)

This error type could be handled by local error rules.

Doubled word (GPABDW)

This error type could be handled by local error rules.

Word order (GPABWO)

This error type could be handled by local error rules.

Other problems (GPABOP)

Different kinds of errors are gathered here, and no general approach towards them can be taken.

2.4 Prepositional Phrase (GPPP)

In general, removal or insertion of a word makes the error a structural one. Replacing one word with another word is a non-structural error. However, there might be diff iculties that make it impossible to approach the error type by neither partial parsing nor local error rules. For instance, the choice of preposition is very much dependent on contextual in-formation which is diff icult to formalise.

Prepositions (GPPPPR)

01 preposition to be removed [should not be a prepositional phrase] Principally, this error type could be handled by local error rules.

02 one preposition too many This error type could be handled by local error rules.

07 doubled preposition This error type could be handled by local error rules.

03 preposition missing This error type could be handled by local error rules.

08 preposition missing in coordination of phrases – phrases of different types Principally, this error type could be handled by local error rules.

09 preposition missing in coordination of phrases – phrases of the same type Principally, this error type could be handled by local error rules.

– 11 –

05 wrong word category This error type could be handled by local error rules.

10 comma => preposition This error type is diff icult to handle by partial parsing or local error rules.

04 wrong preposition; choice of preposition This error type is diff icult to handle by partial parsing or local error rules.

06 wrong preposition in coordinated PPs This error type is diff icult to handle by partial parsing or local error rules.

Complements (GPPPCO)

Problems in complements in prepositional phrases may be either structural or non-struc-tural.

01 erroneous construction after a certain preposition This is a structural error that can be handled by local error rules.

02 consistency in complements This problem may be handled by partial parsing, since the word categories are the same. However, special rules may have to be written to capture in which cases feature ought to be unified.

03 complement missing This is a structural error that can be handled by local error rules.

04 med phrase This is a structural error that can be handled by local error rules.

2.5 Conjunctions and Conjunctive Adverbs (GPCN)

Conjunction or conjunctive adverb missing (GPCNCM)

A missing conjunction or conjunctive adverb is a structural problem that can be approach by local error rules.

04 coordinating conjunction missing This error type can be handled by local error rules.

01 subordinating conjunction or conjunctive adverb missing This error type can be handled by local error rules.

02 comma => subordinating conjunction This error type can be handled by local error rules.

03 comma => coordinating conjunction This error type can be handled by local error rules.

– 12 –

Complex conjunction (GPCNCC)

A complex conjunction is a conjunction consisting of more than one word. If the words may not appear on there own, the error can be detected by looking in a dictionary allowing multi -words entries. Otherwise, local error rules are the appropriate approach.

01 continuous This error type can be handled by local error rules.

02 discontinuous This error type can be handled by local error rules.

Doubled conjunctions (GPCNDW)

As other instances of repeated words, these cases are structural problems best handled by local error rules.

01 coordinating conjunction This error type can be handled by local error rules.

02 subordinating conjunction This error type can be handled by local error rules.

Erroneous conjunction (GPCNEC)

These errors are about sequences of categories, and therefore best handled by local error rules.

01 subordinating conjunction – no clause This error type can be handled by local error rules.

02 subordinating conjunction – two conjunctions This error type can be handled by local error rules.

Wrong word category (GPCNWC)

Wrong word category problems can be handled by local error rules.

01 pronoun This error type can be handled by local error rules.

03 adverb This error type can be handled by local error rules.

04 preposition This error type can be handled by local error rules.

05 verb This error type can be handled by local error rules.

– 13 –

06 adjective This error type can be handled by local error rules.

02 other This error type can be handled by local error rules.

2.6 Verb Phrase in the Limited Sense (GPVF)

The problems in the verb phrase in the limited sense are divided into subcategories depen-ding on the combination of verbs. If the division is made according to the error types in terms of structural and non-structural errors, the main problems are ill egal combinations of verbs or wrong type of verb and missing auxili ary verbs. Illegal combinations of verbs are structural problems if a verb should be removed, otherwise they are non-structural errors in the sense that the problems can be handled by feature propagation, although well -designed grammar rules are needed. Missing auxili ary verbs are always structural errors.

Main verb in the finite form (GPVFMF)

Error types dealing with two verbs of which one should be removed:

01 presens + presens => presens This error type can be handled by local error rules.

02 presens + preteritum => preteritum This error type can be handled by local error rules.

03 preteritum + preteritum => preteritum This error type can be handled by local error rules.

Error types dealing with one verb which should be changed:

04 infinitiv => presens This error type can be handled by partial parsing.

08 infinitiv => preteritum This error type can be handled by partial parsing.

05 supinum => imperativ This error type can be handled by partial parsing.

07 supinum => preteritum This error type can be handled by partial parsing.

10 perfektparticip => presens This error type can be handled by partial parsing.

06 perfektparticip => preteritum This error type can be handled by partial parsing.

– 14 –

09 presensparticip => preteritum This error type can be handled by partial parsing.

12 preteritum => imperativ This error type can be handled by partial parsing.

Error type addressing erroneous infinitive mark:

11 att + preteritum => preteritum This error type can be handled by local error rules.

Temporal auxili ary verb in the finite form + Main verb in the supine (GPVFTS)

Error types dealing with erroneous infinite verb forms:

01 har/hade + presens => har/hade + supinum This error type can be handled by partial parsing.

02 har/hade + infinitiv => har/hade + supinum This error type can be handled by partial parsing.

08 har/hade + perfektparticip => har/hade + supinum This error type can be handled by partial parsing.

07 auxili ary verb omitted + perfect participle This error type can be handled by partial parsing.

Error types dealing with erroneous auxili ary verbs:

09 ha + supinum => har/hade + supinum This error type can be handled by partial parsing.

03 wrong auxili ary verb This error type can be handled by partial parsing.

04 doubled auxili ary verb This error type can be handled by local error rules.

05 missing auxili ary verb This error type can be handled by local error rules.

06 wrong word category of the auxili ary verb This error type can be handled by local error rules.

– 15 –

Existential auxili ary verb in the finite form + Main verb in the perfect participle (GPVFEP)

01 är/var + presens => är/var + perfektparticip This error type can be handled by partial parsing.

02 auxili ary verb missing This error type can be handled by local error rules.

Auxili ary verb in the finite form + Main verb in the infinitive (GPVFAI)

Error types concerning the auxili ary verb:

01 infinitiv + infinitiv => presens/preteritum + infinitiv This error type can be handled by partial parsing.

05 supinum + infinitiv => presens/preteritum + infinitiv This error type can be handled by partial parsing.

10 wrong word category of the auxili ary verb This error type can be handled by local error rules.

13 missing auxili ary verb This error type can be handled by local error rules.

14 doubled auxili ary verb This error type can be handled by local error rules.

Error types concerning the main verb:

02 presens + preteritum => presens + infinitiv This error type can be handled by partial parsing.

03 presens + presens => presens + infinitiv This error type can be handled by partial parsing.

04 preteritum + preteritum => preteritum + infinitiv This error type can be handled by partial parsing.

08 preteritum + supinum => preteritum + infinitiv This error type can be handled by partial parsing.

11 preteritum + imperativ => preteritum + infinitiv This error type can be handled by partial parsing.

06 missing infinitive This error type can be handled by local error rules.

– 16 –

15 doubled infinitive or two infinitives This error type can be handled by local error rules.

09 wrong word category of the infinitive This error type can be handled by local error rules.

Error types concerning incorrect infinitive marks:

07 presens/preteritum + att + infinitiv [att should be removed] This error type can be handled by local error rules.

12 att + presens + infinitiv [att should be removed] This error type can be handled by local error rules.

Combination of auxili ary verbs + Main verb (GPVFAM)

The majority of these problems is non-structural, provided the appropriate grammar rules, and can thus be handled by partial parsing

01 two finite auxili ary verbs This error type can be handled by partial parsing.

02 infinitive + infinitive This error type can be handled by partial parsing.

03 supine + imperative / perfect participle This error type can be handled by partial parsing.

04 ha + perfect participle This error type can be handled by partial parsing.

05 ha doubled This error type can be handled by local error rules.

06 modal auxili ary + supine This error type can be handled by partial parsing.

Coordination of verbs (GPVFCO)

Coordination problems may be handled by partial parsing.

01 auxili ary verb + coordinated infinitives This error type can be handled by partial parsing.

02 bliva + coordinated perfect participles This error type can be handled by partial parsing.

03 coordinating conjunction missing This error type can be handled by partial parsing.

– 17 –

Infinitive in infinitive phrase (GPVFIP)

Choice of verb form after the infinitive mark is a problem that can be captured by partial parsing. Deletion or insertion of a verb involves a structural change, and could thus be handled by local error rules.

01 presens => infinitiv This error type can be handled by partial parsing.

02 supinum => infinitiv This error type can be handled by partial parsing.

05 perfektparticip => infinitiv This error type can be handled by partial parsing.

06 presens + infinitiv => infinitiv This error type can be handled by local error rules.

03 missing infinitive This error type can be handled by local error rules.

04 wrong word category This error type can be handled by local error rules.

Other problems (GPVFOP)

There are erroneous sequences of verbs that do not easily fit into any of the other subcate-gories. In general, these problems lie outside the scope of partial parsing and local error rules.

2.7 Verb Valency (GPVV)

Verb valency problems may be divided into two main types: complements and prepositions or adverbs governed by the verbs. To handle errors concerning complements, well -design-ed grammar rules are needed stating the verb valency as a feature. The errors can thus be detected by partial parsing. Local error rules are also possible, especially if a certain cor-rection strategy is preferred. When partial parsing could be used, that approach has been suggested as a result of the analyses.

Intransitivity (GPVVIN)

01 transitive verb => intransitive verb This error type can be handled by partial parsing.

02 transitive context => intransitive context This error type can be handled by partial parsing.

– 18 –

Transitivity (GPVVTR)

01 transitive verb => transitive context This error type can be handled by partial parsing.

02 intransitive verb => transitive verb This error type can be handled by partial parsing.

Copula (GPVVCO)

01 transitive verb => copula This error type can be handled by partial parsing.

Reflexivity (GPVVRE)

01 reflexive => non-reflexive This error type can be handled by partial parsing.

02 non-reflexive => reflexive This error type can be handled by partial parsing.

03 one reflexive pronoun too many This error type can be handled by local error rules.

04 wrong word – should be a reflexive pronoun This error type can be handled by local error rules.

05 other problems with reflexivity This error type lies outside the scope of partial parsing and local error rules.

Passive constructions (GPVVPC)

01 s-form: active voice => passive voice This error type can be handled by partial parsing.

02 s-form: passive voice => active voice This error type can be handled by partial parsing.

03 construction with få This error type can be handled by partial parsing.

04 active context => passive context This error type can be handled by partial parsing.

05 passive context => active context This error type can be handled by partial parsing.

– 19 –

Object with infinitive (GPVVOI)

01 preteritum + NP + att + infinitiv [att to be removed] This error type can be handled by local error rules.

02 other erroneous construction Errors belonging to this error type may be handled by partial parsing or local error rules.

Prepositional phrase (GPVVPP)

Certain verbs and verbal expressions take a prepositional phrase as an obligatory comple-ment. Problems with this complement type may, as the other error types, be handled by partial parsing provided that the verbs are categorised in an appropriate manner.

01 PP missing This error type can be handled by partial parsing.

Infinitive phrase (GPVVIP)

An missing infinitive mark is a structural problem that can be detected by partial parsing provided that a difference is made between auxili ary verb (which take an infinite without the infinitive mark) and main verbs that take an infinitive phrase as a complement. The problem may also be handled by local error rules. As in the previous verb valency cate-gories, the partial parsing is suggested as the appropriate approach when it is possible.

01 the infinitive mark att missing after the verb komma This error type can be handled by partial parsing.

02 the infinitive mark att missing – other cases This error type can be handled by partial parsing.

03 the infinitive mark att doubled This error type can be handled by local error rules.

04 the infinitive mark att to be removed No general approach may be suggested, neither partial parsing nor local error rules.

05 wrong word This error type can be handled by local error rules.

Clause (GPVVCL)

Clauses may also function as verb complements. The error types are probably best handled by local error rules, even though they could be detected by means of partial parsing.

01 att missing in att-clause This error type can be handled by local error rules.

– 20 –

Position holding ” det” (GPVVID)

Problems with position holding det could by handled by either partial parsing or local error rules. The clauses are of a certain kind and det is often erroneously replaced by de or an-other pronoun, why the suggested approach is partial parsing.

01 existential det This error type can be handled by partial parsing.

02 det in emphatic constructions This error type can be handled by partial parsing.

VF missing (GPVVVM)

Missing verb sequence is a structural problem best handled by local error rules.

01 verb inserted This error type can be handled by local error rules.

02 wrong word category This error type can be handled by local error rules.

NP missing (GPVVNM)

Missing subjects are best handled by local error rules, since the errors are structural.

01 subject in clause with inversion This error type can be handled by local error rules.

02 subject in clause without inversion This error type can be handled by local error rules.

03 subject in att-clause This error type can be handled by local error rules.

04 subject in relative clause This error type can be handled by local error rules.

Choice of preposition/adverb after verbs (GPVVCP)

As choices of prepositions governed by nouns and adverbs are preferably dealt with by partial parsing, so are choices of prepositions or adverbs governed by verbs. If the prepo-sition or adverb is to be removed, the category sequence is changed. The problem may still be handled by partial parsing provided that valency information is handled in features and that the combination of categories might be appropriate in some occasions. Otherwise, lo-cal error rules could handle the problems properly.

Error types dealing with replacing one word with another:

01 verb + preposition + NP This error type can be handled by partial parsing.

– 21 –

02 verb + preposition/adverb + att-clause This error type can be handled by partial parsing.

03 verb + preposition + infinitive phrase This error type can be handled by partial parsing.

04 verb + adverb + NP This error type can be handled by partial parsing.

16 verb + adverb + preposition + NP This error type can be handled by partial parsing.

10 verb + adverb + PP This error type can be handled by partial parsing.

05 verb + pronoun + som + NP This error type can be handled by partial parsing.

08 verb + noun + preposition This error type can be handled by partial parsing.

14 verb + reflexive pronoun + preposition/adverb This error type can be handled by partial parsing.

09 verb + reflexive pronoun + preposition + NP This error type can be handled by partial parsing.

Error types dealing with removing a word influence the sequence of categories, thus being structural problems. The problems may though be handled by partial parsing using valency information in terms of feature values, provided that the sequence of categories could be appropriate for some other verb.

11 verb [no preposition or adverb] This error type can be handled by partial parsing.

06 verb + NP [no preposition or adverb] This error type can be handled by partial parsing.

12 verb + infinitive phrase [no preposition or adverb] This error type can be handled by partial parsing.

13 verb + att-clause [no preposition or adverb] This error type can be handled by partial parsing.

15 verb + adjective + NP [no preposition or adverb] This error type can be handled by partial parsing.

07 one preposition/adverb too many This error type can be handled by local error rules.

– 22 –

Preposition/adverb missing after verbs (GPVVMP)

A missing word is a structural problem, but it can be handled by partial parsing if the valency information is provided by means of feature values, and the combination of categories are legal for some verbs but not for other verbs.

05 verb + preposition/adverb + NP This error type can be handled by partial parsing.

01 verb + preposition/adverb + clause This error type can be handled by partial parsing.

09 verb + reflexive pronoun + preposition + NP This error type can be handled by partial parsing.

08 verb + reflexive pronoun + preposition + infinitive phrase This error type can be handled by partial parsing.

02 verb + reflexive pronoun + noun + preposition + noun This error type can be handled by partial parsing.

07 verb + preposition + infinitive phrase [att may also be missing] This error type can be handled by partial parsing.

11 verb + adverb + preposition + NP This error type can be handled by partial parsing.

10 verb +adverb + preposition + att-clause This error type can be handled by partial parsing.

03 verb +adverb + preposition + infinitive phrase [att may also be missing] This error type can be handled by partial parsing.

12 verb + preposition + noun + infinitive phrase This error type can be handled by partial parsing.

04 verb + reflexive pronoun + som + NP This error type can be handled by partial parsing.

06 verb + som + clause This error type can be handled by partial parsing.

13 verb + noun + preposition This error type can be handled by partial parsing.

Repetition of preposition/adverb (GPVVRP)

It is diff icult to formalise when to repeat the preposition and when not to, why the error type lies outside the scope of partial parsing and local error rules.

– 23 –

01 phrases of the same type The error type lies outside the scope of partial parsing and local error rules.

02 phrases of different types The error type lies outside the scope of partial parsing and local error rules.

2.8 Pronoun Case (GPPC)

Pronoun case problems are non-structural errors in the sense that they can be handled by features, although grammar rules are needed to specify in which position in the clause which case is to be used.

Subjective form correct (GPPCSF)

02 objective form => subjective form, followed by a relative clause This error type can be handled by partial parsing.

Objective form correct (GPPCOF)

01 subjective form => objective form This error type can be handled by partial parsing.

02 subjective form => objective form, followed by a relative clause This error type can be handled by partial parsing.

2.9 Agreement (GPAG)

Agreement errors are non-structural errors, thus it is possible to handle them by partial parsing.

NP and AP – subject and complement (GPAGNA)

01 number in non-collective nouns This error type can be handled by partial parsing.

02 number in collective nouns This error type can be handled by partial parsing.

07 number in coordinated noun phrases It might be diff icult to specify the feature values of the resulting noun phrase, but the error type could in principal be handled by partial parsing.

06 number in heading without copula This error type may be handled as an agreement error within a noun phrase, which can be done by partial parsing.

– 24 –

03 gender This error type can be handled by partial parsing.

04 gender in specific/general meaning It is diff icult to formalise when the specific meaning is used, and when the general mean-ing is used. This error type lies outside the scope of partial parsing and local error rules.

05 head noun/relative pronoun and AP in relative clause This error type can be handled by partial parsing.

NP and AP – object and complement (GPAGNO)

Provided well -designed grammar rules, the agreement problems may be handled by partial parsing.

01 gender This error type can be handled by partial parsing.

02 species This error type can be handled by partial parsing.

AP and AP – subject and complement (GPAGAA)

01 number This error type can be handled by partial parsing.

NP and perfect participle – subject and complement (GPAGNE)

01 gender This error type can be handled by partial parsing.

02 number This error type can be handled by partial parsing.

03 person This error type can be handled by partial parsing.

NP and pronoun – subject and complement (GPAGPN)

01 number This error type can be handled by partial parsing.

NP and NP – subject and complement (GPAGNP)

There might be cases for which the noun phrases should not agree, and it is diff icult to for-malise the conditions when agreement is correct and when it is not. Therefore, agreement between noun phrases lies outside the scope of partial parsing and local error rules.

– 25 –

01 number This error type lies outside the scope of partial parsing and local error rules.

NP and NP in ” som” phrases – subject and complement (GPAGNS)

The special constructions in which the noun following som should agree with the subject of the clause may be captured by local error rules.

01 number This error type can be handled by local error rules.

02 gender This error type can be handled by local error rules.

NP and NP – object and complement (GPAGNN)

As in the subcategory above, a noun in a so-called som-phrase should agree with another noun, here functioning as the object.

01 number This error type can be handled by local error rules.

2.10 Referential Problems (GPRP)

Reference problems within the sentence lie outside the scope of partial parsing and local error rules.

Pronoun reference (GPRPPN)

01 anaphoric reference This error type lies outside the scope of partial parsing and local error rules.

02 deictic reference This error type lies outside the scope of partial parsing and local error rules.

Choice of VF (GPRPVF)

01 conditional subordinate clause This error type lies outside the scope of partial parsing and local error rules.

02 comparative subordinate clause This error type lies outside the scope of partial parsing and local error rules.

03 consistency This error type lies outside the scope of partial parsing and local error rules.

04 combination of verb form and temporal adverbial This error type lies outside the scope of partial parsing and local error rules.

– 26 –

2.11 Word Order (GPWO)

Word order problems involve erroneous sequences of categories. The problems are there-fore best handled by local error rules.

Inversion (GPWOIN)

Inversion takes place when the finite verb precedes the subject of the clause. The word order shift thus involves the finite verb and a noun phrase. The specifications do not state what causes the inversion, only if the inversion is correct or incorrect.

01 inversion => not inversion This error type can be handled by local error rules.

02 not inversion => inversion This error type can be handled by local error rules.

Inserted phrase (GPWOIP)

01 before => after the finite verb This error type can be handled by local error rules.

Adverb phrase (GPWOAB)

01 noun phrase This error type can be handled by local error rules.

02 preposition This error type can be handled by local error rules.

06 prepositional phrase This error type can be handled by local error rules.

03 finite verb This error type can be handled by local error rules.

04 infinite verb This error type can be handled by local error rules.

05 adverb governed by a verb This error type can be handled by local error rules.

Noun phrase (GPWONP)

01 reflexive pronoun This error type can be handled by local error rules.

02 infinite verb This error type can be handled by local error rules.

– 27 –

Prepositional phrase (GPWOPP)

01 infinitive phrase This error type can be handled by local error rules.

02 finite verb This error type can be handled by local error rules.

03 finite verb + adverb This error type can be handled by local error rules.

05 finite verb + noun phrase This error type can be handled by local error rules.

06 infinitive mark This error type can be handled by local error rules.

04 prepositional phrase This error type lies outside the scope of partial parsing and local error rules.

Other word order problems (GPWOOP)

01 både ... och ... This error type can be handled by local error rules.

02 såväl … som … This error type can be handled by local error rules.

03 other problems This error type can be handled by local error rules.

2.12 Wrong Word Category (GPWC)

Word category errors are structural errors, which can be handled by local error rules.

Adjective (GPWCAV)

01 verb This error type can be handled by local error rules.

02 preposition This error type can be handled by local error rules.

Adverb (GPWCAB)

01 noun This error type can be handled by local error rules.

– 28 –

02 verb This error type can be handled by local error rules.

03 adjective This error type can be handled by local error rules.

05 preposition This error type can be handled by local error rules.

04 other This error type lies outside the scope of partial parsing and local error rules.

Pronoun (GPWCPN)

01 other This error type can be handled by local error rules.

2.13 Other Grammar Problems (GPOG)

In this category, a variety of grammar problems are gathered, most of them involving structural issues. However, since the problems are quite complex, the error types lie outside the scope of partial parsing and local error rules.

Coordinations (GPOGCO)

No general approach can be taken.

Word missing (GPOGWM)

No general approach can be taken.

Doubled words (GPOGDW)

Doubled words problems can be handled by local error rules, if any generalisation can be made.

Heading (GPOGHE)

The error type lies outside the scope of partial parsing and local error rules.

Strange syntax and other grammatical problems (GPOGOP)

No general approach can be taken.

– 29 –

3 Analyses of Other Error Types

Grammar errors is not the only error group in the typology for which syntactic analyses are needed for detection and correction of the language errors, even though it is the most cen-tral part of the typology for the development of the grammar checker. In this chapter, spel-ling errors, punctuation problems, graphical problems, and style, meaning and reference problems are discussed.

3.1 Spelling Errors (SE)

The majority of the spelli ng errors are to be handled by the word checker. In the word for-mation category, there are problems which involves a larger context than the single ortho-graphic word to be detected.

Split words (SEWFSW)

One split words problem that may be handled by the word checker is when both words are non-lexical but the concatenated word is a lexical one. It is also possible to capture some errors in a negative dictionary, for instance christian names that can be recognised together with the surname.

The most problematic cases involve correctly spelled words that, in the specific context, should be written as one word. In other cases, at least one word is misspelt. This word might be dealt with in the word checker and the result would then be two correctly spelled words that are transferred into the grammar checker (with a notation about the correction or suggested correction of the misspelt word). The main problem occurs thus when two correct words that ought to be concatenated – how to make the parser check for that solu-tion when it fails to parse the erroneous sentence. The problem can be approached by neither partial parsing nor local error rules as a general approach.

Coordination with common word part (SEWFCO)

Coordination with common word part involves more than one graphical word. The most common mistakes may be handled by a negative dictionary. Otherwise, some of the pro-blems could be detected if the grammar rules are to handle asymmetric coordinations, so that an adjective and a noun can not form a noun phrase by coordination (e.g. enmans och familj eföretag).

The cases in which no common word part exists are more diff icult, since a special com-pound analysis would be needed. This is also needed for detection of misplaced hyphens and spaces.

3.2 Punctuation Problems (PU)

Punctuation problems can involve end of sentence punctuation, capital letter in beginning of sentences, usage of comma, dash within the sentence, colon, and semicolon. To detect punctuation problems in general, syntactic analyses are needed. However, some of the problems could be handled procedural. For instance, doubled punctuation marks can be detected and corrected without a parser mechanism.

– 30 –

Local error rules for missing punctuation marks may be formed to detect certain punctuation problems. For instance, local error rules may capture that two main clauses should not stand after each other without conjunction or a punctuation mark between them. Such errors are classified with different error type codes depending on how the proof-reader chose to correct the errors. However, only a limited number of problems may be captured, since it is very diff icult to formalise when to prefer one punctuation mark to another. Below, the most interesting error types are discussed.

3.2.1 End of Sentence Punctuation (PUES)

Punctuation mark missing (PUESPM)

Missing punctuation marks may be diff icult to detect, especially if the word checker changes a capital letter in a common word if it does not stand in the beginning of the sentence, which otherwise could be a sign to insert a punctuation mark.

Choice of end of sentence punctuation (PUESEC)

The type of sentence may indicate which punctuation mark to use. For instance, a question ought to end with a question mark, but it can end with another mark and still be correct. The sentence type and the preferable punctuation mark can be incorporated in the grammar rules, either by partial parsing or by local error rules depending on the division of the punc-tuation marks in terms of categories and/or features. For certain error types, it may be the best solution to write local error rules.

Full stop together with quotation marks or parentheses (PUESFS)

To decide which to come first, the full stop or the quotation mark, one needs to know if the whole sentence is a citation or only if a part of it is. The same goes for full stop and paren-theses. These error types can be handled by local error rules.

One punctuation mark too many (PUESPT)

Double punctuation marks can be handled by local error rules in which the erroneous combinations are stated.

Not end of sentence (PUESNE)

A punctuation mark is followed by a word beginning with a lower case character. This pro-blem can be detected, but it may be corrected by changing the lower case character to the upper case. To suggest the proper correction, a syntactic analysis is needed which steps over the sentence boundary.

3.2.2 Capital Letter (PUCP)

As mentioned above, the correction of missing capital letter at the beginning of a sentence may colli de with the problem of an erroneous punctuation mark that should be removed. Otherwise, missing capital letter can be handled by the grammar checker if the punctuation mark is a point. Colon is much more diff icult since it may be followed by a lower case character as well .

– 31 –

3.2.3 Comma (PUCO)

Problems with commas can be handled by local error rules. There are, however, many rules or guidelines concerning the usage of the comma sign which are contradictory and very diff icult to formalise.

Main clauses (PUCOMC)

When main clauses are coordinated without any shared element, a comma should occur. This error type can be handled by local error rules.

Subordinate clause (PUCOSC)

If the subordinate clause is a necessary one, a comma should not occur. If it is not a neces-sary subordinate clause, a comma is appropriate. Syntactic information such as a determi-native pronoun can be used for recognising the different types of subordinate clauses. De-letion or insertion of a comma sign affects the sequence of categories; the comma problems are best handled by local error rules.

Phrases / units (PUCOPH)

To formalise when a phrase or a unit ought to or ought not to be surrounded by commas is a very diff icult task. If it is achievable, local error rules would be the most appropriate way to handle the problems.

Parts of phrases / units (PUESPA)

Some of the cases may be captured by local error rules.

” Clarity criteria” (PUESCC)

The problems gathered in this category can not be formalised in comma usage rules, and are therefore impossible to handle.

Comma instead of word (PUESIW)

The problems gathered in this category can not be formalised in comma usage rules, and are therefore impossible to handle.

Comma correct (PUCOCO)

To capture when to use commas instead of other punctuation marks such as colon and dash is a diff icult to formalise. Therefore, these error types lie outside the scope of partial pars-ing and local error rules.

3.2.4 Dash within the Sentence (PUDW)

As for the majority of the comma problems, the usage of dash within the sentence is diff icult to handle by partial parsing and local error rules.

– 32 –

3.2.5 Colon (PUCN)

The colon can end a sentence, but it can also be used within the sentence. As for other punctuation marks, a general approach for error types in the colon category is hardly possible to make. Some errors may though be detected by local error rules.

3.2.6 Semicolon (PUSN)

Semicolon is not perceived as a possible end of sentence mark. The basis for choosing semicolon over other punctuation marks is not easy to handle by partial parsing or local error rules.

3.2.7 Other Punctuation Problems (PUOP)

Erroneous punctuation in certain text types (PUOPEP)

Certain text types follow specific norms when it comes to punctuation. To handle these problems, the parser must be able to handle information about text types, e.g. headings. If that is possible, local error rules may properly handle the errors.

3.3 Graphical Problems (GR)

Graphical problems involve decision about how to graphically present the texts, including signs. Some of the problems may be handled by a post-processing stage after the grammar checker, since a parser would not be needed for error detection and correction.

3.3.1 Space (GRSC)

There are well -defined guidelines where a space before and after punctuation marks is ap-propriate and where there should be no space. A majority of the space signs problems can be properly handled by a post-processing stage.

3.3.2 New Line / Paragraph (GRNL)

New line / paragraph to be removed (GRNLNR)

If a new line or a new paragraph token is not immediately preceded by an end of sentence mark, and if the text type is plain text, it can be possible to detect the error.

Erroneously placed line break (GRNLAB)

A new line should not divide an abbreviation or a number. Such errors may be avoided if a hard space token is used instead of an ordinary soft space token. A mechanism for check-ing abbreviations and number in this respect is a possible solution.

– 33 –

New line / paragraph to be inserted (GRNLNI)

To change the structure of the text lies outside the scope of Scarrie.

3.3.3 Dash before Direct Speech (GRDS)

Problems concerning the graphical representation of dash before direct speech can be handled without a parser. However, missing dashes lies outside the scope of Scarrie.

3.3.4 Dash within the Sentence (GRDW)

Problems concerning the graphical representation of dash within the sentence can be hand-led. However, there might be some diff iculties to divide between the incorrect and correct usage of hyphens.

3.3.5 Quotation Marks (GRQM)

Usage of quotation marks is very diff icult to formalise. Quotation marks not appearing in pairs may be detected, and so can erroneous combinations of så kallade and quotation marks. The most plausible solution is to look for these errors during post-processing.

3.3.6 Parentheses (GRPA)

Parentheses do normally appear in pairs. When they do not, the error may be detected dur-ing post-processing. Other types of errors involving parentheses lie outside the scope of Scarrie.

3.3.7 Typographical Errors (GRTY)

Typographical errors concern primarily the choice of fonts (italics, bold, font size, etc). They may be detected in connection to text type, otherwise they are outside the scope of Scarrie.

3.3.8 Other Graphical Problems (GROP)

Illegal signs may be detected by Scarrie.

3.4 Style, Meaning, and Reference (SP)

Style problems concerning choice between correct word forms can be handle by using a special word list of preferred spelli ng alternatives including proper names and abbrevia-tions. Choice of syntactic construction is the only error type in this group for which partial parsing and/or local error rules can be used for detection within the Scarrie project.

There are several problems involving numbers that make it diff icult to formulate rules how to present them. This is true for other error categories not mentioned above.

– 34 –

4 Proposed Order of Priority

The order of priority is set partly according to the frequencies of the error types in the Er-ror Corpora Database, and partly according what can be achieved by partial parsing and local error rules. From the grammar problems group, error types have been distinguished that will be primarily and secondarily focused on during the development of the grammar checker. The error types not mentioned below are also of great importance, although they have been given a lower priority because they are not as frequent.

Primary grammar problems:

– agreement within the noun phrase

– exceptions from agreement rules (species)

– case problems

– verb sequences

– structural errors: violations of category sequences of well -formed phrases and clauses

Secondary grammar problems:

– verb valency

– agreement between NP (subject) and AP (subjective complement)

– noun valency

– adjective valency

– pronoun case

Grammar problems being outside the scope of partial parsing and/or local error rules invol-ve diff iculties with formalising the language usage. Such problematic issues may concern choices of number and species in noun phrases, and choices of prepositions. In general, referential problems and all stylistic and semantical problems going beyond the individual word form are also outside the scope of Scarrie.

– 35 –

Literature

Bustamente, Flora Ramírez & León, Fernando Sanchez (1996). GramCheck: A Grammar and Style Checker. In: Proceedings of the 16th International Conference of Computational Linguistics. Coling –96, p. 175–181.

Sågvall Hein, Anna (forthcoming). A Chart-Based Framework for Grammar Checking. Initial Studies. Nodalida 1998. Copenhagen.

Wedbjer Rambell et al. (1998). An Error Database of Swedish. SCARRIE, Deliverable 2.1.3.2, version 1.0. Uppsala University, Department of Linguistics.

Wedbjer Rambell , Olga (1998). Error Typology for Automatic Proof-reading Purposes. SCARRIE, Deliverable 2.1, version 1.1. Uppsala University, Department of Linguistics.

Vosse, Theo (1994). The Word Connection. Grammar-based Spelli ng Error Correction in Dutch. Amsterdam.

Bibliography and References

Bustamente, Flora Ramírez & León, Fernando Sanchez (1996). GramCheck: A Grammar and Style Checker. In: Proceedings of the 16th International Conference of Computational Linguistics. Coling –96, p. 175–181.

Sågvall Hein, Anna (forthcoming). A Chart-Based Framework for Grammar Checking. Initial Studies. Nodalida 1998. Copenhagen.

Wedbjer Rambell et al. (1998). An Error Database of Swedish. SCARRIE, Deliverable 2.1.3.2, version 1.0. Uppsala University, Department of Linguistics.

Wedbjer Rambell , Olga (1998). Error Typology for Automatic Proof-reading Purposes. SCARRIE, Deliverable 2.1, version 1.1. Uppsala University, Department of Linguistics.

Vosse, Theo (1994). The Word Connection. Grammar-based Spelli ng Error Correction in Dutch. Amsterdam.