TEI Recommendations for Encoding Language Corpora


Click here to start

Table of Contents

TEI Recommendations for Encoding Language Corpora

PPT Slide

What is a corpus?

"linguistically motivated selection"

Sampling issues


BNC Composition

Sampling frame


User Requirements

1. Structural Issues

Newspapers and Periodicals

Transcribed speech

TEI basic structure

TEI basic structure (2)

TEI basic structure (3)

TEI Syntax

BNC Architecture


TEI proposals

BNC structure...

Low-level segmentation


Sample written text

Sample spoken text

2. Reference schemes

BNC reference scheme

Editorial practices

for example: skuzzy/SCSI


3. Texts are not just words...

For example...

Who does the work?

Types of linguistic annotation

Inherited properties

for example...

Word class categorization

BNC practice

Providing an analysis

using interp...

Hierarchic bundling of interps

Feature structures

Using a feature structure...

feature definitions may be stored as a feature library...

...and invoked by reference


discontinuous segments

discontinuous segments

discontinuous segments

Anaphoric reference

Translation pairs

4. Contextual Information

Information Discovery Needs

Describing a source

Contextualizing a source

The TEI header

TEI Header structure

The File Description

The publication statement

The source description

The Encoding Description

Editorial Declarations

The Profile Description

Text classification

For example...

BNC categorization scheme

Text classifications

Written text: medium

Spoken texts: region


Ancillary documentation

BNC Participant Description

BNC Participant Description

A BNC Setting Description

The revision description

Transcribing speech

The Spoken base tagset

Features of speech


Vocals and events

Voice quality and prosody

Another example





Defining a timeLine

Using a timeLine

The TEI as a standard

Why use this approach?

Email: lou.burnard@oucs.ox.ac.uk

Home Page: http://users.ox.ac.uk/~lou

Other information:
Presented at GLDV 99, 8 July 1999, Frankfurt a/M.