Human and computer learning of linguistic gender categories
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This dissertation addresses two issues, the importance of orthographic and semantic information in grammatical gender learning and the ability of various categorization theories, implemented within connectionist frameworks, to model human learning of gender categories. Feature-based theories propose that a category is represented as a list of abstracted features which can be weighted based on the reliability of their associations to a particular category over competing categories (termed cue reliability, from the Competition model); novel instances are categorized based on the summed weight of their features for a category. Exemplar-based theories propose that instances (termed exemplars) are stored in memory within their respective categories; novel items are categorized based on their similarity to the stored exemplars. The role of orthographic and semantic information is examined in two experiments, one using a measure of cue reliability, the other using a measure of exemplar similarity. In Experiment 1, 40 participants attempted to learn the correct adjective (petit or petite) for 24 French nouns. Participants demonstrated significantly higher levels of learning when orthographic cue reliability was high versus low, demonstrating the importance of orthographic information in gender learning. In Experiment 2, orthographic and semantic information (and their interrelationship) was manipulated based on exemplar similarity. Sixty-four participants attempted to learn the correct article (1^ or la.) for a set of 20 pseudowords, and generalized their learning to 7 additional pseudowords. Participants showed significantly higher levels of learning when orthographic similarity was high versus low, but did not show significant differences in learning for the semantic or interrelationship manipulation. For the generalization decisions, orthographic similarity was more accurate in predicting human performance than was semantic similarity. Following the experiments, four connectionist models of categorization were compared for their ability to fit Experiment 1 learning data (the least mean squares, configural-cue, standard backpropagation, and exemplar-based backpropagation models). The exemplar-based model provided the best overall fit (average R^ = .84). The fit of exemplar similarity to static portions of Experiment 2 data supports the conclusion that more emphasis was placed on orthographic than semantic information by the human participants. Extensions of the exemplar-based backpropagation model are discussed.