[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

are artificial languages scientifically interesting



Following is a post I'm making to sci.lang, but for which I'd like
comments from inside the community on how to improve (and suggestions for
additional examples along the same lines).  Post to the list if they
are of general interest.

Subject: The Scientific Value of Artificial Languages
Newsgroups: sci.lang
Organization: The Logical Language Group, Inc.
Summary: Several ideas, and more if you want them
Keywords: science, experiment, AL, model, Lojban

pautler@ils.nwu.edu (david pautler) says (12 Mar 91 15:59:02 GMT):
>        I did not say that ALs have no good use.  I said there's nothing
>particularly interesting about them (from a scientific viewpoint - this is
>a `sci' group) *because* they're artificial.  Some interesting sociological
>behaviors may appear if these languages come into widespread use, perhaps
>even some interesting linguistic phenomena if enough spontaneous innovation
>occurs (although AL enthusiasts seem to want to prevent this).  But there
>certainly doesn't appear to be anything interesting about them now, because
>AL enthusiasts in this group prefer to argue over which of several (truly
>arbitrary) conventions are "better".
>        I am willing to admit I am wrong about all this if some of you AL
>enthusiasts can give the rest of us some good reasons why ALs *are*
>scientifically interesting.

David later adds ( 15 Mar 91 04:46:32 GMT):
>        I still believe that knowing the design principles of any system
>beforehand makes a scientific study of those principles silly, but I'm
>going to get off my high horse and go back to being a level-headed
>contributor.

This addition definitely clarifies the goal, and the problem, especially
since it removes the loaded topic 'AL' from the question.  I will answer
primarily from the standpoint of Lojban, though some of my points are
applicable to Esperanto and other ALs.

David is taking a very limited view of science, to presume that the
design principles of a system are the only interesting thing about that
system to a scientist.  I can see a few other possibilities:

a) in a highly complex system (which even an AL is), the interaction of
the design features displays properties that are 'more than the sum of
the parts'.  Thus it is possible that all language is merely a system
comprised of a bunch of neurons releasing neurotransmitters.
Biochemistry may eventually devise a complete explanation for the
neuronic process (including genetic components), and we may then say we
"know the design prnciples of the system".  But this won't be the case,
because the complexity of those neuronic interactions is so great that
knowing the pieces does not give a total understanding of the >system<.
This indeed may be what defines the concept 'system'.

Knowing all the prescribed rules of an AL does not tell you how that AL
will be used communicatively, and I don't mean in the sociological
sense.  A sample question:  Given multiple ways of communicating the
same idea, do users of the language choose particular forms over others,
and why?  This is similar to a question that presumably is commonly
asked about natural languages.

I can come up with many other sample questions of science that can be
applied to the system of an AL that are not compromised by 'knowing the
design', but let's move on.  (Feel free to ask, though).

b) A simpler system, which can be more fully understood, may serve as an
excellent model for a less understood, more complex system.  Thus the
simpler system could be examined for parallels to hypotheses about the
more complex system.  Examination of the simpler system may suggest
properties to look for in the more complex system, or it may even
suggest hypotheses that can be tested in the more complex system.

A 'hot' topic in parts of the Lojban community is whether the language
has or should have, an underlying semantic theory.  If there is one, it
is certainly not as developed or prescribed as the syntactic design and
theory.  As a result, filtering out syntactic ambiguity allows a more
direct examination of semantic ambiguities, including the properties of
modification and restriction, resolution of anaphora, and identification
of ellipses.  Any semantic theories proposed for natural language can be
looked at in terms of semantic usage in the simpler Lojban system.

It seems likely that any theory NOT true of Lojban is at least
suspicious with regard to natural language, thus allowing partial
verification of theories (not complete - I would never say that ALs
should be studied to the exclusion of natural languages, but rather in
relation to them); if it however IS true of natural language, then you
have found evidence that Lojban is in some way unnatural.  Then you get
to try to explain which of the known design features of Lojban causes
this unnaturalness.  By counterexample that design feature is NOT a
feature of natural languages.

Pragmatic effects can be more easily recognized in the simpler Lojban
system, and can clearly be identified as pragmatic.  Thus insights about
pragmatic effects may be more visible in Lojban, insights that would
then be tested in the natural languages.

Again, moving on.

c) Another aspect of a simple system is that it is easier to perform
experiments on than a more complex system.  There are fewer variables,
and if the system is 'designed', some things that are variables are in
effect TUNABLE constants, so that you can rerun the experiment with
minor changes to explore the effects of those variables.

Experimental linguistics is a virtually unthinkable possibility with the
natural langauges.  The Sapir-Whorf Hypothesis (no I'm not trying to
ge susceptible to
the same analysis as natural language in terms of TG, GB, UG (or
whatever initials suit you %^).

Take even a few children during the critical period and teach them this
artificial language (at the same time as they learn their traditional
language).  Do they become truly bilingual?  If they are as fluently
 flexible; you can evolve slightly different versions of
the language very easily by simply changing some features.  Forbid a
given construct in the prescription, and do not teach it to a child.
Does the child develop that construct anyway by analogy to other
languages known, or does the child successfully adapt to whatever other
processes you've designed into the language instead of the construct.
It seems that all manner of linguistic universals could be investigated
in this way.

d) I've mentioned only child learning, because this is what many
linguists concentrate on, as revealing the essential nature of language.
But there is also the applied linguistics problems of teaching foreign
languages.  It is much easier to test a method or theory of vocabulary
teaching/learning with an artificial language than with a natural
language; I don't think particularly controversial the statement that
ALs are more quickly (I didn't say easily!) learned then NLs.

The pragmatic problems of language learning are alone justification into
researching using ALs.  But ALs may provide the solution as well as the
means of testing.  It seems to be well accepted that in learning a
second language and then learning a third, you learn the third MUCH more
quickly than the second.  The example I've heard is that it might take 4
years to learn French and then 2 to learn German thereafter; and vice
versa.  If this is true, then, if you can learn an AL comparably well in
1 year as French in 4, then you can learn the AL and German in 3 years
instead of 4, a gain of a year EVEN IF YOU NEVER AGAIN HAVE A USE FOR
THE AL.  But I don't claim this as a fact - it should be easily testable
in a controlled experiment, and this seems much more scientific than
arguments about what ALs and NLs are 'easier to learn'.

e) Lojban has one feature designed to explore a less-understood aspect
of language - the expression of emotion.  Lojban allows expressive
communication of emotions in words without suprasegmentals (this
presumably unlike all natural languages, but not entirely, as many
languages have a limited set of indicators of attitude in the form of
interjections and some discursive function words e.g.  'but').  Can
human beings manipulate the symbols of emotion in the same way they
manipulate the comparable symbols of non-emotional expression?  There is
a whole range of experimental questions raised by this design element,
probably the most 'unnatural' element of Lojban's design.

f) The latter points to the one other aspect of a well-designed
artificial language of scientific interest and value to linguistics - as
a tool of analysis.  Best an example.  The new Scientific American
Library book _The Science of Words_, by George A. Miller of Princeton
(just out and I'm finding it quite interesting).  A picture caption
notes that Nootka (a Pacific Northwest language) has the single word
'inikwihl'minik'isit' meaning the equivlanet of the entire English
sentence "Several small fires were burning in the house."  I won't
presume to know any more about Nootka than I've just told you, but in
Lojban, I can express that sentence parallelling the English:

so'i cmalu fagri puca      jelca   vine'i    le  prezda
Many small fires were-then burning at-within the person-nest.

and analytically as a single word (though not with the same structure as
Nootka)

prezdane'ikemcmafagyso'ikempruje'a
person-house-inside-type_of-small-fire-many_some-type_of-previous-burner

(Yes, I can say it!  :^)

Actually, according to Miller, the Nootka breaks down as:

inikw     -ihl         -'minih  -'is        -it
fire/burn in-the-house plural   diminuitive past-tense

This order is also expressible in Lojban:

fagykemprezdanerso'icmapru
fire-type_of-person-nest-inside-many_some-small-past_thing/event

I don't know which of the two orders more accurately conveys how the
Nootka speaker thinks of the concept expressed by the word, or whether
others are better still.

The Lojban in either case more accurate tracks the semantics of the
Nootka, demonstrating the inadequacy of the English - the actual word as
broken out did not require two separate particles for fire and burn as
did the English equivalent, and the English translation used the more
complicated tense 'were-burning' instead of the simpler, and presumably
more accurate 'burnt'.  (I'll plainly admit that I'm relying on the given
explanations by Miller, which are in English, but it seems clear that in
translating the word-sentence into English there is a considerable
ambiguity introduced.

I won't claim that Lojban can express EVERYTHING in the natural form of
any language (Lojban has a less-marked syntactic word-order, and
expressing other orders requires marking particles that would not be
found in the source language.  Thus there is a tradeoff between semantic
representation and syntactic representation.)

Still, I think a convincing case can be made that, as a predicate
language, Lojban is a much more effective tool at studying both the
forms and semantics of other languages than is English, which has its
own cultural, syntactic and semantic complexities to gum up the analysis
(especially if the analysis is being done by a non-native English
speaker - if there is any place where there is a justification for an
international, minimal-culture language, it is when linguists from
different native language backgrounds try to perform and communicate
their linguistic analyses).

g) There is also the 'other' tool aspect of an artificial language, in
computer and AI applications.  A predicate language like Lojban should
be especially amenable to AI processes - the programmers are familiar
with predicate language expression and manipulation, and often store the
data in predicate form internally for manipulation.  With Lojban, such
storage becomes a fairly trivial process.  If Lojban is proven by
experiment (per above) to have the systemic properties of a natural
language, and is easier to implement in computational linguistics
research problems, it serves as a tool to bridge those two disciplines,
leading to more rapid and effective NLP.  But only if it is tried.  Even
if it proves less than ideal, I have little doubt that study of natural
language using computational linguistic techniques and a Lojban-based
tool will be productive in ways not possible with any natural language.
(In effect, this argument is the same as the last one, except that
instead of two different-natural-language speakers trying to communicate
about language, you have a human and a computer, who obviously speak
different native languages, trying to communicate.)

h) This has been raised before, but not as clearly perhaps:  A highly
prescribed language is an ideal test bed for examining the processes of
language evolution.  In the case of an AL like Lojban, as the speaking
community in each culture grows, you can observe how the language
creolizes in contact with those other languages.  Because of the speed
of learning, artificial languages should tend to show effects more
quickly (by being mastered to a communicative level more quickly), and
anecdotal evidence about Esperanto tends to support this idea.

Does this mean that the conclusions are absolutely valid for natural
language evolutionary processes.  I don't claim so.  But again, we are
performing experiments with a model, somewhat idealized, of a natural
language.  Unlike a paper-theoretic model (as all linguistic theories
must inherently be), this is a model that can be experimented with using
live speakers.  Provided that we understand the model as it evolves,
that understanding much more approximates an understanding of natural
language as time goes on.

i) The large majority of languages have some degree, more or less, of
prescription.  In addition, some 'natural' languages, like modern
Hebrew, formal Swahili, and some standardized dialects (e.g.  Mandarin,
which has been noted as being related but not identical to the Beijing
dialect), are not all that far from being true artificial languages, but
are much more interesting to linguists.  A predominantly prescribed
language would seem an especially effective tool for studying the
effects of prescription on language development and use (again,
linguistic and not sociological effects).


None of these scientific applications of Lojban inherently requires a
large fluent body of speakers, or any solely-native speaker of that
tongue.  If any of the less scientific applications of Lojban serve to
justify it developing such a speaker base, the nature of its usefulness
as a model will change.  New applications, as yet not really predictable
will turn up, aided by our no doubt increased understanding of language.
But the model, even if well understood, no longer is as simple, and new
Loglans and other experimental linguistic tools, all artificial
languages, will be developed to take the next step.

I have hopefully given a bit of food for thought, yet with only a few
hours preparation.  I also only thought about this as somewhat an
outsider to the profession of linguistics.  With a different
point-of-view others should be able to find many more questions of
scientific interest using an AL like Lojban either as a model, an
experimental test bed, or a tool.  And if even a small fraction of these
ideas are useful, then ALs have a valid scientific role in linguistics.

  --  lojbab = Bob LeChevalier, President, The Logical Language Group, Inc.
               2904 Beau Lane, Fairfax VA 22031-1303 USA 703-385-0273
      lojbab@snark.thyrsus.com