This website has been archived and is no longer updated.

The content featured is no longer current and is being made available to the general public for research and historical information purposes only.


Powerhouse Museum - Home


Back


 
The project
The Power project

The Power project was a joint project run by the Centre for Language Technology (CLT) at Macquarie University and CSIRO’s Intelligent Interactive Technology Group.

The goal of the project was to:

design software that would let people call up information from databases to fit their needs, taking account of — how much they already know about the topic and had learned so far — and even the language they speak. The information the system provided would not have been previously composed by an author but generated automatically by the system.

The system takes information stored in a database, and uses natural language generation techniques, to automatically generate online hypertext documents for the user to read. This is dynamic document delivery (DDD).

The Dynamic Document Delivery model
The Dynamic Document Delivery model

As part of this project, the research team worked with the Powerhouse Museum, Sydney to develop the Power system. The intent was to use information from the Museum’s database to create a new system to dynamically generate web pages describing and comparing objects in the Museum’s collection.

click to view larger image
The Power project - click on image to view a larger image.

Natural language
Once the system has selected the information it will present, and organised that information into the structure of a coherent text, it then needs to be able to turn that information into natural language. This means it needs to turn the structured information into a web page that you can just read like any other page on the Internet. Hence, the system needs to be able to generate ordinary language. To do this it needs a lexicon (or dictionary) of all the words and parts of words it needs, along with a grammar that it can use to put these building blocks together to make sentences. The system needs to be able to do something similar to what we do each time we construct a sentence when we are speaking.

But it’s not enough to just generate web pages in natural language. Normally if you move from one page to another, information is presented each time as if you are reading it for the first time. But one of the goals of the Power project was to make the system remember what you have already looked at and present the information in a way that follows on from that. For example you might look at a description of a pair of shoes which partly says “these shoes are made of patent leather and have laces”. Then if you go on to a comparison of the shoes and a pair of boots, it needs to remember that you’ve just read about the shoes. Instead of then saying “this pair of shoes and this pair of boots are both made of patent leather and both have laces” it might say “like the shoes, the boots are made of patent leather and have laces”. This way the pages make more sense and are easier to read as you move from one page to the next. This is called discourse coherence. Taking this idea of context-sensitive generation even further, the Power system was designed so you can enter user information about yourself and it will remember who you are and tailor the descriptions it gives you to what you already know.


Natural language generation

Language generation applications
Language generation can be used for:

  • Multilinguality: multiple output languages from one source representation.
  • Customised information delivery: gives the user information tuned to what the system knows about them.
  • Dialogue: the system’s output takes account of the previous conversational context.
  • Self-generating documentation: removes need to keep data and documents in step.

Different languages
The actual language that the pages are generated in is also an issue. One of the main goals of the Power project was to design a system that would generate texts that would be tailored to suit each user‘s need. Part of this involved designing a system that could dynamically generate texts in different languages, so if you spoke French instead of English, for example, the system could generate a text in French so you could read it. The natural language generation part of the system had to work for other languages.

A page from the Power project in French
A page from the Power project in French

The system the Power project set out to design had to be able to select data from a database, structure the text that it was going to generate, then turn that into natural language. And it needed to be able to do this in a very flexible and interactive way.

To make all this work, the team needed to design system architecture to combine the components of the system and integrate the user’s input.

Activity
Often the output from a database is in tabular form. How would you deliver the following information as natural language?

Functions and sources of minerals

Mineral Function Food source
Chromium Role in carbohydrate metabolism Meat, cheese, legumes
Selenium Component of antioxidant enzyme Seafood, meat, cereals

System architecture
The key to making any system work is the system architecture. This is basically the structure that all the components of the system fit into, and the connections between the components that let them work together to do the job the system is designed to do.

The Power system had components that:

  • selected information from a database
  • structured this information into a text plan
  • turned the text plan into natural language.

The Power project developed new ways of making these components function. The traditional Natural Language Generation architecture, however, is one way of making these components work together.

The standard Natural language generation architecture picks information from a database (the knowledge base) on the basis of the reason the text is being generated (the discourse goals). It structures this information into a text plan, then generates a surface realisation, in other words a document in natural language.

The problem with this architecture was that it was not interactive enough for what the Power project wanted to achieve. It could not take into account the other web pages the user had just looked at and all the other issues that would make the system flexible enough to be able to tailor each page exactly to the user’s needs.

To make the system that interactive the Power team developed a system architecture called Dynamic Hypertext.

On the left you can see a traditional natural language generation architecture and on the right the dynamic hypertext architecture.

Natural language architecture Dynamic hypertext architecture
Natural language architecture Dynamic hypertext architecture

Dynamic hypertext architecture
Most of this architecture looks like the traditional natural language generation architecture. The big difference is that the end result of the language generation part of the system is not just a document in natural language. Instead it is an HTML document that can be viewed on the World Wide Web. The web page the user looks at to view the document also contains an html command capability that lets the user send instructions back to the beginning of the document generation process. It also lets the system remember what pages the user has looked at, and gather other information on the user so it can create documents tailored to the user’s needs.

The traditional architecture is basically a one way process. You can ask the system for a document, and it can generate it. The dynamic hypertext architecture, on the other hand instead, is an interactive loop that constantly generates documents in response to the user’s use of the system. It’s a two way process, so it’s a much more flexible and effective structure for presenting the information the user needs.

A simpler version of the previous diagram can be seen below.


DDD architecture

To find out more about dynamic hypertext and the Power system architecture click here.

The Centre for Language Technology
The Centre for Language Technology started life in 1994 as the Natural Language Processing Unit in the Microsoft Research Institute for Advanced Software Technology at Microsoft Australia. Later the unit moved to the Department of Computing at Macquarie University and became the Language Technology Group (LTG). CSIRO became a sponsor of the LTG (and later the Centre for Language Technology), and Cécile Paris and her team are actively in collaboration with the researchers in LTG at Macquarie.

In 2001 LTG was expanded and became the Centre for Language Technology (CLT) as part of Macquarie University’s Division of Information and Communication Sciences.

CLT’s main job is to develop new technology to allow computers to use human language. This research aims to get computers to use the same kind of language that people use and to make it easier for people to interact with them computers directly.

Activity
Develop a screen design for the Power project.

HSC technology syllabses support - HOME space The Power project - main menu