Research funding issues
When the Centre for Language Technology started out as the Natural Language Processing Unit back in 1994 it was funded by Microsoft Australia through the Microsoft Institute for Advanced Software Technology. Sometimes it may be inappropriate for research to be funded by just one organisation, as the results may be in some way biased or benefit the company. At other times the organisation keeps a distance and is just the source of funding.
Activity
Read Robert Dale's responses to these questions and write a summary of the key points.
What are the ethics of research like this being tied to corporations like Microsoft? What happened with this research? Did Microsoft Australia have a stake in the outcome?
The funding of the research wasn't an issue for us. We weren't impacted by that in any significant way, says Robert Dale. The Microsoft Research Institute was set up out of something fairly close to altruism by Microsoft here in Australia, and so for a number of reasons there were no strings attached really, other than annual reporting and so on. In particular, because Microsoft Australia doesn't have a research arm, there was no local reporting channel that went through a research hierarchy. We basically reported what we did to the management in Microsoft locally, and because they didn't have a research agenda, as long as we did good stuff they didn't really mind what we did. We had quite free reign over the projects we carried out; it wasn't in any way tied back to Microsoft's particular interests.
Corporate funding of research isn't always that hands-off. If this research had been funded by Microsoft in the US things would've been different and this would've had advantages and disadvantages.
If we'd reported into the US then we'd be hooked in to the Microsoft research agenda in the US, which is where the main corporate research gets done, and that would probably have had a much more significant influence on what we did. I actually think it might've been better if we had reported through the US channels, since there are also more dollars available via that route for funding neat ideas.
Ethical implications of designing software
Activity
Both ILEX and the Power project are designed to do things that museum curators sometimes do, like writing labels for museum objects, or writing descriptions of them.
- What are the ethical implications of designing software to do these tasks?
- Will it be a useful tool?
- Or will it intrude on the curator's territory?
Click here to see what the team thought about this issue.
Effectiveness versus maintenance
The fact that there aren't many people around who know LISP was another big issue. It proved that you can't always use the most efficient effective programming language for the task at hand. You have to design software so that developers can actually maintain it. If developers in the future are going to be able to maintain and extend the code, it has to be in a programming language that a lot of people know.
Robert Dale tells a story about another project he's working on that makes this point.
Another project I've been working on is to do with the processing of legal text for a legal publisher. The idea is to find citations within a legal document to other legal documents, and then to automatically hyperlink these things all together so that you have one big network of legal information. That's a very core Natural Language Processing task. Because you're trying to find references to the legislation in free text, it's kind of hard to do. In fact we often use the language Prolog for things here, and I know I could sit down and in a weekend I could solve most of the problems in that project using Prolog. However, the client is not interested in these kinds of programming languages, and quite understandably so, because they want a piece of code that they can maintain; so it has to be in Java .
As a consequence it takes 10 to 15 times as much effort to write the code, but maintainability is important. It's got to be the case that when we go off this project and other people come in, those other people can maintain the code and extend it and so on, and if it's in Prolog then there's no chance. So those sorts of factors have a bearing. It's not just the speed of the language, it's a social consequence of the way things have developed and what's popular and what's not.
|