Monday, August 31, 2015

Citizens' Dialogue event was organized at the University of Helsinki on Monday, 31st of August. The topic was Open Science, Open Data and Open Innovation. Panel members were:

  • Commissioner for Research, Science and Innovation Carlos Moedas, European Commission
  • Chancellor, Professor Thomas Wilhelmsson, University of Helsinki
  • Co-founder and Senior Partner Valto Loikkanen, Grow VC Group
  • Director Kristiina Hormia-Poutanen, the National Library of Finland and President of the Association of European Research Libraries LIBER
President, Professor Anneli Pauli, Lappeenranta University of Technology served as the moderator.

Before her current position at Lappeenranta University of Technology, Anneli Pauli has worked as the Deputy Director-General of the Directorate-General Research and Innovation of the European Commission and as the Deputy Director-General of the Joint Research Centre of the European Commission. Pauli first congratulated University of Helsinki for its 375th anniversary. She continued by introducing the two main themes, Open Science and Open Data in one hand, and new forms of funding in the other. The audience could participate by asking questions directly or presenting questions through a message wall or social media (Twitter and Instagram).

Commissioner Moedas told about the importance of European collaboration in an early stage of his career. He was born in Beja, a town in southers Portugal. Moedas mentioned that Erasmus changed his life forever, enabled studying in Paris. These are the kind of things that make Europe and that are easily forgotten. The main projects of the EU are peace and prospery. Europe has been a conversion machine. First, wellbeing of citizens has developed to a large degree. Second, within EU there are 7-8% of the world population, but 30% of knowledge is created here. Third, we are the only social platform of the world where people are taked care of. This is a unique feature for EUrope.

Pauli introduced the first discussion theme, is openness a key to scientific excellence and to innovation?

Commissioner pondered that border of different disciplines is important for innovations in digital age. He presented the example of Ada Lovelace (and Charles Babbage), daugther of Lord Byron. Mother told Lovelace to study mathematics and science. Thanks to her understanding and experience of art, Lovelace who is coined the first programmer envisioned programming music, seeing the concept of digital world, connecting different areas. In summary, openness and crossing areas as preconditions for innovation. Digital humanities can be seen as an example of a crossing of areas that promotes emergence of new knowledge and innovations.

Wilhelmsson underlined that the EU has an important role in development of open science. Nandatory exceptions are needed for data and text mining. Moreover, promoting the sustainability of research is needed. In Finland, ministry of education's open science and research intiative has been important. Finland aims to be a leading country in this area in 2017 which is a very ambitious goal. Funding, changes in legistlation, education, and research are needed.

Hormia-Poutanen stressed the need for raising awareness, training, clarifyin concepts like open sience, and stating clearly what kind of benefits we can gain with data and text mining. Regarding open access, different stakeholders need to collaborate.

As an expert of crowd funding, Loikkanen mentioned that he is eager to have discussion with the audience. He emphasized innovation as a trial and error process.

A number of questions and comments were presented by the audience, for example the following.

  • Professor Tuukka Petäjä (University of Helsinki) told about a large body of research on the athmosphere at the University of Helsinki. Petäjä stressed the global potential of such work and called for means to support the means that could lead into wide application of the methods and data created in the research efforts.
  • The president of the University of Helsinki, Professor Jukka Kola asked how does the openness of science in the EU compare with that of the USA and other parts of the world.
  • A member of the net audience posed a question related to the cost of openess.
Carlos Moedas described the traditional form of publishing business. He stressed the negative effect of the pay walls to innovation. The way of doing the business has to change. Regarding the comparison with the USA, the commissioner stated that unfortunately it is not as good as in the USA. An important example is opening of the data genome. This created a lot of new research and business. A side remark was that he sees Open Science synonymous to Science 2.0. Moedas said that the education system can be open, business-based, or a mixture of those as the education system is up to each member state. There will be a new layer in education will increased use of digital tools. The experience in the USA has shown that online courses need to be associated with personal contacte. The importance of the personal contact remains. Horizon 2020: going from fundamental research to combining …

Valto Loikkanen stressed the importance of open data in the case of publicly funded research. He also raised the issue of making close methods abd data open in various ways to boost innovation processes. In computer science, open source software has long tradiotion. In Finland, Linux originating from the University of Helsinki can even be viewed as a national pride. In a private discussion after the event, Loikkanen stressed the global importance of crowd funding regarding its volume and status as a means for reaching rational decisions.

Kristiina Hormia-Poutanen stressed that we need copyright legistlation that allows data and text mining.

The first part of the event closed with a poll. For the question, does opening research data foster scientific excellence and innovation, more than 90 percent of the audience replied yes.

The second topic, Open Innovation, or more specifically, Private investment in Research and Innovation, and the European fund for strategic investments (EFSI), seemed to be less familiar to the mostly academic audience than the first one. The commissioner summarized the situation by saying that Governments do well in Europe regarding research funding but Europe is lacking in the private side. Companies cannot be forced to invest but conditions can be more favourable. For instance, Europe cannot have 28 different markets. Moedas referred to Juncker and Katainen when discussion the investment plan to boost the European economy. The plan includes three key areas:

  • mobilising investments of at least 315 billion euros in three years,
  • supporting investment in the real economy, and
  • creating an investment friendly environment.

Loikkanen mentioned that wider understanding of risk investment processes are needed. Digitalizing investing is becoming increasingly popular through crowd (sourcing) methods. They offer scalable value creation. Loikkanen strssed that investors want to invest to teams that have necessary skills. This kind of list should be given to all who wish to build a start-up company and consider how they build the core team. Loikkanen mentioned that the cost of innovation is actually low, and the investments in early stages are usually low. However, when the business is to grow, substantial investments are needed. Loikkanen used Google as an example. At Stanford University, investors are at a walking distance from the inventors and developers.

Carlos Moedas answered to the concerned to the role of basic research. He said that the commission does to very good projects regardless of their nature, even basic research (cf. CERN). For innovators, European investment council is needed. The commissioner concluded nicely that "you invest to people, people make the difference!"

Friday, August 28, 2015

Oskar Kohonen: Weakly Supervised Learning of Morphology

Oskar Kohonen defended his dissertation ”Advances in Weakly Supervised Learning of Morphology” in Aalto University, School of Science. As the opponent served Professor Lars Borin (Språkbanken, Institutionen för svenska språket, University of Göteborg, Sweden) and as the custos Professor Emeritus Erkki Oja, Aalto University. Among other things, professor Borin is the director of SWE-CLARIN, the sister organization of FIN-CLARIN, directed by Dr. Krister Lindén. Thesis advisor has been Dr. Krista Lagus who is the originator of the Morfessor method with Dr. Mathias Creutz. Morfessor was introduced as an unsupervised method for discovery of morphological segmentation of words in a data-driven manner. Dr. Creutz defended his thesis on this topic in 2006. Dr. Sami Virpioja extended the scope and the method. He defended his thesis "Learning Constructions of Natural Language: Statistical Models and Evaluations" in 2012.

In morphological segmentation, unsupervised methods do not typically model allomorphy, that is, non-concatenative structure. In English, one can given pretty/prettier as an example of allomorphy. In Finnish this phenomenon is very common. Moreover, the accuracy of unsupervised methods remains far behind rule-based methods. In this thesis, Oskar Kohonen studies the use of weakly supervised methods in order to alleviate these problems. With his colleagues, Kohonen has propose a novel extension to the Morfessor Baseline method to model allomorphy via the use of string transformations. Moreover, Kohonen has with his colleagues examined the effect of weak supervision on accuracy by training on a small annotated data set in addition to a large unannotated data set. Two novel semi-supervised morphological segmentation methods have been developed. First, a semi-supervised extension of Morfessor Baseline has been introduced, and, second, a means for morphological segmentation with conditional random fields (CRF) has been developed. The methods have been evaluated on several languages including English, Estonian, Finnish, German and Turkish.

The opponent, professor Borin represented the computational linguistics aspect of the work. He paid attention to a number of linguistic, methodological, terminological and conceptual issues. Borin first set Kohonen's work to a broader context mentioning, for instance, the fact that there are about 7,000 languages in the world. The discussion took place in a sophisticated manner. Both the defendant and the opponent speak Swedish, Finnish and English and therefore the examples could be related to any of these languages. As a conceptual topic, Borin asked to clarify what has been meant by morphological analysis.

In the evening party, the opponent and I realized that we had met in a conference already quite some time ago. First we thought it would have beeen in Gothenburg but adterwards I realized it must have been a conference in Copenhagen in 1987. At that time, we both had a shared interest on morphology. As a young researcher I got the idea that one could try to learn morphological analysis rules through inductive inference. Almost thirty years ago the attempt did not give convincing results. Therefore, from a personal point of view, it has been a pleasure to see how substantial developements have taken place.