Monday, August 31, 2015

Citizens' Dialogue event was organized at the University of Helsinki on Monday, 31st of August. The topic was Open Science, Open Data and Open Innovation. Panel members were:

  • Commissioner for Research, Science and Innovation Carlos Moedas, European Commission
  • Chancellor, Professor Thomas Wilhelmsson, University of Helsinki
  • Co-founder and Senior Partner Valto Loikkanen, Grow VC Group
  • Director Kristiina Hormia-Poutanen, the National Library of Finland and President of the Association of European Research Libraries LIBER
President, Professor Anneli Pauli, Lappeenranta University of Technology served as the moderator.

Before her current position at Lappeenranta University of Technology, Anneli Pauli has worked as the Deputy Director-General of the Directorate-General Research and Innovation of the European Commission and as the Deputy Director-General of the Joint Research Centre of the European Commission. Pauli first congratulated University of Helsinki for its 375th anniversary. She continued by introducing the two main themes, Open Science and Open Data in one hand, and new forms of funding in the other. The audience could participate by asking questions directly or presenting questions through a message wall or social media (Twitter and Instagram).

Commissioner Moedas told about the importance of European collaboration in an early stage of his career. He was born in Beja, a town in southers Portugal. Moedas mentioned that Erasmus changed his life forever, enabled studying in Paris. These are the kind of things that make Europe and that are easily forgotten. The main projects of the EU are peace and prospery. Europe has been a conversion machine. First, wellbeing of citizens has developed to a large degree. Second, within EU there are 7-8% of the world population, but 30% of knowledge is created here. Third, we are the only social platform of the world where people are taked care of. This is a unique feature for EUrope.

Pauli introduced the first discussion theme, is openness a key to scientific excellence and to innovation?

Commissioner pondered that border of different disciplines is important for innovations in digital age. He presented the example of Ada Lovelace (and Charles Babbage), daugther of Lord Byron. Mother told Lovelace to study mathematics and science. Thanks to her understanding and experience of art, Lovelace who is coined the first programmer envisioned programming music, seeing the concept of digital world, connecting different areas. In summary, openness and crossing areas as preconditions for innovation. Digital humanities can be seen as an example of a crossing of areas that promotes emergence of new knowledge and innovations.

Wilhelmsson underlined that the EU has an important role in development of open science. Nandatory exceptions are needed for data and text mining. Moreover, promoting the sustainability of research is needed. In Finland, ministry of education's open science and research intiative has been important. Finland aims to be a leading country in this area in 2017 which is a very ambitious goal. Funding, changes in legistlation, education, and research are needed.

Hormia-Poutanen stressed the need for raising awareness, training, clarifyin concepts like open sience, and stating clearly what kind of benefits we can gain with data and text mining. Regarding open access, different stakeholders need to collaborate.

As an expert of crowd funding, Loikkanen mentioned that he is eager to have discussion with the audience. He emphasized innovation as a trial and error process.

A number of questions and comments were presented by the audience, for example the following.

  • Professor Tuukka Petäjä (University of Helsinki) told about a large body of research on the athmosphere at the University of Helsinki. Petäjä stressed the global potential of such work and called for means to support the means that could lead into wide application of the methods and data created in the research efforts.
  • The president of the University of Helsinki, Professor Jukka Kola asked how does the openness of science in the EU compare with that of the USA and other parts of the world.
  • A member of the net audience posed a question related to the cost of openess.
Carlos Moedas described the traditional form of publishing business. He stressed the negative effect of the pay walls to innovation. The way of doing the business has to change. Regarding the comparison with the USA, the commissioner stated that unfortunately it is not as good as in the USA. An important example is opening of the data genome. This created a lot of new research and business. A side remark was that he sees Open Science synonymous to Science 2.0. Moedas said that the education system can be open, business-based, or a mixture of those as the education system is up to each member state. There will be a new layer in education will increased use of digital tools. The experience in the USA has shown that online courses need to be associated with personal contacte. The importance of the personal contact remains. Horizon 2020: going from fundamental research to combining …

Valto Loikkanen stressed the importance of open data in the case of publicly funded research. He also raised the issue of making close methods abd data open in various ways to boost innovation processes. In computer science, open source software has long tradiotion. In Finland, Linux originating from the University of Helsinki can even be viewed as a national pride. In a private discussion after the event, Loikkanen stressed the global importance of crowd funding regarding its volume and status as a means for reaching rational decisions.

Kristiina Hormia-Poutanen stressed that we need copyright legistlation that allows data and text mining.

The first part of the event closed with a poll. For the question, does opening research data foster scientific excellence and innovation, more than 90 percent of the audience replied yes.

The second topic, Open Innovation, or more specifically, Private investment in Research and Innovation, and the European fund for strategic investments (EFSI), seemed to be less familiar to the mostly academic audience than the first one. The commissioner summarized the situation by saying that Governments do well in Europe regarding research funding but Europe is lacking in the private side. Companies cannot be forced to invest but conditions can be more favourable. For instance, Europe cannot have 28 different markets. Moedas referred to Juncker and Katainen when discussion the investment plan to boost the European economy. The plan includes three key areas:

  • mobilising investments of at least 315 billion euros in three years,
  • supporting investment in the real economy, and
  • creating an investment friendly environment.

Loikkanen mentioned that wider understanding of risk investment processes are needed. Digitalizing investing is becoming increasingly popular through crowd (sourcing) methods. They offer scalable value creation. Loikkanen strssed that investors want to invest to teams that have necessary skills. This kind of list should be given to all who wish to build a start-up company and consider how they build the core team. Loikkanen mentioned that the cost of innovation is actually low, and the investments in early stages are usually low. However, when the business is to grow, substantial investments are needed. Loikkanen used Google as an example. At Stanford University, investors are at a walking distance from the inventors and developers.

Carlos Moedas answered to the concerned to the role of basic research. He said that the commission does to very good projects regardless of their nature, even basic research (cf. CERN). For innovators, European investment council is needed. The commissioner concluded nicely that "you invest to people, people make the difference!"

Friday, August 28, 2015

Oskar Kohonen: Weakly Supervised Learning of Morphology

Oskar Kohonen defended his dissertation ”Advances in Weakly Supervised Learning of Morphology” in Aalto University, School of Science. As the opponent served Professor Lars Borin (Språkbanken, Institutionen för svenska språket, University of Göteborg, Sweden) and as the custos Professor Emeritus Erkki Oja, Aalto University. Among other things, professor Borin is the director of SWE-CLARIN, the sister organization of FIN-CLARIN, directed by Dr. Krister Lindén. Thesis advisor has been Dr. Krista Lagus who is the originator of the Morfessor method with Dr. Mathias Creutz. Morfessor was introduced as an unsupervised method for discovery of morphological segmentation of words in a data-driven manner. Dr. Creutz defended his thesis on this topic in 2006. Dr. Sami Virpioja extended the scope and the method. He defended his thesis "Learning Constructions of Natural Language: Statistical Models and Evaluations" in 2012.

In morphological segmentation, unsupervised methods do not typically model allomorphy, that is, non-concatenative structure. In English, one can given pretty/prettier as an example of allomorphy. In Finnish this phenomenon is very common. Moreover, the accuracy of unsupervised methods remains far behind rule-based methods. In this thesis, Oskar Kohonen studies the use of weakly supervised methods in order to alleviate these problems. With his colleagues, Kohonen has propose a novel extension to the Morfessor Baseline method to model allomorphy via the use of string transformations. Moreover, Kohonen has with his colleagues examined the effect of weak supervision on accuracy by training on a small annotated data set in addition to a large unannotated data set. Two novel semi-supervised morphological segmentation methods have been developed. First, a semi-supervised extension of Morfessor Baseline has been introduced, and, second, a means for morphological segmentation with conditional random fields (CRF) has been developed. The methods have been evaluated on several languages including English, Estonian, Finnish, German and Turkish.

The opponent, professor Borin represented the computational linguistics aspect of the work. He paid attention to a number of linguistic, methodological, terminological and conceptual issues. Borin first set Kohonen's work to a broader context mentioning, for instance, the fact that there are about 7,000 languages in the world. The discussion took place in a sophisticated manner. Both the defendant and the opponent speak Swedish, Finnish and English and therefore the examples could be related to any of these languages. As a conceptual topic, Borin asked to clarify what has been meant by morphological analysis.

In the evening party, the opponent and I realized that we had met in a conference already quite some time ago. First we thought it would have beeen in Gothenburg but adterwards I realized it must have been a conference in Copenhagen in 1987. At that time, we both had a shared interest on morphology. As a young researcher I got the idea that one could try to learn morphological analysis rules through inductive inference. Almost thirty years ago the attempt did not give convincing results. Therefore, from a personal point of view, it has been a pleasure to see how substantial developements have taken place.

Wednesday, April 22, 2015

Enhancing digital humanities at National Library of Finland

The National Library of Finland organized an internal seminar in which both organizational and content matters were presented and discussed. The director of research library Liisa Savolainen gave a presentation on the developments related to the Digital Humanities area. Digital humanities is a natural active area for the National Library as an increasing proportion of material is in digital form and the lirary itself digitizes large quantities of materials. Savolainen discussed a conceptual model. She also gave examples of international and Finnish digital humanities projects and institutions including
  • styly analysis of Sharepeare's texts,
  • Old Bailey corpus of London central criminal court decisions, published from 1674 to 1913
  • FIN-CLARIN,
  • VARIENG,
  • Bible version comparison, and
  • sea traffic in the antiquities.
Savolainen concluded that library's natural role is to provide materials. Is was also discussed that availability of easy-to-use tools can be important for researchers, many of which have only limited skills in computer science.

Jean Sibelius is the internationally best known Finnish composer who lived 1965-1957. Tuija Wicklund gave a presentation on a large-scale project called JSW - Jean Sibelius Works in which a critical edition of Sibelius' works is compiled. The editions includes both musical scores and and associated texts such as letters. Wicklund gave as an example Lemminkäinen Tuonelassa and described the different stages of composition and the information is transferred from composer's table to åublisher manuscrisher's hands where the score is presented for each 27 players separately. In the critical edition, different information sources are integrated. For example, potential errors in the original score are corrected but in an open and transparent manner.

Thursday, April 16, 2015

Consumer Research Center becoming a part of University of Helsinki

The former autonomouis National Consumer Research Center is becoming a part of the University of Helsinki. This move was celebrated today at the new premises of the institution whose researchers described ongoing research and future plans.

One collaboration project with the Depertment of Modern Languages is called Citizen Mindspaces. In the project, social scientists, text analysts and computational linguists will study a large collection of social media discussions in the Suomi24 service and develop research questions, methods, tools in order to provide means for deeper understuding of citizens' thoughts about the state of affairs.

Friday, April 10, 2015

Lauri Lahti: Computer-Assisted Learning Based on Cumulative Vocabularies, Conceptual Networks and Wikipedia Linkage

Lauri Lahti defended his PhD thesis "Computer-Assisted Learning Based on Cumulative Vocabularies, Conceptual Networks and Wikipedia Linkage" in the field of Computer Science and Engineering. As the opponent serves Associate professor Piet Kommers, University of Twente, the Netherlands, and as the custos Professor Jorma Tarhio, Aalto University, Department of Computer Science. In the thesis, it was found that conceptual networks of students, common language and Wikipedia inherently emphasize different themes that should be addressed when developing learning methods.

In his lectio precursoria, Lahti discussed the motivation and different aspects of his work. His motivation stems from education. A central task he considers is how to travel through a conceptual network during learning. At later stages of the work, Wikipedia became an important resource. In the work, educational methods developed that are inspired by the collaboratively maintained knowledge structure of Wikipedia. Moreover many of its features and contents related to representing, exploiting and mimicking were used. Due to Wikipedia’s many unique characteristics, Lahti considered Wikipedia to offer much more than just a mere encyclopedic reference for factual information but a holistic framework for knowledge representation. One can, for example, compare mind maps created by students at different stages of learning and the Wikipedia as a socially constucted holistic resource.

The opponent discussed the relationship between literature and the keywords that are used to characterize the knowledge contained in the documents. This is an old question that libraries have solved in various ways. The basic question is how to link individual words with objects like books the contents of which are very complex and multifaceted.

Prof. Kommers had made the defence easy to follow by preparing the questions beforehand and by presenting them in one slide. The questions were discussed one by one, however widening the scope or delving into details whenever necessary. Among other things, various aspects related to networked representation of knowledge, students' use of it, measuring learning results, and different theories of learning were discussed. The opponent specifically mentioned Gordon Pask's work. With Heinz von Foerster and others, Pask was an early cybernetician who paid careful attention to systems theoretical aspects of complex phenomena.

A central critical point by the opponen was related to the limitations of using recall as a means to study learning. He asked about the potential of using the knowledge, for instance. in active problem solving. In general, the opponent found the work substantial and warmly recommended it to be accepted.