CORPUS

The UNL EOLSS Interface processes metadata from the EOLSS as of July 2006. Our corpus included:

The corpus was automatically downloaded from the EOLSS website and semi-automatically represented as a relational database comprising 13 tables:

Due to many inconsistencies in the EOLSS, the process of populating the UNL EOLSS Database was not a trivial task. There were found many divergent representations for the same concepts: University of Tokyo x Tokyo University; Bagrov Alexei Mikhailovich x Bagrov Alexey Mikhailovich; Berman Cilingir Kayis x Berman C. Kayis; and so on. In many cases, authors were associated to specific departments; in others, to the whole university.

The resulting database can be downloaded from here (Microsof Access 2003 database).