Ideas
Partners
Consultants
Calls
EU Projects
Blog NEW
- Europe towards City Innovation Hubs
  
  Written by Matteo Satta EU funding has always helped cities support their innovation processes, particularly on digital transformation and environment, but the approach is now moving toward a new ambitious […]
- Mistakes that Cities Should Be Avoiding in EU Projects
  
  By Matteo Satta Nowadays, we are seeing an increasing trend of people requesting to lower taxes and, as a result, public budgets. This leads many cities to try to find […]
- How a city can boost your Smart City project
  
  By Matteo Satta The commission has been heavily investing in pilot Smart City projects for a long time, but you may often have R&D or innovation projects pretending to have […]
- How to involve a city in your project? Here are a few tips.
  
  Article written by Matteo Satta (Smart) City projects are more and more funded by the European Commission, but consortia often struggle to get cities on board or really engage them. […]
- >> See all posts
Log In Sign Up

Want to see this project on homepage?
Propose a Picture

Collaborative Annotation of a Large Biomedical Corpus (CALBC)
Start date: Jan 1, 2009, End date: Jun 30, 2011 PROJECT FINISHED

Description CALBC transform a large set of documents into a corpus with rich semantic links to biomedical data resourcesThe biomedical scientific literature is the key resource for the exchange of scientific facts: researchers write publications for their peer group to propose novel theories and report groundbreaking innovative findings. The new open access policies of the publishers have removed the barriers that hindered integration of the literature content into the infrastructure of fact databases. This change led into the standardization process where scientific publications are seamlessly connected to the scientific databases.The CALBC support action will engage the community of biomedical text mining researchers into a challenge that will lead to the exchange of a large set of annotated scientific documents. This community research effort will give answers to a very difficult question: “If we take all semantic resources, for example terminologies, that are available and use them to annotate a large set of documents, how will the documents finally look like under the best conditions possible”. The solutions to this problem will deliver biomedical literature in a standardized way and will enable sophisticated retrieval methods for the literature, i.e. with better semantic support. In addition, automatic interlinking of the documents with the biomedical fact databases will be possible.This project addresses the difficult problem of annotating an unrestricted number of text documents with a large set of semantic types from the biomedical domain. We propose a collaborative approach to this annotation task in the form of an open challenge to the biomedical text mining community. The task is the annotation of named entities in a large biomedical corpus, for a variety of semantic categories. The project delivers as outcome a large, collaboratively annotated corpus, marked with the mentions of biomedical entities. The annotated corpus becomes a resource for the community, to be used as a reference for improving text-mining applications.The biomedical text mining research community has a long tradition of organizing such challenges, as a way of evaluating techniques, sharing technical knowledge, and helping to improve the results from text mining programs. However, such challenges have typically addressed relatively small corpora in a narrow sub-domain, in part because the evaluation of the results is extremely long and costly. As a result, the generated annotated corpora are too small and are only narrowly annotated to be useful in a variety of text mining applications.In contrast, we propose to create a broadly scoped and large annotated corpus (at least 100,000 Medline abstracts annotated with 5-10 semantic types) by integrating the annotations from different named entity recognition systems. Metadata will also be added to the corpus. The participating systems have different application scopes and annotation strategies, and therefore complement each other. Therefore, the annotated corpus reflects these different scopes and strategies. A secondary goal of this project is to define a standardized format for representing the annotations contributed by the participants and comparing them effectively. Currently the lack of such a format hinders progress in the evaluation of named entity recognition systems. The final corpus will also be made available formatted in RDF for exploitation in Semantic Web applications.The corpus will be used to organize challenges where participants can download the corpus, can annotate it with their own text mining solutions, submit the corpus to a central server and receive an assessment of their results through a fully automated analysis. Over a half-year period, submissions and assessments at any time can be contributed. At the end of that period all submissions of annotated corpora will be used to generate the next fully annotated corpus, which then will be used for the next round of the challenge.

Let an Expert

help you

A suggestion of Up2Europe Experts in EU Funding

ONECO

Erasmus+

International Project Management

ONECO is a consultancy of European programs in the field of education, culture, employment and local development. From 1997 until ...

Nuno Vaz Silva - C Consulting

Creative Industries

INTERREG

Expertise in EU projects and policies – drafting project proposals, technical and administrative assistance, support to project implementatio ...

Ing. Iva Mládenková PhD

Horizon Europe

Innovation & Research

Freelance EU funding analyst and consultant - Focused primarily on EU Direct Management Programmes, in particular H2020, Eras ...

Ana Maria Solis

Erasmus+

Horizon Europe

Expert - Project Development & EU Funding Consultant in participatory processes

MY SUCCESSFUL EXPERIENCES IN COMMUNITY GRANTS

Joanna Elwira Sychta

Entrepreneurship and SMEs

Project Management

Developing and managing the R&D applications of the European Union Funds especially Horizon 2020 SME Instrument, Marie Curie Research and other ...

Jose Ospina Development Consultant

Eco-Innovation

Energy Efficiency

I have worked for over 35 years in a professional capacity in as project developer and manager in areas related to sustainable development an ...

Angelo Napolitano

Partnership Management

Transnational cooperation

Seeking to stimulate and improve the participation of Italian private and public stakeholders to EU projects

Şükrü Torun

Mental health

Innovation & Research

Working as a full-time lecturer at Anadolu University, Faculty of Health Sciences, Department of Language and Speech Therapy, Prof. Dr. Şükr&u ...

Marketkaps

Horizon Europe

LIFE programme

Our Experts are former Entrepreneurs and also Experts & Auditors for the European Commission for more than 23 years and French Foreign Trade Ad ...

Coordinator

EUROPEAN MOLECULAR BIOLOGY LABORATORY

€ 685 697,00

Barbara Baron
Wellcome Trust Genome Campus - CB10 1SD Hinxton, Cambridge (Germany)

Details

68.3% € 1 499 687,00
FP7-ICT
Project on CORDIS Platform

Project Website

3 Partners Participants

FRIEDRICH-SCHILLER-UNIVERSITAET JENA

€ 262 364,00

Udo Hahn
FUERSTENGRABEN 07743 JENA (Germany)

ERASMUS UNIVERSITAIR MEDISCH CENTRUM ROTTERDAM

€ 442 700,00

Erik van Mulligen
's Gravendijkwal 3015CE ROTTERDAM (Netherlands)

LINGUAMATICS LIMITED

€ 108 926,00

Roger Hale
St Johns Innovation Centre, Cowley Road CB4 0WS Cambridge (United Kingdom)

Subscribe to the newsletter
Get alerts whenever new EU Calls, Ideas, Partners and Consultants are available

Search for European Projects

Drop Images Here
Or click to add/replace

Collaborative Annotation of a Large Biomedical Corpus (CALBC)
Start date: Jan 1, 2009, End date: Jun 30, 2011 PROJECT FINISHED