Personalized medicine (PM, Personalized Medicine) seeks to identify personalized therapies that make safe and effective individualized treatment of specific patients. One of the great difficulties in carrying out this clinical practice effectively is that currently there are no flexible information systems capable of providing accurate knowledge, updated and interrelated stratified based access to multiple data sources heterogeneous type . All this information, generated in experimental studies , clinical trials and in daily clinical practice and recently through biomedical sensors and large data sets freely available and interlaced (Open and Linked Data) should become an extraordinary source of knowledge for the advancement of the PM. However, the PM is currently facing great challenges. It is necessary to integrate heterogeneous information scattered in multiple origins, different genre, domain, structure and scale, which also plays a very important textual component. To meet these challenges, this project proposes a coordinated application of information integration techniques for type cover heterogeneous sources of text and data mining to facilitate the extraction of associated knowledge.

The main objective of the project is to design tools to integrated and intelligent access to information related to get useful knowledge extraction in the context of the PM. We propose three usage scenarios: (i) assistance to healthcare professionals during the decision making process of clinical settings, (ii) access to relevant information about their health status and dependent chronic patients and (iii) to support evidence-based training of new medical students. Most effective techniques are proposed for operations such as summarization, retrieval of images from text , information retrieval, named entity recognition and extraction of information from large data sets from sets of sensors and using open data. Tools will be implemented to gain knowledge from biomedical mainly public resources. They will design an architecture and Web application framework that enables integration of processes and techniques of text and data mining and integration of information in a fast, consistent and reusable (via plugins). Finally, we develop intelligent tools for user support in the three scenarios defined: decision-making for the diagnosis and treatment , patients, and training. In addition, experiments were conducted to evaluate both effectiveness and usability by conducting systematic and users. In the case of the former, participating in competitions like TREC -Medical Records, CLEF, TAC, DDIExtraction, i2b2, BioCreative, CoNLL Shared Task or BioNLP Shared Task. Evaluations with users, will consider both open and controlled environments.

Related projects

MedicalMiner (TIN-2009-14057-C03-01) is a project for the integration of explicit knowledge in text mining techniques for the development of translational medicine tools.

FlipIT!–Flipped Classroom in the European Vocational Education (2015-1-HU01-KA202-013555). The project is funded by the European Commission (European ICT Sector Skills Alliance - VET open course for mobile apps creators (AppSkil). Public contract (554271-EPP-1-2014-1-UK-EPPKA2-SSA). the analysis and subsequent development of a methodology based on the pedagogical model FLipped Classroom applied to VET.A MOOC for teachers will be developed and pre and post test will be carried out.

BIDAMIR (BIomedical Data Mining and Information Retrieval, TIC 07629). Excellence Research Project of Junta de Andalucia. The main objective of this project is the development of an intelligent system of clinical information that allows access to textual information and extract useful knowledge from structured data sources.

VirtualcloudCarer(TSI-020100-2011-83) is a project for the development of a highly personalized Service Platform for each of the actors involved: dependents, family and primary care professionals, hospitals and social services, allowing, on the one hand, sensing and telemonitoring of the dependent person and his / her environment, both domiciliary and external, and on the other, making possible the integration in the Digital Society of those people with their physical capacities widely degraded, adapting the systems and the environment to them and not vice versa.

MAVIR (Improving Access and Visibility of Multilingual Web Information for the Community of Madrid, S2009 / TIC1542). The MAVIR Consortium is a research network co-financed by the Community of Madrid within the IV Regional Plan of Scientific Research and Technological Innovation (IV PRICIT) and formed by a multidisciplinary team of scientists, technicians, linguists and documentalists to develop an integrative effort in the areas of research, training and technology transfer.
NAVIGA (E!4583 / CDTI) is a European project funded by the "Eurostars Eureka" initiative. The Naviga Project aims to reduce the risk that social groups with special vulnerabilities, such as the elderly or disabled, may be victims of digital progress, by making available to these groups the tools to stay active both in the exercise of the mind and in the social life.


Intelligent Systems Group (GSI), European University of Madrid (Madrid-UEM). Subproject 1 (UEM-IPHealth): Integration of access methods to open sources of information and sensor data for health education and decision-making (TIN-2013-47153-C3-1-R)

Laboratory for Information Retrieval and Mining of Texts and Data (Labyrinth), University of Huelva (Huelva-UHU). Subproject 2 (UHU-IPHealth): Text and data mining to support decision-making and learning in the field of health (TIN2013-47153-C3-2-R)
Next Generation Computer Systems Group (SING), University of Vigo (Ourense-UVigo). Subproject 3 (UVigo-IPHealth): Platform for Integration of Intelligent Techniques for Biomedical Information Analysis (TIN2013-47153-C3-3-R)



Manuel de Buenaga Rodríguez
Diego Gachet Paez
Enrique Puertas Sanz
Margarita Rubio Alonso
María Asunción Hernando Jerez
María José Busto Martínez
María Teresa Villalba de Benito
María Cruz Gaya López
Rosa Belén Mohedano del Pozo
Fernando Aparicio Galisteo
Rafael Muñoz Gil
María de la Luz Morales Botello

Laberinto (UHU)

Manuel J.Maña López
Jacinto Mata Vázquez
Miguel Á. Vélez Vélez
Manuel de la Villa Cordero
Noa Patricia Cruz Díaz

SING (UVigo)

Florentino Fernández Riverola
Reyes Pavón Rial
Rosalía Laza Fidalgo
José Ramón Méndez Reboredo
Daniel González Peña
Miguel Reboiro Jato
Fernando Díaz Gómez
Francisco José González Cabrera
Mª del Carmen Rodríguez Otero
Eva María Lorenzo Iglesias
M. Lourdes Borrajo Diz
Adrián Seara Vieira
Rubén Romero González
David Ruano Ordás



Gómez-Vallejo, H. J., Uriel-Latorre, B., Sande-Meijide, M., Villamarín-Bello, B., Pavón, R., Fdez-Riverola, F., & Glez-Peña, D. (2016). A case-based reasoning system for aiding detection and classification of nosocomial infections. Decision Support Systems, 84, 104–116.
Iglesias, E. L., Vieira, A. S., & Diz, L. B. (2016). An HMM-Based Multi-view Co- training Framework for Single-View Text Corpora. En F. Martínez-Álvarez, A. Troncoso, H. Quintián, & E. Corchado (Eds.), Hybrid Artificial Intelligent Systems (pp. 66-78). Springer International Publishing.
Jácome, A. G., Fdez-Riverola, F., & Lourenço, A. (2016). BIOMedical Search Engine Framework: Lightweight and customized implementation of domain-specific biomedical search engines. computer methods and programs in biomedicine, 131, 63– 77.
Rodrigues, M. F., Gonçalves, S. M., Santos, R., Fdez-Riverola, F., & Carneiro, D. (2016). Intelligent Tutoring: Active Monitoring and Recommendation. Interdisciplinary Perspectives on Contemporary Conflict Resolution, 205.
A. Seara Vieira, Maria Lourdes Borrajo Diz, Eva Lorenzo Iglesias: Improving the text classification using clustering and a novel HMM to reduce the dimensionality. Computer Methods and Programs in Biomedicine 136: 119-130.
D Gachet Paez, M de Buenaga Rodriguez, E Puertas Sanz, MT Villalba (2016). Healthy and wellbeing activities promotion using a Big Data approach. Health informatics journal, August 4, https://doi.org/10.1177/1460458216660754.
D Gachet, M de la Luz Morales, M de Buenaga, E Puertas, R Munoz (2016) Distributed Big Data Techniques for Health Sensor Information Processing. 10th International Conference on Ubiquitous Computing and Ambient Intelligence (UCAMI.2016).
M Teresa Villalba, M de Buenaga, D Gachet, F Aparicio (2016) Security Analysis of an IoT Architecture for Healthcare .Internet of Things and IoT Infrastructures: Second International Summit.
DG Paez, M de Buenaga Rodriguez, EP Sanz, MT Villalba, RM Gil (2016) Big data processing using wearable devices for wellbeing and healthy activities promotion. International Workshop on Ambient Assisted Living.


Manuel de Buenaga, Diego Gachet, Manuel J. Maña, Jacinto Mata, Lourdes Borrajo, Eva L. Lorenzo (2015).IPHealth: Plataforma inteligente basada en open, linked y big data para la toma de decisiones y aprendizaje en el ámbito de la salud. Procesamiento del Lenguaje Natural, Vol.55 pp.161-164
M.Teresa Villalba, Manuel de Buenaga, Diego Gachet, Fernando Aparicio. (2015). Security analysis of an IoT architecture for Healthcare. In Proceedings of the 2nd EAI International Conference on IoT Technologies for HealthCare. Lecture Notes of ICST 2015.
Diego Gachet. ML. Morales Botello, Manuel de Buenaga, Enrique Puertas. (2015). Health Sensors Information Processing and Analytics using Big Data approaches. In Proceedings of the 2nd EAI International Conference on IoT Technologies for HealthCare. Lecture Notes of ICST 2015.
Gachet Páez, D., Buenaga, M., Puertas Enrique, Villalba M.T. (2015). Big Data Processing of Bio-signal Sensors Information for Self-management of Health and Diseases. In Proceedings of the 2015 seventh international conference on innovative mobile and internet services in ubiquitous computing. IEEE Computer Society
Fernando Aparicio, Ma. Cruz Gaya, Manuel de Buenaga, Diego Gachet. (2015) M- health mobile app usability tested with engineering students. Proceedings of 9th International Technology, Education and Development Conference. Madrid. Spain.
Noa P. Cruz, Maite Taboada, Ruslan Mitkov (2015). A Machine Learning Approach to Negation and Speculation Detection for Sentiment Analysis. Journal of the American Society for Information Science and Technology (JASIST). DOI: 10.1002/asi.23533.
Noa P. Cruz (2015). Negation and Speculation Detection in Clinical and Review Texts. Procesamiento del Lenguaje Natural, vol. 54, pp. 107-110.
R. Romero, E. L. Iglesias, and L. Borrajo(2015) “A Linear-RBF Multikernel SVM to Classify Big Text Corpora”. BioMed Research International, vol. 2015, Article ID 87829, pp: 1-14, 2015. doi:10.1155/2015/878291
Borrajo, L.; Seara Vieira, A.; Iglesias, E.L. (2015). “TCBR-HMM, an HMM-based text classifier with a CBR system”. Applied Soft Computing Journal. Vol. 26, pp. 463-473, DOI: 10.1016/j.asoc.2014.10.019
Diego Gachet, Manuel de Buenaga, Enrique Puertas, María Teresa Villalba and Rafael Muñoz. Big Data Processing Using Wearable Devices for Wellbeing and Healthy Activities Promotion. 7th International conference on Ambient Assisted Living (IWAAL) 2015. Lecture Notes in Computer Science
Diego Gachet Páez, Manuel de Buenaga, Enrique Puertas, María Villalba. Big Data Processing of Bio-signal Sensors. Information for Self-management of Health and Diseases.The Ninth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing(IMIS-2015). Regional University of Blumenau (FURB), Blumenau, Brazil.
Jacinto Mata, Noa Patricia Cruz, Juan Luis Domínguez, Victoria Pachón, Manuel de la Villa, Alberto Moreno, Alicia Martínez, Carlos Luis Parra. BIDAMIR: BIomedical DAta Mining and Information Retrieval. XVIII Congreso Nacional de Informática de la Salud (InforSalud 2015), Madrid, 2015.
Juan L. Dominguez, Jacinto Mata, Victoria Pachón. Deterministic Extraction of Compact Sets of Rules for Subgroup Discovery. The International Conference on Intelligent Data Engineering and Automated Learning (IDEAL). Special Session on Discovering Knowledge from Data, Wrocklaw (Poland), 2015.Publicación: Lecture Notes in Computer Science (ISSN: 0302-9743).
Noa P. Cruz, Manuel J. Maña. An Analysis of Biomedical Tokenization: Problems and Strategies. Sixth International Workshop on Health Text Mining and Information Analysis (Louhi). Lisboa (Portugal), 2015.
Seara Vieira, E.L. Iglesias and L. Borrajo (2015) “A new dimensionality reduction technique based on HMM for boosting document classification”. Advances in Intelligent and Soft Computing (AISC). 9th International Conference on Practical Applications of Computational Biology and Bioinformatics (PACBB 2015). Vol. 1, pp. 69-78. Salamanca (España). DOI 10.1007/978-3-319-19776-0


Gachet-Páez, D., Aparicio-Galisteo, F., Buenaga-Rodríguez, M., Ascanio, J. R. (2014). Big data and IoT for chronic patients monitoring. In Ubiquitous Computing and Ambient Intelligence: Personalisation and user adapted services. 2014 (pp. 416-423). Springer International Publishing. Lecture Notes in Computer Science.
Dueñas Fuentes, Antonio Jesús; Mochón Doña, Ana; Escribano Dueñas, Ana Milagrosa; Piña Fernández, Juan Antonio; Gachet Páez, Diego (2014). Mathematical probability model for Obstructive Sleep Apnea Syndrome (OSAS). Chest, 145(3 Suppl), 597A-597A.
Rivas, A.R.; Iglesias, E.L.; Borrajo, L. (2104). “Study of query expansion techniques and their application in the biomedical information retrieval”. Scientific World Journal, vol. 2014, pp: 1-10. doi: 10.1155/2014/132158
Enrique Puertas, Manuel de Buenaga, María Lorena Prieto. (2014). BUSCLIMED: Mobile app for searching medical literature. In Proceedings of International Conference on Mobile and Information Technologies in Medicine and Health 2014.
Noa P. Cruz. Negation and Speculation Detection in Medical and Review Texts, ISBN: 978-84-617-0887-1. Edita Sociedad Española para el Procesamiento del Lenguaje Natural. 2014
D. Gachet Paez, F. Aparicio, M. De Buenaga, Juan Ramon Ascanio. Chronic Patients Monitoring Using Wireless Sensors and Big Data Processing. The Eighth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2014), Birmingham City University, Birmingham, UK.
Seara Vieira, A.; Iglesias, E.L.; Borrajo, L. (2014). “T-HMM: A Novel Biomedical Text Classifier Based on Hidden Markov Models”. Advances in Intelligent and Soft Computing (AISC). 8th International Conference on Practical Applications of Computational Biology and Bioinformatics (PACBB 2014). Vol. 1, pp. 225-234. Salamanca (España), DOI: 10.1007/978-3-319-07581-5.
Romero, R.; Seara Vieira, A.; Iglesias, E.L.; Borrajo, L. (2014) “BioClass: A Tool for Biomedical Text Classification”. Advances in Intelligent and Soft Computing (AISC). 8th International Conference on Practical Applications of Computational Biology and Bioinformatics (PACBB 2014). Vol. 1, pp. 243 – 251, Salamanca (España)
Prieto, M. L., Puertas, E., & de Buenaga, M. (2014). Learning tool for medicine students based on biomedical named entity recognition and Linked Open Data. Biomedical Engineering and Environmental Engineering, 145, 29.



From the perspective of the end user, BioClass is a platform that focuses on the application of reasoning models for the classification of texts. It is designed to work with the results obtained from a process of retrieving information from a text database, where the documents may or may not be relevant to a specific topic. BioClass takes this data as input and offers multiple filters and machine learning algorithms to handle the automatic classification problem. From a developer perspective, BioClass offers an abstraction layer that faces the classification process. Thanks to this, the developer can use its architecture and apply new models of reasoning.


Busclimed (Biomedical information finder for mobile devices) is an application that allows you to consult information about medical terms through the use of Linked Data. The patient or health care professional can view terms related to a particular disease, symptom or drug, consult scientific articles from sources such as Pubmed or Medline Plus about them, or see also information from the National Drug File when the searched term is a drug.

Download the android app from the Google Play Store

Big Data Analytics in Cardiology

Big Data analysis technologies are having and will have a huge impact on health. The benefits of "Big Data" include improved quality and accuracy of clinical decisions, improved processing speed for large amounts of data, and detection of diseases at an early stage. Here we use tools compatible with Big Data technology to predict the mortality of patients in an intensive care unit using the R software.


The objective of the ClinLaP (Clinical Language Analysis Platform) application is to help patients to better understand what their clinical history says, showing relevant information extracted from various biomedical sources. To do this, the system has a language analysis module that automatically extracts those medical terms that are relevant in the text, and allows the user to obtain information about the meaning of each term with a simple click. In addition, interesting related information such as symptoms, treatments, medications and scientific articles in which the disease or drug is mentioned, is shown, among others.
This application has been awarded the "Fujitsu Linked Open Data 2015"


jARVEST (Java web harvesting library) is a simple web scraping tool. It is implanted through a powerful domain-specific language based on JRuby, facilitating development with minimal code.


DISSUM is a tool to support the process of labeling a biomedical corpus of clinical evolution sheets for the semi-automatic creation of hospital discharge reports using automatic summarization techniques. The process of annotating a corpus for abstracts, essential for applying automatic learning and quality assessment techniques, is a complex task requiring chronological access to documents, selection of meaningful sentences within size constraints, as well as typological differentiation of these sentences. A tool has been developed that manages the whole process facilitating and economizing the work of the human annotators.

Technology transfer

Promoter entities (EPO)