Computational models for establishing means and methodological procedures for investigation of research data in bioinformatics - MCBio

Enter multiple e-mails separated by comma.

Computation used as a tool by scientific research has revolutionized the biological sciences, as it has done with many other fields of science, and considering the exponentially increasing amount and complexity of the scientific data being generated and needing to be efficiently handled, translated, processed and communicated, new computational resources are needed for the effective treatment of all this volume of data that makes it possible to transform it into knowledge and subsequently, as the last instance of this process, to apply the generated knowledge to allow or to increase technological advances that, in turn, bring about the modernization of the productive sectors. The use of computational and mathematical models as a tool for research makes it possible not only to interpret the content easily identified in data deposited in several databases, but also to use the appropriate computational resources to process the large quantities of scientific data, to enable the conversion of scientific data into innovative technologies, services or products (TSP), from the identification of patterns and relationships that were not perceived a priori. The use of adequate computational and mathematical models based on concepts and applications of data science allows scientific questions to be approached from a new perspective of analysis of results, as a new methodological strategy to observe these results, with the proposal that new ways of analysis might bring new TSPs. The proposal that computational models be adopted for this analysis comes to complement frequently used methods, such as the statistical approach, which, in general, is based on the test of experiments against a previously defined hypothesis. However, the current needs of research projects require the generation and evaluation of hundreds and even thousands of hypotheses, leading the latter to be solely evaluated by computational models. This scenario is even more challenging when one realizes how complex the currently generated datasets are, the characteristics of which include, among others, large amounts of data, where terabyte-level data sets are becoming common; high dimensionality, when working with hundreds or thousands of attributes; heterogeneity, since, unlike traditional methods of analysis, the computational models are suitable for data of different types, discontinuous and not categorized; multiple physical locations of data sets, since it is common for such sets to be distributed or dispersed in several repositories. The public interest and the benefits to society are noticed from the application of the methodology and other resources developed in its scope by other research projects that will be more efficient in the search for results. That is, in terms of potential results, MCBio has developed new research resources in large masses of data - specifically, genomic databases –so that its results are ready to be used by the targeted recipients, which are scientific and academic communities. In this way, MCBio results potentially enable scientific and academic communities to contribute in an applied and / or finalistic way to society, leveraging actions and results of their studies and research on the methodology and other results obtained by the project concerned.

Status: Completed Start date: Fri Apr 01 00:00:00 GMT-03:00 2011 Conclusion date: Mon Mar 31 00:00:00 GMT-03:00 2014

Head Unit: Embrapa Dairy Cattle

Project leader: Wagner Antonio Arbex

Contact: wagner.arbex@embrapa.br