Importance of Data Management and Analyses
Historically, the major emphasis in bioinformatics centered on sequence and genome analysis. Today, however, the extensive use of microarrays and mass spectrometry has stimulated bioinformatic work in data acquisition, signal and image processing, and data mining. Also, simulation, modeling and visualization tools are becoming increasingly important areas of focus in bioinformatics. It is useful to discuss the differences among these different activities. Hence, we use the term bioinformatics for discussing data capture, management, and retrieval and use the term biocomputation for data analyses (see Glossary).
Bioinformatics is defined here as the capture, manipulation, and access of high-dimensional datasets. Warehousing this diverse set of data may seem challenging for biologists but other disciplines have developed scalable databases on an even larger scale. Bioinformatics is a multidisciplinary science that uses computers to address challenges in genomics, proteomics and affiliated fields. The primary tools of bioinformatics are powerful workstation and server computers and software applications that embody the computational algorithms that do the work.
- Database Management Systems
- Sample tracking
- Patient information management
- Data Acquisition
- Data Analysis
- Signal and Image Processing
- Pattern Recognition and Classification
- Data mining
- Simulation and Modeling
Biocomputation converts data into knowledge.
The processes of life are determined through interactions between the organism's genetic makeup and multiple environmental variables. Identifying the positive and negative connections between the common constituents of our diet with genetic determinants of health and disease (as influenced by environmental factors), makes nutritional genomics a high dimensional problem. This is because nutritional genomics datasets are large, complex and nonlinear.
The sources of nutritional genomic complexity are many and may included: (1) seasonal variations in food selection and content, (2) food fads and public response to news, studies and ads, (3) food preparation and cooking, (4) cultural and religious practices, (4) socio-economic status, (5) access to health care, (6) age and health status, (7) exercise and life-style, (8) disease complexity, and (9) genetic background.
The analyses of these datasets poses more significant challenges for nutrigenomics researchers. Progress in developing new software tools for such analyses is growing rapidly as more and more high dimensional datasets are acquired but there may be no "optimum" method for analyses. We expect that different datasets will be re-analyzed as new biocomputational tools are developed and perfected.
Dawson, K., Rodriguez, RL, and Malyj, W. 2005. Sample phenotype clusters in high-density oligonucleotide microarray data sets are revealed using Isomap, a nonlinear algorithm. BMC Bioinformatics 6, 195. Free Access
Dawson, K., Rodriguez, RL, Hawkes, WC, and Malyj, W. 2006. Biocomputation and the Analyses of Complex Data Sets in Nutritional Genomics In Nutritional Genomics: Discovering the Path to Personalized Nutrition. Kaput, J and Rodriguz, R (eds). Wiley and Sons, Inc. NY. 2006.
Kibbe, W. 2006. The Informatics and Bionformatics Infrastructure of a Nutrigenomics Biobank. In Nutritional Genomics: Discovering the Path to Personalized Nutrition. Kaput, J and Rodriguz, R (eds). Wiley and Sons, Inc. NY. 2006.