Data Input
data_input.Rmd
Longitudinal Data
Our visualization pipeline starts with an optional pre-processing module with built-in functions. There are basically two conditions:
If your datasets are multi-omics:
Here, we assume you provide a list of data tables, each gathered for a different omics type. For instance, you might have datasets for peptides, metabolites, and lipids. We recommend reviewing your data and thinking about applying quantile normalization and KNN imputation. This helps to address issues related to library size and missing data.
Then, run the pre_process()
function, and you will get
the standard input format as follows.
#> ID Day_1 Day_2 Day_3 Day_4 Day_5 Day_6
#> 1 1 0.8564458 -0.7026066 0.36425502 2.4817445 0.34595037 0.98407204
#> 2 2 -0.8276824 -1.0309664 -0.01834923 -0.1516639 0.41712822 1.94645479
#> 3 3 0.2413671 -0.6354232 -0.13154219 1.0836720 -0.24427531 0.04627503
#> 4 4 -1.3853057 1.3269031 -0.73650929 1.3218904 -1.08954020 -0.14244174
#> 5 5 -0.7863371 0.4455912 -1.81562430 -1.1050732 -0.06868605 -0.26191600
#> Day_7 Day_8 type
#> 1 0.1947470 0.8254115 peptide
#> 2 -0.3697760 0.9036718 peptide
#> 3 0.5317324 0.2719372 peptide
#> 4 -1.5919935 0.7888163 lipid
#> 5 0.7955413 -0.1197570 metabolite
If your datasets are NOT multi-omics:
Well, you can always try it on with our dashboard as long as the data
inputs are reformatted into the standard longitudinal format. You can
manually assign the type
with any category label to
describe major groups in the data. As In the case study, we used Kingdom
as the type label column for cheese data.
Annotation
Besides the data type we mentioned above, our methods allow three levels of information: functional annotation, taxonomy annotation, and feature annotation. These will be matched with ID as columns in the annotation data, which is another input for generating a dashboard. We support KeggID and GOID for automatic feature link generation, and the users should set the corresponding column name beforehand.