<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1d1 20130915//EN" "JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta id="journal-meta-1">
      <journal-id journal-id-type="nlm-ta">Biomedical research and Therapy</journal-id>
      <journal-id journal-id-type="publisher-id">Biomedical research and Therapy</journal-id>
      <journal-id journal-id-type="journal_submission_guidelines">http://www.bmrat.org/</journal-id>
      <journal-title-group>
        <journal-title>Biomedical research and Therapy</journal-title>
      </journal-title-group>
      <issn publication-format="print"/>
    </journal-meta>
    <article-meta id="article-meta-1">
      <article-id pub-id-type="doi">10.15419/bmrat.v8i3.666</article-id>
      <title-group>
        <article-title id="at-66fc79383d10">Advantages of functional analysis in comparison of different chemometric techniques for selecting obesity-related genes of adipose tissue from high-fat diet-fed mice</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author" corresp="yes">
          <contrib-id contrib-id-type="orcid"/>
          <name id="n-6ad8d59ee84b">
            <surname>Dharmaraj</surname>
            <given-names>Saravanan</given-names>
          </name>
          <email>saravanandharmaraj@unisza.edu.my</email>
          <xref id="x-77fd60f1ab96" rid="a-887300e75529" ref-type="aff">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <contrib-id contrib-id-type="orcid"/>
          <name id="n-185473417587">
            <given-names>Mahadeva Rao U. S.</given-names>
          </name>
          <xref id="x-dd9afa6642f1" rid="a-887300e75529" ref-type="aff">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <contrib-id contrib-id-type="orcid"/>
          <name id="n-a2087d45ccaf">
            <surname>Simbak</surname>
            <given-names>Nordin</given-names>
          </name>
          <xref id="x-f46986bfe3b9" rid="a-887300e75529" ref-type="aff">1</xref>
        </contrib>
        <aff id="a-887300e75529">
          <institution>Faculty of Medicine, Universiti Sultan Zainal Abidin, Medical Campus, 20400 Kuala Terengganu, Terengganu, Malaysia.</institution>
        </aff>
      </contrib-group>
      <volume>8</volume>
      <issue>3</issue>
      <permissions/>
      <abstract id="abstract-06fedb30fc69">
        <title id="abstract-title-6605b4354335">Abstract</title>
        <p id="paragraph-50c22ab32df3"><bold id="s-d6ecf5070a2d">Introduction</bold>: Obesity is a lifestyle disease that is becoming prevalent nowadays and is associated with a surplus in energy balance related to lipid metabolism, inflammation and hypoxic condition, resulting in maladaptive adipose tissue expansion. This study used the publicly available gene dataset to identify a small subset of important genes for diagnostics or as potential targets for therapeutics. <bold id="strong-2">Methods</bold>: Chemometric analyses by principal component analysis (PCA), random forest (RF), and genetic algorithm (GA) were used to identify 50 genes that differentiate adipose samples from high-fat diet- and normal diet-fed mice. The first 30 important genes were studied for classifying the samples using six different classification techniques. Gene ontology (GO), pathway analysis, and protein-protein interaction studies on the 50 selected genes were subsequently done to identify important functional genes. Finally, gene regulatory effects by microRNA were assessed to confirm the genes’ potential as targets for new therapeutic drugs. <bold id="strong-3">Results</bold>: The genes identified by RF are best for differentiating the samples, followed by PCA, with the least predictability shown by genes chosen by GA. However, PCA identified more genes with functional importance, such as the hub genes <italic id="emphasis-1">ATP5a1</italic> and <italic id="emphasis-2">Apoa1</italic>. <italic id="emphasis-3">ATP5a1</italic> is the main hub gene, whereas Apoa1 is involved in cholesterol metabolism. <italic id="emphasis-4">Vapa</italic> and <italic id="emphasis-5">Npc2</italic> are crosstalk genes that link both of these main genes and could be targeted for therapeutic drug design. <bold id="strong-4">Conclusion</bold>: The combination of different chemometric techniques and functional analysis of genes could be used to select for a small number of genes which could serve as more suitable diagnostic or therapeutic targets.</p>
      </abstract>
      <kwd-group id="kwd-group-1">
        <title>Keywords</title>
        <kwd>gene ontology</kwd>
        <kwd>obesity</kwd>
        <kwd>principal component analysis</kwd>
        <kwd>protein-protein interaction</kwd>
        <kwd>random forest</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec>
      <title id="t-8feae88a53a9">
        <bold id="s-13f3a05b168c">Introduction</bold>
      </title>
      <p id="p-7ae77743ea1c">Obesity is defined as an accumulation of white adipose tissue, with the disease often occurring together with hyperglycemia, hypercholesterolemia and hypertension; this cluster is often termed metabolic syndrome <xref id="x-963cd5ce0870" rid="R104834421777534" ref-type="bibr">1</xref>. Data analysis between 1980 and 2015 from 68.5 million persons showed an increasing prevalence of obesity and overweight condition in children and adults. In 2015, approximately 108 million children and 604 million adults were designated as obese <xref id="x-f5dd288bd114" rid="R104834421777535" ref-type="bibr">2</xref>. </p>
      <p id="p-3f4a890fb6a5">Adipose tissue plays a key role in systemic energy homeostasis; indeed, any dysfunction involving adipocytes, such as hypertrophy, fibrosis, hypoxia and robust inflammation, is known to contribute to obesity<xref id="x-35f927740b14" rid="R104834421777536" ref-type="bibr">3</xref>. The wide imbalance between energy intake and expenditure in obesity results from a combination of genetic, epigenetic, physiological, behavioral, sociocultural and environmental factors which make the diagnosis and management of obesity difficult<xref id="x-efd9a3de9957" rid="R104834421777537" ref-type="bibr">4</xref>. Obesity can be divided into monogenic or polygenic obesity, with the monogenic type being further classified as syndromic or non-syndromic. People with monogenic obesity represent only a small percentage of the obese population, whereas common obesity with no obvious Mendelian inheritance pattern is polygenetic and highly prevalent<xref id="x-7b92bce24a0c" rid="R104834421777538" ref-type="bibr">5</xref>. It has been mentioned that for any disease, one of the greatest challenges lies not in the identification of association genes but in ascertaining the molecular mechanisms by which those factors/genes reduce the disease risk or phenotypic expression<xref id="x-e0a3c8964e09" rid="R104834421777539" ref-type="bibr">6</xref>. </p>
      <p id="p-960948973eaa"/>
      <p id="p-645fe6192cfb">The explosion of genomic data in terms of expression levels of thousands of genes from microarray studies, combined with chemometric and bioinformatic tools, has enabled the identification of candidate biomarker genes and pathways. The aim of the study was to use chemometric analyses of principal component analysis (PCA), random forest (RF), and genetic algorithm (GA) to identify a small fraction of genes that differentiate high-fat diet- and normal diet-fed adipose samples from mice using the microarray dataset GSE39549. Various classification techniques were used to check which set of genes are best for classification purposes, whereas the underlying mechanisms were studied using functional gene annotation, pathway analysis, protein-protein interaction, and miRNA regulation. </p>
      <p id="p-505b1992f6ba"/>
    </sec>
    <sec>
      <title id="t-ad46ba2d65f2">
        <bold id="s-3559064d082b">Materials - Methods</bold>
      </title>
      <sec>
        <title id="t-b27799fca7ee">
          <bold id="s-0b7be3574b90">Overview of Methods</bold>
        </title>
        <p id="p-07ccd004fbea">The methods' workflow consisted of dataset selection and pre-processing, selection of genes by three multivariate techniques, and evaluation of the classification accuracy of the selected genes. In addition, evaluations of the biomechanism of the genes and their potential clinical significance, functional annotation, protein-protein interaction, and miRNA-target gene interactions were conducted.</p>
        <p id="p-848ee03aa342"/>
      </sec>
      <sec>
        <title id="t-745707265175">
          <bold id="s-e63e207fee33">Data retrieval and pre-processing</bold>
        </title>
        <p id="p-2865b1d9fa86">The Gene Expression Omnibus (GEO; https://www.ncbi.nlm.nih.gov/gds), a public functional genomics data repository, was searched for ‘obesity’ and the choice of the dataset was based on an adequate number of samples. The chosen dataset of GSE 39549 was downloaded from the Gene Expression Omnibus to gain insight into the relationship between obesity and hypoxia. This dataset consisted of both adipose and liver samples from mice fed with a high-fat diet and the corresponding control diet<xref id="x-865e209fe3e1" rid="R104834421777542" ref-type="bibr">7</xref>. The data used in this study consisted of gene expression data from the adipose samples. Microsoft Access was used to map the probe sets of the genes (which were differentially expressed by more than 2.0-fold) to Entrez Gene IDs, and the average expression values<xref rid="R104834421777543" ref-type="bibr">8</xref>, <xref rid="R104834421777544" ref-type="bibr">9</xref> of 15.000 genes were obtained. The original data consisted of different time points but in this study the data were pooled to compare the high-fat diet and control/normal diet. This helped overcome the dimensionality problem associated with microarray data where variables are very large but the number of samples is limited. </p>
        <p id="p-cccfee4e817b"/>
      </sec>
      <sec>
        <title id="t-13284885f4c3">
          <bold id="s-d76920db625d">Software and packages</bold>
        </title>
        <p id="p-a6a510c09bba">Three approaches were used to carry out the selection of genes. In the first approach, the free R package with prcomp as well as randomForest libraries were used for selecting genes by PCA and RF; conversely, GA was undertaken using Matlab R2019b. The selected variables or genes' ability to classify the samples was further carried by the use of glm and e1071 libraries in the R package. The network analysis and visualization were carried out using Cytoscape 3.72 and related apps which were downloaded from the Cytoscape website (https://cytoscape.org/). The analyses were all carried out on an Intel® Core™ i5-7400 CPU@ 3.0 GHz with 16.0 GB RAM. </p>
        <p id="p-766028fb9e41"/>
      </sec>
      <sec>
        <title id="t-2b21994fad73">
          <bold id="s-3ff6a102bac3">Gene selection algorithms </bold>
        </title>
        <p id="p-fc8db358080e">The PCA was carried out using the prcomp function in the R program. The RF method has only a couple parameters which need to be chosen (mtry and ntree). The mtry was set to 120 and ntree was set to 1000. The GA was carried out with Matlab using the approach described previously <xref rid="R104834421777545" ref-type="bibr">10</xref>, <xref rid="R104834421777546" ref-type="bibr">11</xref>. The parameters chosen were the number of chromosomes of 100, ndims of 3, and the algorithm was run for 400 generations. The number of genes selected from each chemometric method was 50.</p>
        <p id="p-a02e1d40aea2"/>
      </sec>
      <sec>
        <title id="t-31210fd95859"><bold id="s-fa936d7ae476">Use of machine learning for classification</bold> </title>
        <p id="p-8a5ebaa9467b">The gene selection method had chosen 50 genes from either PCA, RF or GA, and the ability of the first 30 genes from each were selected for differentiating between the high-fat diet and control diet. The correct classifications were predicted using six different supervised chemometric techniques, which consisted of k-nearest neighbors (kNN), logistic regression, linear discriminant analysis, Naïve-Bayes, and two types of singular vector machines (SVM) <xref rid="R104834421777547" ref-type="bibr">12</xref>, <xref rid="R104834421777548" ref-type="bibr">13</xref>. The first SVM evaluator used was a non-kernel or linear-based method, whereas the second SVM used was the sigmoid-based kernel. The other parameters chosen for the above techniques were k = 5 for kNN, as well as use of the binomial option for logistic regression.</p>
        <p id="p-c1040b80485b"/>
      </sec>
      <sec>
        <title id="t-3bd9d0b0efe0">
          <bold id="s-687dba10fd6e">Functional enrichment and pathway analysis (Functional annotation clustering)</bold>
        </title>
        <p id="p-40384b745ee8">Functional enrichment analysis was carried on the genes chosen by the three methods by loading the selected genes into the Functional Annotation tool in the Database for Annotation Visualization and Integrated Discovery (DAVID; https://david.ncifcrf.gov/) to identify Gene Ontology (GO) functions, especially those pertinent to biological processes, molecular functions and cellular components. A total of 50 chosen genes by each method was evaluated for functional annotation, and the similarity term overlap was set to 3. The similarity threshold was 0.50, whereas p-value &lt; 0.05 was used to obtain the optimal and statistically significant results. The enriched pathways of the genes in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database were also evaluated <xref id="x-509a857d0d2e" rid="R104834421777549" ref-type="bibr">14</xref>.</p>
        <p id="p-768a28bbb0ff"/>
      </sec>
      <sec>
        <title id="t-eefb560d441d">
          <bold id="s-f54ac040cdfe">Protein-protein interactions</bold>
        </title>
        <p id="p-4ea58a50ac11">The genes identified by the three methods were subjected to STRING (Search Tool for the Retrieval of Interacting Genes; https://string-db.org/) database to identify protein-protein interactions in adipose samples from high-fat diet. The confidence score of &gt;0.4 was used to identify the protein-protein interaction networks, and the disconnected nodes were hidden in the network to simplify the resulting display<xref id="x-f9f569e66c7a" rid="R104834421777562" ref-type="bibr">15</xref>. The active interaction sources were chosen to include “textmining, experiments, databases, co-expression, neighborhood, gene fusion, and co-occurrence”. The network obtained was downloaded as tab-separated values (tsv) and processed further in Cytoscape 3.72. </p>
        <p id="p-f403ff870a76"/>
      </sec>
      <sec>
        <title id="t-579a467516a4">
          <bold id="s-9584db5c2d3f">The associated miRNA-gene regulatory network in humans</bold>
        </title>
        <p id="p-f572721c4775">The genes chosen by the three different multivariate analyses also showed protein-protein interactions and were further assessed for biological meaningfulness by studying the regulatory aspect of the associated human genes by human microRNA (miRNA). The human protein-protein network associated with the mice proteins was obtained by using the STRINGIFY network function of the STRING app in Cytoscape. The miRNA-gene regulatory network in humans was obtained by extending the previous human protein-protein interaction network with CyTargetLinker<xref id="x-dcea807dcad7" rid="R104834421777563" ref-type="bibr">16</xref>. The miRNA database chosen for this was the experimentally validated database of miRTarBase (version 4.4). </p>
        <p id="p-97f65ed5a7ef"/>
      </sec>
    </sec>
    <sec>
      <title id="t-2e7e87ec6598">
        <bold id="s-973233af566d">Results</bold>
      </title>
      <sec>
        <title id="t-d359c0b1c68b">
          <bold id="s-f42b937f7d98">Differential genes between a high-fat diet and a normal diet</bold>
        </title>
        <p id="p-e691d66adc0f">The PCA showed that principal component 1 (PC1) contributed 38.2% of the overall variance and PC2 was responsible for the remaining 17.0%, whereas a total of eight principal components were required to achieve the cumulative proportion of variance of 90%. The 30 genes which had the highest loading or weightage for the first principal components were chosen for usage in classification. From their ENTREZ ID, the first six of them were identified as <italic id="e-24c8221beb25">Mup3</italic>, <italic id="e-8310640ca463">Mup2</italic>, <italic id="e-4a69d97da90b">Mup1</italic>, <italic id="e-6db5dfa65125">Aldh6a1,</italic> <italic id="e-5a2607746f07">H2-Aa</italic> and <italic id="emphasis-6">Acadsb</italic>. The mean decrease in the RF accuracy option was used to select the 30 most important genes, which were differentiated between samples from a high-fat diet and those from a normal diet. The first six of these genes were <italic id="emphasis-7">Lilrb4a</italic>, <italic id="emphasis-8">Tef</italic>, <italic id="emphasis-9">Cdt1</italic>, <italic id="emphasis-10">Adam17</italic>, <italic id="emphasis-11">Gas7,</italic> and <italic id="emphasis-12">Mlxipl</italic>. The RF used for the selection of genes had the added advantage of also classifying the samples. It had an out-of-bag (OOB) error rate of 15%. Additionally, 9 out of the 10 (or 90%) of the test samples were classified correctly when mtry of 120 and ntree of 1000 were used. The GA had to be run for 400 generations in order to pick relevant genes that had higher loads by singular vector decomposition; once again, 30 genes were chosen for classification. The first six genes were identified by their ENTREZ GENE ID as <italic id="emphasis-13">Hoxa3</italic>, <italic id="emphasis-14">Igf2r</italic>, <italic id="emphasis-15">Rassf4</italic>, <italic id="emphasis-16">Armcx1</italic>, <italic id="emphasis-17">Klf4</italic> and <italic id="emphasis-18">Galr3</italic>. </p>
        <p id="p-fdb2dae7363e"/>
      </sec>
      <sec>
        <title id="t-d5dd5fcb1fdb">
          <bold id="s-4e0092438cb6">Evaluation of classification performance</bold>
        </title>
        <p id="p-a56c8260b3bb">The genes selected by RF to differentiate between adipose samples from mice on normal diet or high-fat diet were tested with the six different chemometric techniques. RF gave the best correct classification compared to PCA and GA. The genes selected by RF were classified correctly in 58 out of 70 (83%) tested samples. The genes selected by PCA showed 74% correct classification, and those selected by GA showed 73% correct classification. The Naïve Bayes had the highest correct classification among the individual classification techniques as the three sets of variables had values of 85% each, and SVM using radial kernel had the next highest. </p>
        <p id="p-15d40c37bde5"/>
      </sec>
      <sec>
        <title id="t-0232f5a5988c">
          <bold id="s-234d9b792dfa">Gene ontology and pathway analyses</bold>
        </title>
        <p id="p-5e5202f5a7c8">The functional annotation of genes using an online DAVID database showed that the genes obtained by PCA were more associated with GO terms of molecular functioning, biological processes, and cellular components related to lipid metabolism, as compared to the two other selection methods. The related GO terms, percentage of genes identified, and P-values are shown in Table 1. The genes chosen by PCA that are associated with GO annotations of ‘insulin activated receptor activity’ to ‘negative regulation of lipid metabolic processes’, as shown in Table 1, are the following: <italic id="e-094b264cae62">Mup1</italic>, <italic id="e-77221b2511f5">Mup2</italic> and <italic id="e-3ade0b99467c">Mup3</italic>. The three genes associated with GO annotation linked with cholesterol, such as ‘cholesterol transport’ to ‘cholesterol metabolic process’ are <italic id="e-285afa8eabe7">Apoa1</italic>, <italic id="e-acf2c8f2177c">Apoa2</italic> and <italic id="e-fb055ac30290">Npc2</italic>. The genes chosen by RF had one term directly related to obesity: the GO term of ‘lipid metabolic process’; the five genes associated with it are sphingomyelin phosphodiesterase 3 (<italic id="e-183bbc72b118">Spmd3</italic>), ATP citrate lyase (A<italic id="e-7d2b2341a056">cly</italic>), <italic id="e-bfa06a8e3054">Spmd13b</italic>, 1β-Hydroxysteroid dehydrogenase type 1 (<italic id="e-e19e371045b3">Hsd11b1</italic>) and alpha/beta hydrolase domain containing 3 (<italic id="e-73a3016b3c69">Abhd3</italic>). The genes obtained by GA did not have any GO term related to molecular function or biological function, but the term ‘extracellular exosome’ under cellular component was the only term with an enrichment score above the value of 1 and a probability value under 0.05. The three genes out of nine associated with the term are <italic id="e-de6701e0b853">Aldh16a1</italic>, <italic id="e-8414e04d6d85">Igf2r,</italic> and <italic id="e-b855082afdcf">Hsp90aa1</italic>. The KEGG analysis revealed that only genes selected by PCA were significantly enriched. The two pathways that were enriched were mmu03010 (ribosome underclass of translation in genetic information processing) and mmu00280 (valine, leucine and isoleucine degradation underclass of amino acid metabolism).</p>
        <p id="p-c8cd20181133"/>
        <fig id="f-fc8f8c1ce6e9" orientation="portrait" fig-type="graphic" position="anchor">
          <label>Figure 1 </label>
          <caption id="c-a7ddd5434f6f">
            <title id="t-3bdd9999e643">
              <bold id="s-2e07e1b5e92b">Protein-protein interactions among genes chosen by principal component analysis.</bold>
            </title>
          </caption>
          <graphic id="g-ede18ea6f7db" xlink:href="https://typeset-prod-media-server.s3.amazonaws.com/article_uploads/6b75bf38-92e9-4738-9368-75606e4a1d55/image/f1dfafbd-f83e-42df-bfa3-d20ea823e79e-ufig-1.png"/>
        </fig>
        <fig id="f-767b034f24c1" orientation="portrait" fig-type="graphic" position="anchor">
          <label>Figure 2 </label>
          <caption id="c-273fd1c82436">
            <title id="t-28d12f5809e5">
              <bold id="s-00acc6f05d6c">Protein-protein interactions among genes chosen by random forest with accuracy function.</bold>
            </title>
          </caption>
          <graphic id="g-7013174f31b6" xlink:href="https://typeset-prod-media-server.s3.amazonaws.com/article_uploads/6b75bf38-92e9-4738-9368-75606e4a1d55/image/406bdaf5-c26a-497f-904f-8909bceb4dcc-ufig-2.png"/>
        </fig>
        <fig id="f-026ace35381c" orientation="portrait" fig-type="graphic" position="anchor">
          <label>Figure 3 </label>
          <caption id="c-c57d9cff4cdc">
            <title id="t-81cd6c8740f5">
              <bold id="s-1f98ec5a79e8">Protein-protein interactions among genes chosen by genetic algorithm.</bold>
            </title>
          </caption>
          <graphic id="g-0a8f837394ad" xlink:href="https://typeset-prod-media-server.s3.amazonaws.com/article_uploads/6b75bf38-92e9-4738-9368-75606e4a1d55/image/a1943459-f039-40bc-934d-048207063008-ufig-3.png"/>
        </fig>
      </sec>
      <sec>
        <title id="t-038d1fd9d644">
          <bold id="s-aa9a144bd339">Protein-protein interaction and hub genes</bold>
        </title>
        <p id="p-33a3b5e32158">The network of protein-protein interactions showed that the 50 genes chosen by PCA exhibited a wide network, whereas the genes chosen by GA were least extensive. The interaction between genes was regarded as positive when having a combined score of ≥ 0.4. The network for the PCA chosen genes is shown in <bold id="s-6142151f013e"><xref id="x-f89e9d810c5c" rid="f-fc8f8c1ce6e9" ref-type="fig">Figure 1</xref>.</bold> Among the genes chosen by PCA, two genes are considered as hub genes in the protein-protein interaction network, with <italic id="e-64982a73ace8">Atp5a1</italic> having nine degrees of connectivity while <italic id="e-0077b9e34a5c">ApoA1</italic> having slightly less connectivity at six degrees. The network from RF and GA chosen genes is less extensive and shown in <bold id="s-c4a3f7cdf13a"><xref id="x-d28e7483cf59" rid="f-767b034f24c1" ref-type="fig">Figure 2</xref> </bold>and <bold id="s-e2a08abfaecf"><xref id="x-e24b56076d00" rid="f-026ace35381c" ref-type="fig">Figure 3</xref>.</bold> The biggest network consisting of seven members for RF-selected genes consisted of the hub gene <italic id="e-3dd22a6e9cd7">Plk1</italic> with five connections. The GA chosen genes had two networks composed of four genes, and one of them was a linear network consisting of four genes, with two of the members being <italic id="e-2fa64ad4f8f2">Igf2r</italic> and <italic id="e-bb60aa7d0c13">Hsp90aa1</italic>. </p>
        <p id="p-a4f2d1581b24"/>
      </sec>
      <sec>
        <title id="t-408fc6347caa">
          <bold id="s-54b7379f528e">Regulation of target genes by microRNA</bold>
        </title>
        <p id="p-c8ea21f16c01">The use of the Stringify function of Cytoscape enabled identifying similar protein-protein interactions in humans, along with the use of CyTargetLinker to predict the miRNA-gene regulatory interactions of these proteins. The genes selected by PCA which showed protein-protein interactions in humans had a total of 578 miRNA regulating the genes, with <italic id="e-b154fa980ace">ATP5A1 </italic>and <italic id="e-efe4cd1f0620">RPL18A</italic> being regulated by the greatest number of miRNAs (which was 85). The number of miRNAs regulating the genes with protein-protein interactions chosen by RF was 390, whereas for GA, the number of miRNAs was at least 356 for the 16 genes with protein interactions. One of the genes chosen by GA, <italic id="e-90cf6bca85d8">HSP90AA1</italic> was regulated by a total of 100 miRNAs. </p>
        <p id="p-ae9d0c8d63d7"/>
      </sec>
    </sec>
    <sec>
      <title id="t-8a536715a5ba">
        <bold id="s-62fa28b66d83">Discussion</bold>
      </title>
      <p id="p-cfcd04beb908">The use of data mining techniques combined with bioinformatics has facilitated finding biological meaning in large molecular datasets to diagnose, understand the underlying pathogenesis, and provide insight to develop treatments for various diseases. This study has compared the use of PCA, RF and GA to identify genes that differentiate adipose samples from high-fat diet treatment, compared to control, to understand the underlying biological mechanisms of obesity. The biological and molecular functions of each set of chosen genes were studied using gene annotation, pathway analysis, protein-protein interaction, and gene regulation. </p>
      <p id="p-f77ae37454fa">There are various approaches to selecting the relevant genes. The choice of selecting the smallest number of ‘principal gene components’ that best explain the experimental data is often used for PCA, but in this study, the decision was to choose the first principal component only <xref id="x-c9d1586843d8" rid="R104834421777564" ref-type="bibr">17</xref>. This decision was based on the fact that the first principal component explained the more than double variance percentage compared to the second component. Based on this, the genes that had the highest loading or weightage for this component were chosen for differentiating the samples. </p>
      <p id="p-7653e142fd1b">Moreover, it was found that choosing principal component two for selection of the important genes gave less correct classification, and the genes were less associated with GO terms associated with fat metabolism. PCA usage to select genes does not involve parameters that need to be optimized, but for GA the number of generations to be run and the number of chromosomes used can be varied. In this study, many generations were chosen such that the loads obtained for the variables show few characteristic peaks having higher values than other variables. </p>
      <p id="p-880064412e9e">This study aimed to investigate the underlying mechanism regarding obesity, but if the choice were only for diagnosis, then RF alone would have sufficed. This is because RF functions as a wrapper approach where the genes selected are evaluated for accuracy of the classification at the same time. The selection of genes by RF was from using the decrease in accuracy as this has been mentioned to be better than a decrease in Gini index <xref id="x-ae87a21375e3" rid="R104834421777565" ref-type="bibr">18</xref>. However, it should be noted that most of the genes selected by a decrease in accuracy were also selected by Gini index, with the difference being only the selected genes' ranking. The approach of PCA is a filter method that conducts the first selection of genes, with the selected genes having to be classified with other statistical techniques. It should be noted that the three techniques of PCA, RF and GA did not include any genes among the 50 chosen genes that were associated with obesity or hypoxia (a causative risk factor), such as <italic id="e-996fb4644dc1">FTO</italic>, <italic id="e-e0bcf8eb52f4">LEP</italic>, <italic id="e-bb45ed17ceed">HIF-2</italic>, <italic id="e-b3cd9230d73d">NFκB</italic>, <italic id="e-f8384be1e0ea">PPAR</italic> and <italic id="e-9da91d1d3042">NPC1</italic> <xref rid="R104834421777536" ref-type="bibr">3</xref>, <xref rid="R104834421777566" ref-type="bibr">19</xref>, <xref rid="R104834421777584" ref-type="bibr">20</xref>, <xref rid="R104834421777585" ref-type="bibr">21</xref>. However, NPC2 was among the first 30 genes chosen by PCA for differentiating between a high-fat diet and normal diet treated adipose samples. Dysfunction in either NPC1 or NPC2 protein leads to an altered storage pattern of cholesterol and sphingolipids in late endosomes/lysosomes <xref id="x-29416c3d16e6" rid="R104834421777586" ref-type="bibr">22</xref>. </p>
      <p id="p-b608a2fb43dd">Hypoxia in humans affect the expression of <italic id="e-6be1f465b48b">MMP2</italic> and <italic id="e-ec13626a0507">MMP9 </italic>in adipocytes <xref id="x-113f19332b43" rid="R104834421777584" ref-type="bibr">20</xref>, and although both these genes were not among the genes selected by the three methods, the related gene <italic id="e-235db033cb08">Mmp13</italic> was selected by RF. <italic id="e-31491144aa14">MMP13</italic> codes for collagenase 3 in humans, which degrades the extracellular matrix <xref id="x-459eb211f2d2" rid="R104834421777587" ref-type="bibr">23</xref>. As Mmp13 is related to Mmp9, which is related to hypoxia, it can be noted that the combination of the different selection methods could identify different causative or related factors of a disease. The number of genes selected to be used for classification was limited to 30. The value of correct predictions was obtained by pooling six classification techniques, such as a technique that would provide bias <xref rid="R104834421777589" ref-type="bibr">24</xref>, <xref rid="R104834421777590" ref-type="bibr">25</xref>. </p>
      <p id="p-aa10de7bdb87">The number of genes selected for GO and the study of pathogenesis was increased to 50 as 30 genes used for classification were not enough to obtain biological meaning or provide an elaborate network of interactions. The number of genes used for gene annotation and the biological processes identified was less than that in previous publication<xref id="x-c25d0762ebaf" rid="R104834421777542" ref-type="bibr">7</xref>, but the core processes involving lipid metabolism were identified. The use of the smallest possible set of genes is advantageous in the clinical setting for diagnostic purposes and investigating disease mechanisms<xref rid="R104834421777591" ref-type="bibr">26</xref>, <xref rid="R104834421777592" ref-type="bibr">27</xref>.</p>
      <p id="p-0641d2c0ed1c">The genes picked up by using the accuracy function of RF obtained fewer GO terms, but some of them, such as lipid metabolic process, had more genes coding for important proteins (e.g. sphingomyelin phosphodiesterase 3 and acid-like 3B). Proteins closely related to both of these, such as SMPDL3A and SPMD1, have been reported to have a role in cholesterol efflux<xref rid="R104834421777593" ref-type="bibr">28</xref>, <xref rid="R104834421777594" ref-type="bibr">29</xref>. The functional enrichment study with GA genes identified only one GO term related to extracellular exosome. The combination of the KEGG pathway and GO terms with protein-protein interaction networks suggests important genes for system-level regulation of cellular processes. The genes <italic id="e-87e8dea776be">Vapa</italic> and <italic id="e-b53ac24678dd">Npc2</italic> seem to be a bridge that links the hub genes <italic id="e-a7009fb779f8">ATP5a1</italic> and <italic id="e-07ec028e6398">Apoa1</italic>. <italic id="e-500132244db5">ATP5a1</italic> seems to link the protein cluster of Rp18a, Mrp120, Rps3a1 and Rps3, which involves the KEGG pathway of the ribosome with the pathway of acid amino degradation (mmu00280)-associated genes, such as <italic id="e-cff47c295f6f">Acadsb</italic>, <italic id="e-d37c5841a450">Aldh6a1,</italic> and <italic id="e-0d7bf1883144">Hadhb</italic>. As the <italic id="e-1b67e83e9ace">Apoa1</italic> gene seems to be involved in cholesterol transport, efflux and homeostasis, <italic id="e-8759f71ec3dc">Vapa</italic> and <italic id="e-870846a443d2">Npc2</italic> can be regarded as crosstalk genes which link the above three processes. The interaction between these genes also occurs in humans, with miRNAs regulating the human genes. For instance, the human gene <italic id="e-51555176007a">VAPA</italic> is regulated by 24 miRNas, whereas <italic id="e-59a5caeaca96">has-mIR-92a-3p </italic> regulates <italic id="e-c45c6234dd0a">NPC2.</italic> Both these genes could be potential targets for studies of drug intervention. It has to be highlighted that although GA did not identify many protein-protein interactions, the genes identified by it have been reported to be potential targets. For example, the <italic id="e-4b793ac1257b">IGF2R-mIR-143-3p</italic> interaction has been reported to be a potential target of obesity-associated insulin resistance <xref id="x-198921168dcf" rid="R104834421777595" ref-type="bibr">30</xref>.</p>
      <p id="p-eb928a8e3be8">In the present study, the number of samples from which the data was obtained is still small, and a larger sample would have avoided the need to pool the different time points. Secondly, due to the complexity of the molecular mechanisms regulating disease development, the choice of only 50 genes for each chemometric technique made a more comprehensive evaluation of mechanism difficult for the genes chosen by RF and GA. Finally, as some of the interactions were predicted through data mining techniques, the use of <italic id="e-79d2761602df">in vitro</italic> or <italic id="e-b4a332e88b70">in vivo</italic> work to confirm the findings would be warranted in future studies. </p>
      <p id="p-d4a98d3e0ed9"/>
    </sec>
    <sec>
      <title id="t-6e59bc92c7b1">
        <bold id="s-b9b612f2be96">Conclusion</bold>
      </title>
      <p id="p-c39ff62e7e78">The analysis of multivariate data in this study showed that the selection of genes for classification purpose, diagnosis, and elucidation of disease mechanisms could involve different chemometric techniques. The genes selected could be studied further using functional analyses such as GO, pathway analysis, and gene interactions to obtain an overall greater understanding. In this study, RF was better for classification purposes, whereas genes selected by PCA, such as <italic id="e-7de0711fe4c8">Atp5a1</italic>, <italic id="e-6d337f1ebe25">Apoa1</italic>, <italic id="e-9d8294f7745b">Vapa</italic> and <italic id="e-d56bed7bd0bc">Npc2</italic>, were more appropriate for showing, generally, the protein-protein interactions and, more specifically, the disease mechanisms. </p>
      <p id="p-59cb58dba0b7"/>
    </sec>
    <sec>
      <title id="t-b1fa3d6f3659">
        <bold id="s-503f2e1123bd">Abbreviations</bold>
      </title>
      <p id="p-c9c7f5da27a0"><italic id="e-227c77ff1444"><bold id="s-160ac6537d7b">Acadsb:</bold> </italic>acyl-Coenzyme A dehydrogenase, short/branched chain*<italic id="e-c42c7ad5e8b7"> </italic></p>
      <p id="p-57dae56d0135"><italic id="e-f3a9ad1bbd2f"><bold id="s-05979a934592">Adam17</bold>:</italic> a disintegrin and metallopeptidase domain<italic id="e-8a98ea330e27"> 9*</italic></p>
      <p id="p-bdfc627f45e3"><italic id="e-c07bf4969ec1"><bold id="s-e789ebf305e3">Aldh6a1</bold>: </italic>aldehyde dehydrogenase family 6, subfamily A1 * </p>
      <p id="p-862639516c00"><italic id="e-feaadbfc7d79"><bold id="s-84fabcc43f31">Apoa1</bold>: </italic>apolipoprotein A-I*<italic id="e-4b66671a13ae"> </italic></p>
      <p id="p-1ceb204ed081"><italic id="e-a4ecb29cb866"><bold id="s-0cf2403608ea">Apoa2</bold>: </italic>apolipoprotein A-II* </p>
      <p id="p-ec1467d68c99"><italic id="e-9803c0b338bd"><bold id="s-c4f34e5b34ac">Armcx1</bold>:</italic> armadillo repeat containing, X-linked 1* </p>
      <p id="p-c66d2727b658"><italic id="e-351341eaa5d2"><bold id="s-abb0e0ca25c8">ATP5a1</bold>: </italic>ATP synthase, H+ transporting, mitochondrial F1 complex, alpha subunit 1**<italic id="e-bbc0cbf92be6"> </italic></p>
      <p id="p-55dd0c75f882"><italic id="e-6cdfbdd36955"><bold id="s-2256947ded16">Cdt1</bold>:</italic> chromatin licensing and DNA replication factor 1* </p>
      <p id="p-534439b872fc"><bold id="s-ee97a49f12ea">DAVID</bold>: Database for Annotation, Visualization and Integrated Discovery</p>
      <p id="p-2f0210c90542"><italic id="e-c3c15ea34fde"><bold id="s-a70dbf7caa8b">FTO</bold></italic>: FTO alpha-ketoglutarate dependent dioxygenase*</p>
      <p id="paragraph-12"><bold id="s-4c638c1b7dbb">GA</bold>: genetic algorithm</p>
      <p id="paragraph-13"><italic id="e-2f814c102e45"><bold id="s-9f6d313c8675">Galr3</bold>: </italic>galanin receptor 3* </p>
      <p id="paragraph-14"><italic id="e-60b438bda3db"><bold id="s-20f17d86e7a2">Gas7</bold>: </italic>growth arrest specific 7*</p>
      <p id="paragraph-15"><bold id="s-6c82d2e2f4a1">GEO</bold>: Gene Expression Omnibus</p>
      <p id="paragraph-16"><bold id="s-d6c2da27ea25">GO</bold>: gene ontology</p>
      <p id="paragraph-17"><italic id="e-a53c109e37da"><bold id="s-7a507702daef">H2-Aa</bold></italic>: histocompatibility 2, class II antigen A, alpha* </p>
      <p id="paragraph-18"><italic id="e-ee74523cd68a"><bold id="s-2f6c5e69a3e3">Hadhb</bold>: </italic>hydroxyacyl-Coenzyme A dehydrogenase/3-ketoacyl-Coenzyme A thiolase/enoyl-Coenzyme A hydratase (trifunctional protein), beta subunit* </p>
      <p id="paragraph-19"><italic id="e-10b8201f5662"><bold id="s-56c75ef92069">HIF-2</bold></italic>: hypoxia inducible factor<italic id="emphasis-19"> 2</italic>**</p>
      <p id="paragraph-20"><italic id="emphasis-20"><bold id="s-4a7a0eea259b">Hoxa3</bold></italic>: homeobox A3* </p>
      <p id="paragraph-21"><italic id="emphasis-21"><bold id="s-8dda6e26fb5e">Igf2r</bold></italic>: insulin-like growth factor 2 receptor* </p>
      <p id="paragraph-22"><bold id="s-6759abc5b94d">KEGG</bold>: Kyoto Encyclopedia of Genes and Genomes</p>
      <p id="paragraph-23"><italic id="emphasis-22"><bold id="s-1e9749030933">Klf4</bold></italic>: Kruppel-like factor 4* </p>
      <p id="paragraph-24"><bold id="s-a4ce73b18549">kNN</bold>: k-nearest neighbours </p>
      <p id="paragraph-25"><italic id="emphasis-23"><bold id="s-4e6164bf737d">LEP</bold></italic>: <italic id="emphasis-24">leptin</italic>**</p>
      <p id="paragraph-26"><italic id="emphasis-25"><bold id="s-937e15bbb1b2">Lilrb4a</bold></italic>: leukocyte immunoglobulin-like receptor, subfamily B, member 4A* </p>
      <p id="paragraph-27"><bold id="s-89ab03d9effd">miRNA</bold>: microRNA</p>
      <p id="paragraph-28"><italic id="emphasis-26"><bold id="s-ffb8b462a736">Mlxipl</bold>: </italic>MLX interacting protein-like*<italic id="emphasis-27"> </italic></p>
      <p id="paragraph-29"><bold id="s-bb40a6b2ca73">Mmp13</bold>: matrix metallopeptidase 13# </p>
      <p id="paragraph-30"><italic id="emphasis-28"><bold id="s-0b34a3387656">MMP2</bold>: </italic>matrix metallopeptidase 2**</p>
      <p id="paragraph-31"><italic id="emphasis-29"><bold id="s-97de4b49ad02">MMP9</bold>: </italic>matrix metallopeptidase 9**<italic id="emphasis-30"> </italic></p>
      <p id="paragraph-32"><bold id="s-4fefbf872eed">Mrp120</bold>: mitochondrial ribosomal protein L20#</p>
      <p id="paragraph-33"><italic id="emphasis-31"><bold id="s-499799e84dda">Mup1</bold></italic>: major urinary protein 1*</p>
      <p id="paragraph-34"><italic id="emphasis-32"><bold id="s-c8ef16c39ed7">Mup2</bold></italic>: major urinary protein 2* </p>
      <p id="paragraph-35"><italic id="emphasis-33"><bold id="s-2a6d69597aeb">Mup3</bold>: </italic>major urinary protein 3*</p>
      <p id="paragraph-36"><italic id="emphasis-34"><bold id="s-2c4b47a908df">NFκB</bold>:</italic> nuclear factor kappa B**</p>
      <p id="paragraph-37"><italic id="emphasis-35"><bold id="s-7f66fc15a692">NPC1</bold>: </italic>Niemann-Pick type C1**<italic id="emphasis-36"> </italic></p>
      <p id="paragraph-38"><bold id="s-8a1bd34a6f4d">Npc2</bold>: Niemann-Pick type C2# </p>
      <p id="paragraph-39"><bold id="s-08ad3d49d441">PC</bold>: principal component</p>
      <p id="paragraph-40"><bold id="s-26881b11d87f">PCA</bold>: principal component analysis</p>
      <p id="paragraph-41"><italic id="emphasis-37"><bold id="s-beef480ba135">PPAR</bold>: </italic>peroxisome proliferator activated receptor** </p>
      <p id="paragraph-42"><italic id="emphasis-38"><bold id="s-2fcaf2d87442">Rassf4</bold></italic>: Ras association (RalGDS/AF-6) domain family member 4*</p>
      <p id="paragraph-43"><bold id="s-791287c6fad4">RF</bold>: random forest</p>
      <p id="paragraph-44"><bold id="s-cc1bea2d236f">Rp18a</bold>: ribosomal protein L8a#</p>
      <p id="paragraph-45"><bold id="s-1510b05325f8">Rps3</bold>: ribosomal protein S3# </p>
      <p id="paragraph-46"><bold id="s-edc097f20e91">Rps3a1</bold>: ribosomal protein S3A1#</p>
      <p id="paragraph-47"><bold id="s-9d89cec2a73e">SMPDL3A</bold>: sphingomyelin phosphodiesterase, acid-like 3A##</p>
      <p id="paragraph-48"><bold id="s-276156fd673d">SPMD1</bold>: sphingomyelin phosphodiesterase 1##</p>
      <p id="paragraph-49"><bold id="s-5fc226761380">STRING</bold>: Search Tool for the Retrieval of Interacting Genes</p>
      <p id="paragraph-50"><bold id="s-05656e557cfc">SVM</bold>: singular vector machine</p>
      <p id="paragraph-51"><italic id="emphasis-39"><bold id="s-f30bf156be3e">Tef</bold></italic>: thyrotroph embryonic factor*</p>
      <p id="paragraph-52"><bold id="s-5d8e062eef55">Vapa</bold>: vesicle-associated membrane protein, associated protein A#</p>
      <p id="paragraph-54">(*: mouse gene; **: human gene; #: mouse protein, ##: human protein)</p>
      <p id="p-af57339155e4"/>
    </sec>
    <sec>
      <title id="t-e13c84083720">
        <bold id="s-4e25908a52bb">Acknowledgement</bold>
      </title>
      <p id="p-945d9a8b754e">The data analysis in this project were carried out as part of project FRGS/1/2014/SKK01/UNISZA/03/1. Dr Saravanan Dharmaraj acknowledges the financial backing of Ministry of Higher Education, Malaysia for the above research grant.</p>
      <p id="p-f66ae0009c6c"/>
    </sec>
    <sec>
      <title id="t-6f0e1f0f236a">
        <bold id="s-ddd70e703e5a">Author’s Contributions</bold>
      </title>
      <p id="paragraph-57">SD performed significant contribution to the study design and conceptualization, data mining, acquisition, analysis, and interpretation of the data. MRUS checked the molecular functional aspect of the paper. NS facilitated the final drafting of the manuscript and critical revision of the content. All authors read and approved the final manuscript.</p>
      <p id="p-12e41a5c127d"/>
    </sec>
    <sec>
      <title id="t-888ae2df48ea">
        <bold id="s-f668199ffef6">Funding</bold>
      </title>
      <p id="t-093073861a85">None.</p>
      <p id="p-05ce1ce34c09"/>
    </sec>
    <sec>
      <title id="t-138f18c9c7d3">
        <bold id="s-36f08317f928">Availability of data and materials</bold>
      </title>
      <p id="p-08889bfa5d5f">Data used in this study is from that of 15 000 genes reported in the paper of Kwon et al. with PMID:22947075 or reference <xref id="x-db4e2631dd17" rid="R104834421777542" ref-type="bibr">7</xref>, which is available at https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE39549. The processed data and algorithms for the multivariate analyses can also be obtained from corresponding author on reasonable request.</p>
      <p id="p-cc1f1f924739"/>
    </sec>
    <sec>
      <title id="t-281ae3f38ff7">
        <bold id="s-4c029fde25f4">Ethics approval and consent to participate</bold>
      </title>
      <p id="p-c67d1a8db59e">Not applicable.</p>
      <p id="p-f49a66a63816"/>
    </sec>
    <sec>
      <title id="t-29261e06d9fe">
        <bold id="s-39e541d9ba37">Consent for publication</bold>
      </title>
      <p id="t-c0ba1aa64ec6">Not applicable.</p>
      <p id="p-6d30c4c39628"/>
    </sec>
    <sec>
      <title id="t-7b3f14e2adf3">
        <bold id="s-cc191afca524">Competing interests</bold>
      </title>
      <p id="p-95f9fe7c8fea">The authors declare that they have no competing interests. </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <title>References</title>
      <ref id="R104834421777534">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Chen Y-K, Cheung C, Reuhl KR, Liu AB, Lee M-J, Lu Y-P, Yang CS. Effects of green tea polyphenol (-)-epigallocatechin-3-gallate on newly developed high-fat/Western-style diet-induced obesity and metabolic syndrome in mice. J Agric Food Chem [Internet]. 2011 Nov 9;59(21):11862-71</article-title>
          <pub-id pub-id-type="doi">http://pubs.acs.org/doi/abs/10.1021/jf2029016</pub-id>
          <pub-id pub-id-type="pmid">21932846</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777535">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Collaborators G 2015 O. Health Effects of Overweight and Obesity in 195 Countries over 25 Years. N Engl J Med [Internet]. 2017;377(1):13-27</article-title>
          <pub-id pub-id-type="doi">http://www.espeyearbook.org/ey/0015/ey0015.15-2.htm</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777536">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title> Lee M-J. Transforming growth factor beta superfamily regulation of adipose tissue biology in obesity. Biochim Biophys Acta - Mol Basis Dis [Internet]. 2018;1864(4):1160-71</article-title>
          <pub-id pub-id-type="doi"> https://doi.org/10.1016/j.bbadis.2018.01.025</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777537">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Unamuno X, Gómez-Ambrosi J, Rodríguez A, Becerril S, Frühbeck G, Catalán V. Adipokine dysregulation and adipose tissue inflammation in human obesity. Eur J Clin Invest [Internet]. 2018 Sep;48(9):e12997</article-title>
          <pub-id pub-id-type="doi">http://doi.wiley.com/10.1111/eci.12997</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777538">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title> Rohde K, Keller M, la Cour Poulsen L, Blüher M, Kovacs P, Böttcher Y. Genetics and epigenetics in obesity. Metabolism [Internet]. 2019;92:37-50</article-title>
          <pub-id pub-id-type="doi">https://doi.org/10.1016/j.metabol.2018.10.007</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777539">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JPA, Hirschhorn JN. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet [Internet]. 2008 May;9(5):356-69</article-title>
          <pub-id pub-id-type="doi">http://www.nature.com/articles/nrg2344</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777542">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Kwon E-Y, Shin S-K, Cho Y-Y, et al. Time-course microarrays reveal early activation of the immune transcriptome and adipokine dysregulation leads to fibrosis in visceral adipose depots during diet-induced obesity. BMC Genomics [Internet]. 2012;13(1):450. PMID:22947075</article-title>
          <pub-id pub-id-type="doi">http://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-13-450</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777543">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title> Li W, Zhu W, Che J, Sun W, Liu M, Peng B, Zheng J. Microarray Profiling of Human Renal Cell Carcinoma: Identification for Potential Biomarkers and Critical Pathways. Kidney Blood Press Res [Internet]. 2013;37(4-5):506-13</article-title>
          <pub-id pub-id-type="doi">https://www.karger.com/Article/FullText/355726</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777544">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Zhang X-M, Guo L, Chi M-H, Sun H-M, Chen X-W. Identification of active miRNA and transcription factor regulatory pathways in human obesity-related inflammation. BMC Bioinformatics [Internet]. 2015;16:76</article-title>
          <pub-id pub-id-type="doi">https://doi.org/10.1186/s12859-015-0512-5</pub-id>
          <pub-id pub-id-type="pmid">25887648</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777545">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Kemsley EK. A genetic algorithm (GA) approach to the calculation of canonical variates (CVs). Trends Anal Chem. 1998;17(1):24-34</article-title>
          <pub-id pub-id-type="doi">https://doi.org/10.1016/S0165-9936(97)00085-X</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777546">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Dharmaraj S, Gam L-Y, Sulaiman SF, Mansor SM, Ismail Z. The application of pattern recognition techniques in metabolite fingerprinting of six different Phyllanthus spp. Spectroscopy. 2011;26(1):69-78</article-title>
          <pub-id pub-id-type="doi">https://doi.org/10.1155/2011/980109</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777547">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Statnikov A, Wang L, Aliferis CF. A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification. BMC Bioinformatics [Internet]. 2008;9(1):319</article-title>
          <pub-id pub-id-type="doi">https://doi.org/10.1186/1471-2105-9-319</pub-id>
          <pub-id pub-id-type="pmid">18647401</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777548">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Amaratunga D, Cabrera J, Shkedy Z. Exploration and Analysis of DNA Microarray and Other High-Dimensional Data [Internet]. Second Edition. Hoboken, NJ, USA: John Wiley &amp; Sons, Inc.; 2014. 1-317 p. (Wiley Series in Probability and Statistics)</article-title>
          <pub-id pub-id-type="doi">http://doi.wiley.com/10.1002/9781118364505</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777549">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Wang L, Huang W, Zhang L, Chen Q, Zhao H. Molecular pathogenesis involved in human idiopathic pulmonary fibrosis based on an integrated microRNA‑mRNA interaction network. Mol Med Rep [Internet]. 2018 Sep 5;18(5):4365-73</article-title>
          <pub-id pub-id-type="doi"> https://doi.org/10.3892/mmr.2018.9456</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777562">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Wang W, Liu Q, Wang Y, et al. Integration of Gene Expression Profile Data of Human Epicardial Adipose Tissue from Coronary Artery Disease to Verification of Hub Genes and Pathways. Biomed Res Int [Internet]. 2019 ;2019:1-9</article-title>
          <uri>https://www.hindawi.com/journals/bmri/2019/8567306/</uri>
        </element-citation>
      </ref>
      <ref id="R104834421777563">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title> Chai Y, Tan F, Ye S, Liu F, Fan Q. Identification of core genes and prediction of miRNAs associated with osteoporosis using a bioinformatics approach. Oncol Lett [Internet]. 2018;17(1):468-81</article-title>
          <uri>http://www.spandidos-publications.com/10.3892/ol.2018.9508</uri>
        </element-citation>
      </ref>
      <ref id="R104834421777564">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Yeung KY, Ruzzo WL. Principal component analysis for clustering gene expression data. Bioinformatics [Internet]. 2001 Sep 1;17(9):763-74</article-title>
          <uri>https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/17.9.763</uri>
        </element-citation>
      </ref>
      <ref id="R104834421777565">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Pang H, Lin A, Holford M, et al. Pathway analysis using random forests classification and regression. Bioinformatics [Internet]. 2006 Aug 15;22(16):2028-36</article-title>
          <pub-id pub-id-type="pmid">16809386</pub-id>
          <uri>https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btl344</uri>
        </element-citation>
      </ref>
      <ref id="R104834421777566">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Ursu R-I. Obesity, a Gene Review. Bull Transilv Univ Brasov Med Sci Ser VI. 2013;6(55):1-8</article-title>
        </element-citation>
      </ref>
      <ref id="R104834421777584">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Trayhurn P. Hypoxia and Adipose Tissue Function and Dysfunction in Obesity. Physiol Rev [Internet]. 2013 Jan;93(1):1-21</article-title>
          <pub-id pub-id-type="doi">https://doi.org/10.1152/physrev.00017.2012</pub-id>
          <pub-id pub-id-type="pmid">23303904</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777585">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Foti DP, Brunetti A. Editorial: "Linking Hypoxia to Obesit. Front Endocrinol (Lausanne) [Internet]. 2017 Apr;8:34</article-title>
          <pub-id pub-id-type="doi">https://doi.org/10.3389/fendo.2017.00034</pub-id>
          <pub-id pub-id-type="pmid">10766250</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777586">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Desnick JP, Kim J, He X, Wasserstein MP, Simonaro CM, Schuchman EH. Identification and Characterization of Eight Novel SMPD1 Mutations Causing Types A and B Niemann-Pick Disease. Mol Med [Internet]. 2010 Jul 6;16(7-8):316-21</article-title>
          <pub-id pub-id-type="pmid">20386867</pub-id>
          <uri>https://molmed.biomedcentral.com/articles/10.2119/molmed.2010.00017</uri>
        </element-citation>
      </ref>
      <ref id="R104834421777587">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Fanjul-Fernández M, Folgueras AR, Cabrera S, López-Otín C. Matrix metalloproteinases: Evolution, gene regulation and functional analysis in mouse models. Biochim Biophys Acta - Mol Cell Res [Internet]. 2010;1803(1):3-19</article-title>
          <pub-id pub-id-type="doi">http://dx.doi.org/10.1016/j.bbamcr.2009.07.004</pub-id>
          <pub-id pub-id-type="pmid">19631700</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777589">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Sharbaf FV, Mosafer S, Moattar MH. A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization. Genomics [Internet]. 2016;107(6):231-8</article-title>
          <pub-id pub-id-type="doi">http://dx.doi.org/10.1016/j.ygeno.2016.05.001</pub-id>
          <pub-id pub-id-type="pmid">27154739</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777590">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Al-Rajab M, Lu J, Xu Q. Examining applying high performance genetic data feature selection and classification algorithms for colon cancer diagnosis. Comput Methods Programs Biomed [Internet]. 2017;146:11-24</article-title>
          <uri>https://linkinghub.elsevier.com/retrieve/pii/S0169260716304163</uri>
        </element-citation>
      </ref>
      <ref id="R104834421777591">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Yu H, Gu G, Liu H, Shen J, Zhao J. A Modified Ant Colony Optimization Algorithm for Tumor Marker Gene Selection. Genomics Proteomics Bioinformatics [Internet]. 2009 Dec;7(4):200-8</article-title>
          <pub-id pub-id-type="doi">https://doi.org/10.1016/S1672-0229(08)60050-9</pub-id>
          <pub-id pub-id-type="pmid">20172493</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777592">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Díaz-Uriarte R, Alvarez de Andrés S. Gene selection and classification of microarray data using random forest. BMC Bioinformatics [Internet]. 2006;7:3</article-title>
          <pub-id pub-id-type="pmid">16398926</pub-id>
          <uri>http://www.biomedcentral.com/1471-2105/7/3</uri>
        </element-citation>
      </ref>
      <ref id="R104834421777593">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Tamasawa N, Takayasu S, Murakami H, et al. Reduced cellular cholesterol efflux and low plasma high-density lipoprotein cholesterol in a patient with type B Niemann-Pick disease because of a novel SMPD-1 mutation. J Clin Lipidol [Internet]. 2012;6(1):74-80</article-title>
          <pub-id pub-id-type="doi">http://dx.doi.org/10.1016/j.jacl.2011.08.009</pub-id>
          <pub-id pub-id-type="pmid">22264577</pub-id>
        </element-citation>
      </ref>
      <ref id="R104834421777594">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Traini M, Quinn CM, Sandoval C, et al. Sphingomyelin Phosphodiesterase Acid-like 3A (SMPDL3A) Is a Novel Nucleotide Phosphodiesterase Regulated by Cholesterol in Human Macrophages. J Biol Chem [Internet]. 2014 Nov 21;289(47):32895-913</article-title>
          <pub-id pub-id-type="pmid">25288789</pub-id>
          <uri>http://www.jbc.org/lookup/doi/10.1074/jbc.M114.612341</uri>
        </element-citation>
      </ref>
      <ref id="R104834421777595">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <collab/>
          </person-group>
          <article-title>Xihua L, Shengjie T, Weiwei G, et al. Circulating miR-143-3p inhibition protects against insulin resistance in Metabolic Syndrome via targeting of the insulin-like growth factor 2 receptor. Transl Res [Internet]. 2019;205:33-43</article-title>
          <pub-id pub-id-type="doi"> https://doi.org/10.1016/j.trsl.2018.09.006</pub-id>
          <pub-id pub-id-type="pmid">30392876</pub-id>
        </element-citation>
      </ref>
    </ref-list>
  </back>
</article>
