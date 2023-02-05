



It has been reported that genes are not independent of each other, co-expressed genes may have similar biological functions, and the effect of grouping genes is relatively strong.17The WGCNA algorithm, widely used to study network evolution, uses the dissimilarity of topological overlaps as a measure of distance between genes to identify network topologies and sub-networks (called modules). can be used for Given the complexity of RA pathogenesis, the WGCNA algorithm was performed to identify biologically significant genetic modules and to better understand genes associated with RA pathogenesis (Fig. 2). Due to the heterogeneous and variable disease course, RA classification is important for the clinical management of patients. While it is necessary to identify DEGs, it is also important to determine their interconnections. Therefore, as input data for WGCNA to generate gene co-expression modules, p < 0.05 および logFCs > A DEG of 0.25 was chosen. Applying a soft threshold power of 4 (scale-free R2 = 0.85) (Fig. 2B, C) and cutting height as 0.25 (fig. 3B), 12 RA-related modules were identified (Fig. 3A). Figure 2 Neighbor function parameters and construction of the WGCNA module. (a) sample clustering to detect outliers. (B.) average connectivity of eigengenes. (C.) Scale independence of eigengenes. The red line represents the squared correlation coefficient (r2) and the mean connectivity of the eigengenes under a soft threshold power of 4 (D.) Cluster dendrogram of the DEG. Modules with high similarity are identified by clustering and merged dynamically. Figure 3 WGCNA module analysis. (a) Heatmap of WGCNA module correlations with RA and normal clinical traits. (B.) Cluster phylogenetic tree of module-specific genes. Module-specific gene dissimilarity is calculated to merge several similar modules with a height cutoff value of 0.25. (C.) Scatter plot of RA gene importance and module membership of purple modules. In the WGCNA algorithm, hub modules were identified by the highest correlations between module-specific genes and clinical traits and the most significant correlations between module membership and gene importance.We then assessed the relationship between modules and clinical features to identify hub modules (Supplementary Figure S3 and S4). As a result, the blue module was significantly associated with RA (correlation coefficient = 0.77, p = 3e-08) (Figure 1). 3C).As a result, we identified the blue module as the hub module of the WGCNA network and selected the top 20 genes based on gene importance and module membership as candidate genes: ADAMDEC1, TRBC1, CD27, SEL1L3, LCK, IGLL5. , IGKC, IGLJ3, CD3D, CXCL13, TNFRSF17, IGHM, CD2, MS4A1, TRAF3IP3, HLA-DOB, IGLV1-44, TRAC, TPD52, and PSMB9 (Supplementary Figure S2C). Identification of hub genes and verification of diagnostic effects We then duplicated the top 20 candidate genes in the PPI and WGCNA networks and identified them as common hub genes, including LCK, CXCL13, IGHM, and MS4A1 (Supplementary Figure S2C). We then performed receiver operating characteristic (ROC) curve analysis to assess the individual predictive power of the four common hub genes. As a result, LCK had the highest AUC value (AUC = 0.773), followed by CXCL13 (AUC = 0.771), IGHM (AUC = 0.757), and MS4A1 (AUC = 0.739) (Figure 1). Four), suggesting that the four common hub genes have better predictive ability. Figure 4 ROC curve analysis in the training cohort. (a) ROC curve of CXCL13 expression to predict RA. (B.) ROC curve of IGHM expression predictive of RA. (C.) ROC curve of LCK expression to predict RA. (D.) ROC curve of expression of MS4A1 to predict RA. To further validate the predictive ability of common hub genes, we performed ROC curve analysis on the GSE89408 dataset. Results showed that four common hub genes also performed well (CXCL13: AUC = 0.899; IGHM: AUC = 0.874; LCK: AUC = 0.772; MS4A1: AUC = 0.877; Fig. Five). In addition, another rheumatoid arthritis externally validated cohort was retrieved from the GEO database to further validate the predictive power of the four common hub genes. In an externally validated cohort of 36 rheumatoid arthritis patients and 22 control patients (GSE121894), ROC curve analysis showed that four common hub genes had excellent predictive ability (CXCL13: AUC = 0.789; IGHM: AUC = 0.726; LCK: AUC = 0.622; MS4A1: AUC = 0.652; S5). These findings indicate that four common hub genes may serve as effective biomarkers for RA diagnosis. Figure 5 ROC curve analysis in the validation cohort. (a) ROC curve of CXCL13 expression to predict RA. (B.) ROC curve of IGHM expression predictive of RA. (C.) ROC curve of LCK expression to predict RA. (D.) ROC curve of expression of MS4A1 to predict RA. Correlations between biomarker genes and immune-related functions Previous studies have shown that RA is an autoimmune disease in which pro-inflammatory cytokines secreted by fibroblasts and infiltrating immune cells can lead to gradual cartilage degeneration9,TenInvestigating the correlations between biomarker genes and immune-related features is therefore important for further understanding the pathogenesis of RA. The immune and stromal scores were then used to assess immune status between RA and control samples to estimate immune and stromal cells in the entire patient infiltrate calculated by the ESTIMATE algorithm. Results showed that RA patients had significantly higher immune and stromal scores than control patients, as well as ESTIMATE scores (Figure 1). 6). The ESTIMATE results demonstrated that there are indeed large differences in the immune microenvironment between RA and control samples. To further investigate which immune cells are responsible for RA differences and potential mechanisms. We estimated the infiltration of 24 immune cells using the ssGSEA method via the ‘GSVA’ package in R. Compared to control samples, RA samples had significantly higher infiltration of T cells, aDCs, and B cells (Fig. 7A). In addition, a Pearson correlation analysis was performed and the results were positively associated with the expression of four biomarker genes including aDCs, B cells, CD8 T cells, cytotoxic cells, DCs, NK CD56 bright cells and T cells11. immune cells, T helper cells, Tcm cells, Tfh cells, and Th2 cells. Conversely, NK cell infiltration was negatively associated with target gene expression (Fig. 7B). Figure 6 ESTIMATE analysis was performed between RA and controls. (a) Comparison of putative wounds between RA and controls. (B.) Comparison of immune pain between RA and controls. (C.) Comparison of interstitial ulcers between RA and controls. (D.) Comparison of tumor purity between RA and controls. Figure 7 Correlation between biomarker genes and immune infiltration. (a) heatmap of the infiltration of 24 immune cells between RA and controls drawn by the ‘pheatmap’ package in R (version 3.6.3) (http://cran.r-project.org/bin/windows/base/old/3.6.3/). (B.) correlation of biomarker genes with 24 immune cell infiltration.

