MSA 2025


Preliminary GWAS Results

BSLMM: Associated Genes

The Bayesian sparse linear mixed model produced the smallest number of significantly associated genes, mostly from the same regions identified by the linear model. This short list of genes includes ones involved in carbon metabolism and protein synthesis.

PCA Correlation SNP ID SNP Scaff. SNP Pos. Effect Downstream Gene Scaff. Downstream Gene Type Downstream Gene Start Downstream Gene Stop Downstream Gene Strand Downstream Gene Phase Downstream Gene ID Downstream Gene Name Upstream Gene Scaff. Upstream Gene Type Upstream Gene Start Upstream Gene Stop Upstream Gene Strand Upstream Gene Phase Upstream Gene ID Upstream Gene Name
aga Chr1047:20020\T,\C 5 20020 -0.01993098 5 gene 18386 20564 + rna-gnl|WGS:QKKD|L226DRAFT_mRNA480625 mannanase
aga Chr1047:362588\A,\G 5 362588 -0.0141599863274 5 gene 362276 364189 rna-gnl|WGS:QKKD|L226DRAFT_mRNA456986 homoserine O-acetyltransferase
aga Chr1047:779150\T,\G 5 779150 -0.02569876 5 gene 778480 779139 rna-gnl|WGS:QKKD|L226DRAFT_mRNA289929 hypothetical protein 5 gene 779329 781713 rna-gnl|WGS:QKKD|L226DRAFT_mRNA610287 hypothetical protein
aga Chr1047:837099\G,\A 5 837099 -0.02058652 5 gene 835929 838479 + rna-gnl|WGS:QKKD|L226DRAFT_mRNA531717 hypothetical protein
aga Chr1047:959366\G,\A 5 959366 -0.0183342358056 5 gene 956932 959245 + rna-gnl|WGS:QKKD|L226DRAFT_mRNA610344 hypothetical protein
aga Chr1047:1082394\A,\G 5 1082394 -0.02622976 5 gene 1082045 1083561 rna-gnl|WGS:QKKD|L226DRAFT_mRNA551180 phosphopantothenate-cysteine ligase
aga Chr1047:1134386\A,\G 5 1134386 -0.01627176 5 gene 1134353 1135371 + rna-gnl|WGS:QKKD|L226DRAFT_mRNA531833 hypothetical protein
aga Chr1047:1167402\T,\C 5 1167402 -0.0193230869364 5 gene 1163998 1167020 rna-gnl|WGS:QKKD|L226DRAFT_mRNA531846 hypothetical protein 5 gene 1167451 1168064 + rna-gnl|WGS:QKKD|L226DRAFT_mRNA457574 UPF0041-domain-containing protein
aga Chr1047:1317050\A,\C 5 1317050 -0.02262773 5 gene 1316530 1317709 rna-gnl|WGS:QKKD|L226DRAFT_mRNA568174 hypothetical protein
aga Chr1047:1350097\T,\C 5 1350097 -0.01868771
aga Chr1047:1423286\G,\T 5 1423286 -0.02649434 5 gene 1420120 1423186 rna-gnl|WGS:QKKD|L226DRAFT_mRNA610474 hypothetical protein 5 gene 1423453 1424456 rna-gnl|WGS:QKKD|L226DRAFT_mRNA610475 hypothetical protein
aga Chr1047:1546379\T,\C 5 1546379 -0.0205141 5 gene 1546158 1550440 + rna-gnl|WGS:QKKD|L226DRAFT_mRNA531964 RNA polymerase II-associated protein
aga Chr1056:290190\T,\C 14 290190 -0.0183155 14 gene 281852 293292 rna-gnl|WGS:QKKD|L226DRAFT_mRNA613564 hypothetical protein
aga Chr1056:301085\C,\T 14 301085 -0.028432977118 14 gene 301299 302875 rna-gnl|WGS:QKKD|L226DRAFT_mRNA464361 NAD(P)-binding protein
aga Chr1056:372057\A,\G 14 372057 -0.02773642 14 gene 370578 372984 + rna-gnl|WGS:QKKD|L226DRAFT_mRNA535672 S-adenosyl-L-methionine-dependent methyltransferase
aga Chr1056:379782\G,\T 14 379782 -0.04061589
aga Chr1056:577692\G,\A 14 577692 -0.02634238 14 gene 575623 579266 + rna-gnl|WGS:QKKD|L226DRAFT_mRNA535743 hypothetical protein
aga Chr1056:657424\G,\A 14 657424 -0.02265925
aga Chr1177:1001\T,\C 135 1001 -0.02463647 135 gene 1352 3889 rna-gnl|WGS:QKKD|L226DRAFT_mRNA527722 hypothetical protein
sec Chr1047:581962\C,\T 5 581962 0.02195912 5 gene 581677 582232 + rna-gnl|WGS:QKKD|L226DRAFT_mRNA567930 hypothetical protein
sec Chr1047:1533358\C,\T 5 1533358 0.0174058
sec Chr1101:124187\A,\G 59 124187 0.01755269 59 gene 123327 124894 rna-gnl|WGS:QKKD|L226DRAFT_mRNA473141 hypothetical protein

GWAS with Updated Genome

Past work on Lentinus tigrinus used the publicly available Lenti7 genome. The results in the poster above use the same Lenti7 genome. The four scaffolds associated with the agaricoid/secotioid morphology look like they fit together, but without a better genome assembly we cannot be sure. Our lab recently produced a new genome for Lentinus tigrinus using Illumina data for the Lenti6 genome and new Nanopore data from Lenti6. We mapped the GWAS results onto this new genome to produce the following plots:

Manhattan plot: linear model SNP significance mapped onto Lenti6 genome
Manhattan plot: Bayesian sparse linear mixed model SNP effect mapped onto Lenti6 genome

Lenti6 BSLMM: Associated Genes

Six Bayesian sparse linear mixed model runs identified a collective 26 associated SNPs, mostly on scaffold 13 of the new Lenti6 genome and none of which were identified in the PCA analysis. The two known genes identified in this analysis were hexokinase and vacuolar protein sorting-associated protein 1.

Run SNP ID SNP Scaff. SNP Pos. Effect Downstream Gene Scaff. Downstream Gene Type Downstream Gene Start Downstream Gene Stop Downstream Gene Strand Downstream Gene Phase Downstream Gene ID Downstream Gene Name Upstream Gene Scaff. Upstream Gene Type Upstream Gene Start Upstream Gene Stop Upstream Gene Strand Upstream Gene Phase Upstream Gene ID Upstream Gene Name
0 Chr104:1032336\T,\A 4 1032336 0.0698733181312 4 gene 1032631 1033578   FUN_005221-T1 hypothetical protein                
5 Chr1013:688090\C,\T 13 688090 -0.2489215 13 gene 687692 689188   FUN_010699-T1 hypothetical protein                
5 Chr1013:696605\G,\A 13 696605 -0.2383887 13 gene 693661 696437 +   FUN_010701-T1 hypothetical protein 13 gene 699007 700773   FUN_010702-T1 vacuolar protein sorting-associated protein 1
2 Chr1013:696631\T,\G 13 696631 0.3346923 13 gene 693661 696437 +   FUN_010701-T1 hypothetical protein 13 gene 699007 700773   FUN_010702-T1 vacuolar protein sorting-associated protein 1
4 Chr1013:696677\C,\G 13 696677 0.203313200385 13 gene 693661 696437 +   FUN_010701-T1 hypothetical protein 13 gene 699007 700773   FUN_010702-T1 vacuolar protein sorting-associated protein 1
4 Chr1013:702665\T,\C 13 702665 0.23511962228 13 gene 702068 702784   FUN_010703-T1 hypothetical protein                
1 Chr1013:708962\G,\C 13 708962 0.1137285                                
3 Chr1013:712112\T,\C 13 712112 0.2080436                                
3 Chr1013:759975\C,\G 13 759975 -0.2796142                 13 gene 760036 761416   FUN_010719-T1 hypothetical protein
0 Chr1013:767076\A,\C 13 767076 0.487800357066 13 gene 766613 768561 +   FUN_010722-T1 hypothetical protein                
1 Chr1013:770510\G,\A 13 770510 0.1488966 13 gene 768993 771310   FUN_010723-T1 hypothetical protein                
1 Chr1013:773674\G,\A 13 773674 0.130354093192 13 gene 773584 774931   FUN_010724-T1 hypothetical protein                
2 Chr1013:823573\G,\C 13 823573 -0.2246705 13 gene 822010 824549 +   FUN_010745-T1 hypothetical protein                
5 Chr1013:828374\C,\G 13 828374 -0.1447514 13 gene 827711 828484   FUN_010747-T1 hypothetical protein                
5 Chr1013:828701\T,\C 13 828701 -0.1704084 13 gene 827711 828484   FUN_010747-T1 hypothetical protein 13 gene 828798 829330   FUN_010748-T1 hypothetical protein
4 Chr1013:829242\C,\T 13 829242 0.194681302877 13 gene 828798 829330   FUN_010748-T1 hypothetical protein                
0 Chr1013:834604\A,\C 13 834604 0.202942721124 13 gene 834394 836452   FUN_010751-T1 hypothetical protein                
3 Chr1013:834604\A,\C 13 834604 0.1226208 13 gene 834394 836452   FUN_010751-T1 hypothetical protein                
0 Chr1013:842534\G,\A 13 842534 -0.28680617856                                
4 Chr1013:848994\A,\G 13 848994 -0.252565932983 13 gene 846915 848958 +   FUN_010756-T1 hexokinase                
1 Chr1013:854661\A,\C 13 854661 -0.143584635222                 13 gene 855217 856061 +   FUN_010758-T1 hypothetical protein
1 Chr1013:854869\A,\C 13 854869 0.1346482                 13 gene 855217 856061 +   FUN_010758-T1 hypothetical protein
3 Chr1013:862356\G,\A 13 862356 0.1544015 13 gene 862003 862757   FUN_010762-T1 hypothetical protein                
5 Chr1013:862514\T,\C 13 862514 0.1712555 13 gene 862003 862757   FUN_010762-T1 hypothetical protein                
1 Chr1013:865973\T,\C 13 865973 -0.1337842 13 gene 865503 867116 +   FUN_010764-T1 hypothetical protein                
3 Chr1013:870821\T,\C 13 870821 -0.1515588 13 gene 871143 874676 +   FUN_010766-T1 hypothetical protein                

Methods Details

The specimen SI.Lt.038 (aga/sec) was collected from the Ipswich River (Topsfield, MA, USA) in June, 2023. Spores were isolated from that specimen and grown out as monokaryons. The monokaryons were then test crossed to the tester strain FP.102501.T_SSI.5 (sec) and mushrooms were grown from the resulting dikaryons. Hymenophore morphology of these test crosses was used to genotype the SI.Lt.038 single spore isolates (SSIs). Two SSIs – SI.Lt.038_SSI.4 (aga) and SI.Lt.038_SSI.6 (sec) – were crossed together to produce the GWAS parent. From this parent, 150 SSIs were produced. The GWAS SSIs were then test crossed to FP.102501.T_SSI.5 and genotyped by fruiting.

DNA was extracted using a standard SDS extraction protocol and was cleaned using a Zymo DNA Clean & Concentrator-5 kit. Illumina sequencing was performed by Novogene, Nanopore sequencing was performed in-house on an R10.4.1 flowcell using the Ligation Sequencing Kit v14. SNPs were aligned and called using the standard GATK4 variant calling pipeline.

A new Lenti6 genome was produced by hybrid assembly of Illumina and Nanopore data. Reads were trimmed and quality controlled using FastP. Nanopore long reads were basecalled using dorado and the Super Accurate “SUP” algorithm. Hybrid assembly was then performed using MaSuRCA with the CABOG assembler. Post assembly, genome polishing was performed with POLCA and homozygous scaffolds were collapsed using Redundands.

The genome-wide association was conducted with vcf2gwas, a pipeline which uses GEMMA to do the statistical analysis. Samples GWAS_SSI.060, 096, and 007 were removed from analysis until their hymenophore morphologies can be re-checked. The program was run using either the Lenti7 genome and annotations available on NCBI or the newly produced Lenti6 genome and annotations. We supplied principal components 2-6 as covariates, which covered a combined 49% of variation. Principal component 1 was not included because it strongly correlated to the phenotype of interest. To calculate significantly-associated SNPs, GEMMA was run using the linear model, linear mixed model, and Bayesian sparse linear mixed model. For the linear model and linear mixed model, SNPs were considered significant if they had a p-value lower than the lowest p-value of any SNP not identified in PCA analysis. For the Bayesian sparse linear mixed model, SNPs were considered significant if the effect size was higher than the highest or lower than the lowest SNP not identified by PCA analysis. In cases where the greatest effects were from non-PCA identified SNPs, significant SNPs were considered to be the ones greater or less than the most extreme PCA-identified SNPs. The linear mixed model produced results similar to the linear model, except without intermediately-significant SNPs and so for simplicity those results are not shown here. At least 6 runs of the Bayesian sparse linear mixed model were conducted – the ones shown here are the most representative results of those runs. Results were visualized using ggplot2 in the R statistical computing software version 4.4.1.