Skip to contents

scPlant pipeline have adapted the Paired Motif Enrichment Tool (PMET) (Rich-Griffin et al., 2020) to predict pairs of TF binding motifs within the promoter regions of cell-type specific marker genes. Three plant species (Arabidopsis thaliana, Oryza sativa, Zea mays) are supported for now, and more plant species will be supported soon.

Download and installation

Download databases and scripts

The required databases and scripts to perform Paired Motif Enrichment analysis can be downloaded here.

# In linux system:
unzip scPlantData.zip
cd scPlantData

Download genome files

Please download the genome and gff3 files suitable for your scRNA-seq data. If you already have these files, please copy them to the path scPlantData/PMET/.

For example, download Arabidopsis thaliana’s TAIR10 genome and gff3:

# To make the program works, we recommend downloading the files to the directory scPlantData/PMET/
wget -O PMET/arabidopsis_thaliana.fa.gz 'https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-55/fasta/arabidopsis_thaliana/dna/Arabidopsis_thaliana.TAIR10.dna.toplevel.fa.gz'
wget -O PMET/arabidopsis_thaliana.gff3.gz 'https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-55/gff3/arabidopsis_thaliana/Arabidopsis_thaliana.TAIR10.55.gff3.gz'
gunzip PMET/arabidopsis_thaliana.fa.gz
gunzip PMET/arabidopsis_thaliana.gff3.gz
Download other species’s genome and gff3 files

Users can download other plant organism’s genome and gff3 files in EnsemblPlants and Phytozome. For example, download Zea mays’s Zm-B73-REFERENCE-NAM-5.0 genome and gff3:

wget -O PMET/zea_mays.fa.gz 'https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-55/fasta/zea_mays/dna/Zea_mays.Zm-B73-REFERENCE-NAM-5.0.dna.toplevel.fa.gz'
wget -O PMET/zea_mays.gff3.gz 'https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-55/gff3/zea_mays/Zea_mays.Zm-B73-REFERENCE-NAM-5.0.55.gff3.gz'
gunzip PMET/zea_mays.fa.gz
gunzip PMET/zea_mays.gff3.gz

Install dependent softwares

Please make sure several dependent softwares have been installed, including samtools, bedtools, meme, parallel.

Paired Motif Enrichment analysis

Help information:

bash RunPMET.sh -h
USAGE: RunPMET [options]
Perform Paired Motif Enrichment Test.

-r <PMET_path>  : Full path of directory where PMET scripts exist. Required.
-o <output_directory> : Output directory for results.
-m <gene_inputs>        : File containing genes to be tested. Required. A 2 column tab-delimited text file containing cluster ID in column 1 and genes in column 2.
-s <species>  : Species. currently At (Arabidopsis thaliana), Os (Oryza sativa), Zm (Zea mays) are supported.

Prepare gene input file

Take a real scRNA-seq data of Arabidopsis thaliana (Zhang et al., 2019) as example data (download here), you can use the R codes below to prepare gene input file:

library(dplyr)
DEG <- SeuratObj@misc$topMarker %>% dplyr::filter((pct.1>pct.2) & (p_val <0.05) & (pct.1>0.25) & avg_logFC>0.25) %>% dplyr::select(cluster, gene)
write.table(DEG, file = "DEG.txt", quote = F, sep = "\t", row.names = F, col.names = F)

Run the script

Note: Running the shell script could be time-consuming, particularly step 2.

source RunPMET.sh -r ./PMET -o output -m DEG.txt -s At

Once the program has run successfully, you can find result files in the output directory output/output.txt.

Process the result and visualization

Pre-process original PMET output:

PMETresult <- processPMET("output/output.txt")

Heatmap of PMET result:

PMEThmp(PMETresult, topn = 5)

Draw triangle heatmap of PMET result:

triHmp(PMETresult, clus = unique(PMETresult$Module)[1])

Network diagram showing top pairs of a cluster:

topPairsNet(PMETresult, clus = unique(PMETresult$Module)[1], topn = 20)