Set up PGx server
Install python 3.8 (incl.pandas), docker, aws, java, samtools
Clone this and additional repos
git clone git@bitbucket.org:quantgene/pgx-engine-wrapper.git
git clone git@bitbucket.org:quantgene/pgx-engine.git
git clone https://github.com/SBIMB/StellarPGx.git pgx-main
Prepare main directory
cp -r pgx-engine-wrapper/* pgx-main/
mkdir pgx-main/pgx_results
rm pgx-main/data/*
Update the Chr definitions in main.nf
Update all Chr<#> to just <#> in main.nf (e.g. Chr1 should be 1). Important are lines 226-230 in main.nf for cyp2d6.
if (params.gene=='cyp2d6') {
chrom = "22"
region_a1 = "22:42126000-42137500"
region_a2 = "042126000-042137500"
region_b1 = "22:42126300-42132400"
region_b2 = "042126300-042132400"
transcript = "ENST00000645361"
Place fasta files into pgx-main directory
(hg38.fa, hg38.fa.fai). For example:
scp root@139.162.190.87:~/ajit/resources/hg38\* pgx-main/
Install nexflow
curl -fsSL get.nextflow.io | bash
Move the nextflow launcher (installed in your current directory) to a directory in your $PATH e.g. /bin
mv nextflow /bin
(The full Nextflow documentation can be found here)
Docker: StellarPGx
docker pull twesigomwedavid/stellarpgx-dev:latest
Docker: PGx Engine
cd pgx-engine
docker build -t pgx .
docker run -d -p 5000:5000 --name pgx-api pgx
docker stop pgx-api
cd ..
Run Entire PGx Pipeline
Change directory: cd pgx-main
The main script to run is: get_pgx_result.sh.
The input can be either a text file or standard input: each line should contain an s3 path to a vcf file. The location of the corresponding bam file is inferred.
The output folders will all be in pgx_results/. A diplotype/rsid overview is given in pgx_diplotypes_rsids.tsv.
The output.json for each sample is stored in the same s3 location as the vcf with the filename <sample.json>.
The entire pgx output for a sample is uploaded to s3 in the same location as the bam file with the name <sample>_pgx_result.tar.gz.