pgx-main from prod added

This commit is contained in:
2025-08-18 12:06:58 +02:00
parent fcb0e9aa4c
commit fe48df8676
984 changed files with 878657 additions and 0 deletions

63
pgx-main/README.md Normal file
View File

@@ -0,0 +1,63 @@
## Set up PGx server
### Install python 3.8 (incl.pandas), docker, aws, java, samtools
### Clone this and additional repos
```bash
git clone git@bitbucket.org:quantgene/pgx-engine-wrapper.git
git clone git@bitbucket.org:quantgene/pgx-engine.git
git clone https://github.com/SBIMB/StellarPGx.git pgx-main
```
### Prepare main directory
```bash
cp -r pgx-engine-wrapper/* pgx-main/
mv pgx-main/test3.bed pgx-main/resources/cyp2d6/cyp_hg38/test3.bed
mkdir pgx-main/pgx_results
rm pgx-main/data/*
```
### Place fasta files into pgx-main directory
(`hg38.fa`, `hg38.fa.fai`). For example:
```bash
scp root@139.162.190.87:~/ajit/resources/hg38\* pgx-main/
```
### Install nexflow
```bash
curl -fsSL get.nextflow.io | bash
```
Move the `nextflow` launcher (installed in your current directory) to a directory in your `$PATH` e.g. `/bin`
```bash
mv nextflow /bin
```
(The full Nextflow documentation can be found [here](https://www.nextflow.io))
### Docker: StellarPGx
```bash
docker pull twesigomwedavid/stellarpgx-dev:latest
```
### Docker: PGx Engine
```bash
cd pgx-engine
docker build -t pgx .
docker run -d -p 5000:5000 --name pgx-api pgx
docker stop pgx-api
cd ..
```
## Run Entire PGx Pipeline
Change directory: `cd pgx-main`
The main script to run is: `get_pgx_result.sh`.
The input can be either a text file or standard input: each line should contain an s3 path to a vcf file. The location of the corresponding bam file is inferred.
The output folders will all be in `pgx_results/`. A diplotype/rsid overview is given in `pgx_diplotypes_rsids.tsv`.
The `output.json` for each sample is stored in the same s3 location as the vcf with the filename `<sample.json>`.
The entire pgx output for a sample is uploaded to s3 in the same location as the bam file with the name `<sample>_pgx_result.tar.gz`.