This repository contains Python scripts to automate the preparation and submission of IDRs ensambles generated with IDPforge to the PED server.
It provides tools to generate JSON files with protein descriptions and chain construct information from PDB files, and to post these data to the PED database.
The workflow includes:
- 
Generating description and construct JSON files ( description.py+construct.py+json_generation.py)
- 
Submitting data to the PED database ( Job-description-PED.pyandconstruct-post-PED.py)
 Uses the generated JSON files to create drafts and upload constructs in PED (this is ultimately when drafts are already created).
This setup allows easy batch processing of multiple PDB files and ensures reproducible, consistent submissions to PED.
This script provides helper functions to generate JSON description files for PDB entries in the PED database.
description.py contains three main functions:
- 
get_uniprot_name(uniprot_id)- Queries the UniProt API to retrieve the full protein name for a given UniProt ID.
 
- 
get_disprot_id(uniprot_id)- Queries the DisProt API to get the corresponding DisProt ID for a given UniProt ID.
 
- 
create_description_json(uniprot_id)- Generates a JSON dictionary template for a PDB file description.
- The returned dictionary includes fields such as:
- title
- authors
- publication_status
- entry_cross_reference
- experimental_cross_reference
- experimental_procedure
- structural_ensembles_calculation
- ontology_terms
 
 
This script provides helper functions to generate JSON construct files for drafts already generated in the PED database.
construct.py contains two main functions:
- 
get_chain_sequences_and_last_residues(pdb_path)- Extracts the amino acid sequence and last residue number for each chain in a PDB file.
 
- 
create_construct_json(chain_info, uniprot_id, protein_name)- Generates a JSON dictionary template for a PDB file construct.
- The returned dictionary includes fields such as:
- chain_name
- fragments:- description(protein name from UniProt)
- source_sequence(sequence)
- start_position(always 1)
- end_position
- uniprot_acc(Uniprot ID)
- definition_type("Uniprot ACC")
 
 
 
Main script to generate JSON files from PDB structures. For each PDB file, it creates:
- Description JSON: Metadata about the protein (title, authors, publication info, cross-references) using description.py. These files are used later inJob-description-PED.pyto create drafts in the PED database.
- Construct JSON: Chain and fragment information (sequence, start/end positions, UniProt ID) using construct.py. These files are used later inconstruct-post-PED.pyto post constructs to the PED database.
- pdb_files/— Contains input PDB files
- jsonFiles/— Contains description JSON files of the PDBs in- pdb_files/. The file names are the same as PDB now with .json
- const_files/— Contains construct JSON files of the PDBS in- pdb_files/. The file names are the same as PDB now with _const.json
This script automates the creation of drafts in the PED database from local PDB files and links them to their corresponding description information.
- Iterates through the pdb_filesfolder looking for.pdbfiles.
- For each PDB file:
- Finds the corresponding description JSON file in the jsonFilesfolder.
- Creates a new draft in the PED server.
- Updates the draft with the description.
- Uploads the PDB file to create an ensemble job.
 
- Finds the corresponding description JSON file in the 
- Maintains a tracking log (job_tracking_log.csv) containing:- Processed filename
- draft_idand- job_idassigned by PED
- Job status
- PDB file size and start time
 
- pdb_files/— Contains input PDB files
- jsonFiles/— Contains description- .jsonfiles (same basename as PDB)
This script is used to upload construct information to existing drafts in the PED database.
After draft creation and JSON description files have been uploaded using Job-description-PED.py, this script will:
- Read the job_tracking_log.csvfile to get all previously created drafts and their corresponding PDB filenames.
- For each PDB file, look for a corresponding construct JSON file in the const_files/folder. Construct JSON filenames must match the PDB filename with the suffix_const.json.
- Post the construct JSON to the PED API.
- Report success or errors for each submission.
- const_files/— Contains construct JSON files (same prefix as PDB + "_const.json")
- job_tracking_log.csv— Automatically created by Job-description-PED.py
- Place all PDB files in pdb_sample/.
- Run the JSON generation script:
python json_generation.py 
- Run Job-description-PED.pypython Job-description-PED.py 
- Once drafts are created, run construct-post-PED.pypython construct-post-PED.py