Expand description
§Handle identified peptidoform files
Handling many different formats of identified peptidoform files. Supports these formats:
- mzTab
- Fasta
- Spectrum Sequence List (SSL)
- mzSpecLib (only with feature
mzannotate)
And output from the following programs:
- DeepNovo
- PointNovo
- BiatNovo
- PGPointNovo
- InstaNovo
- MaxQuant
- MetaMorpheus
- MSFragger
- NovoB
- Novor
- OPair
- Peaks
- PepNet
- π-HelixNovo
- π-PrimeNovo
- pLink
- PLGS
- PowerNovo
- Proteoscape
- pUniFind
- Sage
§Compilation features
mzannotate- Adds mzannotate as a dependency and allow mzSpecLib spectra to be used as IdentifiedPeptiform and allow other formats to parse annotated spectra
Modules§
- prelude
- A subset of the types and traits that are envisioned to be used the most, importing this is a good starting point for working with the crate
Structs§
- BasicCSV
Data - The data for individual entries in BasicCSV files.
- BasicCSV
Format - The type to contain the format description for BasicCSV files.
- CVTerm
- A CV term
- Deep
Novo Family Data - The data for individual entries in DeepNovoFamily files.
- Deep
Novo Family Format - The type to contain the format description for DeepNovoFamily files.
- Fasta
Data - A single parsed line of a fasta file
- Identified
Peptidoform - A peptide that is identified by a de novo or database matching program
- Identified
Peptidoform Iter - An iterator returning parsed identified peptides
- Insta
Novo Data - The data for individual entries in InstaNovo files.
- Insta
Novo Format - The type to contain the format description for InstaNovo files.
- MSFragger
Data - The data for individual entries in MSFragger files.
- MSFragger
Format - The type to contain the format description for MSFragger files.
- MZTab
Data - Peptidoform data from a mzTab file
- MaxQuant
Data - The data for individual entries in MaxQuant files.
This can contain data from the database match and the de novo match at the same time when
run with MaxNovo. In that case the de novo data will not be shown via the methods of the
MetaDatatrait. If access is needed solely to the de novo data and not to the database data the easiest way is detecting this case and overwriting the data in place. - MaxQuant
Format - The type to contain the format description for MaxQuant files.
- Maybe
Peptidoform - A structure where a peptidoform might be present
- Meta
Morpheus Data - The data for individual entries in MetaMorpheus files.
- Meta
Morpheus Format - The type to contain the format description for MetaMorpheus files.
- NovoB
Data - The data for individual entries in NovoB files.
- NovoB
Format - The type to contain the format description for NovoB files.
- Novor
Data - The data for individual entries in Novor files.
- Novor
Format - The type to contain the format description for Novor files.
- Opair
Data - The data for individual entries in Opair files.
- Opair
Format - The type to contain the format description for Opair files.
- PLGS
Data - The data for individual entries in PLGS files.
- PLGS
Format - The type to contain the format description for PLGS files.
- PLink
Data - The data for individual entries in PLink files.
- PLink
Format - The type to contain the format description for PLink files.
- PUni
Find Data - The data for individual entries in PUniFind files.
- PUni
Find Format - The type to contain the format description for PUniFind files.
- Peaks
Data - The data for individual entries in Peaks files.
- Peaks
Family Id - The scans identifier for a peaks identification
- Peaks
Format - The type to contain the format description for Peaks files.
- PepNet
Data - The data for individual entries in PepNet files.
- PepNet
Format - The type to contain the format description for PepNet files.
- Peptidoform
Present - A structure where a peptidoform definitely is present
- PiHelix
Novo Data - The data for individual entries in PiHelixNovo files.
- PiHelix
Novo Format - The type to contain the format description for PiHelixNovo files.
- PiPrime
Novo Data - The data for individual entries in PiPrimeNovo files.
- PiPrime
Novo Format - The type to contain the format description for PiPrimeNovo files.
- Power
Novo Data - The data for individual entries in PowerNovo files.
- Power
Novo Format - The type to contain the format description for PowerNovo files.
- Protein
- A protein definition from mzTab
- Proteoscape
Data - The data for individual entries in Proteoscape files.
- Proteoscape
Format - The type to contain the format description for Proteoscape files.
- Sage
Data - The data for individual entries in Sage files.
- Sage
Format - The type to contain the format description for Sage files.
- Spectrum
Sequence List Data - The data for individual entries in SpectrumSequenceList files.
- Spectrum
Sequence List Format - The type to contain the format description for SpectrumSequenceList files.
Enums§
- BasicCSV
Version - All possible basic CSV versions
- Deep
Novo Family Version - All possible DeepNovoFamily versions
- Fasta
Identifier - A fasta identifier following the NCBI identifier definition
- File
Format - A file format that might not be (fully) known
- Identified
Peptidoform Data - The definition of all special metadata for all types of identified peptides that can be read
- Insta
Novo Version - All possible InstaNovo versions
- Known
File Format - A file format that is fully known
- MSFragger
Open Modification - A MSFragger open search modification
- MSFragger
Version - All possible MSFragger versions
- MaxQuant
Version - All possible MaxQuant versions
- Meta
Morpheus Match Kind - Meta
Morpheus Version - All possible peaks versions
- NovoB
Version - All possible NovoB versions
- Novor
Version - All available Novor versions
- Opair
Match Kind - Opair
Version - All possible peaks versions
- PLGS
Curation - PLGS curation categories
- PLGS
Version - All possible PLGS versions
- PLink
Peptide Type - The different types of peptides a cross-link experiment can result in
- PLink
Version - All possible pLink versions
- PUni
Find Version - All possible pUniFind versions
- Peaks
Version - All possible peaks versions
- PepNet
Version - All possible PepNet versions
- PiHelix
Novo Version - All possible π-HelixNovo versions
- PiPrime
Novo Version - All possible π-PrimeNovo versions
- Power
Novo Version - All possible PowerNovo versions
- Proteoscape
Version - All available Novor versions
- Reliability
- The reliability of a PSM
- Sage
Version - All possible Sage versions
- Spectrum
Id - A spectrum identifier
- Spectrum
Ids - Multiple spectrum identifiers
- Spectrum
Sequence List Version - All possible SpectrumSequenceList versions
- Used
Model - The model that produced the final prediction for an InstaNovoPlus
Constants§
- AB
- Version Ab of PEAKS export
- BASIC
- msms.txt
- DB_
PEPTIDE - Version DB peptide of PEAKS export
- DB_
PROTEIN_ PEPTIDE - Version DB protein peptide of PEAKS export protein group, protein id, protein accession, unique, start, end,
- DB_PSM
- Version DB psm of PEAKS export
- DEEPNOVO_
V0_ 0_ 1 - The only known version of DeepNovo
- FRAGPIPE_
V22 - v22
- FRAGPIPE_
V20_ OR_ 21 - v20 or v21
- INSTANOVOPLUS_
V1_ 1_ 4 - The only known version of InstaNovoPlus
- INSTANOVO_
V1_ 0_ 0 - InstaNovo version 1.0.0
- META_
MORPHEUS - The only supported format for [
MetaMorpheus] data - MSMS
- msms.txt
- MSMS_
SCANS - msmsScans.txt
- NEW_
DENOVO - denovo:
# id, scanNum, RT, mz(data), z, pepMass(denovo), err(data-denovo), ppm(1e6*err/(mz*z)), score, peptide, aaScore, - NEW_PSM
- PSM:
#id, spectraId, scanNum, RT, mz, z, pepMass, err, ppm, score, protein, start, length, origin, peptide, noPTMPeptide, aac, allProteins - NOVOB_
V0_ 0_ 1 - The only known version of NovoB
- NOVO_
MSMS_ SCANS - MaxNovo msmsScans.txt
- OLD_
DENOVO - The older supported format for denovo.csv files from Novor
- OLD_PSM
- The older supported format for psms.csv files from Novor
- O_PAIR
- The only supported format for Opair data
- PEPNET_
V1_ 0 - The only known version of PepNet
- PHILOSOPHER
- Philosopher
- PIHELIXNOVO_
V1_ 1 - The only known version of π-HelixNovo
- PIPRIMENOVO_
V0_ 1 - The only known version of π-PrimeNovo
- POINTNOVOFAMILY
- The only known version of the PointNovo Family
- POWERNOVO_
V1_ 0_ 17 - The only known version of PowerNovo
- PUNIFIND_
V0_ 1 - The only version of pUniFind
- SILAC
- MaxQuant v2.4.14.0 SILAC evidence.txt
- SSL
- General type of SSL files
- V2_3
- The only built in version of pLink export
- V11
- Version 11 of PEAKS export
- V12
- Version 12 of PEAKS export
- V11_
FEATURES - Version 11 of PEAKS export
- V13_DIA
- Version 13 Dia de novo missing: Delta RT, MS2 correlation, #precursors, gene, database, ion intensity, positional confidence
- V2025B
- Version 2025b
- VERSION_
0_ 14 - An older version of a Sage export
- VERSION_
3_ 0 - An older version of a PLGS export
- VERSION_
V4_ 2 - The only supported format for MSFragger data
- X
- An older version of a PEAKS export
- XPLUS
- Version X+ of PEAKS export (made for build 20 November 2019)
- X_
PATCHED - Version X of PEAKS export (made for build 31 January 2019)
Traits§
- Identified
Peptidoform Source - The required methods for any source of identified peptides
- Identified
Peptidoform Version - A version for an identified peptide version
- Meta
Data - Generalised access to meta data of identified peptidoforms
- Peptidoform
Availability - A trait to mark all options for availability of peptidoforms
Functions§
- open_
identified_ peptidoforms_ file - Open the selected path and automatically determine the filetype. It will decompress gzipped files automatically.
Type Aliases§
- Boxed
Identified Peptide Iter - Convenience type to not have to type out long iterator types
- General
Identified Peptidoforms - A general generic identified peptidoform iterator from any source format