Crate mzident

Crate mzident 

Source
Expand description

§Handle identified peptidoform files

Handling many different formats of identified peptidoform files. Supports these formats:

  • mzTab
  • Fasta
  • Spectrum Sequence List (SSL)
  • mzSpecLib (only with feature mzannotate)

And output from the following programs:

  • DeepNovo
  • PointNovo
  • BiatNovo
  • PGPointNovo
  • InstaNovo
  • MaxQuant
  • MetaMorpheus
  • MSFragger
  • NovoB
  • Novor
  • OPair
  • Peaks
  • PepNet
  • π-HelixNovo
  • π-PrimeNovo
  • pLink
  • PLGS
  • PowerNovo
  • Proteoscape
  • pUniFind
  • Sage

§Compilation features

  • mzannotate - Adds mzannotate as a dependency and allow mzSpecLib spectra to be used as IdentifiedPeptiform and allow other formats to parse annotated spectra

Modules§

prelude
A subset of the types and traits that are envisioned to be used the most, importing this is a good starting point for working with the crate

Structs§

BasicCSVData
The data for individual entries in BasicCSV files.
BasicCSVFormat
The type to contain the format description for BasicCSV files.
CVTerm
A CV term
DeepNovoFamilyData
The data for individual entries in DeepNovoFamily files.
DeepNovoFamilyFormat
The type to contain the format description for DeepNovoFamily files.
FastaData
A single parsed line of a fasta file
IdentifiedPeptidoform
A peptide that is identified by a de novo or database matching program
IdentifiedPeptidoformIter
An iterator returning parsed identified peptides
InstaNovoData
The data for individual entries in InstaNovo files.
InstaNovoFormat
The type to contain the format description for InstaNovo files.
MSFraggerData
The data for individual entries in MSFragger files.
MSFraggerFormat
The type to contain the format description for MSFragger files.
MZTabData
Peptidoform data from a mzTab file
MaxQuantData
The data for individual entries in MaxQuant files. This can contain data from the database match and the de novo match at the same time when run with MaxNovo. In that case the de novo data will not be shown via the methods of the MetaData trait. If access is needed solely to the de novo data and not to the database data the easiest way is detecting this case and overwriting the data in place.
MaxQuantFormat
The type to contain the format description for MaxQuant files.
MaybePeptidoform
A structure where a peptidoform might be present
MetaMorpheusData
The data for individual entries in MetaMorpheus files.
MetaMorpheusFormat
The type to contain the format description for MetaMorpheus files.
NovoBData
The data for individual entries in NovoB files.
NovoBFormat
The type to contain the format description for NovoB files.
NovorData
The data for individual entries in Novor files.
NovorFormat
The type to contain the format description for Novor files.
OpairData
The data for individual entries in Opair files.
OpairFormat
The type to contain the format description for Opair files.
PLGSData
The data for individual entries in PLGS files.
PLGSFormat
The type to contain the format description for PLGS files.
PLinkData
The data for individual entries in PLink files.
PLinkFormat
The type to contain the format description for PLink files.
PUniFindData
The data for individual entries in PUniFind files.
PUniFindFormat
The type to contain the format description for PUniFind files.
PeaksData
The data for individual entries in Peaks files.
PeaksFamilyId
The scans identifier for a peaks identification
PeaksFormat
The type to contain the format description for Peaks files.
PepNetData
The data for individual entries in PepNet files.
PepNetFormat
The type to contain the format description for PepNet files.
PeptidoformPresent
A structure where a peptidoform definitely is present
PiHelixNovoData
The data for individual entries in PiHelixNovo files.
PiHelixNovoFormat
The type to contain the format description for PiHelixNovo files.
PiPrimeNovoData
The data for individual entries in PiPrimeNovo files.
PiPrimeNovoFormat
The type to contain the format description for PiPrimeNovo files.
PowerNovoData
The data for individual entries in PowerNovo files.
PowerNovoFormat
The type to contain the format description for PowerNovo files.
Protein
A protein definition from mzTab
ProteoscapeData
The data for individual entries in Proteoscape files.
ProteoscapeFormat
The type to contain the format description for Proteoscape files.
SageData
The data for individual entries in Sage files.
SageFormat
The type to contain the format description for Sage files.
SpectrumSequenceListData
The data for individual entries in SpectrumSequenceList files.
SpectrumSequenceListFormat
The type to contain the format description for SpectrumSequenceList files.

Enums§

BasicCSVVersion
All possible basic CSV versions
DeepNovoFamilyVersion
All possible DeepNovoFamily versions
FastaIdentifier
A fasta identifier following the NCBI identifier definition
FileFormat
A file format that might not be (fully) known
IdentifiedPeptidoformData
The definition of all special metadata for all types of identified peptides that can be read
InstaNovoVersion
All possible InstaNovo versions
KnownFileFormat
A file format that is fully known
MSFraggerOpenModification
A MSFragger open search modification
MSFraggerVersion
All possible MSFragger versions
MaxQuantVersion
All possible MaxQuant versions
MetaMorpheusMatchKind
MetaMorpheusVersion
All possible peaks versions
NovoBVersion
All possible NovoB versions
NovorVersion
All available Novor versions
OpairMatchKind
OpairVersion
All possible peaks versions
PLGSCuration
PLGS curation categories
PLGSVersion
All possible PLGS versions
PLinkPeptideType
The different types of peptides a cross-link experiment can result in
PLinkVersion
All possible pLink versions
PUniFindVersion
All possible pUniFind versions
PeaksVersion
All possible peaks versions
PepNetVersion
All possible PepNet versions
PiHelixNovoVersion
All possible π-HelixNovo versions
PiPrimeNovoVersion
All possible π-PrimeNovo versions
PowerNovoVersion
All possible PowerNovo versions
ProteoscapeVersion
All available Novor versions
Reliability
The reliability of a PSM
SageVersion
All possible Sage versions
SpectrumId
A spectrum identifier
SpectrumIds
Multiple spectrum identifiers
SpectrumSequenceListVersion
All possible SpectrumSequenceList versions
UsedModel
The model that produced the final prediction for an InstaNovoPlus

Constants§

AB
Version Ab of PEAKS export
BASIC
msms.txt
DB_PEPTIDE
Version DB peptide of PEAKS export
DB_PROTEIN_PEPTIDE
Version DB protein peptide of PEAKS export protein group, protein id, protein accession, unique, start, end,
DB_PSM
Version DB psm of PEAKS export
DEEPNOVO_V0_0_1
The only known version of DeepNovo
FRAGPIPE_V22
v22
FRAGPIPE_V20_OR_21
v20 or v21
INSTANOVOPLUS_V1_1_4
The only known version of InstaNovoPlus
INSTANOVO_V1_0_0
InstaNovo version 1.0.0
META_MORPHEUS
The only supported format for [MetaMorpheus] data
MSMS
msms.txt
MSMS_SCANS
msmsScans.txt
NEW_DENOVO
denovo: # id, scanNum, RT, mz(data), z, pepMass(denovo), err(data-denovo), ppm(1e6*err/(mz*z)), score, peptide, aaScore,
NEW_PSM
PSM: #id, spectraId, scanNum, RT, mz, z, pepMass, err, ppm, score, protein, start, length, origin, peptide, noPTMPeptide, aac, allProteins
NOVOB_V0_0_1
The only known version of NovoB
NOVO_MSMS_SCANS
MaxNovo msmsScans.txt
OLD_DENOVO
The older supported format for denovo.csv files from Novor
OLD_PSM
The older supported format for psms.csv files from Novor
O_PAIR
The only supported format for Opair data
PEPNET_V1_0
The only known version of PepNet
PHILOSOPHER
Philosopher
PIHELIXNOVO_V1_1
The only known version of π-HelixNovo
PIPRIMENOVO_V0_1
The only known version of π-PrimeNovo
POINTNOVOFAMILY
The only known version of the PointNovo Family
POWERNOVO_V1_0_17
The only known version of PowerNovo
PUNIFIND_V0_1
The only version of pUniFind
SILAC
MaxQuant v2.4.14.0 SILAC evidence.txt
SSL
General type of SSL files
V2_3
The only built in version of pLink export
V11
Version 11 of PEAKS export
V12
Version 12 of PEAKS export
V11_FEATURES
Version 11 of PEAKS export
V13_DIA
Version 13 Dia de novo missing: Delta RT, MS2 correlation, #precursors, gene, database, ion intensity, positional confidence
V2025B
Version 2025b
VERSION_0_14
An older version of a Sage export
VERSION_3_0
An older version of a PLGS export
VERSION_V4_2
The only supported format for MSFragger data
X
An older version of a PEAKS export
XPLUS
Version X+ of PEAKS export (made for build 20 November 2019)
X_PATCHED
Version X of PEAKS export (made for build 31 January 2019)

Traits§

IdentifiedPeptidoformSource
The required methods for any source of identified peptides
IdentifiedPeptidoformVersion
A version for an identified peptide version
MetaData
Generalised access to meta data of identified peptidoforms
PeptidoformAvailability
A trait to mark all options for availability of peptidoforms

Functions§

open_identified_peptidoforms_file
Open the selected path and automatically determine the filetype. It will decompress gzipped files automatically.

Type Aliases§

BoxedIdentifiedPeptideIter
Convenience type to not have to type out long iterator types
GeneralIdentifiedPeptidoforms
A general generic identified peptidoform iterator from any source format