Module jaspar

Source
Expand description

Parser implementation for matrices in JASPAR (raw) format.

The JASPAR database stores manually curated DNA-binding sites as count matrices.

The JASPAR files contains a FASTA-like header line for each record, followed by one line per symbol storing tab-separated counts at each position. The “raw” format simply stores 4 lines corresponding to the scores for the A, C, G and T letters:

>MA1104.2 GATA6
22320 20858 35360  5912 4535  2560  5044 76686  1507  1096 13149 18911 22172
16229 14161 13347 11831 62936 1439  1393   815   852 75930  3228 19054 17969
13432 11894 10394  7066 6459   580   615   819   456   712  1810 18153 11605
27463 32531 20343 54635 5514 74865 72392  1124 76629  1706 61257 23326 27698

Structs§

Reader
An iterative reader for the JASPAR format.
Record
A JASPAR (raw) record.

Functions§

read
Read the records from a file in JASPAR format.