nameme
na·me·me | \na-me-me\ 1. A simple utility to find the real filetype of a file based on its magic number and, optionally, rename it.
Usage
nameme
is a CLI program that given a path prints the type of the contents the path based
on the magic number of the content itself. For example, in a folder such as the following
)
)
invoking nameme .
would print
$ nameme .
Format Results Erroneous
BMP 1 1
GIF 1 1
JPG 7 0
To see which files are misnamed, you can use nameme --verbose
:
$ nameme --verbose .
Format Results Erroneous
BMP 1 1
./Cats/1.jpg
GIF 1 1
./Dogs/3.jpg
JPG 7 0
./Cats/3.jpg
./Cats/5.jpg
./Cats/2.jpg
./Cats/4.jpg
./Dogs/1.jpg
./Dogs/2.jpg
./Dogs/4.jpg
The other use of nameme
is that of automatically renaming files according to
their magic number[^1]:
$ tree .
.
├── Cats
│ ├── 1.jpg
│ ├── 2.jpg
│ ├── 3.jpg
│ ├── 4.jpg
│ └── 5.jpg
└── Dogs
├── 1.jpg
├── 2.jpg
├── 3.jpg
└── 4.jpg
$ nameme --rename --auto .
$ tree .
.
├── Cats
│ ├── 1.bmp
│ ├── 2.jpg
│ ├── 3.jpg
│ ├── 4.jpg
│ └── 5.jpg
└── Dogs
├── 1.jpg
├── 2.jpg
├── 3.gif
└── 4.jpg
If you prefer to be asked whether or not a certain file should be renamed,
removing the --auto
flag from the invocation in the previous example will make nameme
ask you for each potential rename:
$ nameme --rename .
rename ./Cats/1.jpg -> ./Cats/1.bmp? [Y/n] y
rename ./Dogs/3.jpg -> ./Dogs/3.gif? [Y/n] y
Todo
Items marked as complete in this list can still be improved.
-
Each header pattern may correspond to multiple extensions. A user should have the possiblity to choose the desired extension among those available when renaming a file.
-
A user should be able to decide whether the header is matched greedily (that is, the longest match is taken, as it is now) or lazily (that is, the shortest match is taken).
-
When invoked on a large directory,
nameme
can get pretty slow. Figure out what's the problem. -
file
is able to figure out filetypes of some text files (e.g. json). Figure out if it makes sense to implement something similar here, and how.
Credits
This library and application are based on GCK's File Signatures Table, distributed under MIT license. [^1]: A single header pattern might correspond to multiple possible extensions. Ideally, the user should be able to choose which of the available extension to choose, but this can get tedious really fast for large files.