nameme 0.2.3

A CLI to find the type of files based on their magic number and, optionally, rename them.
nameme-0.2.3 is not a library.
Visit the last successful build: nameme-0.1.0

nameme

na·me·me | \na-me-me\ 1. A simple utility to find the real filetype of a file based on its magic number and, optionally, rename it.

Usage

nameme is a CLI program that given a path prints the type of the contents the path based on the magic number of the content itself. For example, in a folder such as the following

.
├── a
├── Cats
   ├── 1.jpg  	(* <- Actually a BMP image *)
   ├── 2.jpg  
   ├── 3.jpg
   └── 4.jpg
└─ Dogs
   ├── 1.jpg
   ├── 2.jpg
   ├── 3.jpg 	(* <- Actually a GIF *)
   └── 4.jpg

invoking nameme . would print

$ nameme . 
Format	 	Results	Erroneous
BMP		1		1
GIF		1		1
JPG		7		0

To see which files are misnamed, you can use nameme --verbose:

$ nameme --verbose . 
Format		Results	Erroneous
BMP		1		1
	./Cats/1.jpg
GIF		1		1
	./Dogs/3.jpg
JPG		7		0
	./Cats/3.jpg
	./Cats/5.jpg
	./Cats/2.jpg
	./Cats/4.jpg
	./Dogs/1.jpg
	./Dogs/2.jpg
	./Dogs/4.jpg

The other use of nameme is that of automatically renaming files according to their magic number[^1]:

$ tree .
.
├── Cats
│   ├── 1.jpg
│   ├── 2.jpg
│   ├── 3.jpg
│   ├── 4.jpg
│   └── 5.jpg
└── Dogs
    ├── 1.jpg
    ├── 2.jpg
    ├── 3.jpg
    └── 4.jpg
$ nameme --rename --auto .
$ tree .
.
├── Cats
│   ├── 1.bmp
│   ├── 2.jpg
│   ├── 3.jpg
│   ├── 4.jpg
│   └── 5.jpg
└── Dogs
    ├── 1.jpg
    ├── 2.jpg
    ├── 3.gif
    └── 4.jpg

If you prefer to be asked whether or not a certain file should be renamed, removing the --auto flag from the invocation in the previous example will make nameme ask you for each potential rename:

$ nameme --rename .
rename ./Cats/1.jpg -> ./Cats/1.bmp? [Y/n] y
rename ./Dogs/3.jpg -> ./Dogs/3.gif? [Y/n] y

Todo

Items marked as complete in this list can still be improved.

  • Each header pattern may correspond to multiple extensions. A user should have the possiblity to choose the desired extension among those available when renaming a file.

  • A user should be able to decide whether the header is matched greedily (that is, the longest match is taken, as it is now) or lazily (that is, the shortest match is taken).

  • When invoked on a large directory, nameme can get pretty slow. Figure out what's the problem.

  • file is able to figure out filetypes of some text files (e.g. json). Figure out if it makes sense to implement something similar here, and how.

Credits

This library and application are based on GCK's File Signatures Table, distributed under MIT license. [^1]: A single header pattern might correspond to multiple possible extensions. Ideally, the user should be able to choose which of the available extension to choose, but this can get tedious really fast for large files.