Beware NuGet's Filename Encoding


Saturday 20 Aug 2016 at 21:00
Development  |  nuget files encoding

The other day, I was troubleshooting some issues that had occurred on a deployment of some code to a test server.  Large parts of the application were simply not working after deployment, however, the (apparently) same set of code and files worked just fine on my local development machine.

After much digging, the problem was finally discovered as being how NuGet handles and packages files with non-standard characters in the filename.

It seems that NuGet will ASCII-encode certain characters within filename, such as spaces, @ symbols etc.  This is usually not a problem as NuGet itself will correctly decode the filenames again when extracting (or installing) the package, so for example, a file named:

READ ME.txt

within your solution will be encoded inside the .nupkg file as:

READ%20ME.txt

And once installed / extracted again using NuGet will get it’s original filename back.  However, there’s a big caveat around this.  We’re told that NuGet’s nupkg files are “just zip files” and that simply renaming the file to have a .zip extension rather than a .nupkg extension allows the file to be opened using 7-Zip or any other zip archive tool.  This is all fine, except that if you extract the contents of a nupkg file using an archiving utility like 7-zip, any encoded filenames will retain their encoding and will not be renamed back to the original, correct, filename!

It turns out that my deployment included some manual steps which included the manual extraction of a nupkg file using 7-Zip.  It also turns out that my nupkg package contained files with @ symbols and spaces in some of the filenames.  These files were critical to the functioning of the application, and when manually extracting them from the package, the filenames were left in an encoded format meaning the application could not load the files as it was looking for the files with their correct (non-encoded) filenames.