Sunday, 13 February 2022

ADF Snack - Empty file when creating new folder

Case
I'm adding files to a Data Lake container via ADF, but when creating a new folder I'm also getting an empty file with the same name as the folder. How do I avoid the creation of these unwanted empty files?
Unwantend empty files with same name as folders










Solution
You are probably using an Azure Storage Account V2 (general purpose v2) without the hierarchical namespace setting enabled. You can check that on the Overview page of the Storage Account.
Hierarchical namespace disabled
















To solve this you need to enable that setting in your storage account. You can do that bij clicking on 'Disabled' behind Hierarchical namespace on the Overview page of the Storage Account. However there are some options that don't work with the hierarchical namespace like 'point-in-time restore for containers', 'versioning for blobs', 'blob change feed' and 'version-level immutability support''. When you enable the hierarchical namespace setting you will get a warning about this so that you can disable those settings. After Step 2 Validate account you can donwload a json file all the stuff you need to fix first.
Upgrading your storage account














Make sure to consult other users / Data Contributors of that Data Lake before changing those settings or at a last resort you could just create a new separate Storage Account if you are not sure about the settings. An other option could be to create a Python or PowerShell script to clean up the empty files.

Conclusion
In this post your learned how to fix an annoying needless empty file 'bug' for each directory. This is especially handy when you have those flexible pipelines that store your files in neat folder structures per year/month/day. It is just one setting, but it could have consequences for other settings. Thanks to colleague Derek Groen for finding the right setting.