dir_callback – Lets play with Directories (PHP)

PHP provides some handy functions for file and directory handling, but to deal with nested directories, it does not provide much. There are basic directory functions and a directory iterator, but does not cover directory level operations like moving, cloning, counting files, deleting etc.

Here we have a very simple directory function that takes a path and iterates through all its files and sub-directories (nodes) and passes on path of each node to the callback function.

Using the above simple function, we can do much of what we want. This function does nothing, except to pass the full path of each node (directory or file) to the callback function. The callback function then decides on how to use that path. Notice that the function even does not return anything. It does not force you to use a fixed way to store data in recursive calls. You have to define in your callback your own way to store data. For simplicity, I am using $GLOBALS. Alternatively you may modify it to store data in a static variable, arrays, or write your own data storage/retrieval class.

I have written some callback functions below to be used in combination with dir_callback.

Calculating Directory Size (in Bytes)

List all Files and Sub-Directories

Get Directory Structure

The above callback function returns all files and directories inside the parent directory. Here we have a function that returns an array of sub-directories only, constituting the whole directory structure.

Styling Paths and File Names

This function will color the file name and the file path separately. Similar styling is used by many online storage services to display easy to read file names in deep nested file paths.

List all files by Extension

Count Files and Directories

Getting List of Unique File Names

This functions lists unique names of all files in the given directory.

Get Duplicate Files (Using File Names)

This function uses file names to identify duplicate at any level within a directory, and returns a list of such file paths.

Get Duplicate Files (Using MD5)

This function uses md5 hash of all files within a directory (at any level) to identify duplicates, and returns a list of such duplicates.

Clone a Directory (Creating a Copy of Directory)

Truncate a Directory but Preserve Directory Structure

List all files by their Sizes

List all Files by their Last Modified Time

List all Files by their Last Access Time

All the above functions are useful for operations like Disk Cache Management, Uploads Management, User Account Deleting (Deleting all uploaded data of user), optimizing file system usage (by removing duplicates), summarized reporting direct from file system, etc.