Fight corruption… with PowerShell!

I’ve often spoken of the importance of learning PowerShell if you’re working in a Windows shop because it can be used to work with just about every Microsoft system. Here’s a small script to help you fix corruption issues in SharePoint.

Yours truly, protesting a different kind of corruption in DC yesterday evening.

A few years ago, back in my help desk days, some users started reporting a strange issue. Syncing their files with certain SharePoint sites would suddenly stop working. The strange thing is that the sites identified as problematic worked fine in every other way, and the users experiencing the problems had no issues with other sites.

After some digging, an interesting cause was identified: some files on our farm were randomly becoming corrupted. Once we knew what to look for, they were easy enough to spot: they were all completely empty. Deleting the file allowed the site to sync again.

But… SharePoint sites can have a lot of files on them. (That’s the point of them, after all. SharePoint is a document repository.) Sites could have folders within folders within folders, each of them with hundreds of files. That could take hours of clicking just to find an empty file!

When you’re faced with having to do a repetitive task like this, get in the habit of asking yourself how you can automate it. Let’s take a look at the simple two-liner I wrote to quickly do this tedious task for me. (Three lines if you count the comment at the beginning.)

My standard warning: always understand what a PowerShell script does before you run it! PowerShell is a great tool but can be very unforgiving.

And now, the script:

#This will look for file sizes of 0 in a SharePoint library.

cd “\\YourSharePointServer\sites\YourSharePointSite\shared documents”

Get-ChildItem -Recurse | Where-Object Length -eq 0

The first line is simply a comment and will not be executed by PowerShell. It’s meant for clarity if someone else ever needs to use your script. (Or, it can be useful for you in cases where you write a complicated script and have to come back to it later.)

The next line is not actually PowerShell, it’s a regular command prompt line. (If you were to press start, type “cmd”, and launch a command prompt, you could use the cd command to change to the directory (aka folder) of your choice.) When you use command lines in PowerShell, a command window will actually be run by PowerShell in the background for you. Neat!

Let’s understand what’s happening on this line. CD means “change directory.” In other words, CD will change your location to a different folder or path. The next part starts with quotes. These are needed in this case because the path we’re using has a space in it (between “shared” and “documents”.) If there were no space at all, you could skip the quotes.

The path then starts with two backslashes. This means that you’ll be connecting to another computer. The next part of the line is the computer name. You can probably figure out the rest of the line without me walking you through it!

The last line is a PowerShell command. Well, actually it’s two PowerShell commands. The first one is “Get-ChildItem”. This command is used to get files and folders. The “-recurse” part tells PowerShell that if it opens a folder and finds other folders, it should open those too. If you don’t add that in, PowerShell will only list out the files and folders it finds in the current folder you’re in.

To get a sense of the difference, open PowerShell (press start and type “PowerShell”) and type:

Get-ChildItem c:\

Compare the output to what’s in your C drive by going there with a file Explorer window and you’ll see it’s the same. That’s great, but not so useful in our case. Now, add the recurse switch:

Get-ChildItem c:\ -recurse

Whoa!

PowerShell is now opening every single folder and printing out all the files it finds in your entire C drive. (Can you imagine how long that would take if you had to click into each folder and type out each file?) If you ever want to interrupt a command that’s running, press Control +C and PowerShell will stop running. Feel free to do that here unless you really want to read through all your files.

Now, you’ll remember that I didn’t just want to identify all the files in a particular site. I wanted to identify all the corrupted files. This is where the next part of the last line comes in. The | symbol is called a “pipe” symbol. The pipe works just like a real life pipe. Just like the water company uses a pipe to get water to your house, the pipe in PowerShell takes “objects” generated by one command and pipes them to another command. You can chain multiple commands together with pipes, by the way; you’re not limited to two commands.

The next command is “Where-Object”. PowerShell can read pretty intuitively – this command is literally saying “where the object is…” The last part is saying “file size is zero.”

So, putting that last line together in plain English, you get:

“Go through each folder in the path, grab each file, then pass it down the pipe. At the other end of the pipe, look at each file passed to you and if it is empty, print it out on the screen.”

You’ll notice that nowhere did we say to print out to the screen. We don’t have to because this is PowerShell’s default behavior. If we wanted to, we could change that default behavior by piping the output somewhere else. For example:

Get-ChildItem -Recurse | Where-Object Length -eq 0 | Out-File C:\files.txt

This would pipe our output out to a text file. Hmmm… I wonder where else you could pipe output to? (You should look that up!)

Note that the script merely outputs the corrupted files. It doesn’t delete them for you. I’ve left this part out for you to figure out, but I’ll give you a hint: if you were to pipe out the objects to a command that deletes instead of piping them out to a command that creates a file, the files would be automagically deleted for you.

Even though this script was a huge help with a SharePoint problem, it didn’t actually use any SharePoint specific commands. However, there are a ton of product specific PowerShell commands that can help you with just about anything you want to do (in the Microsoft world.) I highly encourage you to start learning PowerShell today!

With each post, I cover a new topic to help you get your start (or keep progressing) in your IT career. If it’s your first time visiting this blog, start here. Or, see all my posts about interview questions you should be able to answer.