Retrieve File Contents using SQL Server 2005 SQL CLR
I recently had a need to load a bunch of pdf files into my database. The problem was that the directory the files were in contained a bunch of other pdf files that I didn't want or that were no longer needed so I couldn't just upload every PDF in the directory. I did have a database table that contained all the file names that were still valid. I was wishing for a way to simply join from that table to the file system and load all those files. I certainly could have just written a quick exe that did this by retrieving the list of file names from the database and then grabbing the file and uploading back into the database. Instead I decided to write a CLR table value function that took in a directory path and a file search criteria and returned a table with the filename and the binary of the file. This way I could simply do an insert into my table all from t-sql. This ended up being a lot easier than what I thought and it worked quite well. Below is the code for the CLR function. Simply compile this code and then load it into your database. This will create a function called FileReader. You simply pass in a directory path and a search criteria (i.e. *.pdf).
using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.IO;
using System.Collections;
public partial class FileReader
{
[SqlFunction(FillRowMethodName = "FillRow",
TableDefinition = "FileName nvarchar(200),FileContent varbinary(max)",
Name = "FileReader")]
public static IEnumerable InitMethod(String Directory, String FileCriteria)
{
DirectoryInfo d = new DirectoryInfo(Directory);
return d.GetFiles(FileCriteria);
}
public static void FillRow(Object obj, out SqlString FileName, out SqlBytes FileContent)
{
FileInfo fi = (FileInfo)obj;
FileContent = new SqlBytes(File.ReadAllBytes(fi.FullName));
FileName = fi.Name;
}
}
Here is an example of how to use the function.
select * from dbo.FileReader('d:\', '*.txt')
and what the result was....
FileName FileContent
-------------------- ----------------------------------------------------
Test.txt 0x6173646673646661736466...
Test2.txt 0x61736466736466617364660D0A0D0A6173626C61736661666C...
Here is some example t-sql to show how I used this to insert the files into a table.
Create table _FilesToGrab (FileName varchar(50))
Create table _Files (FileName varchar(50), FileContent varbinary(max))
Insert into _FilesToGrab VALUES ('test.txt')
Insert Into _Files (FileName, FileContent)
select f.FileName, fc.FileContent
from _FilesToGrab f
Cross Apply dbo.FileReader('d:\', f.FileName) fc
select * from _Files
Drop Table _FilesToGrab
Drop Table _Files
So my final solution consisted of this function and then one insert into statement with a select.
Hopefully this function comes in handy for someone else. If any has any suggestions on how to improve on it please let me know and I will see what I can do.