We recently had an opportunity to do some work to integrate S3 Amazon Web Services into Salesforce.com.  Salesforce.com provides some sample code that was very useful in getting the S3 services to work.  However, along the way I ran into a number of challenging issues that I'm sure many other developers have tackled.

The first big issue was that we would be dealing with very large files, greater than 2GB in some cases.  Unfortunately, the Salesforce.com web services can't handle large files (>100k) via their web services, but the S3 services can handle up to 5GB.  Salesforce.com had some sample code that leveraged POST directly to the S3 service via a Visual Force page.  This was all good, but it took a bit of work getting everything customized and working correctly.

One of the most challenging parts was figuring out how S3 handles access because we didn't want our customer's files being shared with the rest of the world.  There is a mechanism for controlling access, but if you're wanting to have different access levels, your stuck either handling the access inside of Salesforce, or getting every user to sign up with Amazon as a user of S3.

The next issue I encountered was figuring out how to provide a properly encoded hash of the credentials for a specific file to S3 so it could be downloaded.  There were no good examples of this, and although the S3 documentation was clear about how it should work, there was a fair amount of discovery involved to insure I was passing all the right information over the wire.

The last, and unfortunately most challenging issue was overcoming the Salesforce.com test code coverage requirement.  S3 involves a callout in a class that was generated from the S3 WSDL in Salesforce.com.  The generation tool is nice, and while test coverage works great in that class, it's not constructed to support Salesforce.com's preferred method of testing callouts in your custom classes.  The recommended way to handle this is leveraging a virtual class and overriding the callout with test methods that replace the return value with dummy data.  In any case, the S3 code provided simply wasn't structured this way and with the size of the class it is impractical to rewrite it, so you may want to be careful with the construction of your custom classes to be sure keep the callout's limited to as few lines of code as possible, thereby improving you chances of passing the code coverage requirements.  Not a pretty picture, but neither is having to re-write the entire S3 webservices callout code.

While S3 is a great service there are some challenges with making it work with Salesforce.