Bugs with Non-Alphanumeric Credentials for AWS

I've recently had a chance to be working with AWS and S3 and encountered a somewhat frustrating bug that I wanted to share. I was using boto3, an AWS SDK for Python. I had created a script that would list the EC2 instances that make up an EMR cluster. It worked fine with my credentials, but another user would be a SignatureDoesNotMatch error every time. We spent a lot of time comparing the environment setup and permissions on the two accounts, but everything lined up. We started guessing a little more wildly when we noticed that my AWS credentials (AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY) were entirely alphanumeric, but the other user's were not. My first thought was that there was some kind of encoding error, but I had no luck with that. After some searching I found this longstanding bug with the same issue. As it turns out having non-alphanumeric credentials can cause this error and has for at least a couple of years. The only fix is to regenerate the credentials until getting a set that is entirely alphanumeric.

I later ran into the same signature error trying to distcp some data from HDFS to S3, but I knew what to look for this time. HADOOP-3733 is a very similar issue to that in boto3, but this is fixed in Hadoop 2.8.0.

This was a troublesome bug for me, so hopefully this post helps someone else who encounters the same issue.