Debugging Access-Control-Allow-Origin issues in Apache

Here’s how I was able to solve & debug a problem relating to a security issue that prevented iPad and Android devices from loading some of the files on my site, due to the fact that they were served by a different port. Client side (browser) tools did not provide the level of insight I needed to debug the server issue.

The problem:

I was running Django server on a port (8000) with Apache serving static files from port 80. On iPad and Android devices, my JWPlayer 6 video player was giving users an error that my custom skin file could not be found. The skin file was simply an xml file being served by Apache on the same machine, and it worked just fine on laptops, desktops, Mac and PC, but not mobile, and not tablets.

I found the underlying client problem first with Mac iOS Simulator and Safari Developer tools (and later by simply spoofing the Android user agent from my Chrome browser and checking Console errors). The error was:

XMLHttpRequest cannot load http://localhost.myhost.com/static/inc/jwsk/glow_gb.xml. No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'http://localhost.myhost.com:8000' is therefore not allowed access.

I found a number of reports on the issue, including this Stack Overflow post. To correct my problem, I needed to setup CORS (Cross-Origin Resource Sharing) via some Apache directives.

I verified that the loosest setting in Apache did indeed fix the problem, so I was on the right track:


<FilesMatch ".xml$">
    Header set Access-Control-Allow-Origin *
</FilesMatch>


However I wanted stricter security in place, to not allow any old hostname to match. Based on some misleading statements in the StackOverflow thread, I believed I could use multiple lines for this Response Header, using environment variables to generate different combinations of host & port that I wanted to allow. I tried dozens of combinations unsuccessfully and struggled to figure out how to debug the apache behavior. Finally I found
this post (http://archive.ianwinter.co.uk/2010/11/18/log-response-headers-in-apache/) which got me headed in the right direction.

Debugging the issue

I added the following lines to my apache conf file, to emit to my logs both the environment variables I was setting & request/response headers relevant to the issue:

#capture server & host from request origin
SetEnvIf Origin "^(.*.?myhost.com)(:[0-9]+)?$" ORIGIN_0=$0 ORIGIN_1=$1 ORIGIN_2=$2
# add variations of incoming host/port to the response headers
<FilesMatch ".xml$">
    Header set Access-Control-Allow-Origin "%{ORIGIN_0}e" env=ORIGIN_0
    Header set Access-Control-Allow-Origin "%{ORIGIN_1}e" env=ORIGIN_1
</FilesMatch>

#emit request "Origin" header, response "Access-Control-Allow-Origin” header, and my 3 environmental variables to each line of the log
LogFormat "%h %l %u %t "%r" %>s %b ORIG="%{Origin}i" ALLOW="%{Access-Control-Allow-Origin}o" O0="%{ORIGIN_0}e" O1="%{ORIGIN_1}e" O2="%{ORIGIN_2}e"” common2

#use my custom log format 
CustomLog /var/log/django/prototype_access.log common2

After restarting apache, my apache logs now showed output like this when I reloaded an affected page:

127.0.0.1 - - [15/Mar/2014:13:41:26 -0700] "GET /static/inc/i/logo2.png HTTP/1.1" 304 - ORIG="-" ALLOW="-" O0="-" O1="-" O2="-"
127.0.0.1 - - [15/Mar/2014:13:41:26 -0700] "GET /static/inc/i/unlimited.png HTTP/1.1" 304 - ORIG="-" ALLOW="-" O0="-" O1="-" O2="-"
127.0.0.1 - - [15/Mar/2014:13:41:26 -0700] "GET /static/inc/jwsk/glow_gb.xml HTTP/1.1" 304 - ORIG="http://localhost.myhost.com:8000" ALLOW="http://localhost.myhost.com:8000"  O0="http://localhost.myhost.com:8000" O1="http://localhost.myhost.com" O2=":8000"

Eureka.

The Solution

I could verify via apache logs that my environmental variables were working as expected. My problem was that only one of my Access-Control-Allow-Origin headers was taking effect, and not the right one. I needed to Allow origins without the Django port.
Once I removed the extra lines, I was left with this configuration, which solved my problem by enabling the Apache host (without the django port) in a single response header directive.


SetEnvIf Origin "^(.*.?myhost.com)(:[0-9]+)?$" ORIGIN_1=$1 
<FilesMatch ".xml$">
    Header set Access-Control-Allow-Origin "%{ORIGIN_1}e" env=ORIGIN_1
</FilesMatch>

My final apache logs looked like this:

127.0.0.1 - - [15/Mar/2014:13:41:26 -0700] "GET /static/inc/jwsk/glow_gb.xml HTTP/1.1" 304 - ORIG="http://localhost.myhost.com:8000" ALLOW="http://localhost.myhost.com" O0="http://localhost.myhost.com:8000" O1="http://localhost.myhost.com" O2=":8000"

This week in Startup Engineering: Decoupling Django DB & Web logic

In theory it sounded very straightforward, a simple refactoring.  

My goal: Separate the django database logic from the web/UI/business logic code. Out of the box, django worked like a charm, an all-in-one stack that ran very efficiently for a web/db prototype website on Amazon Web Services.

But in order to support future scalability, I needed to decouple these components, so they could live on the same or different servers transparently, and communicate completely through service APIs, a la the infamous Steve Yegge rant touting Jeff Bezos’s all-services-all-the-time mandate.

Things started simply enough, reviewing the existing views and models, figuring out what type of generic APIs I would need for a decoupled world.  Then it hit me. To separate user data from the web server meant an entirely new level of authentication and security would be needed between the database and the web servers. User-specific data would now make lots of back and forth trips across the network, and would need protection. Unlike my sheltered days coding at Ask.com, I no longer have a team of brilliant network and system administrators dedicated to solving exactly these problems: masking networks, enabling access and authorization, setting up virtual clouds.  

Time for another crash course in bootstrapped engineering. 

Note: I’m also enrolled in Secure Recurring Payments 101, Amazon AutoScaling Architecture 206, and of course, the toughest one for an introvert, Business Development 342a. Thankfully I’m coming off recent successes completing my studies in Video Security and Adaptive Bitrate Tuning, as well as Fitness Video Production 101 and 102.

So that’s where I am now.  Reviewing my options for django/apache authentication methods and frameworks, SSL certificates, and the like.  Will report back when I get to the midterm (later this week!), or find a study buddy to give me a headstart.