Everything was pointing towards a client problem, but it couldn't be a client problem.
The reason for this is how the "new" exchange server came about. In December we suffered a big outage, resulting in our Exchange server crashing out completely, and repeatedly. We would rebuild it, it would test fine, a day later it would randomly reboot, and fail to come back up. Eventually, suspecting a deeper problem with our ESX infrastructure/SAN backend, we built an Exchange server on brand new (borrowed) physical hardware, and the problems went away.
Previously we were Exchange 2007 on Server 2003R2. Now, on the interim hardware, we were Exchange 2007 on Server 2008 SP2. Outlook Anywhere worked fine before, and worked fine now on the new hardware. However, the person who had graciously lent us the server was now getting slightly annoyed that, almost 3 months later, we were still using his server. Whoops. So we planned a migration back onto our own (now fixed) ESX infrastructure. Which all went off without a hitch. As you could imagine, following on from our fun in December we were all very proficient in Exchange 2007 recovery and migration.
So over the weekend we backed up the old Exchange server and took it off the network. We built the new Exchange server with the same name as the old box, and installed Exchange using the /RecoveryMode switch to tell it to pick up the configuration from AD. Then it was simply a matter of restoring the databases to the new server, and doing all the misc little jobs like re-keying and installed the SSL certs, and setting up everything else.
This took longer than we had expected, but worked well. OWA was up and working, ActiveSync was up and working, and client mailboxes were talking to the new server without needing to re-sync their entire Cached Mode OST file (the main benefit of doing the migration as we had done, opposed to a Move Mailbox that won't move the "cached?" bit, and thus triggers a re-sync of the entire user's mailbox - potentially a 4Gb mailbox over a wireless/3G VPN link - not nice!).
However, one niggling little problem remained. A number of Outlook 2007 users who previously were able to connect via Outlook Anywhere (RPC over HTTP) were now unable to do so - it was MAPI or nothing. The configuration on the server for Outlook Anywhere was correct. Even still, we disabled it and re-enabled it half a dozen times, both via the EMC as well the powershell. Still no good. Into IIS manager to ensure that the RPC vdir was created, and had the right Auth settings (at one point I suspected it was the NTLM auth not working). Certs were obviously OK - OWA and AS were working perfectly and reporting the right cert. AutoDiscover was working perfectly, presenting the correct HTTP details and MSSTD server name for mutual auth. But OA refused to work.
From the look of it, the client was being presented with all the right OA info, but was just refusing to make the call over HTTPS. But I knew this couldn't be correct, since my machine had previously worked perfectly, and nothing had changed on it - the only thing that had changed was the server. Something must not be right.
I was desperate at this point, and had even installed dotNET 1.1 in a vain attempt to get it working. (SIDE NOTE: this is not as random as it seems. A week earlier I was trying to rebuild a Forefront Client Security server which had also fallen victim to the same trouble we had experienced in December. It deserves a blog post on it's own, due to the dearth of information available on the internet regarding FFCS in general. On all the MS doco, they specify DotNet and ASP.NET as a requirement for the FFCS server console. Most everyone assumes that since Server 2008 comes with dotNET 3.5, that you're good to go. NO! You need to install dotNET 1.1 as a prereq! Otherwise the setup program gives you very weird, cryptic error messages - ie, not just "you need to install dotnet 1.1", which would be the sensible thing to do. So despite appearances, I'm not completely insane. yet.)
Hundreds of google-hours later, and all I had established was that I was pretty much alone with my situation. Everyone else that was having trouble with OA seemed to be doing the obvious things wrong - firewall ports blocked, self-signed certs, AutoDiscover not working correctly, etc. No one had a situation where OWA, AS and AutoDiscover were working, but OA not.
By sheer chance I stumbled onto this article: Exchange Genie - Configuring Outlook Anywhere for Exchange 2007 SP1
yup - yup - yup - same stuff. How to configure it via the console. How to configure it via the PowerShell if you really hate yourself. yup - yup - yup. Then, on about the 5th page down, the "Eureka!" moment. It was talking about reghacking a specific RPC-Proxy key to ensure that the RPC Proxy would forward requests onto specific servers - in this case, our CAS and mailbox servers.
I remember having to do this previously! Only the once though - other times it has just worked. But I remember once having to work with this reg key!
Some furious googling later bought up an article from the Exchange Team Blog here : You Had Me At EHLO -How Does Outlook Anywhere work (and not work)?
I remembered this article from the last time it had happened to us. It's a great article, and I thoroughly encourage anyone troubleshooting Outlook Anywhere to read through it and ensure you understand it. I have a hard time searching for articles on the Exchange Team Blog, but when you find them, they're solid gold.
I quickly added in the right servers and ports to that regkey. For us, it was the internal FQDN (ie, the mailserver.local name) as well as the netbios name (ie, just the server name: mailserver), for both the ports required. IISReset, and restarted the Outlook 2007 client on my laptop. Voila - connectivity via HTTPS! And this was still whilst connected to the VPN, showing that Outlook was correctly interpreting the "On fast networks connect via HTTPS first" setting as well. Excellent!
Here's the snippet from the Exchange Genie site, in case it changes:
Once we have Enabled Outlook Anywhere we can validate the registry key has configured correct ports for communication to our mailbox servers. Note only the name listed in the key can be used by clients to connect and you will notice there is no IP address listed so testing via IP will fail through the rpc proxy.
1. Click start Run
2. Regedit – this will open the registry editor
4. Notice the Dword called Enabled set to 1
5. There is a String value called “ValidPorts”
**Note if the port are not listed it could take up to 15 minutes to update or you can restart the Microsoft Exchange Service Host **
we can see that the rpc proxy connects to our mailbox server on the following port 6001-6002 and 6004. Each port is defined below
Microsoft Exchange Information Store service: 6001
referral service of DSProxy: 6002
proxy service of DSProxy: 6004
Active Directory (if the global catalog server and Exchange Server are on the same server): 6004
I would highly recommend reading the Exchange Team Blog as well though, as it goes into much greater detail about these settings, and why they're required. There's also a couple of caveats around manually changing this reg key that the Exchange Team Blog explores a bit deeper.
So - it's all magically working!
But why did it happen in the first place? Well, I have a theory. The Exchange Team Blog article describes how this reg key is maintained by RPCHTTPConfigurator servicelet that runs as part of the Exchange Service Host service. It's responsible for putting the right values for your exchange server into this key. When you add a new CAS or mailbox server, the servicelet will update this key on the RPCHTTP Proxy server to ensure that clients with mailboxes on the new server can talk through that RPC Proxy server to the new backend machines.
What I think happened here is that did not have the RPC over HTTP Proxy feature installed before we installed Exchange 2007. We had IIS installed, and all the other pre-reqs, but not the RPC over HTTP Proxy feature. So when we enabled OutlookAnywhere the first time, the servicelet couldn't add the correct values into that registry key, because it didn't exist.
(UPDATE - The servicelet didn't update the registry key because the Microsoft Exchange Service Host service wasn't running! It's startup type was set to Automatic, but it wasn't running. Because the RPCHTTPConfigurator servicelet runs inside of the Service Host process, therefore it wasn't running, and couldn't automatically update the reg key. After manually hacking it myself as described here I discovered this, and started the service. Immediately my entries were removed and replaced with the correct ones as stated in the articles, and it was all good.
So - first step - check to see if your Microsoft Exchange Service Host service is running, and if not, start it up!!!)
The first thing I did when troubleshooting the problem was to look for the RPC over HTTP Proxy feature, and install it when I found it missing. It then installed itself with it's default reg key, which obviously didn't include the specific Exchange settings required. I can only guess that previous disabling/re-enabling of the Outlook Anywhere feature didn't trigger the servicelet to re-apply the settings to the regkey, because as far as it was concerned, it had already done it. Thus, we ended up with an Exchange server that tested fine, and thought everything was OK, and an RPC over HTTP Proxy that itself was running perfectly fine, but the overall configuration wasn't correct.
That's just a guess, but I reckon it's pretty close to the money. I know previously when I rebuilt the server onto the borrowed hardware, I had ensured that the RPC over HTTP Proxy feature had been installed prior to the Exchange installation. This time, someone else was responsible for the actually Exchange installation, and had missed that step. Understandable, because it's not listed as a required pre-req by the Exchange setup program (especially when doing a /RecoveryMode install). I went wrong when I assumed that it would make no difference having installed the feature after the Exchange install had happened.
Once we'd manually fixed up that inconsistency, and entered back in the details that the servicelet *should* have entered, everything magically worked! Hooray! The moral of the story is - if you're having issues with Outlook Anywhere but your OWA, ActiveSync and AutoDiscover are working perfectly, then check your RPC over HTTP Proxy reg key settings, and ensure you've enabled access to your mail servers!