Freigeben über


Windows Home Server client join troubleshoooting hints

This document is published here because there is no way to publish images on forum. It's written primarily to help our support people, and we are publishing it to help troubleshooting problems with the server join process for Windows Home Server for Beta participants. Please, understand that this document is not official, and provided AS is without any warranties... (see the disclaimer on the sidebar). Also, while I cannot help troubleshoot on this blog (use Connect site for that), I would appreciate comment on how t make this document more useful as well as if you notice any typos or problems in it.

Important notice: the document mentions ports 55000 and 56000, however, if you are using Beta2 build the ports are standard http ports 80 and 443, and for CTP build the ports are 88 and 444.

----------------------------------------------------------------------------

So, server join failed. Now what?

This document is not about troubleshooting client setup or discovery. It’s purely to troubleshoot server join.

Reminder: there are three phases of WHS client software installation:

  1. Install. That’s usual software installation. Files are copied and registered.
  2. Discovery. Client software tried to find WHS over UPNP. That’s the screen with Vista circle cursor going round and round.
  3. Server join. Client software does webservice calls to join WHS. That’s what you see in the very end with either two green marks on the screen (good!) or anything else, usually with some red circles and crosses (bad!)

This document is about handling red circles with crosses.

Steps-by-step

  1. Do trivial stuff: check that your server is up and running, that wires are in place, that your server is connected to network, that your PC is connected to network, and that they are connected to the SAME network. Yes, it’s trivial, but often the case.

  2. Check that your name resolution works. Name resolution is about 9 out of every 10 cases when the problem occurs. An easy way to check name resolution is to go to the command line and type:

    nslookup SERVER

    where SERVER is the name of your WHS server. It should give you IP address of your server. If it does not work, go to name resolution section.
    NB: It's still worth to check steps 3 and 4. Windows uses additional methods (WINS, hosts file) for name resolution which may work even if DNS service failes. If step 3 or 4 work, the name is resolved ok though alternative ways.

  3. If name resolution works, it’s most likely that you have firewall issues. First, try to open your browser and type

    https://server

    You should get a nice picture of a prehistoric office worker sitting behind his desk in savannah without cubicle walls or sunscreen. If you don’t get it, you will get an IE error message. Read it. If it does not help, go to Accessing public website section.

  4. Now type in browser

    https://server:55000/enrollid/id.xml

    You should get a nice XML with the number 1 in it. If you don’t, take a note of the error message and go to Accessing internal website section.

  5. Now type in browser

    https://server:56000/enroll/id.xml

    You should be prompted for the admin user ID and password, and once you give the correct ones, you get a nice XML with the number 2 in it. If you don’t, check the error message and go to Accessing internal site with authentication section.

  6. Now type in browser

    https://server:55000/enrollid/id.aspx

    You should get a nice XML with a bunch of stuff in it. If you don’t, then it’s either a bug or somebody played with IIS settings on the server and broke it. Anyway, take a note of what error message you will get, it may help. There is no simple troubleshooting from this point. On a positive side, we did not see anybody hitting this in months, so if you have a problem, it’s most likely that you already skipped this text and went to the specific section.

Name resolution

Name resolution is the ability of you machine to figure out IP address by the name. You sit in front of your PC and type in IE https://www.microsoft.com, and the machine converts www.microsoft.com into an IP address like 207.46.19.190. Machines use IP address to reach each other.

Name resolution is critical, because many services that WHS provides are actually standard services of Windows 2003 server, and they all need name resolution to work well to be used over home (or any other) network. In fact, technically we could have make server join work without name resolution, but we decided to break without it, because otherwise many other services of WHS will fail all around.

Unfortunately, most home networks are configured just to share Internet connection, not to use home PCs together. A good example is a home printer. You connect it to one home PC, can you print on it from another home PC? Most people cannot, because their home networks don’t support this. The situation is so bad, that even manufacturers of routers for home network don’t consider it seriously and many even expensive routers don’t do that simple job, stripping their customers from this important and useful functionality.

You see, to use a printer on other machine, your machine needs to be able to get to it. To do so, it need to resolve the name of that PC with the printer, and most home networks are not setup for that. Similarly, WHS disk shares, backup services, health monitoring only work if your PC can get to the server.

Did I get too technical? Well, there is a good news. Vista uses extended name resolution mechanism including UPNP/SSDP protocol. What it means is that Vista client normally can get to WHS even if your router is not up to the task (it still have to support UPNP though). So, upgrading to Vista helps in 95% of cases. But not everybody is ready to upgrade to Vista, and if this is your case, read on.

Why do most home networks fail with internal name resolution? Here is a typical home network:

 

 

Name resolution is done by name servers. When your home PC sees a name, it goes to DNS server and asks it, “What’s the IP address for this name?” To share Internet connections you only need to resolve Internet names, like “www.microsoft.com”. In this case all home PCs can be configured to go directly to you ISP provider DNS server on Internet. But once you try to make home PC work with each other, you hit the problem: your ISP DNS server has no clue about the names of you home PCs. And that’s only logical, they are not on Internet!

There is one more complicated version of the same problem. It's when your machine comes to your ISPs DNS and it resolves name. Suppose you name your WHS "SERVER", and for some ridiculous reason your ISP has a machine named "SERVER" on their local (not yours!) network. Then your ISPs DNS server may resolve name to IP address, but it will be a wrong address! So symptoms may be slightly different, but the root cause will be the same -- DNS resolution by DNS server, which does not know about your Home Server.

To resolve home PC names, somebody else should take a job of DNS server. Good routers do that. They present themselves to home PCs as a DNS server, and when in doubt go to your ISP provider’s DNS server for the help. Router is on your home network, so it does know all your home PCs and their IP address. In fact, normally your router is the one who gives IP addresses to your home PCs (DHCP). But it needs to share this knowledge also working as your home DNS server. Many routers out of the box don’t do that. If you can configure your router right, the picture will look very similar but with one critical difference:

Many routers out of box don’t do that, but many modern routers can be configured to do so.

And if instead of a router you have Windows XP Professional, Windows Server, or some Vista SKUs PC with two network cards, it will do that for sure.

There are tricks to make it work without using DNS, mainly around using WINS name resolution.

Trick 1: If all your machines are in the same "workgroup", and router does not block WINS, they will be able to see each other in most cases. If you have OEN headless station, it means that you need to make your client PC to belong to WORKGROUP group. In many cases that should help.

Trick 2: Also, you can potentially assign a static IP to your WHS server and put it hosts table on your PCs. That solution is far from ideal, and you still will have problems with remote access, which will try to go from server to PCs. Also, if you'll ever change the server IP address, you will have to edit hosts tables on all PCs manually again. Essentially, it's a hack, however, it will work. If you are using a headless server, it's even harder and includes a lot of risk of misconfiguring the system into unusable state.

Accessing public website

You came here because you typed in IE

https://server

and got an error message. But you already checked that name resolution works. Now, look at the IE error message.

If it says “cannot find server”, it’s either your server is down, network cable is not inserted somewhere, or firewall. Check cables, see that the server is on (ping server will ensure that), and once it’s ok, go to firewall problems section.

Another popular reason for failing at this step is the setting in Internet Explorer "Internet Options | Connections | LAN Settings | Automatically Detect Settings" or "Use a proxy server" without "Bypass proxy server for local addresses". WHS Connector uses exactly the same Windows code as Internet Explorer and it is affected by these settings. If "Automatically Detect Settings" is checked, your machine may be accidentally (by your router or your Internet sevrice provider) configured to use proxy server for all http calls. If this proxy is on the provider side, or if proxy in the router has problems, you will not be able to access your WHS server over http from local machines. Unchecking this option helps if this was the problem.

Accessing internal website

You’ve come here because name resolution works, public site is accessible fare and square, but when you typed:

https://server:55000/enrollid/id.xml

it failed.

90% chance is that this is a trouble with firewalls, go to firewall problems section.

Accessing internal site with authentication

You came here because name resolution works, public site is accessible, as well as internal site, but when you typed

https://server:56000/enroll/id.xml

it failed.

Read the error message.

If IE says that access is denied, try to recall your password. You typed it incorrectly.

If IE complains about bad certificate, it means that server join was not able to install WHS certificate. It happens in the rare cases of PC misconfiguration, which usually happens on older systems due to malware or user playing with security settings on the system. This is a very rare case.

If IE says it cannot find the site, it’s most likely again firewall issue. Go to firewall problems section.

Firewall problems

Guess, how many firewalls are between your PC and WHS? You have three times to try.

Three. At least. In my case four, because I have two firewalls on my home PC. Here is how it looks like in a typical home network setup:

You see, three of them are in the way of your PC communicating to WHS.

WHS firewall

This one is a most harmless of all for WHS communications. After all, it’s WHS firewall and it is configured to be friendly for WHS use. Within a reason, of course. Specifically, most WHS communications are configured to be only allowable on the same subnet. What does it mean?

Suppose you have a typical home network configuration to use private network addresses 192.168.0.*, for example:

Router 192.168.0.1
PC 192.168.0.5
WHS 192.168.0.13

This will work fine. However make it

Router 192.168.0.1
PC 192.168.1.5
WHS 192.168.2.13

and WHS firewall will start blocking attempts by the clients to connect. The rule of thumb is that the first three numbers in IP addresses of WHS, PC and router must be the same. Also, subnets like 192.168.3.* or 192.168.7.* will also work, as long as three first numbers are the same. That’s not technically 100% true, if you are ready to go into technical detail, but if you don’t, don’t even try.

Some people feel uncomfortable with the idea of only 255 computers on their home network (not that they really have 255 devices on it), and use other private address spaces like 10.*.*.* or 172.16.*.*-172.31.*.*, which is potentially ok if you configure subnet mask right, although it is usually configured by default only to let the last number change… Too technical again? Yes, just don’t use these ranges. Stick to ol’good 192.168.0.*.

Actually, that’s the only trouble with WHS firewall that you may encounter (unless you do manual configuration of WHS, in which case you should know all this stuff already anyway).

Router firewall

That’s not necessarily easy, but straightforward. You router should let the intranet traffic through, at least for WHS services. To join you need TCP connections on ports 55000 and 56000. For transport you need the port 1138. There is also a couple of ports that you need for backup and remote access. Also it needs to let UPNP packets through, otherwise server discovery won’t work.

You don’t have to open them to Internet. In fact, you better not open them to Internet. But they should be open for internal home network computers.

PC firewall

Default Windows firewall settings allow WHS client software to go out to the server both from XP and Vista client, no problems. Of course, if you set it manually, see that the same rules as for router firewall apply for outgoing connections and UPNP response. Under no circumstances, except Remote Access, WHS will try to contact your PC. All connections (again, except remote access) goes from the PC to your WHS server.

OneCare firewall is supposed to let signed binaries out, and that’s all WHS client needs, although we’ve seen occasionally OneCare not letting connection out. You may need to open these ports manually.

Most troublesome are third party firewalls. All of them could block some WHS client connections. If you have those, you need to configure them manually to allow the WHS client outgoing connections.

Conclusion

I realize, that more information is needed on the subject, and maybe I’ll be able to come to that and extend this post or write additional posts on the subject. But right now the whole team is very busy, so it was tough to write even this. Still, the plan is to extend this document with more detail and more information/cases as we will find them.

Also, please, understand that I cannot troubleshoot your system through comments on this blog. The right way to submit beta bugs is through the Connect site, where there is a way to get reasonably full troubleshooting information about your problem.

First version: 4/28/07 11:55 pm
Next update: 5/2/07 8:10pm
Next update: 7/9/07
Next update: 7/13/07

Comments

  • Anonymous
    May 14, 2007
    The comment has been removed

  • Anonymous
    June 15, 2007
    The comment has been removed

  • Anonymous
    June 17, 2007
    I had problems installing on a hard drive that was duel boot with Home Server Beta 2 and Ubuntu 7.04 Linux.  The system just locked up following the loading of files.  No error messages or anything.  I solved the problem by using Partition Magic to delete the Home Server partition.  After deleting the old Home Server partition, I was able to install the system with no problem.

  • Anonymous
    June 25, 2007
    Very simple reason for joining problems: Wrong date! Something you normally don't think of...

  • Anonymous
    June 28, 2007
    many of the home based routers have a Domain Name field in thier setup that if you leave it blank this problem is avoided.

  • Anonymous
    November 27, 2007
    The comment has been removed

  • Anonymous
    December 11, 2007
    Sorry for a delay with an answer -- I don't have notifications about comments here. For the future, you can use a link email on the top right to send message to me directly. Yes, OneCare is a known problem. It's not interefering always, but sometimes it does. If you turn it off for join and turn it on back afterward, everything usually works. Overall, name resolution is one of the weakest links in modern home networks almost as interfering with Windows Home Server join as firewalls. Also sometimes disconnecting from Internet helps. That happens if you router sends all name resolution to your ISP even without proxy. Then you get either "name cannot be resolved" or -- even worse -- you ISP really finds some SERVER or HPSERVER on its own network, but that's not your server...

  • Anonymous
    December 12, 2007
    The comment has been removed

  • Anonymous
    December 12, 2007
    Step 1: Disconnect Internet (get the network cable out of cable or DSL modem) for a moment. Reboot, and try again. Resoving names to Internat instead of your home network is #2 reason for connection failures currently. If not, use the link on the top right to email to me. Also, use troubleshooter "Error reporting" and tell me the CAB number, it will give you. Minor hint: is the third computer on the same subnet as the server? If not, firewall may be the issue. BTW, if it's the only wireless computer out of your three, it's likely to be the case.

  • Anonymous
    December 13, 2007
    Thanks for the reply. Did step 1. Same result. Used the link / top right. Cut / pasted results from TroubleShooter. Never found an "error reporting", nor a CAB. Too much consumer in me, I guess. Computer at issue is desktop. Second connection was laptop. It is having issues accessing shared folders, but... one day at a time. Sorry, but I'm trying to be calm ... rr

  • Anonymous
    December 13, 2007
    Ok, step 1 did not work. About next, got your troubleshooter log... Are you sure that http://<yourserver>:55000/EnrollId/id.xml (as described in this article) does not work> Also, I noticed that you have at least four different versions of OneCare on your machine. Plus a strangte application GTOneCare, which looks like OneCare but strangely not the real OneCare. Most of them are probably just residue from their betas, but are you sure none of them still work and block connections? Anyway, what are the results of the steps described in the article?

  • Anonymous
    December 14, 2007
    The comment has been removed

  • Anonymous
    December 15, 2007
    The comment has been removed

  • Anonymous
    December 27, 2007
    im having an issue with the WHS connector installing. im able to find the server and authenticate the password but fails at configuring computer backup and from teh application event vierer i get this error after every i try to install Log Name:      Application Source:        HomeServer Date:          12/27/2007 11:11:03 PM Event ID:      1813 Task Category: (7) Level:         Error Keywords:      Classic User:          N/A Computer:      Radagast Description: The private key of the Windows Home Server's certificate is not accessible by the SYSTEM account.  File and folder permisions on the system volume may have been modified inappropriately.   Event Xml: <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">  <System>    <Provider Name="HomeServer" />    <EventID Qualifiers="53247">1813</EventID>    <Level>2</Level>    <Task>7</Task>    <Keywords>0x80000000000000</Keywords>    <TimeCreated SystemTime="2007-12-28T04:11:03.000Z" />    <EventRecordID>172250</EventRecordID>    <Channel>Application</Channel>    <Computer>Radagast</Computer>    <Security />  </System>  <EventData>  </EventData> </Event>

  • Anonymous
    December 30, 2007
    The comment has been removed

  • Anonymous
    February 07, 2008
    Wrong date caused me issues when I installed a new motherboard. This is worth checking before doing anything else and it would be useful if this was added as an early (first?) step in the troubleshooter.

  • Anonymous
    February 29, 2008
    The comment has been removed

  • Anonymous
    May 13, 2008
    The comment has been removed

  • Anonymous
    July 29, 2008
    The comment has been removed

  • Anonymous
    August 13, 2008
    I had the connector up and running fine on Vista Home Premium I upgraded the OS to Ultimate 32bit and am now having the same error after entering the password on connecter setup, i can access the server shares, websites etc and im also able to see my enroll id of 1 but cannot see any errors has anyone gotten around this problem as ive run out of ideas and i work in networking field so i have half an idea. any help much appriecated u can email me at edster2007@gmail.com

  • Anonymous
    August 13, 2008
    The comment has been removed

  • Anonymous
    August 15, 2008
    CD unlikely. >  if you are using an older version of the connector software, to use the one in the sharessoftware folder Yep, because it's updated with Windows Update. Again, please, notice, that I don't work for WHS anymore.

  • Anonymous
    August 26, 2008
    I have two routers connected to my network. Router 1 is an ADSL wireless N router running a firewall.  Firewall is configured to allow access to ports 80/433/4125 to router 2.  DHCP enabled providing IP addresses in range 192.168.1.n Router 2 is a gigabit/wireless router – firewall disabled. Has external IP address of 192.168.1.2 provided by router 1. This is connected to router 1 via the WAN port of router 2. Port forwarding is configured for ports 80/433/4125 to WHS.  DHCP enabled providing IP addresses in range 192.168.0.n This network has several XP-pro PC clients and of course the WHS Remote access from the internet to WHS works fine.  However PC clients cannot connect to the WHS via the connector (although I know they can with only one router connected). My best guess is that as router 1 is on a different subnet than the WHS, the WHS is blocking connections from/to it as it normally will only accept connections from devices on the same subnet as itself. Option A: Is it possible to modify WHS to accept connections from different subnets? Option B: Is it feasible to turn off DHCP on router 1 and allocate it a static IP from the router 2 subnet? Option C: I could just run router 1 in bridged mode and authenticate, route and firewall (SPI) from router 2 (but would router 2 still connect to router 1 via its WAN port?) Option D: I could turn router 1 into a switch and connect router 1 to router 2 via router 2 LAN port (not preferred as I lose a gigabit LAN port). These options are in the order of my own preference. Views on these options would be most appreciated.

  • Anonymous
    August 28, 2008
    The comment has been removed

  • Anonymous
    October 13, 2008
    Woohooo! After many evenings reading blogs, addressing everything I could find, turning off firewalls etc. etc. I still could not understand why the discover.exe and WHS Connector showed timeouts whilst connecting to the WHS? YET, I could use the very same URLs from IE & FF and connect :( This is from my Windows XP SP3 workstation. On the off-chance, I pasted one of the timeout errors into Google, and clicked one of the apparently 'unrelated' issues concerning WSUS failing with the same error: http://www.tech-archive.net/Archive/Windows/microsoft.public.windowsupdate/2007-07/msg01132.html The check-list here further included checking the WinHttp proxy settings via the proxycfg tool. BINGO! -- The machine which became my WHS used to serve as a Squid Proxy Cache (may well do so again, one day soon!)... Though I had gone through lengths to ensure all my browsers & Internet Connection Settings were configured as 'direct connection'... BUT of course, the WinHttp was still configured to attempt to use the proxy! I issued a proxycfg -d command, and IMMEDIATELY, everything else clicked into place -- The toolkit passed all the connection tests, and WHS connected correctly, backups started etc. Boy, am I happy now! (So, it appears that in PP1, they've changed the way that the WHS connector works to use WinHttp?) WHY-OH-WHY does WinHttp require a separate utility to configure it's proxy settings, and not simply follow the "Internet Options" settings!?!? Are you listening, Microsoft.